Pytorch Torch Tensor
The `torch.Tensor` class is the foundational building block of PyTorch. It is a multi-dimensional matrix containing elements of a single data type, similar to NumPy's `ndarray`, but with two critical enhancements: it can utilize GPUs for accelerated computing, and it can track operations to perform automatic differentiation (autograd).
Understanding how to instantiate, manipulate, and optimize `torch.Tensor` objects is essential for building, training, and deploying deep learning models.
---
## Introduction
In PyTorch, all dataβincluding model parameters, inputs, outputs, and gradientsβis represented as tensors. A `torch.Tensor` is a dynamic, multi-dimensional array.
### Why It Is Used
* **GPU Acceleration:** Tensors can be seamlessly moved between the CPU and NVIDIA GPUs (via CUDA) or Apple Silicon (via MPS) to accelerate mathematical computations by orders of magnitude.
* **Automatic Differentiation:** By setting `requires_grad=True`, PyTorch tracks all operations performed on a tensor, allowing for automatic backpropagation via the `autograd` engine.
* **Interoperability:** PyTorch tensors share memory with NumPy arrays, allowing zero-copy conversions between the two libraries.
---
## Syntax and Parameters
There are two primary ways to work with tensors in PyTorch:
1. **`torch.Tensor` (The Class):** The main tensor class. Calling `torch.Tensor(data)` is an alias for the default tensor type (`torch.FloatTensor`).
2. **`torch.tensor()` (The Factory Function):** The recommended way to construct a tensor from existing Python data (like lists or NumPy arrays). It infers the data type automatically and copies the underlying data.
### `torch.tensor()` Factory Function Signature
```python
torch.tensor(data, *, dtype=None, device=None, requires_grad=False, pin_memory=False) -> Tensor
```
### Parameters
| Parameter | Type | Description | Default |
| :--- | :--- | :--- | :--- |
| `data` | Array-like | The initial data for the tensor. Can be a list, tuple, NumPy `ndarray`, scalar, or another tensor. | *Required* |
| `dtype` | `torch.dtype` | The desired data type of the returned tensor (e.g., `torch.float32`, `torch.int64`). If `None`, PyTorch infers it. | `None` |
| `device` | `torch.device` | The device on which the tensor is allocated (e.g., `'cpu'`, `'cuda'`, `'mps'`). | `None` (Current default device) |
| `requires_grad` | `bool` | If `True`, autograd will record operations on this tensor for gradient computation. | `False` |
| `pin_memory` | `bool` | If `True`, allocates the tensor in page-locked (pinned) CPU memory, enabling faster copies to CUDA GPUs. | `False` |
### Common Tensor Attributes
Every `torch.Tensor` object has several key attributes:
* **`shape` (or `size()`):** Returns the dimensions of the tensor (e.g., `torch.Size([3, 224, 224])`).
* **`dtype`:** The data type of the elements stored in the tensor.
* **`device`:** The hardware device where the tensor's data resides.
* **`layout`:** The memory layout of the tensor (usually `torch.strided`).
---
## Code Example
The following complete, runnable script demonstrates how to initialize tensors, inspect their attributes, perform basic operations, move them to a GPU (if available), and compute gradients.
```python
import torch
import numpy as np
def main():
print(f"PyTorch Version: {torch.__version__}\n")
# 1. Creation and Type Inference
# Creating a tensor from a Python list
data_list = [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]
tensor_from_list = torch.tensor(data_list)
print("--- Tensor Creation ---")
print(f"Tensor:\n{tensor_from_list}")
print(f"Shape: {tensor_from_list.shape}")
print(f"Data Type: {tensor_from_list.dtype}")
print(f"Device: {tensor_from_list.device}\n")
# 2. NumPy Interoperability (Zero-Copy)
np_array = np.array([10, 20, 30])
# torch.from_numpy shares memory with the NumPy array
tensor_from_np = torch.from_numpy(np_array)
print("--- NumPy Interoperability ---")
print(f"Original NumPy Array: {np_array}")
print(f"Tensor from NumPy: {tensor_from_np}")
# Modifying the tensor modifies the original NumPy array
tensor_from_np = 99
print(f"Modified NumPy Array: {np_array}\n")
# 3. Device Management (CPU to GPU/MPS)
# Determine the best available accelerator
if torch.cuda.is_available():
device = torch.device("cuda")
elif torch.backends.mps.is_available():
device = torch.device("mps")
else:
device = torch.device("cpu")
print("--- Device Allocation ---")
cpu_tensor = torch.randn(2, 2)
# Move tensor to the selected device
accelerated_tensor = cpu_tensor.to(device)
print(f"Tensor moved to: {accelerated_tensor.device}\n")
# 4. Autograd (Automatic Differentiation)
print("--- Autograd Example ---")
# Create a tensor and track operations on it
x = torch.tensor([2.0, 3.0], requires_grad=True)
# Define a function: y = 3*x^2 + 2
y = 3 * (x ** 2) + 2
# Compute the sum to get a scalar output for backward pass
out = y.sum()
# Backpropagate
out.backward()
# dy/dx = 6*x
# For x = [2, 3], dy/dx should be [12, 18]
print(f"Input x: {x}")
print(f"Output out: {out}")
print(f"Gradients (dy/dx): {x.grad}")
if __name__ == "__main__":
main()
```
---
## Best Practices and Common Pitfalls
### 1. Use `torch.tensor()` instead of `torch.Tensor()`
Avoid using the class constructor `torch.Tensor()` (with a capital 'T') to create tensors from existing data.
* **Why:** `torch.Tensor()` is an alias for `torch.FloatTensor()` and will silently convert integer inputs into floats. It also shares global state configurations.
* **Best Practice:** Always use the factory function `torch.tensor()` to copy data, or `torch.as_tensor()` / `torch.from_numpy()` to share memory.
```python
# Avoid:
x = torch.Tensor([1, 2, 3]) # Results in float32: tensor([1., 2., 3.])
# Preferred:
x = torch.tensor([1, 2, 3]) # Preserves type as int64: tensor([1, 2, 3])
```
### 2. Avoid Device Mismatch Errors
PyTorch cannot perform operations on tensors that reside on different devices. Attempting to add a CPU tensor to a GPU tensor will throw a runtime error: `RuntimeError: Expected all tensors to be on the same device...`
* **Best Practice:** Always define a target `device` variable at the beginning of your script and explicitly send newly created tensors or models to that device.
```python
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Ensure both tensors are on the same device
a = torch.randn(3, 3).to(device)
b = torch.ones(3, 3, device=device) # Directly create on device
result = a + b
```
### 3. Detach Tensors When Converting to NumPy or Plotting
If a tensor has `requires_grad=True`, you cannot directly convert it to a NumPy array using `.numpy()`. Doing so will raise a `RuntimeError`.
* **Best Practice:** Use `.detach()` to strip the gradient history from the tensor before converting it or passing it to external libraries like Matplotlib.
```python
x = torch.randn(3, requires_grad=True)
# Avoid:
# arr = x.numpy() # Throws RuntimeError
# Preferred:
arr = x.detach().cpu().numpy()
```
YouTip