Tensors are the core data structure in TensorFlow, which can be understood as an extended concept of multi-dimensional arrays.
In machine learning, almost all data is eventually represented and processed in tensor form.
Basic Properties of Tensors
- Data type (dtype): Each tensor has a specific data type, such as tf.float32, tf.int64, etc.
- Shape: Represents the size of each dimension of the tensor, e.g., (2,3) represents a matrix with 2 rows and 3 columns
- Device location (device): Indicates whether the tensor is stored on CPU or GPU
Dimensions of Tensors
- 0-dimensional tensor: scalar, e.g.,
tf.constant(5) - 1-dimensional tensor: vector, e.g.,
tf.constant([1,2,3]) - 2-dimensional tensor: matrix, e.g.,
tf.constant([[1,2],[3,4]]) - 3-dimensional and above: higher-order tensor, e.g.,
tf.ones((2,3,4))represents matrices of size 3Γ4
Common Methods for Creating Tensors
1. Creating from Python Lists/NumPy Arrays
Example
import tensorflow as tf
import numpy as np
# Create from Python list
tensor_from_list = tf.constant([[1,2],[3,4]])
# Create from NumPy array
numpy_array = np.array([[5,6],[7,8]])
tensor_from_numpy = tf.constant(numpy_array)
2. Creating Special Value Tensors
Example
# All-zero tensor
zeros = tf.zeros((2,3))# 2x3 all-zero matrix
# All-one tensor
ones = tf.ones((3,2))# 3x2 all-one matrix
# Identity matrix
eye = tf.eye(3)# 3Γ3 identity matrix
# Fill with specific value
filled = tf.fill((2,2),7)# 2Γ2 matrix with all elements as 7
3. Creating Random Tensors
Example
# Uniform distribution random numbers
uniform = tf.random.uniform((2,2), minval=0, maxval=1)
# Normal distribution random numbers
normal = tf.random.normal((3,3), mean=0, stddev=1)
# Random shuffle
shuffled = tf.random.shuffle(tf.constant([1,2,3,4,5]))
Basic Tensor Operations
1. Mathematical Operations
Example
a = tf.constant([[1,2],[3,4]])
b = tf.constant([[5,6],[7,8]])
# Element-wise addition
add = tf.add(a, b)# or use operator overloading a + b
# Element-wise multiplication
mul = tf.multiply(a, b)# or a * b
# Matrix multiplication
matmul = tf.matmul(a, b)# or a @ b
# Other mathematical operations
sqrt = tf.sqrt(tf.cast(a, tf.float32))# square root (requires conversion to float type)
2. Shape Operations
Example
tensor = tf.constant([[1,2,3],[4,5,6]])
# Get shape
shape = tensor.shape# returns (2, 3)
# Reshape
reshaped = tf.reshape(tensor,(3,2))# becomes 3 rows and 2 columns
# Transpose
transposed = tf.transpose(tensor)# becomes 3 rows and 2 columns
# Expand dimensions
expanded = tf.expand_dims(tensor, axis=0)# shape changes from (2,3) to (1,2,3)
3. Indexing and Slicing
Example
tensor = tf.constant([[1,2,3],[4,5,6],[7,8,9]])
# Get single element
elem = tensor[1,2]# Get element at row 2, column 3 (value is 6)
# Slicing operation
row = tensor[1, :]# Get all elements of row 2 [4,5,6]
col = tensor[:,1]# Get all elements of column 2 [2,5,8]
sub = tensor[0:2,1:]# Get rows 1-2, columns 2-3 [[2,3],[5,6]]
Tensor Broadcasting Mechanism
Broadcasting is an important mechanism in TensorFlow for handling operations between tensors of different shapes, which automatically expands smaller tensors to match the shape of larger tensors.
Broadcasting Rules
- Compare from the last dimension forward
- Two dimensions must either be equal, or one of them must be 1, or one of them must not exist
- Copy and expand along dimensions that are missing or are 1
Broadcasting Examples
Example
# Vector (3,) added with scalar ()
a = tf.constant([1,2,3])
b = tf.constant(2)
c = a + b # Result is [3,4,5], b is broadcast to [2,2,2]
# Matrix (3,1) added with vector (3,)
d = tf.constant([,,])
e = tf.constant([10,20,30])
f = d + e # d is broadcast to [[1,1,1],[2,2,2],[3,3,3]]
# Result is [[11,21,31],[12,22,32],[13,23,33]]
Tensor Aggregation Operations
Common Aggregation Functions
Example
tensor = tf.constant([[1,2,3],[4,5,6]])
# Sum
sum_all = tf.reduce_sum(tensor)# Sum of all elements β 21
sum_axis0 = tf.reduce_sum(tensor,0)# Sum along dimension 0 (rows) β [5,7,9]
sum_axis1 = tf.reduce_sum(tensor,1)# Sum along dimension 1 (columns) β [6,15]
# Mean
mean_all = tf.reduce_mean(tensor)# Mean of all elements β 3.5
# Max/Min
max_val = tf.reduce_max(tensor)# Maximum value β 6
min_val = tf.reduce_min(tensor)# Minimum value β 1
# Logical operations
any_true = tf.reduce_any(tensor >4)# Is any element >4 β True
all_true = tf.reduce_all(tensor >0)# Are all elements >0 β True
Practical Exercises
Exercise 1: Creating and Manipulating Tensors
Example
# 1. Create a 3Γ3 random matrix with element values between 0-10
random_matrix = tf.random.uniform((3,3), minval=0, maxval=10, dtype=tf.int32)
# 2. Calculate the transpose of this matrix
transposed_matrix = tf.transpose(random_matrix)
# 3. Calculate the product of the matrix and its transpose
product = tf.matmul(random_matrix, transposed_matrix)
# 4. Calculate the sum of diagonal elements of the product matrix
diag_sum = tf.reduce_sum(tf.linalg.diag_part(product))
Exercise 2: Broadcasting Mechanism Application
Example
# 1. Create a 4Γ1 matrix and a 1Γ4 vector
matrix = tf.constant([,,,])
vector = tf.constant([10,20,30,40])
# 2. Use broadcasting mechanism to calculate their sum
broadcast_sum = matrix + vector
# 3. Verify the shape and values of the result
print("Shape:", broadcast_sum.shape)# Should be (4,4)
print("Result:", broadcast_sum.numpy())
YouTip