YouTip LogoYouTip

Scipy Sparse Matrix

SciPy Sparse Matrix

A sparse matrix (English: sparse matrix) refers to a matrix in numerical analysis where the vast majority of values are zero. Conversely, if most elements are non-zero, the matrix is considered dense.

Large sparse matrices frequently appear when solving linear models in scientific and engineering fields.

Image 1

The image above shows a sparse matrix on the left, containing many 0 elements, and a dense matrix on the right, where most elements are not 0.

Let's look at a simple example:

Image 2

The above sparse matrix contains only 9 non-zero elements and 26 zero elements. Its sparsity is 74%, and its density is 26%.

SciPy's scipy.sparse module provides functions for handling sparse matrices.

We primarily use the following two types of sparse matrices:

  • CSC - Compressed Sparse Column, compressed by column.
  • CSR - Compressed Sparse Row, compressed by row.

In this chapter, we will mainly use the CSR matrix.

CSR Matrix

We can create a CSR matrix by passing an array to the scipy.sparse.csr_matrix() function.

Example

Create a CSR matrix.

import numpy as np
from scipy.sparse import csr_matrix
arr = np.array([0,0,0,0,0,1,1,0,2])
print(csr_matrix(arr))

The output of the above code is:

  (0, 5)	1
  (0, 6)	1
  (0, 8)	2

Result Explanation:

  • First line: In the first row of the matrix (index 0), at the sixth position (index 5), there is a value of 1.
  • Second line: In the first row of the matrix (index 0), at the seventh position (index 6), there is a value of 1.
  • Third line: In the first row of the matrix (index 0), at the ninth position (index 8), there is a value of 2.

CSR Matrix Methods

We can use the data attribute to view the stored data (excluding 0 elements):

Example

import numpy as np
from scipy.sparse import csr_matrix
arr = np.array([[0,0,0],[0,0,1],[1,0,2]])
print(csr_matrix(arr).data)

The output of the above code is:

Use the count_nonzero() method to calculate the total number of non-0 elements:

Example

import numpy as np
from scipy.sparse import csr_matrix
arr = np.array([[0,0,0],[0,0,1],[1,0,2]])
print(csr_matrix(arr).count_nonzero())

The output of the above code is:

3

Use the eliminate_zeros() method to remove 0 elements from the matrix:

Example

import numpy as np
from scipy.sparse import csr_matrix
arr = np.array([[0,0,0],[0,0,1],[1,0,2]])
mat = csr_matrix(arr)
mat.eliminate_zeros()
print(mat)

The output of the above code is:

  (1, 2)	1
  (2, 0)	1
  (2, 2)	2

Use the sum_duplicates() method to remove duplicate entries:

Example

import numpy as np
from scipy.sparse import csr_matrix
arr = np.array([[0,0,0],[0,0,1],[1,0,2]])
mat = csr_matrix(arr)
mat.sum_duplicates()
print(mat)

The output of the above code is:

  (1, 2)	1
  (2, 0)	1
  (2, 2)	2

To convert a CSR matrix to a CSC matrix, use the tocsc() method:

Example

import numpy as np
from scipy.sparse import csr_matrix
arr = np.array([[0,0,0],[0,0,1],[1,0,2]])
newarr = csr_matrix(arr).tocsc()
print(newarr)

The output of the above code is:

  (2, 0)	1
  (1, 2)	1
  (2, 2)	2
← Java String EqualsignorecaseScipy Constants β†’