A Gentle Introduction to Matrix Factorization for Machine Learning
Many complex matrix operations cannot be solved efficiently or with stability using the limited precision of computers.
Matrix decompositions are methods that reduce a matrix into constituent parts that make it easier to calculate more complex matrix operations. Matrix decomposition methods, also called matrix factorization methods, are a foundation of linear algebra in computers, even for basic operations such as solving systems of linear equations, calculating the inverse, and calculating the determinant of a matrix.
In this tutorial, you will discover matrix decompositions and how to calculate them in Python.
After completing this tutorial, you will know:
- What a matrix decomposition is and why these types of operations are important.
- How to calculate an LU andQR matrix decompositions in Python.
- How to calculate a Cholesky matrix decomposition in Python.
Let’s get started.
- Update Mar/2018: Fixed small typo in the description of QR Decomposition.
Tutorial Overview
This tutorial is divided into 4 parts; they are:
- What is a Matrix Decomposition?
- LU Matrix Decomposition
- QR Matrix Decomposition
- Cholesky Decomposition
Need help with Linear Algebra for Machine Learning?
Take my free 7-day email crash course now (with sample code).
Click to sign-up and also get a free PDF Ebook version of the course.
What is a Matrix Decomposition?
A matrix decomposition is a way of reducing a matrix into its constituent parts.
It is an approach that can simplify more complex matrix operations that can be performed on the decomposed matrix rather than on the original matrix itself.
A common analogy for matrix decomposition is the factoring of numbers, such as the factoring of 10 into 2 x 5. For this reason, matrix decomposition is also called matrix factorization. Like factoring real values, there are many ways to decompose a matrix, hence there are a range of different matrix decomposition techniques.
Two simple and widely used matrix decomposition methods are the LU matrix decomposition and the QR matrix decomposition.
Next, we will take a closer look at each of these methods.
LU Matrix Decomposition
The LU decomposition is for square matrices and decomposes a matrix into L and U components.
1 | A = L . U |
Or, without the dot notation.
1 | A = LU |
Where A is the square matrix that we wish to decompose, L is the lower triangle matrix and U is the upper triangle matrix.
The factors L and U are triangular matrices. The factorization that comes from elimination is A = LU.
The LU decomposition is found using an iterative numerical process and can fail for those matrices that cannot be decomposed or decomposed easily.
A variation of this decomposition that is numerically more stable to solve in practice is called the LUP decomposition, or the LU decomposition with partial pivoting.
1 | A = P . L . U |
The rows of the parent matrix are re-ordered to simplify the decomposition process and the additional P matrix specifies a way to permute the result or return the result to the original order. There are also other variations of the LU.
The LU decomposition is often used to simplify the solving of systems of linear equations, such as finding the coefficients in a linear regression, as well as in calculating the determinant and inverse of a matrix.
The LU decomposition can be implemented in Python with the lu() function. More specifically, this function calculates an LPU decomposition.
The example below first defines a 3×3 square matrix. The LU decomposition is calculated, then the original matrix is reconstructed from the components.
1234567891011121314 | # LU decompositionfrom numpy import arrayfrom scipy.linalg import lu# define a square matrixA=array([[1,2,3],[4,5,6],[7,8,9]])print(A)# LU decompositionP,L,U=lu(A)print(P)print(L)print(U)# reconstructB=P.dot(L).dot(U)print(B) |
Running the example first prints the defined 3×3 matrix, then the P, L, and U components of the decomposition, then finally the original matrix is reconstructed.
12345678910111213141516171819 | [[1 2 3] [4 5 6] [7 8 9]][[ 0. 1. 0.] [ 0. 0. 1.] [ 1. 0. 0.]][[ 1. 0. 0. ] [ 0.14285714 1. 0. ] [ 0.57142857 0.5 1. ]][[ 7.00000000e+00 8.00000000e+00 9.00000000e+00] [ 0.00000000e+00 8.57142857e-01 1.71428571e+00] [ 0.00000000e+00 0.00000000e+00 -1.58603289e-16]][[ 1. 2. 3.] [ 4. 5. 6.] [ 7. 8. 9.]] |
QR Matrix Decomposition
The QR decomposition is for m x n matrices (not limited to square matrices) and decomposes a matrix into Q and R components.
1 | A = Q . R |
Or, without the dot notation.
1 | A = QR |
Where A is the matrix that we wish to decompose, Q a matrix with the size m x m, and R is an upper triangle matrix with the size m x n.
The QR decomposition is found using an iterative numerical method that can fail for those matrices that cannot be decomposed, or decomposed easily.
Like the LU decomposition, the QR decomposition is often used to solve systems of linear equations, although is not limited to square matrices.
The QR decomposition can be implemented in NumPy using the qr() function. By default, the function returns the Q and R matrices with smaller or ‘reduced’ dimensions that is more economical. We can change this to return the expected sizes of m x m for Q and m x n for R by specifying the mode argument as ‘complete’, although this is not required for most applications.
The example below defines a 3×2 matrix, calculates the QR decomposition, then reconstructs the original matrix from the decomposed elements.
12345678910111213 | # QR decompositionfrom numpy import arrayfrom numpy.linalg import qr# define a 3x2 matrixA=array([[1,2],[3,4],[5,6]])print(A)# QR decompositionQ,R=qr(A,'complete')print(Q)print(R)# reconstructB=Q.dot(R)print(B) |
Running the example first prints the defined 3×2 matrix, then the Q and R elements, then finally the reconstructed matrix that matches what we started with.
123456789101112131415 | [[1 2] [3 4] [5 6]][[-0.16903085 0.89708523 0.40824829] [-0.50709255 0.27602622 -0.81649658] [-0.84515425 -0.34503278 0.40824829]][[-5.91607978 -7.43735744] [ 0. 0.82807867] [ 0. 0. ]][[ 1. 2.] [ 3. 4.] [ 5. 6.]] |
Cholesky Decomposition
The Cholesky decomposition is for square symmetric matrices where all values are greater than zero, so-called positive definite matrices.
For our interests in machine learning, we will focus on the Cholesky decomposition for real-valued matrices and ignore the cases when working with complex numbers.
The decomposition is defined as follows:
1 | A = L . L^T |
Or without the dot notation:
1 | A = LL^T |
Where A is the matrix being decomposed, L is the lower triangular matrix and L^T is the transpose of L.
The decompose can also be written as the product of the upper triangular matrix, for example:
1 | A = U^T . U |
Where U is the upper triangular matrix.
The Cholesky decomposition is used for solving linear least squares for linear regression, as well as simulation and optimization methods.
When decomposing symmetric matrices, the Cholesky decomposition is nearly twice as efficient as the LU decomposition and should be preferred in these cases.
While symmetric, positive definite matrices are rather special, they occur quite frequently in some applications, so their special factorization, called Cholesky decomposition, is good to know about. When you can use it, Cholesky decomposition is about a factor of two faster than alternative methods for solving linear equations.
The Cholesky decomposition can be implemented in NumPy by calling the cholesky() function. The function only returns L as we can easily access the L transpose as needed.
The example below defines a 3×3 symmetric and positive definite matrix and calculates the Cholesky decomposition, then the original matrix is reconstructed.
123456789101112 | # Cholesky decompositionfrom numpy import arrayfrom numpy.linalg import cholesky# define a 3x3 matrixA=array([[2,1,1],[1,2,1],[1,1,2]])print(A)# Cholesky decompositionL=cholesky(A)print(L)# reconstructB=L.dot(L.T)print(B) |
Running the example first prints the symmetric matrix, then the lower triangular matrix from the decomposition followed by the reconstructed matrix.
1234567891011 | [[2 1 1] [1 2 1] [1 1 2]][[ 1.41421356 0. 0. ] [ 0.70710678 1.22474487 0. ] [ 0.70710678 0.40824829 1.15470054]][[ 2. 1. 1.] [ 1. 2. 1.] [ 1. 1. 2.]] |
Extensions
This section lists some ideas for extending the tutorial that you may wish to explore.
- Create 5 examples using each operation with your own data.
- Search machine learning papers and find 1 example of each operation being used.
If you explore any of these extensions, I’d love to know.
Further Reading
This section provides more resources on the topic if you are looking to go deeper.
Books
API
Articles
Summary
In this tutorial, you discovered matrix decompositions and how to calculate them in Python.
Specifically, you learned:
- What a matrix decomposition is and why these types of operations are important.
- How to calculate an LU andQR matrix decompositions in Python.
- How to calculate a Cholesky matrix decomposition in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Get a Handle on Linear Algebra for Machine Learning!
Develop a working understand of linear algebra
…by writing lines of code in python
It provides self-study tutorials on topics like:
Vector Norms, Matrix Multiplication, Tensors, Eigendecomposition, SVD, PCA and much more…
Finally Understand the Mathematics of Data
Skip the Academics. Just Results.
相關推薦
A Gentle Introduction to Matrix Factorization for Machine Learning
Tweet Share Share Google Plus Many complex matrix operations cannot be solved efficiently or wit
A Quick Introduction to Text Summarization in Machine Learning
A Quick Introduction to Text Summarization in Machine LearningText summarization refers to the technique of shortening long pieces of text. The intention i
A Gentle Introduction to Applied Machine Learning as a Search Problem (譯文)
A Gentle Introduction to Applied Machine Learning as a Search Problem 原文作者:Jason Brownlee 原文地址:https://machinelearningmastery.com/applied-m
A Gentle Introduction to Transfer Learning for Deep Learning
Tweet Share Share Google Plus Transfer learning is a machine learning method where a model devel
Introduction to Matrices and Matrix Arithmetic for Machine Learning
Tweet Share Share Google Plus Matrices are a foundational element of linear algebra. Matrices ar
A Gentle Introduction to Autocorrelation and Partial Autocorrelation (譯文)
A Gentle Introduction to Autocorrelation and Partial Autocorrelation 原文作者:Jason Brownlee 原文地址:https://machinelearningmastery.com/gentle-introdu
A gentle introduction to decision trees using R
Most techniques of predictive analytics have their origins in probability or statistical theory (see my post on Naïve Bayes, for example). In this post I'l
A Gentle Introduction to RNN Unrolling
Tweet Share Share Google Plus Recurrent neural networks are a type of neural network where the o
A Gentle Introduction to Autocorrelation and Partial Autocorrelation
Tweet Share Share Google Plus Autocorrelation and partial autocorrelation plots are heavily used
A Gentle Introduction to Exploding Gradients in Neural Networks
Tweet Share Share Google Plus Exploding gradients are a problem where large error gradients accu
A Gentle Introduction to Broadcasting with NumPy Arrays
Tweet Share Share Google Plus Arrays with different sizes cannot be added, subtracted, or genera
A Gentle Introduction to Deep Learning Caption Generation Models
Tweet Share Share Google Plus Caption generation is the challenging artificial intelligence prob
翻譯 COMMON LISP: A Gentle Introduction to Symbolic Computation
因為學習COMMON LISP,起步較為艱難,田春翻譯的那本書起點太高,而大多數書籍起點都很高。其實國內還有一本書,是Common LISP程式設計/韓俊剛,殷勇編著,由西安電子科技大學出版社出版,不過鑑於該書已經絕版,我決定還是找個英文版的練練手。 很多高手,比如田春,都
Facebook's PyTorch plans to light the way to speedy workflows for Machine Learning • DEVCLASS
Facebook's development department has finished a first release candidate for v1 of its PyTorch project – just in time for the first conference dedicated to
How to Prepare Data For Machine Learning
Tweet Share Share Google Plus Machine learning algorithms learn from data. It is critical that y
How to Clean Text for Machine Learning with Python
Tweet Share Share Google Plus You cannot go straight from raw text to fitting a machine learning
Gentle Introduction to the Adam Optimization Algorithm for Deep Learning
The choice of optimization algorithm for your deep learning model can mean the difference between good results in minutes, hours, and days. The Adam optim
Assessing Annotator Disagreements in Python to Build a Robust Dataset for Machine Learning
Assessing Annotator Disagreements in Python to Build a Robust Dataset for Machine LearningTea vs. Coffee: the perfect example of decisions and disagreement
How to Create a Linux Virtual Machine For Machine Learning Development With Python 3
Tweet Share Share Google Plus Linux is an excellent environment for machine learning development
Introduction to Random Number Generators for Machine Learning in Python
Tweet Share Share Google Plus Randomness is a big part of machine learning. Randomness is used a