Working with vectors
1:Vectors
Let's take a deeper look at matrices. Matrices are made up of rows and columns.
A matrix with a single column is called acolumn vector:
⎡⎣⎢312⎤⎦⎥[312]
A matrix with a single row is called arow vector:
[312][312]
As you can see, a vector is a single row or a single column. We can add vectors together:
⎡⎣⎢312⎤⎦⎥+⎡⎣⎢123⎤⎦⎥=⎡⎣⎢435⎤⎦⎥[312]+[123]=[435]
When we add two vectors, we just add each of their elements in the same position together.
Instructions
-
Add
vector1
andvector2
and assign the result tovector1_2
. -
Add
vector3
andvector1
and assign the result tovector3_1
.
vector1 = np.asarray([4, 5, 7, 10])
vector3 = np.asarray([10, 4, 6, -1])
vector1_2=vector1+vector2
vector3_1=vector3+vector1
2:Vectors And Scalars
We can also multiply vectors by single numbers, calledscalars.
4∗⎡⎣⎢123⎤⎦⎥=⎡⎣⎢4812⎤⎦⎥4∗[123]=[4812]
In the example above,4
is ascalarthat we are multiplying the vector by. We multiply each element in the vector by the scalar.
Instructions
-
Multiply
vector
by the scalar7
and assign the result tovector_7
. -
Divide
vector
by the scalar8
and assign the result tovector_8
.
vector = np.asarray([4, -1, 7])
vector_7=vector*7
vector_8=vector/8
3:Geometric Intuition(幾何直觀)
So far, we've worked with one-dimensional arrays, orvectors. The number of elements in avectoris called the vector dimension. For instance, the dimension of this vector is2
:
[12][12]
We can project vectors of dimension1
onto a1
dimensional coordinate system, like this:
[1][1]
012
In the above diagram, we have a vector with1
dimension. Each dimension can have a magnitude, which indicates how large the vector is in that dimension. In this case, our vector is magnitude1
, so it stretches from0
to1
. Note that there's nothing in our vector that indicates it has to start at0
. We could also start our vector at1
:
012
The only information encoded in a vector is where it's going (magnitude), not where it originates. We can also have a vector that goes in a negative direction:
[1][1]
-101
We can apply this same principle to2
dimensional vectors, except we have to draw them in2
dimensional coordinate spaces:
[12][12]
As you can see above, we've drawn the same vector twice, once where it starts at0,0
, and once where it starts at1,0
.
We can add vectors by starting one vector where another ends:
As you can see, the final vector we end up with is:
[12]+[22]=[34][12]+[22]=[34]
As we add more dimensions to vectors, we need to add more coordinate dimensions to accurately plot them. We won't plot out3
dimensional vectors now, but it's useful to understand conceptually what's happening.
4:Plotting Vectors
We make can the geometric interpretation of vectors more clear by plotting them. We can do this with the.quiver()
method ofmatplotlib.pyplot
. This enables us to plot vectors on a2-dcoordinate grid(二維座標網格). We can then see what adding vectors together looks like.
In order to plot vectors, we would use:
import matplotlib.pyplot as plt
plt.quiver(X, Y, U, V, angles='xy', scale_units='xy', scale=1)
X
-- this is the origin of the vector (x coordinate)Y
-- the y-coordinate origin of the vectorU
-- The distance the vector moves on the x axis.V
-- the distance the vector moves on the y axis.
Each ofX
,Y
,U
, andV
are single dimensional (單維度)numpy arrays (vectors) or lists. The first item in each array corresponds to the first vector, the second item corresponds to the second vector, and so on. We can make the arrays as long or short as we want.
If you look at the plot, both vectors are stacked. The second vector starts right where the first vector ends. In fact, if you look at both vectors together, they end up getting us to the coordinates4,4
.
This is vector addition! By drawing one vector starting where the second one ended, we have effectively found the result of adding the two vectors.
Instructions
-
Make a new plot that contains the two vectors in the first plot, but also adds a vector that starts at
0,0
, and goes over4
and up4
.- This will end up at the coordinates
4,4
. - The final plot will have three vectors on it.
- This will end up at the coordinates
-
Set the x and y axis limits to
[0,6]
.
import numpy as np
import matplotlib.pyplot as plt
# We're going to plot 2 vectors
# The first will start at origin 0,0 , then go over 1 and up 2.
# The second will start at origin 1,2 then go over 2 and up 3.
X = [0,1]
Y = [0,2]
U = [1,3]
V = [2,2]
# Actually make the plot.
plt.quiver(X, Y, U, V, angles='xy', scale_units='xy', scale=1)
# Set the x axis limits
plt.xlim([0,6])
# Set the y axis limits
plt.ylim([0,6])
# Show the plot.
plt.show()
plt.quiver([0,1,0], [0,2,0], [1,3,4], [2,2,4], angles='xy', scale_units='xy', scale=1)
plt.xlim([0,6])
plt.ylim([0,6])
plt.show()
5:Vector Length
Now that we can plot vectors, we can intuitively figure out vector length. We just saw that a 2 dimensional vector can be represented as a line. Let's say we have this vector:
X=[23]X=[23]
Since it's a line, we can calculate its length with the pythagorean theorem. If you think about it, this vector is just the sum of these two component vectors:
X=[23]=[03]+[20]X=[23]=[03]+[20]
Both component vectors only have length in one dimension. If we were creating the components of a three dimensional vector, there would be three components, and so on for even higher dimensional vectors.
We have a plot of this below for our[2,3]
vector, and it's really just a triangle we're making. We can find the length of the hypotenuse of a triangle with the famous formulaa2+b2=c2a2+b2=c2, the Pythagorean theorem. We can rewrite this to find the length ofcc, the hypotenuse (long side). This gives usc=a2+b2−−−−−−√c=a2+b2. The length of any vector, no matter how many dimensions, is just the square root of the sum of all of its elements squared.
To find the length of a vector, we just apply the formula. In a two dimensional vector, the first element is the length of the bottom of the triangle (a), and the second element is the length of the right of the triangle (b). By taking the square root ofa2+b2a2+b2, we can find the length of the vector. We'll plot this out below and it will become more clear.
Below, we'll plot the two component vectors of the[23][23]vector we care about.
Instructions
- Compute the length of the vector[23][23]and assign the result to
vector_length
.
# We're going to plot 3 vectors
# The first will start at origin 0,0 , then go over 2 (this represents the bottom of the triangle)
# The second will start at origin 2,2, and go up 3 (this is the right side of the triangle)
# The third will start at origin 0,0, and go over 2 and up 3 (this is our vector, and is the hypotenuse of the triangle)
X = [0,2,0]
Y = [0,0,0]
U = [2,0,2]
V = [0,3,3]
# Actually make the plot.
plt.quiver(X, Y, U, V, angles='xy', scale_units='xy', scale=1)
plt.xlim([0,6])
plt.ylim([0,6])
plt.show()
vector_length=(2**2+3**2)**(1/2)
6:Dot Product
The dot product can tell us how much of one vector is pointing in the same direction as another vector. We find the dot product for vectors like this:
a⃗⋅b⃗=⎡⎣⎢a1a2a3⎤⎦⎥⋅⎡⎣⎢b1b2b3⎤⎦⎥=a1b1+a2b2+a3b3a→⋅b→=[a1a2a3]⋅[b1b2b3]=a1b1+a2b2+a3b3
a⃗a→andb⃗b→are vectors.a1a1is the first element of the a vector,a2a2is the second, and so on. What this equation is saying is that we calculate the dot product by taking the first element of a, multiplying it by the first element of b, then adding that to the second element of a multiplied by the second element of b, then adding that to the third element of a multiplied by the third element of b.
This gives us a number that indicates how much of the length of a is pointing in the same direction as b. If you project a onto the vector b, then it indicates how much of a is "in" vector b. When two vectors are at 90 degree angles, the dot product will be zero.
Dot products can be applied to vectors with any number of dimensions -- we just multiply the elements at the same positions in both vectors and add the results.
Here's an example:
[11]⋅[−11]=1∗−1+1∗1=0[11]⋅[−11]=1∗−1+1∗1=0
When two vectors are the same, the dot product will be the square of the vector length:
[23]⋅[23]=2∗2+3∗3=4+9=13[23]⋅[23]=2∗2+3∗3=4+9=13
It's not extremely important to understand the meaning of the dot product right now, but its calculation is important in many ways. Chief among them is determining if vectors areorthogonal. Two vectors areorthogonalif they are perpendicular (that is, at a 90 degree angle to each other), and their dot product is zero.
Instructions
- Assign the dot product of the vector⎡⎣⎢⎢⎢3456⎤⎦⎥⎥⎥[3456], and the vector⎡⎣⎢⎢⎢5678⎤⎦⎥⎥⎥[5678]to
dot
.
# These two vectors are orthogonal
X = [0,0]
Y = [0,0]
U = [1,-1]
V = [1,1]
plt.quiver(X, Y, U, V, angles='xy', scale_units='xy', scale=1)
plt.xlim([-2,2])
plt.ylim([-2,2])
plt.show()
dot=3*5+4*6+5*7+6*8
7:Making Predictions
Now, let's try predicting how many points NBA players scored in 2013 using how many field goals they attempted. Our algorithm will be single variable linear regression. Remember that a single variable linear regression takes the formy=mx+by=mx+b.yyis the variable we want to predict,xxis the value of the predictor variable,mmis the coefficient (slope), anndbbis an intercept term.
If you've done the statistics lessons, we worked out how to calculate the slope and intercept there. We won't rehash, but the slope is1.26
, and the intercept is-18.92
.
Now, using the slope and intercept, we want to make predictions on thenba
dataframe.
Instructions
-
For each row in
nba
:- Predict the
pts
column using thefga
column. - Use the variables
slope
andintercept
(already loaded in) to complete the linear regression equation.
- Predict the
-
Your final output should be a Pandas Series, which you should assign to
predictions
.
# Slope and intercept are defined, and nba is loaded in
predictions=slope*nba["fga"]+intercept
8:Vector And Matrix Multiplication
We can multiply vectors and matrices. This kind of multiplication can enable us to perform linear regression much faster and more efficiently. When multiplying a vector and a matrix, it can be useful to think of multiplying a matrix by a one column matrix instead:
[3212]∗[21][3122]∗[21]
When we multiply a matrix and a vector, we write out the dimensions in terms of numbers of rows and columns:
2x2 * 2x1
In this case, we're multiplying a2x2
matrix (A
) by2x1
matrix (B
). The inner numbers must match in order for multiplication to work. So the first matrix needs the same number of columns as the second matrix has rows. We can't multiplyB * A
, onlyA * B
.
When we multiply matrices, we multiply each row of the first matrix by each column of the second matrix.
[3212]∗[21]=[(3∗2)+(1∗1)(2∗2)+(2∗1)][3122]∗[21]=[(3∗2)+(1∗1)(2∗2)+(2∗1)]
We multiply the first item in the first row ofA
by the first item in the first column ofB
. We then multiply the second item in the first row ofA
by the second item in the first column ofB
. We then add these values together to get the item at the position1,1
in the result matrix.
We multiply the first item in the second row ofA
by the first item in the first column ofB
. We then multiply the second item in the second row ofA
by the second item in the first column ofB
. We then add these values together to get the item at the position1,2
in the result matrix.
The resulting matrix will always have the same number of rows as the first matrix being multiplied has rows, and the same number of columns as the second matrix being multiplied has columns.
We can generalize what we do by assigningA11
as the value ofA
in the first row and first column,A12
as the value ofA
in the first row and second column, and so on:
[A11A21A12A22]∗[B11B21]=[(A11∗B11)+(A12∗B21)(A21∗B11)+(A22∗B21)][A11A12A21A22]∗[B11B21]=[(A11∗B11)+(A12∗B21)(A21∗B11)+(A22∗B21)]
This gives us more general rules to apply to any case of matrix multiplication.
9:Multiplying A Matrix By A Vector
What we did when we were predicting our linear regression coefficients is fine for when we have one column we're using to predict, but remember that the linear regression equation has a coefficient for every single column that we're using to predict. So for three variables, it would bey=m1x1+m2x2+m3x3+by=m1x1+m2x2+m3x3+b. Sometimes, we'll have thousands of columns we want to use to predict. Writing this out gets tedious (although we could use a for loop), but it's also very slow computationally, because we have to do thousands of separate calculations, and we can't optimize them.
Luckily, there's a faster and better way to solve linear regression equations, among other things. It's matrix multiplication, and it's a foundational block of a lot of machine learning.
For example, let's say that this matrix represents the coefficients of a linear regression (the first row is the x coefficient, the second row is the intercept term -- there is only one column, so this is a column vector):
[3−1][3−1]
And this matrix represents the values of rows that we want to use to generate predictions. The first column is the x values, and the second column is there so that the intercept term can be added to the equation (this will make sense soon):
⎡⎣⎢25−1111⎤⎦⎥[2151−11]
We can do matrix multiplication like this:
⎡⎣⎢25−1111⎤⎦⎥∗[3−1]=⎡⎣⎢2∗3+1∗−15∗3+1∗−1−1∗3+1∗−1⎤⎦⎥=⎡⎣⎢514−4⎤⎦⎥[2151−11]∗[3−1]=[2∗3+1∗−15∗3+1∗−1−1∗3+1∗−1]=[514−4]
What we're doing is starting at the first row in our data. Then we multiply the first element of the first row by the first element in the coefficients. Then we multiply the second element in the first row by the second element in the coefficients column. We add these together. Then, we do the same for the second row in the data. We go across the rows in the first matrix we multiply, and go down the columns in the second matrix we multiply.
A more generic version:
⎡⎣⎢a11a21a31a12a22a32⎤⎦⎥∗[b11b21]=⎡⎣⎢a11∗b11+a12∗b21a21∗b11+a22∗b21a31∗b11+a32∗b21⎤⎦⎥[a11a12a21a22a31a32]∗[b11b21]=[a11∗b11+a12∗b21a21∗b11+a22∗b21a31∗b11+a32∗b21]
This is actually much faster and more efficient for a machine to compute than the method of addition that we did earlier. Adding the second column of1
s to the row enables us to add the intercept term when we multiply everything out.
We can perform matrix multiplication in python using the.dot()
method of numpy.
Instructions
- Multiply
nba_rows
bynba_coefs
.nba_rows
contains two columns -- the first is the field goals attempted by each player in 2013, and the second is a constant1
value that enables us to add in the intercept.- Assign the result to
predictions
.
import numpy as np
# Set up the coefficients as a column vector
coefs = np.asarray([[3], [-1]])
# Setup the rows we're using to make predictions
rows = np.asarray([[2,1], [5,1], [-1,1]])
# We can use np.dot to do matrix multiplication. This multiplies rows by coefficients -- the order is important.
np.dot(rows, coefs)
nba_coefs = np.asarray([[slope], [intercept]])
nba_rows = np.vstack([nba["fga"], np.ones(nba.shape[0])]).T
predictions=np.dot(nba_rows,nba_coefs)
10:Matrix Multiplication
We looked at a special case of matrix multiplication in the earlier screens -- when one matrix only has a single column or row. We can generalize, and look at matrix mutiplication more broadly. The same rules apply, but we need to worry about a bit more complexity. Here's an example:
[1234]∗[3412]=[(1∗3)+(3∗4)(2∗3)+(4∗4)(1∗1)+(3∗2)(2∗1)+(4∗2)]=[1522710][1324]∗[3142]=[(1∗3)+(3∗4)(1∗1)+(3∗2)(2∗3)+(4∗4)(2∗1)+(4∗2)]=[1572210]
The result matrix has the same number of rows asA
(the first matrix being multiplied) has columns, and the same number of columns asB
(the second matrix being multiplied) has rows.
Here's the general rule for how the multiplication works:
[(A11∗B11)+(A12∗B21)(A21∗B11)+(A22∗B21)(A11∗B12)+(A12∗B22)(A21∗B12)+(A22∗B22)][(A11∗B11)+(A12∗B21)(A11∗B12)+(A12∗B22)(A21∗B11)+(A22∗B21)(A21∗B12)+(A22∗B22)]
Each cell in the result matrix is caused by a row inA
being multiplied by each column inB
.
- Position
1,1
in the result is created by multiplying the first row ofA
by the first column ofB
. - Position
1,2
in the result is created by multiplying the first row ofA
by the second column ofB
. - Position
2,1
in the result is created by multiplying the second row ofA
by the first column ofB
. - Position
2,2
in the result is created by multiplying the second row ofA
by the second column ofB
.
11:Applying Matrix Multiplication
Multiplying a matrix by a vector, like we did a few screens ago, is a special case of matrix multiplication. The more general case is multiplying two matrices by each other. We multiply a matrix by another matrix in many machine learning methods, including neural networks(神經網路). Just like with linear regression, it enables us to do multiple calculations much more quickly than we could otherwise.
Let's say we wanted to multiply two matrices. First, the number of columns of the first matrix has to equal the number of rows of the second matrix. The final matrix will have as many rows as the first matrix, and as many columns as the second matrix. An easy way to think of this is in terms of matrix dimensions. We can multiply a3x2
(rows x columns) matrix by a2x3
matrix, and the final result will be3x3
.
Here's the generic version:
⎡⎣⎢a11a21a31a12a22a32⎤⎦⎥∗[b11b21b12b22]=⎡⎣⎢a11∗b11+a12∗b21a21∗b11+a22∗b21a31∗b11+a32∗b21a11∗b12+a12∗b22a21∗b12+a22∗b22a31∗b12+a32∗b22⎤⎦⎥[a11a12a21a22a31a32]∗[b11b12b21b22]=[a11∗b11+a12∗b21a11∗b12+a12∗b22a21∗b11+a22∗b21a21∗b12+a22∗b22a31∗b11+a32∗b21a31∗b12+a32∗b22]
And here's an example:
⎡⎣⎢25−1111⎤⎦⎥∗[3−112]=⎡⎣⎢2∗3+1∗−15∗3+1∗−1−1∗3+1∗−12∗1+1∗25∗1+1∗2−1∗1+1∗2⎤⎦⎥=⎡⎣⎢514−4471⎤⎦⎥[2151−11]∗[31−12]=[2∗3+1∗−12∗1+1∗25∗3+1∗−15∗1+1∗2−1∗3+1∗−1−1∗1+1∗2]=[54147−41]
We can multiply matrices with the numpy.dot
method. It's important to understand how matrix multiplication works, but you'll almost never have to do it by hand.
Instructions
- Multiply
A
andB
together and assign the result toC
.
A = np.asarray([[5,2], [3,5], [6,5]])
B = np.asarray([[3,1], [4,2]])
C=np.dot(A,B)
轉載於:https://my.oschina.net/Bettyty/blog/751089