美国数学学会特色论坛_奇异值分解.docx

资源描述

《美国数学学会特色论坛_奇异值分解.docx》由会员分享，可在线阅读，更多相关《美国数学学会特色论坛_奇异值分解.docx（11页珍藏版）》请在三一文库上搜索。

1、13-11-21Feature Column from the AMSWe Recommend a Singular Value DecompositionIn this article, we will offer a geometric explanation of singular value decompositions and look at some of the applications of them. .David AustinGrand Valley State Universitydavid at merganser.math.gvsu.eduMail to a frie

2、ndPrint this articleIntroductionThe topic of this article, the singular value decomposition, is one that should be a part of the standard mathematics undergraduate curriculum but all too often slips between the cracks. Besides being rather intuitive, these decompositions are incredibly useful. For i

3、nstance, Netflix, the online movie rental company, is currently offering a $1 million prize for anyone who can improve the accuracy of its movie recommendation system by 10%. Surprisingly, this seemingly modest problem turns out to be quite challenging, and the groups involved are now using rather s

4、ophisticated techniques. At the heart of all of them is the singular value decomposition.A singular value decomposition provides a convenient way for breaking a matrix, which perhaps contains some data we are interested in, into simpler, meaningful pieces. In this article, we will offer a geometric

5、explanation of singular value decompositions and look at some of the applications of them.The geometry of linear transformationsLet us begin by looking at some simple matrices, namely those with two rows and two columns. Our first example is the diagonal matrixGeometrically, we may think of a matrix

6、 like this as taking a point (x, y) in the plane and transforming it into another point using matrix multiplication:The effect of this transformation is shown below: the plane is horizontally stretched by a factor of 3, while there is no vertical change.www.ams.org/samplings/feature-column/fcarc-svd

7、1/1113-11-21Feature Column from the AMSNow lets look atwhich produces this effectIt is not so clear how to describe simply the geometric effect of the transformation. However, lets rotate our grid through a 45 degree angle and see what happens.Ah ha. We see now that this new grid is transformed in t

8、he same way that the original grid was transformed by the diagonal matrix: the grid is stretched by a factor of 3 in one direction.This is a very special situation that results from the fact that the matrix M is symmetric; that is, the transpose of M, the matrix obtained by flipping the entries abou

9、t the diagonal, is equal to M. If we have a symmetric 2 2 matrix, it turns out that we may always rotate the grid in the domain so that the matrix acts by stretching and perhaps reflecting in the two directions. In other words, symmetric matrices behave like diagonal matrices.Said with more mathemat

10、ical precision, given a symmetric matrix M, we may find a set of orthogonal vectors vi so that Mvi is a scalar multiple of vi; that isMvi = iviwhere i is a scalar. Geometrically, this means that the vectors vi are simply stretched and/or reflected when multiplied by M. Because of this property, we c

11、all the vectors vi eigenvectors of M; the scalars i are calledeigenvalues. An important fact, which is easily verified, is that eigenvectors of a symmetric matrix corresponding to different eigenvalues are orthogonal.If we use the eigenvectors of a symmetric matrix to align the grid, the matrix stre

12、tches and reflects the grid in the same way that it does the eigenvectors.The geometric description we gave for this linear transformation is a simple one: the grid is simply stretched in one direction. For more general matrices, we will ask if we can find an orthogonal grid that is transformed into

13、 another orthogonal grid. Lets consider a final example using a matrix that is not symmetric:www.ams.org/samplings/feature-column/fcarc-svd2/1113-11-21Feature Column from the AMSThis matrix produces the geometric effect known as a shear.Its easy to find one family of eigenvectors along the horizonta

14、l axis. However, our figure above shows that these eigenvectors cannot be used to create an orthogonal grid that is transformed into another orthogonal grid. Nonetheless, lets see what happens when we rotate the grid first by 30 degrees,Notice that the angle at the origin formed by the red parallelo

15、gram on the right has increased. Lets next rotate the grid by 60 degrees.Hmm. It appears that the grid on the right is now almost orthogonal. In fact, by rotating the grid in the domain by an angle of roughly 58.28 degrees, both grids are now orthogonal.www.ams.org/samplings/feature-column/fcarc-svd

16、3/1113-11-21Feature Column from the AMSThe singular value decompositionThis is the geometric essence of the singular value decomposition for 2 2 matrices: for any 2 2 matrix, we may find an orthogonal grid that is transformed into another orthogonal grid.We will express this fact using vectors: with

17、 an appropriate choice of orthogonal unit vectors v1 and v2, the vectors Mv1 and Mv2 are orthogonal.We will use u1 and u2 to denote unit vectors in the direction of Mv1 and Mv2. The lengths of Mv1 and Mv2- denoted by 1 and 2-describe the amount that the grid is stretched in those particular directio

18、ns. Thesenumbers are called the singular values of M. (In this case, the singular values are the golden ratio and its reciprocal, but that is not so important here.)We therefore haveMv1 = 1u1Mv2 = 2u2We may now give a simple description for how the matrix M treats a general vector x. Since the vecto

19、rs v1 andwww.ams.org/samplings/feature-column/fcarc-svd4/1113-11-21Feature Column from the AMSv2 are orthogonal unit vectors, we havex = (v1x) v1 + (v2x) v2This means thatMx = (v1x) Mv1 + (v2x) Mv2Mx = (v1x) 1u1 + (v2x) 2u2Remember that the dot product may be computed using the vector transposevx =

20、vTxwhich leads toMx = u11 v1Tx + u22 v2TxM = u11 v1T + u22 v2TThis is usually expressed by writingM = UVTwhere U is a matrix whose columns are the vectors u1 and u2, is a diagonal matrix whose entries are 1 and 2, and V is a matrix whose columns are v1 and v2. The superscript T on the matrix V denot

21、es the matrix transpose of V.This shows how to decompose the matrix M into the product of three matrices: V describes an orthonormal basis in the domain, and U describes an orthonormal basis in the co-domain, and describes how much the vectors inV are stretched to give the vectors in U.How do we fin

22、d the singular decomposition?The power of the singular value decomposition lies in the fact that we may find it for any matrix. How do we do it? Lets look at our earlier example and add the unit circle in the domain. Its image will be an ellipse whose major and minor axes define the orthogonal grid

23、in the co-domain.Notice that the major and minor axes are defined by Mv1 and Mv2. These vectors therefore are the longest and shortest vectors among all the images of vectors on the unit circle.www.ams.org/samplings/feature-column/fcarc-svd5/1113-11-21Feature Column from the AMSIn other words, the f

24、unction |Mx| on the unit circle has a maximum at v1 and a minimum at v2. This reduces the problem to a rather standard calculus problem in which we wish to optimize a function over the unit circle. Itturns out that the critical points of this function occur at the eigenvectors of the matrix MTM. Sin

25、ce this matrix is symmetric, eigenvectors corresponding to different eigenvalues will be orthogonal. This gives the family of vectors vi.The singular values are then given by i = |Mvi|, and the vectors ui are obtained as unit vectors in the direction of Mvi. But why are the vectors ui orthogonal?To

26、explain this, we will assume that i and j are distinct singular values. We haveMvi = iuiMvj = juj.Lets begin by looking at the expression MviMvj and assuming, for convenience, that the singular values are non-zero. On one hand, this expression is zero since the vectors vi, which are eigenvectors of

27、the symmetricmatrix MTM are orthogonal to one another:Mvi Mvj = viTMT Mvj = vi MTMvj = jvi vj = 0.On the other hand, we haveMvi Mvj = ij ui uj = 0Therefore, ui and uj are othogonal so we have found an orthogonal set of vectors vi that is transformed into another orthogonal set ui. The singular value

28、s describe the amount of stretching in the different directions.In practice, this is not the procedure used to find the singular value decomposition of a matrix since it is not particularly efficient or well-behaved numerically.Another exampleLets now look at the singular matrixThe geometric effect

29、of this matrix is the following:www.ams.org/samplings/feature-column/fcarc-svd6/1113-11-21Feature Column from the AMSIn this case, the second singular value is zero so that we may write:M = u11 v1T.In other words, if some of the singular values are zero, the corresponding terms do not appear in the

30、decomposition for M. In this way, we see that the rank of M, which is the dimension of the image of the linear transformation, is equal to the number of non-zero singular values.Data compressionSingular value decompositions can be used to represent data efficiently. Suppose, for instance, that we wi

31、sh to transmit the following image, which consists of an array of 15 25 black or white pixels.Since there are only three types of columns in this image, as shown below, it should be possible to represent the data in a more compact form.We will represent the image as a 15 25 matrix in which each entr

32、y is either a 0, representing a black pixel, or 1, representing white. As such, there are 375 entries in the matrix.www.ams.org/samplings/feature-column/fcarc-svd7/1113-11-21Feature Column from the AMSIf we perform a singular value decomposition on M, we find there are only three non-zero singular v

33、alues.1 = 14.722 = 5.223 = 3.31Therefore, the matrix may be represented asM=u11 v1T + u22 v2T + u33 v3TThis means that we have three vectors vi, each of which has 15 entries, three vectors ui, each of which has 25 entries, and three singular values i. This implies that we may represent the matrix us

34、ing only 123 numbersrather than the 375 that appear in the matrix. In this way, the singular value decomposition discovers the redundancy in the matrix and provides a format for eliminating it.Why are there only three non-zero singular values? Remember that the number of non-zero singular values equ

35、als the rank of the matrix. In this case, we see that there are three linearly independent columns in the matrix,which means that the rank will be three.Noise reductionThe previous example showed how we can exploit a situation where many singular values are zero. Typically speaking, the large singul

36、ar values point to where the interesting information is. For example, imagine we have used a scanner to enter this image into our computer. However, our scanner introduces some imperfections (usually called noise) in the image.We may proceed in the same way: represent the data using a 15 25 matrix a

37、nd perform a singular value decomposition. We find the following singular values:1 = 14.152 = 4.673 = 3.004 = 0.215 = 0.19.www.ams.org/samplings/feature-column/fcarc-svd8/1113-11-21Feature Column from the AMS15 = 0.05Clearly, the first three singular values are the most important so we will assume t

38、hat the others are due to the noise in the image and make the approximationM u11 v1T + u22 v2T + u33 v3TThis leads to the following improved image.Noisy imageImproved imageData analysisNoise also arises anytime we collect data: no matter how good the instruments are, measurements will always have so

39、me error in them. If we remember the theme that large singular values point to important features in a matrix, it seems natural to use a singular value decomposition to study data once it is collected.As an example, suppose that we collect some data as shown below:We may take the data and put it int

40、o a matrix:-1.03 0.74 -0.02 0.51 -1.31 0.99 0.69 -0.12 -0.72 1.11-2.23 1.61 -0.02 0.88 -2.39 2.02 1.62 -0.35 -1.67 2.46and perform a singular value decomposition. We find the singular values1 = 6.042 = 0.22With one singular value so much larger than the other, it may be safe to assume that the small

41、 value of 2 is dueto noise in the data and that this singular value would ideally be zero. In that case, the matrix would have rank one meaning that all the data lies on the line defined by u .www.ams.org/samplings/feature-column/fcarc-svd9/1113-11-21Feature Column from the AMSiThis brief example po

42、ints to the beginnings of a field known as principal component analysis, a set of techniques that uses singular values to detect dependencies and redundancies in data.In a similar way, singular value decompositions can be used to detect groupings in data, which explains why singular value decomposit

43、ions are being used in attempts to improve Netflixs movie recommendation system. Ratings of movies you have watched allow a program to sort you into a group of others whose ratings are similar to yours. Recommendations may be made by choosing movies that others in your group have rated highly.Summar

44、yAs mentioned at the beginning of this article, the singular value decomposition should be a central part of an undergraduate mathematics majors linear algebra curriculum. Besides having a rather simple geometric explanation, the singular value decomposition offers extremely effective techniques for

45、 putting linear algebraic ideas into practice. All too often, however, a proper treatment in an undergraduate linear algebra course seems to be missing.This article has been somewhat impressionistic: I have aimed to provide some intuitive insights into the central idea behind singular value decompositions and then illustrate h

展开阅读全文