Matlab Matlab Tutorial 5: Linear Equations

In this Matlab ^[1] tutorial we will deal with linear equations, the least square method, condition numbers over & under-determined equations.

Let’s start with an example.

Example 1

We have an equation system with three unknown variables and three equations. What will be the solution to the system below?

3w-2y+4z=8
5w+8y-6z=-5
9w-2y+7z=-17

Assume a matrix ^[2] A containing the coefficients multiplied with x, y and z, and a vector with the numbers on the right-hand side of the equations. We can thus rewrite our equations as:

AX=b , where X contains the unknown (w, y and z), A and b are shown below.

b=     A=
  8    3 -2  4
 -5    5  8 -6
-17    9 -2  7

How will we find the solution?

>> X=inv(A)*b

>> X=A\b

both gives a correct answer. The last method produces a Gaussian elimination if A is a quadratic matrix. In our case X becomes:

X=
-36.7778
71.2778
65.2222

This is no exact solution, only an approximation. Try also a specific command:

>> lsqr(A,b)

The condition number of an equation can be examined. It means in short, that you can investigate the sensitivity of a linear equation system to disturbances in A or b. The condition number is always >1. The greater the sensitivity, the greater the number.

Calculate the sensitivity in our system.

>> cond(A)

The condition number partly indicates what kind of trust one should put in a solution given from lsqr.

>> cond(A\b) or cond(lsqr(A,b))

As you have seen there are many different commands to use for solving equation systems. Wee will look at a few. They use the same argument as lsqr. Try the following: pcg, qmr, symmlq, minres . Some of them are successful and one or two will fail. try to figure out which one of these methods produces a correct answer?

Round ^[3]-off errors can be magnified and we can lose accuracy. For a badly conditioned matrix A, A*A^-1 is not equal to the identity matrix. Return to the matrix A and change the element A(3,3). See below!

A =
3.0000 -2.0000 4.0000
5.0000 8.0000 -6.0000
9.0000 -2.0000 7.5600

What will the solution become and what happens to the condition number? The right-hand side is the same as previous. The equation system above has been applied to a quadratic matrix A (nxn) , where b is a column vector with n elements.

What happens if we do not have the same number of equations as unknown variables? See first example. Assume that A is mxn matrix, where m>n. It means that we have more equations than unknowns. The system is over-determined meaning: probably it dose not exist an exact solution. This is a very realistic case in real life.

Suppose we have 58 measurements of current and voltage over a resistor, and we wish to determine the resistance using Ohm´s law. In theory these measurements would ideally lie on a straight line on a U-I plot, but in practice there are likely to be some deviations from the measurements to the straight line. We would like to find a quadratic deviation sum as small as possible. This would give us the best solution according to the least square method. In matlab this is achieved by:

>> A\b

Solve the following example where we have 3 equations, but only 2 unknowns.

x + y = 2
x + 2y = 0
x + 3y =-1

Our coefficient matrix is A and the measurement vector is b: AX=b

What do x and y become according to the least square method? If we have fewer equations than unknowns, than we have a under-determined system. It means that we have infinitely many solutions, but Matlab only presents one of these solutions without any warning of all the others. Solve the equation system below:

x + y + z = 2
x + 2y + 3z = 3

What solution is presented? In the Example 2 below we solve an over determined equation system without any knowledge about more advanced commands. Instead we rely on our mathematical background. First specify the nomal equations!

Example 2

We have the following output from an experiment: y=[0 1.1 1.93 3.3] and the corresponding input is x=[ 0 1 2 3]. Let us now look for a function that adapts as good as possible to the data set. Try to find a model that suits our data. First we make a plot of the data. See Figure 1a.

In all real-life measurements we always have the presence of disturbances and noise. These will of course have some impact on our measurement and give us some error. It will happen no matter how good we are in modeling. We will not find a perfect match between our model and the measured data.

Figure 1a

In Figure 1a it seems that a suitable function could be a straight line. y=a₁x + a₂ , where a₁ and a₂ must be determined. If we use our y- and x-values we get the following:

3.3 = a₁ *3 + a₂
1.93 = a₁ *2 + a₂
1.1 = a₁ *1 + a₂
0 = a₁ *0 + a₂

We now have 4 equations and 2 unknown. Now let us look for a best possible solution (least square) that fits our data, where the sum (a₁x_i + a₂ – y_i )² should be as small as possible. Index i goes from i=0 to 3. We rewrite the 4 equations above to a matrix equation Y=X*a, where

Y= 3.3 and X= 3 1
1.93 2 1
1.1 1 1
0 0 1

and

a= a₁ .
a₂

We need to solve the vector a from the matrix equation, but in order to do so X must be inverted, which can be achieved only if X is quadratic. The matrix equation is written like:

X^T*Y= X^T*X*a
X^T*X is a quadratic matrix, and we will try to invert it. If the inverse exists:
a= (X^T*X)^-1* X^T*Y

the vector a that we get contains the values a1 and a2. This is the best solution according to the least square method.

It is time to try the solution. Write the following in Matlab:

>> a=inv(X'*X)*X'*Y

a= if we plot this, it will give a straight line. See figure 1b.

1.0730
-0.0270

Figure 1b

It is not necessary always to adapt the data to a linear function, but in this example the data seemed to fit nicely for a first order function. For other data we might have looked for a quadratic, cubic or a fourth-order polynomial.

y=a₁x² + a₂x + a₃ , quadratic function
y=a₁x³ + a₂x² + a₃x + a₄ , cubical function

Eigenvalues and eigenvectors

Eigenvalue problems can be found within many different fields. One purpose could be to locate cracks in building constructions. we know that a building has its own natural frequencies (eigenvalues) if there are no cracks, but the frequencies will change if cracks appear. The same principle can be used in oil drilling. By listening to the ground you will get different answers (frequencies) depending on whether the ground contains oil, mud, water or something else. Different linear systems mean different eigenvalues.

We can find eigenvalue problems in cars, ships, airplanes, bridges and so on. These issues are of course not limited to mechanical systems, but can also be found in any kind of system modeled by linear equations, including political, economic, biological, electrical or whatever systems. They could in fact have an identical model description and the same eigenvalues.

Yet another example could be the rotation of the earth. In such a system there will be a fixed direction, namely the axis of the rotation. There must be a north and a south pole and thus a rotation axis. This becomes the eigenvector. An eigenvector specifies the direction. How can we define these?

Let A be a nxn matrix. The X vector will be separated from the zero vector, thus AX=λX , for some number λ . X is called the eigenvector to the matrix A with eigenvalue λ. Eigenvalues and eigenvectors can be complex. A nxn matrix has always n eigenvalues (poles). Eigenvalues can be determined by: det(A-λI)=0.

This equation is commonly known as the characteristic equation (numerator in the transfer function = 0), and when this is solved you get the eigenvalues. For each and every eigenvalue one can decide the eigenvectors by (A-λI)X=0 , where I is the identity matrix. Note that the eigenvalues are closely related to the system, but this is not true for the eigenvectors. They change depending on how we describe the system.

Let us solve a couple examples with Matlab.

Example 3

Let us solve the following example. We have a matrix A=[1 2 3; 4 5 6; 7 8 7]. Find the corresponding eigenvalues!

>> eig(A)
ans =
15.023
-1.8018
-0.2217

Now let us find the eigenvectors to the matrix.

>> [X,D]=eig(A) % Gives us the eigenvectors in X and the eigenvalues as
% diagonal elements in the matrix D.
X = % Note that the column vectors are normalized.
% They all have length 1.
-0.2488 -0.5718 0.5956
-0.5686 -0.3273 -0.7589
-0.7841 0.7523 0.2634
D =
15.0235 0 0
0 -1.8018 0
0 0 -0.2217

We can find the eigenvectors as columns in matrix X.This means that the matrix A can be diagonalized (linear algebra). We can express the relation as: A=X*D*X^-1 or D=X^-1*A*X. Maybe you remember the following from the subject in linear algebra. The determinant of the matrix A is the product of the eigenvalues. Check it if you like: det(A) and prod(A)!

Let us see if we can find 3 linear independent eigenvalues in our example. This means that the system can be diagonalized. It also means that the matrix A is invertible and that the 3 vectors span a volume in the space R³. Investigate the matrix A.

>> A=[1 2 4; 2 4 8; 3 7 9]

Are the rows and columns linearly independent ? What are the eigenvalues for A ?

det(A): Calculates the determinant for the matrix A.

rank(A): Gives how many linearly independent rows and columns there are in matrix A. If all rows and columns are independent this implies that A are invertible.

inv(A): Inverts the matrix A. If A is invertible than det(A) is nonzero.

trace(A): The sum of the diagonal elements in A or the sum of its eigenvalues.

System of differential equations

In an ordinary differential equation we often have a time-dependent variable like x(t). Here we know that the time derivative or growth (velocity) x'(t) depends on x(t). Sometimes we have several such differential equations linked to each other, for instance as below:

x₁‘(t)=a₁₁x₁(t) + a₁₂x₂(t) + …………+ a_1nx_n(t)
x₂‘(t)=a₂₁x₁(t) + a₂₂x₂(t) + …………+ a_2nx_n(t)
.
.
.x_n‘(t)=a_n1x₁(t) + a_n2x₂(t) + ………….+ a_nnx_n(t)

This is a nxn-system. We have n first order differential equations, where aij are the coefficients. A differential equation x'(t)=ax(t) has a general solution according to single variable calculus: x(t)=Ce^at

Thus n differential equations will give n similar solutions, where a is the eigenvalue to the equations and this can be real or complex conjugated. The solutions we get from the equation system above are a combination of exponential factors multiplied with constants or (if we have complex eigenvalues) exponential factors multiplied with sine or cosine and of course also some constant.

Example 4

Let us exemplify the previous part:

x₁‘(t)=x₁(t) + 2x₂(t)
x₂‘(t)=2x₁(t) -2x₂(t)

We collect all coefficients in a matrix A=[1 2; 2 -2]. Calculate the eigenvalues and normalized eigenvectors for the coefficient matrix A.

>> [X,D]=eig(A)
X =
-0.4472 -0.8944
0.8944 -0.4472
D =
-3 0
0 2

The general solution to the differential equations can then be written:

x₁(t)= a₁e^-3t(-0.4472) + a₂e^2t (-0,8944) ( Eq1)
x₂(t)= a₁e^3t (0.8944 ) + a₂e^2t (-0.4472) ( Eq2 )

The values of the constants a₁ and a₂ would have been known if we had the initial values of the differential equations. Other numerical values can be identified straight away from the X or D matrix. X has the 2 eigenvectors (columns) and D contains the eigenvalues (diagonal elements). Assume initial values for x₁(t) and x₂(t) and that these are:
x₁(0)=1 and x₂(0)=4.

This gives us an entire new equation system if the initial values are inserted in Eq1 and Eq2.

-0.4472 a₁ – 0.8944 a₂ = 1

0.8944 a₁ – 0.4472 a₂ = 4

Now we have an equation system with 2 unknowns and 2 equations. We can now find an exact solution. This is easily solved in Matlab.

X*a=b where a=[ a₁ ; a₂] , b=[ 1 ; 4] and X=[-0.4472 -0.8944; 0.8944 -0.4472]

The solution becomes:

>> a=X\b
a =
3.1305
-2.6833

It is now possible to plot the solutions x₁(t) and x₂(t) during the first second.

>> t=0:0.1:1;
>> x1=a(1)*exp(D(1,1)*t)*X(1,1)+a(2)*exp(D(2,2)*t)*X(1,2);
>> x2=a(1)*exp(D(1,1)*t)*X(2,1)+a(2)*exp(D(2,2)*t)*X(2,2);
>> plot(t,x1,t,x2,'-.'),grid ,legend('x1(t),'x2(t)')

See the result of the plot below in Figure 2.

Solutions for Example 2

Sparse matrices are matrices with very few nonzero elements. This can be useful to know, when we are dealing with calculations and memory allocation. In Matlab we can request that a mxm – matrix be reduced to a sparse matrix.
Only the nonzero positions and the value need to be stored in the memory. Create an identity matrix!

>> A=eye(1000) % The diagonal elements are 1, the others are 0.
% The size of the matrix is 1000x1000.

Check how many bytes are allocated in the memory (Workspace). Now use the sparse command to save memory.

>> B=sparse(A) % A has become a sparse matrix.
>> whos

Compare the memory allocation between A and B. What do you think? Also check the number of rows and columns in A and B. Does it matter, when it comes to time demand? Let us make a simple calculation. Multiply the matrix A with the scalar 3.

>> 3*A

We can simply measure the time to execute the command. This can be done using the commands tic and toc.

>> tic, 3*A, toc % tic starts clock and toc stops clock.

The time elapsed can be found in the command window. Compare it with the scalar multiplication of the sparse matrix below.

>> tic, 3*B, toc % Starts timer for the command 3*B.

We can see a clear difference, so it could matter if we must save both time and memory. In the table below we give some examples of Matlab commands that can be applied to sparse matrices.

speye(A): Gives a sparse matrix, with ones on the main diagonal.
sprand(A): Gives a sparse matrix with random numbers (0-1) on the nonzero positions.
sprandn(A): Same as above, but the elements are normally distributed (-1 to 1).
spones(A): Gives a sparse matrix with ones as the only nonzero elements.
full(A): Makes a full matrix from a sparse matrix A.
nnz(A): Gives the number of nonzero elements in the matrix A.
issparse(A): Returns 1, if the matrix is sparse or otherwise 0.
nonzeros(A): Returns a vector with all of the nonzero elements in A.
find(A): Gives an index for all nonzero elements in matrix A, regardless whether A is a sparse matrix or not.