- Matlab - https://matlab.diyez.net -

Matlab Tutorial 4: Data Analysis and Statistics with Matlab

This tutorial covers data analysis and statistics using Matlab [1].

Histogram Charts in Matlab

The elements of a vector can be displayed with bars or histograms. To create a histogram you need to divide the elements in to classes and count how many elements that belongs to each class. Then present them as rectangular bars in a diagram. The height of the rectangle is equal to the number of elements in that class. Read the vector.

Matlab Histogram Example

Matlab Histogram Example

Introduce a histogram with 15 intervals for the vector x above. It is rather difficult to see how long the intervals are. Maybe its better to introduce a histogram with 6 intervals since we know the difference between the maximum and minimum value.

>> hist(x,6)

If we don’t know the dataset, we can define the intervals that we are interested to have in the histogram. Suppose we want the integer values between 0 and 10.

>> y=0:10;
>> bar(x),grid
>> title('bar for vector x')

Histogram Bars with Grid in the Background

Histogram Bars with Grid in the Background


Stem Plot

To make this a bit more complete we show some other plotting possibilities. In Matlab we can also illustrate a discrete sequence (stem plot). This is done as:

>> stem(x), grid
>> title('Stem diagram for vector x')

See figure below. Notice that the values in vector x are plotted versus its index.

Matlab Stem Plot Example

Matlab Stem Plot Example

Assume you want to produce a curve from sampled data. If you take sampled data from a sine curve with an amplitude of 1. It becomes a discrete sequence of data. The sine curve looks like: y=sin(x);

>> z=0:0.2:10;
>> yy=sin(z)
>> stem(yy),grid,title('Stem plot for a sine curve')

We can consider this to be a continuous sine wave sampled. This is how a computer system would look upon the signal.

Stem Plot for Sine Curve

Stem Plot for Sine Curve

Staircase Plot and Pia Charts

What other presentations can we find? We will also take a look on pie- and staircase diagrams. Below we have an m-file.

% M-file created by MatlabCorner.com
% M-file makes two presentations of vector x.
% File created by MatlabCorner.com
subplot(1,2,1)
pie(x),grid, title('Pie diagram for vector x')
subplot(1,2,2)
stairs(x), grid,title('Staircase plot for vector x')

See the result in the figures below.

Staircase Plot and Pie Diagram Example

Staircase Plot and Pie Diagram Example



Each element in the pie diagram is given in percent of the whole vector sum x. For a staircase plot the elements in vector x is plotted versus the index.

>> pie([ 2 4 3 5],{'North','South','East','West'})

Several of the commands that we have presented here can also be used in three dimensions. The commands are changed to: bar3(x), stem3(x) and pie3(x). Try them.

Statistics Commands in Matlab

We will now focus on some commands for statistics. These are needed to evaluate measured data. There are also some functions that can be added in your figure window as well. Let’s start with some simple commands, but to use them we need to have some repetition of different concepts.

Exercise 1: Read two vectors and try some of the statistics commands

>> X=1:5; Z=[ 01 4 7 12];
% Calculate the mean and median value of X and Z.
>> mean(X), mean(Z)
% The result should become 3 and 4.8.
>> median(X), median(Z)
% The result should be: 3 and 4.
% Standard deviation for the vectors.
>> std(X), std(Z)
% the answer is 1.5811 and 4.8683 respectively.
% This seems very logical, due to the large spread
% in the elements of vector Z.
% Now let’s plot Z versus X. See figure below
>> plot(X,Z), grid

Plot Z versus X

Plot Z versus X



Use the menu of the figure window and choose Tools-> Data Statistics. Now there will appear a small box where you can choose: min, max, median, std and range both for X and Z values. Mark mean and std for the Z vector. This will give three dotted horizontal lines on the plot. The upper is mean value + standard deviation, in the middle we have the mean value and finally the lower one corresponds to mean value – standard deviation. See the figure below.

The Relationship Between Standard Deviation and Mean Value

The Relationship Between Standard Deviation and Mean Value

In the figure above it seems plausible enough to believe there is a positive correlation between X and Z. Id est when Z increases so does X. Let’s use Matlab to calculate the correlation.

>> corrcoef(X,Z)
ans =
1 0.97435
0.97435 1

It becomes a 2×2 matrix [3]. The element (1, 1) indicates that there is a correlation 1 between X and X. Element (2,2) gives the same information for Y and Y. Element (2,) and (1,2) gives the correlation between X and Y. This means we have a strong positive correlation (0.97435) between X and Y. The commands we used for statistics can have a matrix as an argument.

Create the matrix A=[X;Z].

Try the commands. Are there any surprises in the table?

We will also repeat some previous matrix manipulating commands that can be used to calculate sums, differences and products.

Use the matrix A below and the vector X that was stated earlier.

>> A= 1 2 3
4 5 6
7 8 9

the commands in the above list can also be used on a vector. Try them on X.

>> prod(X), sum(X),diff(X)
% or
>> diff(X,2) % the same as diff(diff(X))
% Now try with the matrix A instead.
>> diff(A), prod(A),sum(A)
ans=       ans=         ans=
3 3 3       28 80 162       12 15 18
3 3 3

Finally create a new matrix F in order to find out how we can put a matrix together.

>> F=[ans; A]
F=
12 15 18
1 2 3
4 5 6
7 8 9

Use the command sort on the resulting matrix F.

>> [A,index]=sort(F) % results in two matrices: one sorted matrix A
% and one matrix containing the original position in the matrix A.

Notice that the sort command only operates within the column so therefore we only need one index to keep track of the element position. We have so far in the course achieved simple text display or very rudimentary tables. I will try to show some useful commands in order to create better display of the output. The command fprintf can together with the use of flags specify the output. How many positions that should be used and how many decimals and so on? We can use it to write to the command window as well a text file.

Let’s start by writing to the command window. Example: we would like to create a table containing three columns. The first one has the numbers from 1 to 5, the second contains the square root of the numbers and the third calculates the cube of the numbers. See below for a suggestion of an m-file.

% Alt_1.m file created by MatlabCorner.com
% The m-file makes a table consisting of 3 columns.
% We also use format codes in order to control the output.
% \t=horizontal tab, \n=new line, %6.3f=6 positions and 3 decimals
x=1:3;
y1=sqrt(x); y2=x.^3;
Y=[x' y1' y2'];
disp(' x sqrt(x) x^3')
fprintf('%4.0f \t %6.3f \t %6.3f \n', Y')

The output in the command window will be:

x sqrt(x) x^3
1 1.000 1.000
2 1.414 8.000
3 1.732 27.000

A slightly changed m-file will more or less accomplish the same thing.

% Alt_2.m file created by MatlabCorner.com
% The m-file makes a table consisting of 3 columns. We also use format
% codes in order to control the output.
% \t=horizontal tab, \n=new line, %6.3f=6 positions and 3 decimals
disp(' x sqrt(x) x^3')
disp('---------------------------------')
for
x=1:3;
y1=sqrt(x); y2=x.^3;
Y=[x' y1' y2'];
fprintf('%1.0f \t %6.3f \t %6.3f \n', Y')
end

Finally we will use fprintf to write to a text file that we creates. We modify the m-file Alt_2.m.

% Alt_2.m file created by MatlabCorner.com
% The m-file makes a table consisting of 3 columns. We also use
% format codes in order to control the output and write to text
% file:Alt_2.txt \t=horizontal tab, \n=New line, 
% %6.3f=6 positions and 3 decimals
disp(' x sqrt(x) x^3')
disp('---------------------------------')
fid=fopen('Alt_2.txt','w') % creates a txt- file and writes to it.
for
x=1:3;
y1=sqrt(x); y2=x.^3;
Y=[x' y1' y2'];
fprintf(fid, '%1.0f \t %6.3f \t %6.3f \n', Y') % fid=file
identifier
end
fclose(fid) % closes the txt-file.

Run the m-file Alt_2. Then please check the file: Alt_2.txt

As you well can imagine now these formatting codes are very useful and there many others that can be used together with the fprintf command , but there some other commands that use the same formatting codes. Like fscanf can be used to read from text files or textscan that also reads text and converts this to a cell array. See the table below for several format codes.

String Formatting Codes

On the homepage we have a text file: name.txt. We shall now try to read with the use of command textscan.

>> fid=fopen('name.txt','r'); % opens the file for reading.
>> C=textscan(fid,'%u%s%u%u'); % Gives a cell array.
>> fclose(fid) % closes the file for reading.

Take a look on the cell array C!

>> C{1,1},C{1,2},C{1,3},C{1,4}

As I have said earlier there are other commands that can do this equally well. Use the Matlab help to find out what other possibilities there are to solve this.