# Using Matlab and Principal Component Analysis (PCA) to Reduce Dimensionality of .csv Data

This information is out of date really, I have a much easier method here that does away with doing everything yourself.

I used Matlab to reduce the number of dimensions in my gesture data. After a bit of experimentation with different numbers of dimensions I found I could reduce the number of dimensions by half using PCA and still get quite low errors between the original data and the reduced dimension reconstructed data. Some gesturers made such consistent movements I could use just 2 dimensions to describe almost their entire range of motion.

The method is relatively clear in Matlab, although I am still a bit unsure of the multiple transforms made in the following code. I think I may have performed a few too many, but at least it works! The code “ReduceUsingPCA.m” takes in a directory to perform the conversion on and the number of output dimensions you require. So to convert every .csv in “c:input” to 20 dimensional data you run it as “ReduceUsingPCA(“c:input”,20) in Matlab.

% FileName is the name of the file to work on, OutputSize is no. of
% dimensions to output after PCA
function [output_args]=ReduceUsingPCA(DirName,OutputSize)

files = dir(fullfile(DirName, ‘*.csv’));

for i=1:length(files)
FileName= [DirName '/' files(i).name];
% read in csv file from FileName and store in x

[Rows, Columns] = size(x);  % find size of input matrix
m=mean(x);                  % find mean of input matrix
y=x-ones(size(x,1),1)*m;    % normalise by subtracting mean
c=cov(y);                   % find covariance matrix
[V,D]=eig(c);               % find eigenvectors (V) and eigenvalues (D) of covariance matrix
[D,idx] = sort(diag(D));    % sort eigenvalues in descending order by first diagonalising eigenvalue matrix, idx stores order to use when ordering eigenvectors
D = D(end:-1:1)’;
V = V(:,idx(end:-1:1));     % put eigenvectors in order to correspond with eigenvalues
V2d=V(:,1:OutputSize);        % (significant Principal Components we use, OutputSize is input variable)
prefinal=V2d’*y’;
final=prefinal’;            % final is normalised data projected onto eigenspace

[infile, remain] = strtok(FileName,’/’);
infile = strtok(remain,’.’);
mkdir([num2str(OutputSize) 'PC']);
outputfilename = [num2str(OutputSize) 'PC' infile '_' num2str(OutputSize) 'PCs.csv'];

csvwrite(outputfilename,final);
end
end

The files are saved in the same directory as the input data, eg: “filename20PCs.csv”

## 26 thoughts on “Using Matlab and Principal Component Analysis (PCA) to Reduce Dimensionality of .csv Data”

1. hi, sorry fo my english, i am from chihuahua, mexico.
So, let me ask you something, may a reduce a matrix or a vector from 10304×1 to 40×1?
because i have implemented in matlab a code similar to yours in a application for face recognition, and the function of PCA works great when i have a matrix of 10304×72 (for example, this is the result of codify 72 picture of 24 persons), but when i codify de picture of just one persons it gets de vector 10304×1.
could you help me to know how to transform to 40×1?
thanks a lot.

• Hi…I kindly and humbly request you to mail me the source code for face recognition using PCA as i am also doing the same project..From a long time i am not able to write the correct code…please help me…please mail me the source code…..M file.thanking you in advance.
you can mail me at murarkaankit@gmail.com

• hi.. i am doing a project on dimensionality reduction. For that i have decide to use pca. can you please tell what function is used to reduce the dimensions using pca?

• How did you solve that problem? Because I also having the same problem. I am doing it for hand signs. Even though it works fine for a data set related to number of hand signs in the training data set it doesn’t work only for one image related matrix. It just gives only 0s. Can you please tel me how did you solve that problem.

2. nice to meet you bro,
so can you help me pleaces for
face recogniton using pca with matlab?
tank’s

3. hi.. i am working on image fusion using PCA and iam using your code to calculate principal component but iam not getting the amswer
here i have the error
y=x-ones(size(x,1),1)*m;

4. hi, im working with pca in prtools, and i would like to know if exists a special command who can tell me which one of the caracteristicas has been retained…im stuck in this part of the work. will be really nice if some one knows how to do it!!! :) thx

5. Sir i’m doing a project on face hallucination and recognition please help me to find the eigen values of the images and recognition using PCA

6. I have JPEG data,we want to reduce the dimension using PCA for feature exactration with help Matlab.

• Hi Imran, or James

Please, I need to use the PCA for feature extraction with matlab, can you send the m file if you did the exraxtion of featueres

7. Hello James, could you please explain how can I perform KPCA + LDA on large databases such as mnist (handwritten digits). If you could give us some examples i would be really appreciated. Thanks and keep up the good work.

8. sir,i am doing a project in fetal ECG extraction using PCA in MATLAB.will u pls kindly send me the source code.

9. hi sir,,
i need source code for foetal ecg extraction using neural network…i’m working on it.. but still i didnt get d o/p.

10. Hello Sir,
you are projecting data set to its eigen space. Can i project it to any sub space?

11. I am applying PCA for feature selection on normalized NSL KDD data set. for this purpose i was using your code .but i am getting error like
Input argument “DirName” is undefined.

Error in ==> ReduceUsingPCA at 10
FileName= [DirName '/' files(i).name];
Since so many days i am working on PCA but i am not able to select features exactly using PCA .my data set is intrusion detection data set i.e. NSL KDD data set.

• hi… i have to reduce the dimensions of the insurance benchmark dataset. when i doing my project have some problems. Can you please send a copy of your file?

12. Hi,
I’m doing project in BCI, i have data in .csv file n I just want to capture the fluctuation in the waveform n indicate it by any indicator.
So can anyone help me regarding this…..???

13. i need code for PCA PLEASE CAN U MAIL THE SOURCE CODE

WHERE IS THE FUNCTION OR CODE FOR ReduceUsingPCA IS USED IN THE PROGRAM CAN U PLEASE MAIL THE CODE ..

14. hi everybody
i must us pca in matlab for feature selection and reducing demintions.
my data set is a big data.
pleace can you send me m.file for it??
pleace help me.

15. hi..sir, i am doing a project on dimensionality reduction. For that i have decide to use pca. can you please tell what function is used to reduce the dimensions using pca.

16. hi..sir, i am doing a project on dimensionality reduction.can you help me why should i go for pca?