16.05.2015 Views

Grid Computing Assignment 1 Discrete Fourier Transform ...

Grid Computing Assignment 1 Discrete Fourier Transform ...

Grid Computing Assignment 1 Discrete Fourier Transform ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Grid</strong> <strong>Computing</strong><br />

<strong>Assignment</strong> 1<br />

<strong>Discrete</strong> <strong>Fourier</strong> <strong>Transform</strong> implementation using Vishwa<br />

Problem statement<br />

Shamju Joseph K<br />

I. M.Tech<br />

CS05M038<br />

Finding the frequency spectrum of a given signal using the <strong>Discrete</strong> <strong>Fourier</strong> <strong>Transform</strong> (DFT)<br />

algorithm.<br />

Algorithm<br />

The DFT is given by the equation :<br />

X[k] = 1/N x[n] ( cos 2 kn/N – j sin 2 kn/N), n= 0 : N­1, k = 0 : N­1<br />

where x[n] is the input data, X[k] is the spectrum output, N is the total number of samples.<br />

Parallelization<br />

As the equation shows, this is a very computation intensive operation for large values of N and<br />

is the case with most of the real world applications. The DFT equation is parallelized by dividing the<br />

outer loop (loop of k) into m subtasks and these subtasks are executed on different nodes. The input<br />

data signal is available in one single file. Every task is provided with parameters like total number of<br />

tasks, current task number, number of samples and each task processes a section of the equation based<br />

on the parameters. These parameters are provided in a separate file for each task. So that the data file<br />

need not be partitioned separately.<br />

The sequential DFT code is given below:<br />

int DFT(int samples,double *x1,double *y1)<br />

{<br />

long i,k;<br />

double arg;<br />

double cosarg,sinarg;<br />

double *x2=NULL,*y2=NULL;<br />

x2 = (double*) malloc(samples*sizeof(double));<br />

y2 = (double*) malloc(samples*sizeof(double));<br />

if (x2 == NULL || y2 == NULL)<br />

return(FALSE);


}<br />

for (i=0;i


sinarg = sin(k * arg);<br />

x2[i2] += (x1[k] * cosarg ­ y1[k] * sinarg);<br />

y2[i2] += (x1[k] * sinarg + y1[k] * cosarg);<br />

}<br />

}<br />

// Copy the data back<br />

for( i = 0; i < blockSize; i++ )<br />

{<br />

x1[i] = x2[i];<br />

y1[i] = y2[i];<br />

}<br />

free(x2);<br />

free(y2);<br />

return(1);<br />

}<br />

where blockSize and offset fields are decided by Total Tasks and Current Task No<br />

parameters and is given by:<br />

blockSize = samples / TotalTasks;<br />

offset = CurrentTask * blockSize;<br />

Execution Procedure<br />

The following shows various files required for parallelization of DFT computation of a 1024<br />

sample data file using 4 grid nodes<br />

1. data.txt : Input data file<br />

2. param0.data : Parameter file for task 0<br />

3. param1.data : Parameter file for task 1<br />

4. param2.data : Parameter file for task 2<br />

5. param3.data : Parameter file for task 3<br />

6. metafile.txt : Configuration file which specifies the arguments to each subtask.<br />

Content of the file ‘param0.dat’<br />

4 (No of Tasks)<br />

0 (current task no)<br />

1024 (total samples)<br />

Content of the file ‘param1.dat’<br />

4 (No of Tasks)<br />

1 (current task no)<br />

1024 (total samples)


Content of the file ‘param2.dat’<br />

4 (No of Tasks)<br />

2 (current task no)<br />

1024 (total samples)<br />

Content of the file ‘param3.dat’<br />

4 (No of Tasks)<br />

3 (current task no)<br />

1024 (total samples)<br />

Content of the file ‘metafile.txt’<br />

4 (No of Tasks)<br />

2 (No of arguments to each task)<br />

param0.dat data.dat (arguments to task 0 )<br />

param1.dat data.dat (arguments to task 1 )<br />

param2.dat data.dat (arguments to task 2 )<br />

param3.dat data.dat (arguments to task 3 )<br />

1. Run ‘purezonalserver’ on one of the nodes.<br />

2. Next set up the grid by running ‘purefaultgridnode’ <br />

on the four nodes which want to participate in the grid.<br />

3. Submit the grid task by running ‘user’ .<br />

The parameters for the program ‘user’ :<br />

• Meta file name : metafile.txt<br />

• Source file for tasks : dft_task.c<br />

• No of splits : 4<br />

• Output file : result.dat<br />

• Source file for aggregation : dft_agg.c<br />

4. The result will be available in the file ‘result.dat’ after the computation.<br />

Vishwa starts the grid computation by running the ‘dft_task’ at each node with the files<br />

‘paramx.dat’ (x corresponds to task number) and ‘data.dat’ as arguments, which are provided in the file<br />

‘metafile.txt’. Each subtask reads ‘paramx.dat’ file to get number of tasks, current task no and number<br />

of samples, then computes a portion of the equation using these values and writes the result into a file.<br />

Once all the subtasks finish execution, ‘dft_agg’ is executed which will aggregate the result into the<br />

‘result.dat’file.<br />

Sample Plots


Observations<br />

1. There can be a provision for the user to list the various grid nodes participating in the<br />

computation with status like busy, idle etc.<br />

2. The total number of tasks and current task number are to be given to the subtasks by vishwa to<br />

allow writing parallel code. Splitting the data manually is tedious and time consuming. Once these two<br />

parameters are available, the sub task can read a selected region of the input data for processing.<br />

3. It is observed that, when given number of splits more than one, most of the time the sub tasks are<br />

executed on the same grid node and when executed on different nodes, the result is not properly<br />

combined.<br />

4. Finding the time requirement for the computation including task distribution and result<br />

aggregation is difficult.<br />

5. A provision can be given for submitting multiple source files including cpp files.<br />

6. The 'user' program could read the parameters (metafile, sub task file, result file etc) from a file to<br />

avoid entering these parameters every time for different runs.<br />

7. Displaying the entire result file onto the screen can be avoided.<br />

References<br />

1. Digital Signal Processing, Alan V. Oppenheim and Ronald W. Schafer<br />

2. http://dos.iitm.ac.in

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!