Grid Computing Assignment 1 Discrete Fourier Transform ...
Grid Computing Assignment 1 Discrete Fourier Transform ...
Grid Computing Assignment 1 Discrete Fourier Transform ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Grid</strong> <strong>Computing</strong><br />
<strong>Assignment</strong> 1<br />
<strong>Discrete</strong> <strong>Fourier</strong> <strong>Transform</strong> implementation using Vishwa<br />
Problem statement<br />
Shamju Joseph K<br />
I. M.Tech<br />
CS05M038<br />
Finding the frequency spectrum of a given signal using the <strong>Discrete</strong> <strong>Fourier</strong> <strong>Transform</strong> (DFT)<br />
algorithm.<br />
Algorithm<br />
The DFT is given by the equation :<br />
X[k] = 1/N x[n] ( cos 2 kn/N – j sin 2 kn/N), n= 0 : N1, k = 0 : N1<br />
where x[n] is the input data, X[k] is the spectrum output, N is the total number of samples.<br />
Parallelization<br />
As the equation shows, this is a very computation intensive operation for large values of N and<br />
is the case with most of the real world applications. The DFT equation is parallelized by dividing the<br />
outer loop (loop of k) into m subtasks and these subtasks are executed on different nodes. The input<br />
data signal is available in one single file. Every task is provided with parameters like total number of<br />
tasks, current task number, number of samples and each task processes a section of the equation based<br />
on the parameters. These parameters are provided in a separate file for each task. So that the data file<br />
need not be partitioned separately.<br />
The sequential DFT code is given below:<br />
int DFT(int samples,double *x1,double *y1)<br />
{<br />
long i,k;<br />
double arg;<br />
double cosarg,sinarg;<br />
double *x2=NULL,*y2=NULL;<br />
x2 = (double*) malloc(samples*sizeof(double));<br />
y2 = (double*) malloc(samples*sizeof(double));<br />
if (x2 == NULL || y2 == NULL)<br />
return(FALSE);
}<br />
for (i=0;i
sinarg = sin(k * arg);<br />
x2[i2] += (x1[k] * cosarg y1[k] * sinarg);<br />
y2[i2] += (x1[k] * sinarg + y1[k] * cosarg);<br />
}<br />
}<br />
// Copy the data back<br />
for( i = 0; i < blockSize; i++ )<br />
{<br />
x1[i] = x2[i];<br />
y1[i] = y2[i];<br />
}<br />
free(x2);<br />
free(y2);<br />
return(1);<br />
}<br />
where blockSize and offset fields are decided by Total Tasks and Current Task No<br />
parameters and is given by:<br />
blockSize = samples / TotalTasks;<br />
offset = CurrentTask * blockSize;<br />
Execution Procedure<br />
The following shows various files required for parallelization of DFT computation of a 1024<br />
sample data file using 4 grid nodes<br />
1. data.txt : Input data file<br />
2. param0.data : Parameter file for task 0<br />
3. param1.data : Parameter file for task 1<br />
4. param2.data : Parameter file for task 2<br />
5. param3.data : Parameter file for task 3<br />
6. metafile.txt : Configuration file which specifies the arguments to each subtask.<br />
Content of the file ‘param0.dat’<br />
4 (No of Tasks)<br />
0 (current task no)<br />
1024 (total samples)<br />
Content of the file ‘param1.dat’<br />
4 (No of Tasks)<br />
1 (current task no)<br />
1024 (total samples)
Content of the file ‘param2.dat’<br />
4 (No of Tasks)<br />
2 (current task no)<br />
1024 (total samples)<br />
Content of the file ‘param3.dat’<br />
4 (No of Tasks)<br />
3 (current task no)<br />
1024 (total samples)<br />
Content of the file ‘metafile.txt’<br />
4 (No of Tasks)<br />
2 (No of arguments to each task)<br />
param0.dat data.dat (arguments to task 0 )<br />
param1.dat data.dat (arguments to task 1 )<br />
param2.dat data.dat (arguments to task 2 )<br />
param3.dat data.dat (arguments to task 3 )<br />
1. Run ‘purezonalserver’ on one of the nodes.<br />
2. Next set up the grid by running ‘purefaultgridnode’ <br />
on the four nodes which want to participate in the grid.<br />
3. Submit the grid task by running ‘user’ .<br />
The parameters for the program ‘user’ :<br />
• Meta file name : metafile.txt<br />
• Source file for tasks : dft_task.c<br />
• No of splits : 4<br />
• Output file : result.dat<br />
• Source file for aggregation : dft_agg.c<br />
4. The result will be available in the file ‘result.dat’ after the computation.<br />
Vishwa starts the grid computation by running the ‘dft_task’ at each node with the files<br />
‘paramx.dat’ (x corresponds to task number) and ‘data.dat’ as arguments, which are provided in the file<br />
‘metafile.txt’. Each subtask reads ‘paramx.dat’ file to get number of tasks, current task no and number<br />
of samples, then computes a portion of the equation using these values and writes the result into a file.<br />
Once all the subtasks finish execution, ‘dft_agg’ is executed which will aggregate the result into the<br />
‘result.dat’file.<br />
Sample Plots
Observations<br />
1. There can be a provision for the user to list the various grid nodes participating in the<br />
computation with status like busy, idle etc.<br />
2. The total number of tasks and current task number are to be given to the subtasks by vishwa to<br />
allow writing parallel code. Splitting the data manually is tedious and time consuming. Once these two<br />
parameters are available, the sub task can read a selected region of the input data for processing.<br />
3. It is observed that, when given number of splits more than one, most of the time the sub tasks are<br />
executed on the same grid node and when executed on different nodes, the result is not properly<br />
combined.<br />
4. Finding the time requirement for the computation including task distribution and result<br />
aggregation is difficult.<br />
5. A provision can be given for submitting multiple source files including cpp files.<br />
6. The 'user' program could read the parameters (metafile, sub task file, result file etc) from a file to<br />
avoid entering these parameters every time for different runs.<br />
7. Displaying the entire result file onto the screen can be avoided.<br />
References<br />
1. Digital Signal Processing, Alan V. Oppenheim and Ronald W. Schafer<br />
2. http://dos.iitm.ac.in