21.07.2015 Views

"TMS320C55x DSP Library DSPLIB Programmer's Reference"

"TMS320C55x DSP Library DSPLIB Programmer's Reference"

"TMS320C55x DSP Library DSPLIB Programmer's Reference"

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

mmulDescriptionAlgorithmThis function multiplies two matricesMultiply input matrix A (M by N) by input matrix B (N by P) using 2 nested loops:for i = 1 to Mfor k = 1 to P{temp = 0for j = 1 to Ntemp = temp + A(i,j) * B(j,k)C(i,k) = temp}Overflow Handling MethodologyNot applicableSpecial Requirements Verify that the dimensions of input matrices are legal, i.e. col1 == row2Implementation Notes In order to take advantage of the dual MAC architecture of the C55x, this implementationchecks the size of the matrix x1. For small matrices x1 (row1 < 4 orcol1 < 2), single MAC loops are used. For larger matrices x1 (row1 ≥ 4 andcol1 ≥ 2), Dual MAC loops are more efficient and quickly make up for the additionalinitialization overhead.ExampleBenchmarksSee examples/mmul subdirectory(preliminary)Cycles †Code size(in bytes)Core: if(row1 < 4 || col1 < 2), use single MAC((col1 + 2)*row1 + 4)*col2if((row1==even)&&(row1 ≥ 4)&&(col1 ≥ 2)), use dual MAC((col1 + 4)*0.5*row1 + 10)col2 if((row1==odd)&&(row1 ≥ 4)&&(col1 ≥ 2), use dual MAC((col1 + 4)*0.5*(row1 – 1) + col1 + 12)col2Overhead: 30215† Assumes all data is in on-chip dual-access RAM and that there is no bus conflict due to twiddletable reads and instruction fetches (provided linker command file reflects those conditions).Function Descriptions4-75

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!