Quick introduction to reverse engineering for beginners
Quick introduction to reverse engineering for beginners
Quick introduction to reverse engineering for beginners
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
mov esi, [esp+10h+ar1]<br />
mov edi, [esp+10h+ar2]<br />
xor ecx, ecx<br />
loc_14D: ; CODE XREF: f(int,int *,int *,int *)+159<br />
mov ebx, [esi+ecx*4]<br />
add ebx, [edi+ecx*4]<br />
mov [eax+ecx*4], ebx<br />
inc ecx<br />
cmp ecx, edx<br />
jb short loc_14D<br />
loc_15B: ; CODE XREF: f(int,int *,int *,int *)+A<br />
; f(int,int *,int *,int *)+129 ...<br />
xor eax, eax<br />
pop ecx<br />
pop ebx<br />
pop esi<br />
pop edi<br />
retn<br />
; ---------------------------------------------------------------------------<br />
loc_162: ; CODE XREF: f(int,int *,int *,int *)+8C<br />
; f(int,int *,int *,int *)+9F<br />
xor ecx, ecx<br />
jmp short loc_127<br />
?f@@YAHHPAH00@Z endp<br />
SSE2-related instructions are:<br />
∙ MOVDQU (Move Unaligned Double Quadword) — it just load 16 bytes from memory in<strong>to</strong> XMM-register.<br />
∙ PADDD (Add Packed Integers) — adding 4 pairs of 32-bit numbers and leaving result in first operand.<br />
By the way, no exception raised in case of overflow and no flags will be set, just low 32-bit of result<br />
will be s<strong>to</strong>red. If one of PADDD operands — address of value in memory, address should be aligned by<br />
16-byte border. If it’s not aligned, exception will be raised 71 .<br />
∙ MOVDQA (Move Aligned Double Quadword) — the same as MOVDQU, but requires address of value in<br />
memory <strong>to</strong> be aligned by 16-bit border. If it’s not aligned, exception will be raised. MOVDQA works<br />
faster than MOVDQU, but requires a<strong>for</strong>esaid.<br />
So, these SSE2-instructions will be executed only in case if there are more 4 pairs <strong>to</strong> work on plus pointer<br />
ar3 is aligned on 16-byte border.<br />
More than that, if ar2 is aligned on 16-byte border <strong>to</strong>o, this piece of code will be executed:<br />
movdqu xmm0, xmmword ptr [ebx+edi*4] ; ar1+i*4<br />
paddd xmm0, xmmword ptr [esi+edi*4] ; ar2+i*4<br />
movdqa xmmword ptr [eax+edi*4], xmm0 ; ar3+i*4<br />
Otherwise, value from ar2 will be loaded <strong>to</strong> XMM0 using MOVDQU, it doesn’t require aligned pointer, but<br />
may work slower:<br />
movdqu xmm1, xmmword ptr [ebx+edi*4] ; ar1+i*4<br />
movdqu xmm0, xmmword ptr [esi+edi*4] ; ar2+i*4 is not 16-byte aligned, so load it <strong>to</strong> xmm0<br />
paddd xmm1, xmm0<br />
movdqa xmmword ptr [eax+edi*4], xmm1 ; ar3+i*4<br />
71 More about data aligning: Wikipedia: Data structure alignment<br />
96