23.03.2013 Views

Quick introduction to reverse engineering for beginners

Quick introduction to reverse engineering for beginners

Quick introduction to reverse engineering for beginners

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

mov esi, [esp+10h+ar1]<br />

mov edi, [esp+10h+ar2]<br />

xor ecx, ecx<br />

loc_14D: ; CODE XREF: f(int,int *,int *,int *)+159<br />

mov ebx, [esi+ecx*4]<br />

add ebx, [edi+ecx*4]<br />

mov [eax+ecx*4], ebx<br />

inc ecx<br />

cmp ecx, edx<br />

jb short loc_14D<br />

loc_15B: ; CODE XREF: f(int,int *,int *,int *)+A<br />

; f(int,int *,int *,int *)+129 ...<br />

xor eax, eax<br />

pop ecx<br />

pop ebx<br />

pop esi<br />

pop edi<br />

retn<br />

; ---------------------------------------------------------------------------<br />

loc_162: ; CODE XREF: f(int,int *,int *,int *)+8C<br />

; f(int,int *,int *,int *)+9F<br />

xor ecx, ecx<br />

jmp short loc_127<br />

?f@@YAHHPAH00@Z endp<br />

SSE2-related instructions are:<br />

∙ MOVDQU (Move Unaligned Double Quadword) — it just load 16 bytes from memory in<strong>to</strong> XMM-register.<br />

∙ PADDD (Add Packed Integers) — adding 4 pairs of 32-bit numbers and leaving result in first operand.<br />

By the way, no exception raised in case of overflow and no flags will be set, just low 32-bit of result<br />

will be s<strong>to</strong>red. If one of PADDD operands — address of value in memory, address should be aligned by<br />

16-byte border. If it’s not aligned, exception will be raised 71 .<br />

∙ MOVDQA (Move Aligned Double Quadword) — the same as MOVDQU, but requires address of value in<br />

memory <strong>to</strong> be aligned by 16-bit border. If it’s not aligned, exception will be raised. MOVDQA works<br />

faster than MOVDQU, but requires a<strong>for</strong>esaid.<br />

So, these SSE2-instructions will be executed only in case if there are more 4 pairs <strong>to</strong> work on plus pointer<br />

ar3 is aligned on 16-byte border.<br />

More than that, if ar2 is aligned on 16-byte border <strong>to</strong>o, this piece of code will be executed:<br />

movdqu xmm0, xmmword ptr [ebx+edi*4] ; ar1+i*4<br />

paddd xmm0, xmmword ptr [esi+edi*4] ; ar2+i*4<br />

movdqa xmmword ptr [eax+edi*4], xmm0 ; ar3+i*4<br />

Otherwise, value from ar2 will be loaded <strong>to</strong> XMM0 using MOVDQU, it doesn’t require aligned pointer, but<br />

may work slower:<br />

movdqu xmm1, xmmword ptr [ebx+edi*4] ; ar1+i*4<br />

movdqu xmm0, xmmword ptr [esi+edi*4] ; ar2+i*4 is not 16-byte aligned, so load it <strong>to</strong> xmm0<br />

paddd xmm1, xmm0<br />

movdqa xmmword ptr [eax+edi*4], xmm1 ; ar3+i*4<br />

71 More about data aligning: Wikipedia: Data structure alignment<br />

96

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!