23.03.2013 Views

Quick introduction to reverse engineering for beginners

Quick introduction to reverse engineering for beginners

Quick introduction to reverse engineering for beginners

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

C/C++ standard defines char type as signed. If we have two values, one is char and another is int,<br />

(int is signed <strong>to</strong>o), and if first value contain -2 (it is coded as 0xFE) and we just copying this byte in<strong>to</strong> int<br />

container, there will be 0x000000FE, and this, from the point of signed int view is 254, but not -2. In signed<br />

int, -2 is coded as 0xFFFFFFFE. So if we need <strong>to</strong> transfer 0xFE value from variable of char type <strong>to</strong> int, we<br />

need <strong>to</strong> identify its sign and extend it. That is what MOVSX 1.10 does.<br />

See also in section “Signed number representations” 2.4.<br />

I’m not sure if compiler need <strong>to</strong> s<strong>to</strong>re char variable in EDX, it could take 8-bit register part (let’s say DL).<br />

But probably, compiler’s register alloca<strong>to</strong>r 21 works like that.<br />

Then we see TEST EDX, EDX. About TEST instruction, read more in section about bit fields 1.14. But<br />

here, this instruction just checking EDX value, if it is equals <strong>to</strong> 0.<br />

Let’s try GCC 4.4.1:<br />

public strlen<br />

strlen proc near<br />

eos = dword ptr -4<br />

arg_0 = dword ptr 8<br />

push ebp<br />

mov ebp, esp<br />

sub esp, 10h<br />

mov eax, [ebp+arg_0]<br />

mov [ebp+eos], eax<br />

loc_80483F0:<br />

mov eax, [ebp+eos]<br />

movzx eax, byte ptr [eax]<br />

test al, al<br />

setnz al<br />

add [ebp+eos], 1<br />

test al, al<br />

jnz short loc_80483F0<br />

mov edx, [ebp+eos]<br />

mov eax, [ebp+arg_0]<br />

mov ecx, edx<br />

sub ecx, eax<br />

mov eax, ecx<br />

sub<br />

leave<br />

retn<br />

eax, 1<br />

strlen endp<br />

The result almost the same as MSVC did, but here we see MOVZX isntead of MOVSX 1.10. MOVZX mean<br />

MOV with Zero-Extent. This instruction place 8-bit or 16-bit value in<strong>to</strong> 32-bit register and set the rest bits<br />

<strong>to</strong> zero. In fact, this instruction is handy only because it is able <strong>to</strong> replace two instructions at once: xor<br />

eax, eax / mov al, [...].<br />

On other hand, it is obvious <strong>to</strong> us that compiler could produce that code: mov al, byte ptr [eax] /<br />

test al, al — it is almost the same, however, highest EAX register bits will contain random noise. But let’s<br />

think it is compiler’s drawback — it can’t produce more understandable code. Strictly speaking, compiler<br />

is not obliged <strong>to</strong> emit understandable (<strong>to</strong> humans) code at all.<br />

Next new instruction <strong>for</strong> us is SETNZ. Here, if AL contain not zero, test al, al will set zero <strong>to</strong> ZF flag,<br />

but SETNZ, if ZF==0 (NZ mean not zero) will set 1 <strong>to</strong> AL. Speaking in natural language, if AL is not zero,<br />

let’s jump <strong>to</strong> loc_80483F0. Compiler emitted slightly redundant code, but let’s not <strong>for</strong>get that optimization<br />

21 compiler’s function assigning local variables <strong>to</strong> CPU registers<br />

32

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!