Quick introduction to reverse engineering for beginners

Recommendations

Info

1.10 strlen() Now let’s talk about loops one more time. Often, strlen() function 20 is implemented using while() statement. Here is how it’s done in MSVC standard libraries: int strlen (const char * str) { const char *eos = str; } while( *eos++ ) ; return( eos - str - 1 ); Let’s compile: _eos$ = -4 ; size = 4 _str$ = 8 ; size = 4 _strlen PROC push ebp mov ebp, esp push ecx mov eax, DWORD PTR _str$[ebp] ; place pointer to string from str mov DWORD PTR _eos$[ebp], eax ; place it to local varuable eos $LN2@strlen_: mov ecx, DWORD PTR _eos$[ebp] ; ECX=eos ; take 8-bit byte from address in ECX and place it as 32-bit value to EDX with sign extension movsx edx, BYTE PTR [ecx] mov eax, DWORD PTR _eos$[ebp] ; EAX=eos add eax, 1 ; increment EAX mov DWORD PTR _eos$[ebp], eax ; place EAX back to eos test edx, edx ; EDX is zero? je SHORT $LN1@strlen_ ; yes, then finish loop jmp SHORT $LN2@strlen_ ; continue loop $LN1@strlen_: ; here we calculate the difference between two pointers mov eax, DWORD PTR _eos$[ebp] sub eax, DWORD PTR _str$[ebp] sub eax, 1 ; subtract 1 and return result mov esp, ebp pop ebp ret 0 _strlen_ ENDP Two new instructions here: MOVSX 1.10 and TEST. About first: MOVSX 1.10 is intended to take byte from some place and store value in 32-bit register. MOVSX 1.10 meaning MOV with Sign-Extent. Rest bits starting at 8th till 31th MOVSX 1.10 will set to 1 if source byte in memory has minus sign or to 0 if plus. And here is why all this. 20 counting characters in string in C language 31
C/C++ standard defines char type as signed. If we have two values, one is char and another is int, (int is signed too), and if first value contain -2 (it is coded as 0xFE) and we just copying this byte into int container, there will be 0x000000FE, and this, from the point of signed int view is 254, but not -2. In signed int, -2 is coded as 0xFFFFFFFE. So if we need to transfer 0xFE value from variable of char type to int, we need to identify its sign and extend it. That is what MOVSX 1.10 does. See also in section “Signed number representations” 2.4. I’m not sure if compiler need to store char variable in EDX, it could take 8-bit register part (let’s say DL). But probably, compiler’s register allocator 21 works like that. Then we see TEST EDX, EDX. About TEST instruction, read more in section about bit fields 1.14. But here, this instruction just checking EDX value, if it is equals to 0. Let’s try GCC 4.4.1: public strlen strlen proc near eos = dword ptr -4 arg_0 = dword ptr 8 push ebp mov ebp, esp sub esp, 10h mov eax, [ebp+arg_0] mov [ebp+eos], eax loc_80483F0: mov eax, [ebp+eos] movzx eax, byte ptr [eax] test al, al setnz al add [ebp+eos], 1 test al, al jnz short loc_80483F0 mov edx, [ebp+eos] mov eax, [ebp+arg_0] mov ecx, edx sub ecx, eax mov eax, ecx sub leave retn eax, 1 strlen endp The result almost the same as MSVC did, but here we see MOVZX isntead of MOVSX 1.10. MOVZX mean MOV with Zero-Extent. This instruction place 8-bit or 16-bit value into 32-bit register and set the rest bits to zero. In fact, this instruction is handy only because it is able to replace two instructions at once: xor eax, eax / mov al, [...]. On other hand, it is obvious to us that compiler could produce that code: mov al, byte ptr [eax] / test al, al — it is almost the same, however, highest EAX register bits will contain random noise. But let’s think it is compiler’s drawback — it can’t produce more understandable code. Strictly speaking, compiler is not obliged to emit understandable (to humans) code at all. Next new instruction for us is SETNZ. Here, if AL contain not zero, test al, al will set zero to ZF flag, but SETNZ, if ZF==0 (NZ mean not zero) will set 1 to AL. Speaking in natural language, if AL is not zero, let’s jump to loc_80483F0. Compiler emitted slightly redundant code, but let’s not forget that optimization 21 compiler’s function assigning local variables to CPU registers 32
Page 1 and 2: Quick introduction to reverse engin
Page 3 and 4: 1.16 C++ classes . . . . . . . . .
Page 5 and 6: Preface Here (will be) some of my n
Page 7 and 8: 1.1 Hello, world! Let’s start wit
Page 9 and 10: 1.2 Stack Stack — is one of the m
Page 11 and 12: (_snprintf() function works just li
Page 13 and 14: 1.3 printf() with several arguments
Page 15 and 16: 1.4 scanf() Now let’s use scanf()
Page 17 and 18: GCC replaced first printf() call to
Page 19 and 20: }; return 0; printf ("What you ente
Page 21 and 22: 1.5 Passing arguments via stack Now
Page 23 and 24: 1.6 One more word about results ret
Page 25 and 26: push OFFSET $SG741 ; ’a
Page 27 and 28: 1.8 switch()/case/default 1.8.1 Few
Page 29 and 30: 1.8.2 A lot of cases If switch() st
Page 31 and 32: loc_804840C: ; DATA XREF: .rodata:0
Page 33 and 34: et 0 _main ENDP Nothing very specia
Page 35: add esp, 1Ch xor eax, eax ; return
Page 39 and 40: eax=eax+ecx return eax ... and this
Page 41 and 42: mov eax, ecx imul edx sar edx, 1 mo
Page 43 and 44: ; current stack state: ST(0) = resu
Page 45 and 46: ; in local stack here 8 bytes still
Page 47 and 48: fld QWORD PTR _a$[esp-4] ; current
Page 49 and 50: In other words, fnstsw ax / sahf in
Page 51 and 52: 1.13 Arrays Array is just a set of
Page 53 and 54: mov [esp+70h+var_70], eax call _pri
Page 55 and 56: jge SHORT $LN1@main mov ecx, DWORD
Page 57 and 58: ebp 0x15 0x15 esi 0x0 0 edi 0x0 0 e
Page 59 and 60: 1.14 Bit fields A lot of functions
Page 61 and 62: So, let’s download Linux Kernel 2
Page 63: There are some redundant code prese
Page 66 and 67: y 10 just by dropping last digit (3
Page 68 and 69: _key$ = 8 ; size = 4 _len$ = 12 ; s
Page 70 and 71: 1.15 Structures It can be defined t
Page 72 and 73: call DWORD PTR __imp__GetSystemTime
Page 74 and 75: 1.15.4 Fields packing in structure
Page 76 and 77: }; int b; struct outer_struct { cha
Page 78 and 79: }; printf ("stepping=%d\n", tmp->st
Page 80 and 81: call ___printf_chk mov eax, esi shr
Page 82 and 83: mov edx, DWORD PTR _t$[ebp] or edx,
Page 84 and 85: 1.16 C++ classes I placed a C++ cla
Page 86 and 87:
push ecx mov DWORD PTR _this$[ebp],
Page 88 and 89:
call _ZN1c4dumpEv lea eax, [esp+20h
Page 90 and 91:
1.17 Unions 1.17.1 Pseudo-random nu
Page 92 and 93:
xor eax, eax pop esi mov esp, ebp p
Page 94 and 95:
Let’s compile it in MSVC 2010 (I
Page 96 and 97:
comp() function: public comp comp p
Page 98 and 99:
Some compilers can do vectorization
Page 100 and 101:
mov ecx, [esp+10h+var_10] mov edx,
Page 102 and 103:
GCC In all other cases, non-SSE2 co
Page 104 and 105:
add ebx, edx lea esi, [esi+0] loc_8
Page 106 and 107:
movdqa xmm1, XMMWORD PTR [ecx+16] a
Page 108 and 109:
1.20 x86-64 It’s a 64-bit extensi
Page 110 and 111:
} x40 = a3 | x31; x41 = x24 & ~x37;
Page 112 and 113:
or ebx, DWORD PTR _x6$[esp+36] mov
Page 114 and 115:
and r11, r8 and rsi, r8 and r10, r1
Page 116 and 117:
pop rdi pop rsi ret 0 s1 ENDP Nothi
Page 118 and 119:
2.1 LEA instruction LEA (Load Effec
Page 120 and 121:
2.3 npad It’s an assembler macro
Page 122 and 123:
2.4 Signed number representations T
Page 124 and 125:
call function function: .. do somet
Page 126 and 127:
3.2 String Debugging messages are o
Page 128 and 129:
.text:3011E91B DD 1E fstp qword ptr
Page 130 and 131:
loc_80483F2: pop ebp retn _f endp 4
Page 132 and 133:
loc_8048444: loc_8048448: loc_80484
Page 134 and 135:
public f2 f2 proc near push ebp mov
Page 136 and 137:
loc_804840A: loc_804842E: loc_80484
Page 138 and 139:
mov esi, DWORD PTR _k$[esp+16] push
Page 140 and 141:
or eax, edx retn f endp 4.1.8 Task
Page 142 and 143:
4.2 Middle level 4.2.1 Task 2.1 Wel
Page 144 and 145:
mov [edx+eax], bl mov ebx, edi mov
Page 146 and 147:
loc_80485AB: ; CODE XREF: f+1E0 ; f
Page 148 and 149:
mov [ecx+eax], dl inc eax cmp ebx,
Page 150 and 151:
Chapter 5 Tools ∙ IDA as disassem
Page 152 and 153:
Chapter 7 More examples 147
Page 154 and 155:
.text:00541050 arg_0 = dword ptr 4
Page 156 and 157:
.text:005410F3 .text:005410F3 loc_5
Page 158 and 159:
.text:005411A9 pop ebx .text:005411
Page 160 and 161:
.text:00541249 .text:00541249 next_
Page 162 and 163:
.text:005412FC mov esi, offset cube
Page 164 and 165:
.text:005413DC call _fwrite ; write
Page 166 and 167:
.text:005414A7 push ecx ; Filename
Page 168 and 169:
.text:005413A3 mov ecx, [esp+44h+pa
Page 170 and 171:
.text:00541408 push offset aRb ; "r
Page 172 and 173:
}; }; return; fseek (f, 0, SEEK_END
Page 174 and 175:
.text:005411B5 mov ebp, eax Check f
Page 176 and 177:
.text:0054124C inc ebp .text:005412
Page 178 and 179:
This instruction clears bit, in oth
Page 180 and 181:
.text:005410CD add esp, 40h .text:0
Page 182 and 183:
}; for (i=0; i
Page 184 and 185:
}; tmp[y][z]=get_bit (row, y, z); f
Page 186 and 187:
}; fread (buf, flen, 1, f); fclose
Page 188 and 189:
7.2 SAP client network traffic comp
Page 190 and 191:
.text:64413F68 .text:64413F68 loc_6
Page 192 and 193:
.text:64405171 call dbg .text:64405
Page 194 and 195:
.text:64404FF7 push offset aLogging
Page 196 and 197:
.text:64405113 loc_64405113: .text:
Page 198 and 199:
Now let’s dig deeper and find con
Page 200 and 201:
.text:64411E0B .text:64411E0B loc_6
Page 202 and 203:
Chapter 8 Other things 8.1 Compiler
Page 204 and 205:
Chapter 9 Tasks solutions 9.1 Easy
Page 206 and 207:
9.1.5 Task 1.5 Hint #1: Keep in min
Page 208 and 209:
} 9.1.8 Task 1.8 Solution: two 100*
Page 210 and 211:
#define PUSH(low, high) ((void) ((t
Page 212 and 213:
} } ignore one or both. Otherwise,
show all

Quick introduction to reverse engineering for beginners

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?