28.10.2014 Views

SQL Injection Attacks and Defense - 2009

SQL Injection Attacks and Defense - 2009

SQL Injection Attacks and Defense - 2009

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Code-Level <strong>Defense</strong>s • Chapter 8 363<br />

Table 8.6 Example Single-Quote Representations<br />

Representation<br />

Type of encoding<br />

%27 URL encoding<br />

%2527 Double URL encoding<br />

%%317 Nested double URL encoding<br />

%u0027 Unicode representation<br />

%u02b9<br />

Unicode representation<br />

%ca%b9<br />

Unicode representation<br />

&apos;<br />

HTML entity<br />

&#39;<br />

Decimal HTML entity<br />

&#x27;<br />

Hexadecimal HTML entity<br />

%26apos;<br />

Mixed URL/HTML encoding<br />

In some cases, these are alternative encodings of the character (%27 is the URL-encoded<br />

representation of the single quote), <strong>and</strong> in other cases these are double-encoded on the<br />

assumption that the data will be explicitly decoded by the application (%2527 when<br />

URL-decoded will be %27 as shown in Table 8.6, as will %%317) or are various Unicode<br />

representations, either valid or invalid. Not all of these representations will be interpreted<br />

as a single quote normally; in most cases, they will rely on certain conditions being in place<br />

(such as decoding at the application, application server, WAF, or Web server level), <strong>and</strong> therefore<br />

it will be very difficult to predict whether your application will interpret them this way.<br />

For these reasons, it is important to consider canonicalization as part of your input<br />

validation approach. Canonicalization is the process of reducing input to a st<strong>and</strong>ard or<br />

simple form. For the single-quote examples in Table 8.6, this would normally be a<br />

single-quote character (‘).<br />

Canonicalization Approaches<br />

So, what alternatives for h<strong>and</strong>ling unusual input should you consider? One method, which is<br />

often the easiest to implement, is to reject all input that is not already in a canonical format.<br />

For example, you can reject all HTML-<strong>and</strong> URL-encoded input from being accepted by<br />

the application. This is one of the most reliable methods in situations where you are not<br />

expecting encoded input. This is also the approach that is often adopted by default when<br />

you do whitelist input validation, as you may not accept unusual forms of characters when<br />

validating for known good input. At the very least, this could involve not accepting the<br />

characters used to encode data (such as %, &, <strong>and</strong> # from the examples in Table 8.6),<br />

<strong>and</strong> therefore not allowing these characters to be input.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!