2nd USENIX Conference on Web Application Development ...

More documents

Recommendations

Info

use Yahoo!’s database since March 2004 8 . The result using Google was a surprise to us. We expected that Google would report the same, if not higher, link polarity than other search engines since it should have more comprehensive information of the Web. It turned out that Google just reported a partial list of link popularity, as their official website described 9 . The Google Webmaster Tool provides more comprehensive external link information, but we could not use it since it is available only for the owner’s website. Table 9: Detection accuracy, false positives and false negatives using only LPOP reported by each individual search engine (%) Metric AllTheWeb Altavista Ask Google Yahoo! ACC 95.1 95.6 84.0 85.7 95.9 TP 95.3 96.3 85.7 86.7 95.7 FP 2.7 2.7 8.4 12.3 2.1 FN 2.2 1.6 7.6 2.1 2.1 Unpopular legitimate link classification. From the results reported above, we can conclude that LPOP is the most effective discriminative feature for detecting malicious URLs. It outperforms all the other feature groups by a large margin. However, LPOP alone may be ineffective for certain types of URLs, for example, to distinguish malicious URLs from a group of unpopular or newly setup benign URLs which also have low LPOP scores. This is the worst scenario for our malicious URL detector since the most effective feature, LPOP, is ineffective in this case. To conduct a test on the performance for this worst scenario, we used only the benign and malicious URLs which had zero LPOP to evaluate the performance of our detector. We obtained the following results on malicious URL detection: 91.2% for the accuracy, 4.0% for false positives, and 4.8% for false negatives. The accuracy remains high even under this worst scenario. Popularity-manipulated link classification. As described in Section 3.2, some malicious URLs have high LPOP scores because their links are manipulated using a link farm [16]. We have developed five features, i.e., distinct domain link ratio, max domain link ratio, spam link ratio, phishing link ratio, and malware link ratio, to detect link manipulated malicious URLs. To make our detector light-weight and feasible in real-time applications, we used sampled link information instead of the whole link information to calculate each of these features. To evaluate the performance when the links are manipulated, we collected malicious URLs which had high LPOP scores (LPOP > 10). Among the 32,000 malicious URLs we collected, only 622 URLs could be selected. Their distinct domain link ratio and max domain link ratio are shown against those of benign URLs in Figure 5. This figure indicates that the popularity-manipulated malicious URLs show a different pattern from those of benign 8 AlltheWeb was taken over by Yahoo!. 9 http://sites.google.com/site/webmasterhelpforum/ en/faq--crawling--indexing---ranking\#links 8 URLs. Moreover, about 90% of these malicious URLs have more than 10% malicious link ratio (spam link ratio, phishing link ratio, and malware link ratio), whereas about 5% of benign URLs have more than 10% malicious link ratio. About 56% of these malicious URLs were linked exclusively by malicious URLs of the same type. Consequently, we obtained 90.03% accuracy in detecting link-manipulated malicious URLs with the aforementioned five features. 0 0 100 200 300 400 500 600 URL dataset (LPOP > 10) 132 WebApps ’11: <strong>2nd</strong> <strong>USENIX</strong> <strong>Conference</strong> on Web Application Development <strong>USENIX</strong> Association Measured ratio 1.2 1 0.8 0.6 0.4 0.2 Distinct domain link ratio (Ben) Distinct domain link ratio (Mal) Max domain link ratio (Ben) Max domain link ratio (Mal) Figure 5: Distinct domain link ratio and max domain link ratio for benign and malicious URLs. 4.2.3 Error Analysis In this section, both false positives and negatives are further studied to understand why these errors happened in order to further improve our method. False positives. A false positive is when a benign URL is misclassified as malicious. False positives can be broadly categorized as follows: • Disreputable URL. A benign URL is likely misclassified by our detector if it fits into two or more of the following three cases: 1) the URL’s domain has a very low link popularity (LPOP errors), 2) the URL contains a malicious SLD (LEX errors), and 3) the URL’s domain is hosted by malicious ASNs (DNS errors). In this case, a benign URL can be considered as a disreputable URL. More than 90% of the false positives belonged to the disreputable case (e.g., 208.43.27.50/˜mike). • Contentless URL. Some benign URLs had no content on their webpages. In this case, CONT would fail (e.g., 222.191.251.167, 1traf.com, and 3gmatrix.cn). • Brand name URL. Some benign URLs contained a brand name keyword even they were not related to the brand domain. These URLs could be misclassified as malicious (e.g., twitterfollower. wikispaces.com). • Abnormal token URL. We observed several benign URLs which had unusual long domain tokens typically appearing in phishing URLs (e.g.,
centraldevideoscomhomensmaduros. blogspot.com). False negatives. A false negative is when a malicious URL is undetected. Most false negatives were hosted by popular social networking sites which had a high link popularity and most URLs they hosted were benign. Most of the false negative URLs were of spam or phishing type. They generated features similar to those of benign URLs. More than 95% of the false negatives belonged to this case (e.g., blog.libero. it/matteof97/ and digilander.libero.it/ Malvin92/?). This will be further discussed in Section 4.4. 4.3 Attack Type Identification Results To evaluate the performance of attack type identification, the following metrics given in [37] for multi-label classification were used: 1) micro and macro averaged metrics, and 2) ranking-based metrics with respect to the ground truth of multi-label data. Identification metrics. Additional notation is first introduced. Assume that there is an evaluation data set of multi-label examples (xi,Yi),i = 1, ...m, where xi is a feature vector, Yi ⊆ L is the set of true labels, and L = {λj : j =1...q} is the set of all labels. • Micro-averaged and macro-averaged metrics. To evaluate the average performance across multiple categories, we apply two conventional methods: micro-average and macro-average [45]. The microaverage gives an equal weight to every data set, while the macro-average gives an equal weight to every category, regardless of its frequency. Let tpλ, tnλ, fpλ, and fnλ denote the number of true positives, true negatives, false positives, and false negatives, respectively, after evaluating binary classification metrics B (accuracy, true positives, etc.) for a label λ. The micro-averaged and macro-averaged version of B can be calculated as follows: M∑ M∑ M∑ M∑ Bmicro = B( tpλ, tnλ, fpλ, fnλ), λ=1 Bmacro = 1 M λ=1 λ=1 λ=1 M∑ B(tpλ, tnλ,fpλ,fnλ). λ=1 • Ranking-based metrics. Among several rankingbased metrics, we employ the ranking loss and average precision for the evaluation. Let ri(λ) denote the rank predicted by a label ranking method for a label λ. The most relevant label receives the highest rank, while the least relevant label receives the lowest rank. The ranking loss is the number of times that irrelevant labels are ranked higher than relevant 9 labels. The ranking loss, denoted as RLoss, is calculated as follows: Rloss = 1 m <strong>USENIX</strong> Association WebApps ’11: <strong>2nd</strong> <strong>USENIX</strong> <strong>Conference</strong> on Web Application Development 133 m∑ i=1 1 |Yi||Y i| |{(λa,λb) :ri(λa) >ri(λb), (λa,λb) ∈ Yi × Y i}| where Y i is the complementary set of Yi with respect to L. The average precision, denoted by Pavg, is the average fraction of labels ranked above a particular label λ ∈ Yi which are actually in Yi. It is calculated as follows: Pavg = 1 m m∑ 1 | |Yi ∑ i=1 λ∈Yi |{λ ′ ∈ Yi : ri(λ ′ ) ≤ ri(λ)}| . ri(λ) Table 10: Multi-label classification results (%) Label ACC Averaged micro TP macro TP Ranking-based Rloss Pavg LSAd 90.70 87.55 88.51 3.45 96.87 RAkEL ML-kNN LWOT LBoth LSAd LWOT LBoth 90.38 92.79 91.34 91.04 93.11 88.45 91.23 86.45 88.96 91.02 89.59 89.04 87.93 89.77 89.33 4.68 2.88 3.42 3.77 2.61 93.52 97.66 95.85 96.12 97.85 Identification accuracy. We performed the multilabel classification by using three label sets, LSAd, LWOT and LBoth mentioned in Section 4.1. The results for two different learning algorithms, RAkEL algorithm and ML-kMN, are shown in Table 10, where micro TP and macro TP are micro-averaged true positives and macro-averaged true positives, respectively. The following results were obtained: the average accuracy was 92.95%, whereas the average precision of ranking of the two algorithms was 97.76%. The accuracy on the label set LBoth was always higher than that on either LSAd or LWOT . This implies that more accurate label set produces a more accurate result for identifying attack types. Accuracy and true positives (%) 100 90 80 70 60 50 Average accuracy Average true positives LEX LPOP CONT DNS DNSF NET Figure 6: Average accuracy and micro-averaged true positives (%).
Page 1 and 2:
Proceedings of the 2nd</str
Page 3 and 4:
USENIX Association
Page 5:
WebApps ’11: 2nd
Page 9 and 10:
GuardRails: A Data-Centric Web Appl
Page 11 and 12:
Policy Annotation Only admins can d
Page 13 and 14:
@edit, price, false, :nothing To ma
Page 15 and 16:
tags in the Project description, bu
Page 17 and 18:
String Command Result "foobar".gsub
Page 19 and 20:
Transformation Status Homepage Logi
Page 21 and 22:
Abstract Ioannis Papagiannis Imperi
Page 23 and 24:
Num. of Occurrences Type Wordpress
Page 25 and 26:
Sanitisation htmlentities() functio
Page 27 and 28:
Original expression Transformed exp
Page 29 and 30:
non-tracking code with a simple str
Page 31 and 32:
Direct request vulnerabilities. The
Page 33 and 34:
Abstract Secure Data Preservers for
Page 35 and 36:
e found for a variety of services.
Page 37 and 38:
3.2 Operational View The lifecycle
Page 39 and 40:
it automatically). A user or servic
Page 41 and 42:
given the availability of client-si
Page 43 and 44:
can only be decrypted by the base l
Page 45 and 46:
BenchLab: An Open Testbed for Reali
Page 47 and 48:
2.2. Realistic load generation Real
Page 49 and 50:
virtual machine with the software s
Page 51 and 52:
the Olio calendaring applications (
Page 53 and 54:
able to match Firefox’s optimized
Page 55 and 56:
5.4.1. Local vs remote users We obs
Page 57 and 58:
Resource Provisioning of Web Applic
Page 59 and 60:
in a directed acyclic graph [3]. Th
Page 61 and 62:
accordingly. This is the goal of in
Page 63 and 64:
Although expressing performance pro
Page 65 and 66:
Response time (ms) Response time (m
Page 67 and 68:
Response time (ms) Response time (m
Page 69 and 70:
C3: An Experimental, Extensible, Re
Page 71 and 72:
HtmlParser CssParser IHtmlParser IC
Page 73 and 74:
node during construction. This immu
Page 75 and 76:
public interface IDOMTagFactory { I
Page 77 and 78:
3.2 JS execution Point (2): Runtime
Page 79 and 80:
Extensions IE: Available from C3-eq
Page 81:
extensions for web-scale use. The r
Page 84 and 85:
dangerous permissions. In compariso
Page 86 and 87:
Permission Popular Unpopular Plug-i
Page 88 and 89:
(a) Prevalence of Dangerous permiss
Page 90 and 91: its functionality to the permission
Page 92 and 93: y ACCESS COARSE LOCATION. 358 appli
Page 94 and 95: 7 Conclusion This study contributes
Page 96 and 97: 5 discusses the pros and cons of th
Page 98 and 99: the client is free to use them how
Page 100 and 101: data it uses RESTful API which in t
Page 102 and 103: On top of Django we are using a sma
Page 104 and 105: nal design. Moreover, when the whol
Page 106 and 107: [5] The spring framework. http://ww
Page 108 and 109: of strcpy can preclude the introduc
Page 110 and 111: Integer-valued Binary Stored XSS Re
Page 112 and 113: Java Perl PHP Number of programmers
Page 114 and 115: 0 2 4 6 8 10 0 1 2 3 Team Number La
Page 116 and 117: 0 10 20 30 Number of Vulnerabilitie
Page 118 and 119: Automated black-box penetration tes
Page 121 and 122: Abstract Integrating Long Polling w
Page 123 and 124: the Web server, the framework dispa
Page 125 and 126: all() => [Record] filter(condition)
Page 127 and 128: in any future requests associated w
Page 129 and 130: ��
Page 131 and 132: quests to be put to sleep. For exam
Page 133 and 134: Abstract Detecting Malicious Web Li
Page 135 and 136: 3 Discriminative Features Our metho
Page 137 and 138: the total number of unique IPs and
Page 139: Table 8: Detection accuracy and tru
Page 143 and 144: Web content by executing the conten
Page 145 and 146: Maverick: Providing Web Application
Page 147 and 148: must schedule USB messages to provi
Page 149 and 150: priate Web driver. USB events sent
Page 151 and 152: hardware IDs to the browser kernel
Page 153 and 154: PhotoBooth latency JavaScript drive
Page 155 and 156: delegate the higher-level abstracti
show all

2nd USENIX Conference on Web Application Development ...

Create successful ePaper yourself

Delete template?

Save as template?