2nd USENIX Conference on Web Application Development ...

More documents

Recommendations

Info

ackend channel. This approach is easier than trying to include long polling in existing MVC frameworks but is not as clean as keeping all application logic in one framework. Event-based Web frameworks differ from other frameworks because they only run one thread at a time, eliminating the thread issues traditional MVC frameworks have when implementing long polling. Since there is only one thread, it is important that it does not block. If an expensive operation is performed, such as disk or network access, a callback is specified by the caller and the server stops processing the current request and starts processing a different request. When the operation finishes, the server runs the callback, passing in values generated by the operation. This style of programming makes it easy to resume a request at a later time as is required with Vault. Node.js [5] is an javascript event-based IO library used to write server-side javascript applications. Tornado [6] is a similar Web server for Python, based on FriendFeed [4]. 10 Conclusion Long polling allows for new interactive Web applications that respond quickly to external events. Existing implementations, however, have not been general or scalable, and they require too much glue code for application developers. We have presented a modification of the MVC pattern that allows developers to write their applications in a familiar style without needing to know about the details of how long polling is implemented. Furthermore, Vault makes it easy to develop a variety of feeds that can be reused in many different applications. Internally, the Vault implementation handles long polling in a scalable way using a distributed notifier to minimize the amount of work that is required per request. We hope that the Vault architecture will be implemented in a variety of mainstream frameworks in order to encourage the development of interesting applications based on long polling. 11 Acknowledgments This research was supported by the National Science Foundation under Grant No. 0963859. Thanks to Tomer London, Bobby Johnson, and anonymous program committee members for reviewing various drafts of this paper. References [1] Bayeux protocol. http://svn.cometd.com/trunk/ bayeux/bayeux.html. [2] Cometd homepage. http://cometd.org/. 12 [3] Django. http://www.djangoproject.com/. [4] Friendfeed homepage. http://friendfeed.com/. [5] Node.js homepage. http://nodejs.org/. [6] Tornado web server homepage. http://nodejs.org/. [7] Web socket api. http://dev.w3.org/html5/ websockets/. [8] GARRETT, J. J. Ajax: a new approach to web applications, February 2005. http://www.adaptivepath.com/ ideas/essays/archives/000385.php. [9] LAMPSON, B. W., AND REDELL, D. D. Experience with processes and monitors in mesa. Commun. ACM 23 (February 1980), 105–117. [10] OUSTERHOUT, J. Fiz: A Component Framework for Web Applications. Stanford CSD Technical Report. http://www. stanford.edu/˜ouster/cgibin/papers/fiz.pdf, 2009. [11] OUSTERHOUT, J., AND STRATMANN, E. Managing state for ajax-driven web components. <strong>USENIX</strong> <strong>Conference</strong> on Web Application Development (June 2010), 73–85. [12] REENSKAUG, T. Models-views-controllers. Xerox PARC technical note http://heim.ifi.uio.no/˜trygver/ mvc-1/1979-05-MVC.pdf, May 1979. [13] REENSKAUG, T. Thing-model-view-editor. Xerox PARC technical note http://heim.ifi.uio.no/˜trygver/ mvc-2/1979-12-MVC.pdf, May 1979. [14] RUSSEL, A. Comet: Low latency data for the browser, March 2006. http://alex.dojotoolkit.org/2006/ 03/comet-lowlatency-data-for-the-browser/. [15] STOICA, I., MORRIS, R., LIBEN-NOWELL, D., KARGER, D. R., KAASHOEK, M. F., DABEK, F., AND BALAKRISHNAN, H. Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw. 11 (February 2003), 17– 32. 124 WebApps ’11: <strong>2nd</strong> <strong>USENIX</strong> <strong>Conference</strong> on Web Application Development <strong>USENIX</strong> Association
Abstract Detecting Malicious Web Links and Identifying Their Attack Types Hyunsang Choi Korea University Seoul, Korea realchs@korea.ac.kr Malicious URLs have been widely used to mount various cyber attacks including spamming, phishing and malware. Detection of malicious URLs and identification of threat types are critical to thwart these attacks. Knowing the type of a threat enables estimation of severity of the attack and helps adopt an effective countermeasure. Existing methods typically detect malicious URLs of a single attack type. In this paper, we propose method using machine learning to detect malicious URLs of all the popular attack types and identify the nature of attack a malicious URL attempts to launch. Our method uses a variety of discriminative features including textual properties, link structures, webpage contents, DNS information, and network traffic. Many of these features are novel and highly effective. Our experimental studies with 40,000 benign URLs and 32,000 malicious URLs obtained from real-life Internet sources show that our method delivers a superior performance: the accuracy was over 98% in detecting malicious URLs and over 93% in identifying attack types. We also report our studies on the effectiveness of each group of discriminative features, and discuss their evadability. 1 Introduction While the World Wide Web has become a killer application on the Internet, it has also brought in an immense risk of cyber attacks. Adversaries have used the Web as a vehicle to deliver malicious attacks such as phishing, spamming, and malware infection. For example, phishing typically involves sending an email seemingly from a trustworthy source to trick people to click a URL (Uniform Resource Locator) contained in the email that links to a counterfeit webpage. To address Web-based attacks, a great effort has been directed towards detection of malicious URLs. A common countermeasure is to use a blacklist of malicious URLs, which can be constructed from various sources, This work was done when Hyunsang Choi was an intern at Microsoft Research Asia. Contact author: Bin B. Zhu (binzhu@ieee.org). Bin B. Zhu Microsoft Research Asia Beijing, China binzhu@microsoft.com 1 Heejo Lee Korea University Seoul, Korea heejo@korea.ac.kr particularly human feedbacks that are highly accurate yet time-consuming. Blacklisting incurs no false positives, yet is effective only for known malicious URLs. It cannot detect unknown malicious URLs. The very nature of exact match in blacklisting renders it easy to be evaded. This weakness of blacklisting has been addressed by anomaly-based detection methods designed to detect unknown malicious URLs. In these methods, a classification model based on discriminative rules or features is built with either knowledge a priori or through machine learning. Selection of discriminative rules or features plays a critical role for the performance of a detector. A main research effort in malicious URL detection has focused on selecting highly effective discriminative features. Existing methods were designed to detect malicious URLs of a single attack type, such as spamming, phishing, or malware. In this paper, we propose a method using machine learning to detect malicious URLs of all the popular attack types including phishing, spamming and malware infection, and identify the attack types malicious URLs attempt to launch. We have adopted a large set of discriminative features related to textual patterns, link structures, content composition, DNS information, and network traffic. Many of these features are novel and highly effective. As described later in our experimental studies, link popularity and certain lexical and DNS features are highly discriminative in not only detecting malicious URLs but also identifying attack types. In addition, our method is robust against known evasion techniques such as redirection [42], link manipulation [16], and fast-flux hosting [17]. Identification of attack types is useful since the knowledge of the nature of a potential threat allows us to take a proper reaction as well as a pertinent and effective countermeasure against the threat. For example, we may conveniently ignore spamming but should respond immediately to malware infection. Our experiments on 40,000 benign URLs and 32,000 malicious URLs obtained from real-life Internet sources show that our method has achieved an accuracy rate of more than 98% in detecting malicious URLs and an accuracy rate <strong>USENIX</strong> Association WebApps ’11: <strong>2nd</strong> <strong>USENIX</strong> <strong>Conference</strong> on Web Application Development 125
Page 1 and 2:
Proceedings of the 2nd</str
Page 3 and 4:
USENIX Association
Page 5:
WebApps ’11: 2nd
Page 9 and 10:
GuardRails: A Data-Centric Web Appl
Page 11 and 12:
Policy Annotation Only admins can d
Page 13 and 14:
@edit, price, false, :nothing To ma
Page 15 and 16:
tags in the Project description, bu
Page 17 and 18:
String Command Result "foobar".gsub
Page 19 and 20:
Transformation Status Homepage Logi
Page 21 and 22:
Abstract Ioannis Papagiannis Imperi
Page 23 and 24:
Num. of Occurrences Type Wordpress
Page 25 and 26:
Sanitisation htmlentities() functio
Page 27 and 28:
Original expression Transformed exp
Page 29 and 30:
non-tracking code with a simple str
Page 31 and 32:
Direct request vulnerabilities. The
Page 33 and 34:
Abstract Secure Data Preservers for
Page 35 and 36:
e found for a variety of services.
Page 37 and 38:
3.2 Operational View The lifecycle
Page 39 and 40:
it automatically). A user or servic
Page 41 and 42:
given the availability of client-si
Page 43 and 44:
can only be decrypted by the base l
Page 45 and 46:
BenchLab: An Open Testbed for Reali
Page 47 and 48:
2.2. Realistic load generation Real
Page 49 and 50:
virtual machine with the software s
Page 51 and 52:
the Olio calendaring applications (
Page 53 and 54:
able to match Firefox’s optimized
Page 55 and 56:
5.4.1. Local vs remote users We obs
Page 57 and 58:
Resource Provisioning of Web Applic
Page 59 and 60:
in a directed acyclic graph [3]. Th
Page 61 and 62:
accordingly. This is the goal of in
Page 63 and 64:
Although expressing performance pro
Page 65 and 66:
Response time (ms) Response time (m
Page 67 and 68:
Response time (ms) Response time (m
Page 69 and 70:
C3: An Experimental, Extensible, Re
Page 71 and 72:
HtmlParser CssParser IHtmlParser IC
Page 73 and 74:
node during construction. This immu
Page 75 and 76:
public interface IDOMTagFactory { I
Page 77 and 78:
3.2 JS execution Point (2): Runtime
Page 79 and 80:
Extensions IE: Available from C3-eq
Page 81: extensions for web-scale use. The r
Page 84 and 85: dangerous permissions. In compariso
Page 86 and 87: Permission Popular Unpopular Plug-i
Page 88 and 89: (a) Prevalence of Dangerous permiss
Page 90 and 91: its functionality to the permission
Page 92 and 93: y ACCESS COARSE LOCATION. 358 appli
Page 94 and 95: 7 Conclusion This study contributes
Page 96 and 97: 5 discusses the pros and cons of th
Page 98 and 99: the client is free to use them how
Page 100 and 101: data it uses RESTful API which in t
Page 102 and 103: On top of Django we are using a sma
Page 104 and 105: nal design. Moreover, when the whol
Page 106 and 107: [5] The spring framework. http://ww
Page 108 and 109: of strcpy can preclude the introduc
Page 110 and 111: Integer-valued Binary Stored XSS Re
Page 112 and 113: Java Perl PHP Number of programmers
Page 114 and 115: 0 2 4 6 8 10 0 1 2 3 Team Number La
Page 116 and 117: 0 10 20 30 Number of Vulnerabilitie
Page 118 and 119: Automated black-box penetration tes
Page 121 and 122: Abstract Integrating Long Polling w
Page 123 and 124: the Web server, the framework dispa
Page 125 and 126: all() => [Record] filter(condition)
Page 127 and 128: in any future requests associated w
Page 129 and 130: ��
Page 131: quests to be put to sleep. For exam
Page 135 and 136: 3 Discriminative Features Our metho
Page 137 and 138: the total number of unique IPs and
Page 139 and 140: Table 8: Detection accuracy and tru
Page 141 and 142: centraldevideoscomhomensmaduros. bl
Page 143 and 144: Web content by executing the conten
Page 145 and 146: Maverick: Providing Web Application
Page 147 and 148: must schedule USB messages to provi
Page 149 and 150: priate Web driver. USB events sent
Page 151 and 152: hardware IDs to the browser kernel
Page 153 and 154: PhotoBooth latency JavaScript drive
Page 155 and 156: delegate the higher-level abstracti
show all

2nd USENIX Conference on Web Application Development ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?