Proceedings [PDF] - Measurement and Analysis of P2P Activity ...
Proceedings [PDF] - Measurement and Analysis of P2P Activity ...
Proceedings [PDF] - Measurement and Analysis of P2P Activity ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
International Conference Advances in the <strong>Analysis</strong> <strong>of</strong> Online Paedophile <strong>Activity</strong> Paris, France; 2-3 June, 2009<br />
<strong>Measurement</strong> <strong>of</strong> paedophile activity in eDonkey<br />
using a client sending queries<br />
Firas Bessadok, Karim Bessaoud, Matthieu Latapy <strong>and</strong> Clémence Magnien<br />
LIP6 - CNRS <strong>and</strong> University Pierre & Marie Curie<br />
firas.bess@gmail.com, karim.bessaoud@complexnetworks.lip6.fr, matthieu.latapy@lip6.fr,<br />
clemence.magnien@lip6.fr<br />
1. INTRODUCTION<br />
The observation <strong>of</strong> peer-to-peer (<strong>P2P</strong>) file exchange<br />
systems is a hot topic covered by several works [1][2].<br />
In our proposal, we mainly focus on observing <strong>and</strong> analyzing<br />
paedophile activity in <strong>P2P</strong> networks. Our goal<br />
is to provide different indicators related to paedophile<br />
files found in the eDonkey network. To this end, we<br />
used a modified client which automatically sends queries<br />
to eDonkey servers <strong>and</strong> collects information about files<br />
sent back by the servers (file-id, name <strong>of</strong> file, size <strong>of</strong><br />
file, or users who possess it). The obtained data is then<br />
analyzed to gain insight on paedophile activity in the<br />
system.<br />
2. MEASUREMENT<br />
Our system is composed <strong>of</strong> a unique client which connects<br />
itself to several servers belonging to the eDonkey<br />
<strong>P2P</strong> network. The client restarts after each session <strong>of</strong> 12<br />
hours. A session is defined by the execution <strong>of</strong> the client<br />
during 12 hours. In each session, the client sends 15 different<br />
queries (8 <strong>of</strong> them are well-known paedophile keywords)<br />
one by one every 6 minutes. Once the answers<br />
(file-id, filenames or file sizes) are received, the client<br />
formats them into XML files. We run this measurement<br />
during 140 days from October 2008 to February<br />
2009.<br />
3. OBSERVATIONS<br />
We collected 2 784 583 distinct files during this experiment.<br />
Later on, we computed many statistics on these<br />
files. For instance, the number <strong>of</strong> files found in each session,<br />
the distribution <strong>of</strong> ages in filenames, <strong>and</strong> also the<br />
number <strong>of</strong> paedophile files <strong>and</strong> other non-paedophile.<br />
We present some <strong>of</strong> the obtained results below.<br />
3.1 Number <strong>of</strong> file-id<br />
Figure 1 shows the evolution <strong>of</strong> the number <strong>of</strong> distinct<br />
file-id (vertical axis) observed during our measurement<br />
as function <strong>of</strong> the number <strong>of</strong> sessions executed : similar<br />
to the time elapsed since the beginning <strong>of</strong> this measurement<br />
(horizontal axis).<br />
Figure 1: Evolution <strong>of</strong> the number <strong>of</strong> file-id observed<br />
during the measurement.<br />
We note that the plot is growing, as expected, but in<br />
a non-linear way. In the first twenty sessions, the curve<br />
has a large slope because most file-id have not yet been<br />
seen. But after that, the slope decreases gradually due<br />
to the fact that most file-id have already been seen.<br />
On the other h<strong>and</strong>, this slope remains significant. We<br />
therefore conclude that the number <strong>of</strong> file-id present in<br />
the eDonkey network is so great that we can’t see all<br />
<strong>of</strong> them with our measurements, even if conducted for<br />
long period <strong>of</strong> time (140 days here).<br />
1<br />
91