01.01.2015 Views

Proceedings [PDF] - Measurement and Analysis of P2P Activity ...

Proceedings [PDF] - Measurement and Analysis of P2P Activity ...

Proceedings [PDF] - Measurement and Analysis of P2P Activity ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

International Conference Advances in the <strong>Analysis</strong> <strong>of</strong> Online Paedophile <strong>Activity</strong> Paris, France; 2-3 June, 2009<br />

<strong>Measurement</strong> <strong>of</strong> paedophile activity in eDonkey<br />

using a client sending queries<br />

Firas Bessadok, Karim Bessaoud, Matthieu Latapy <strong>and</strong> Clémence Magnien<br />

LIP6 - CNRS <strong>and</strong> University Pierre & Marie Curie<br />

firas.bess@gmail.com, karim.bessaoud@complexnetworks.lip6.fr, matthieu.latapy@lip6.fr,<br />

clemence.magnien@lip6.fr<br />

1. INTRODUCTION<br />

The observation <strong>of</strong> peer-to-peer (<strong>P2P</strong>) file exchange<br />

systems is a hot topic covered by several works [1][2].<br />

In our proposal, we mainly focus on observing <strong>and</strong> analyzing<br />

paedophile activity in <strong>P2P</strong> networks. Our goal<br />

is to provide different indicators related to paedophile<br />

files found in the eDonkey network. To this end, we<br />

used a modified client which automatically sends queries<br />

to eDonkey servers <strong>and</strong> collects information about files<br />

sent back by the servers (file-id, name <strong>of</strong> file, size <strong>of</strong><br />

file, or users who possess it). The obtained data is then<br />

analyzed to gain insight on paedophile activity in the<br />

system.<br />

2. MEASUREMENT<br />

Our system is composed <strong>of</strong> a unique client which connects<br />

itself to several servers belonging to the eDonkey<br />

<strong>P2P</strong> network. The client restarts after each session <strong>of</strong> 12<br />

hours. A session is defined by the execution <strong>of</strong> the client<br />

during 12 hours. In each session, the client sends 15 different<br />

queries (8 <strong>of</strong> them are well-known paedophile keywords)<br />

one by one every 6 minutes. Once the answers<br />

(file-id, filenames or file sizes) are received, the client<br />

formats them into XML files. We run this measurement<br />

during 140 days from October 2008 to February<br />

2009.<br />

3. OBSERVATIONS<br />

We collected 2 784 583 distinct files during this experiment.<br />

Later on, we computed many statistics on these<br />

files. For instance, the number <strong>of</strong> files found in each session,<br />

the distribution <strong>of</strong> ages in filenames, <strong>and</strong> also the<br />

number <strong>of</strong> paedophile files <strong>and</strong> other non-paedophile.<br />

We present some <strong>of</strong> the obtained results below.<br />

3.1 Number <strong>of</strong> file-id<br />

Figure 1 shows the evolution <strong>of</strong> the number <strong>of</strong> distinct<br />

file-id (vertical axis) observed during our measurement<br />

as function <strong>of</strong> the number <strong>of</strong> sessions executed : similar<br />

to the time elapsed since the beginning <strong>of</strong> this measurement<br />

(horizontal axis).<br />

Figure 1: Evolution <strong>of</strong> the number <strong>of</strong> file-id observed<br />

during the measurement.<br />

We note that the plot is growing, as expected, but in<br />

a non-linear way. In the first twenty sessions, the curve<br />

has a large slope because most file-id have not yet been<br />

seen. But after that, the slope decreases gradually due<br />

to the fact that most file-id have already been seen.<br />

On the other h<strong>and</strong>, this slope remains significant. We<br />

therefore conclude that the number <strong>of</strong> file-id present in<br />

the eDonkey network is so great that we can’t see all<br />

<strong>of</strong> them with our measurements, even if conducted for<br />

long period <strong>of</strong> time (140 days here).<br />

1<br />

91

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!