11.07.2015 Views

Upgrade Report - Department of Informatics - King's College London

Upgrade Report - Department of Informatics - King's College London

Upgrade Report - Department of Informatics - King's College London

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

A SAMPLING MANUFACTURED GRAPHS: BFS TREE FILTERING 51Vertex coverage C V We define vertex coverage as the percentage <strong>of</strong> vertices which have been discoveredover the total number <strong>of</strong> vertices in the target graph:C V = |V s||V |(A.2)Edge coverage C E We define edge coverage as the percentage <strong>of</strong> edges which have been discovered overthe total number <strong>of</strong> edges in the target graph:C E = |E s||E|(A.3)Success rate R s We define the success rate as the number <strong>of</strong> inspected vertices |V i | over the total number<strong>of</strong> crawler steps sDensity Proximity C DR s = |V i|sWe define density proximity as follows:(A.4)C D = D sDwhere D is the density <strong>of</strong> the original graph and D s is the density <strong>of</strong> the sampled graphWe measure D s as follows:(A.5)D s =|E s ||V i |(|V i | − 1) + |V s \V i ||V i |(A.6)Where |V i |(|V i | − 1) + |V s \V i ||V i | is the maximum possible edges we could have observed via crawling.A.2.1Crawlers With MemoryA crawling method with memory is a method which uses some sort <strong>of</strong> secondary storage to store someinformation regarding the data it has received so far in order to decide on the next step based on all storedinformation.• Our crawler stores the sets V i , V o and L• We start with an initial set <strong>of</strong> vertices s 0 ,· · · ,s i which are our seeds. We add these seeds to V o . In thisparticular case i = 1 and only a single seed was used, identical for all methods.• At each step we select a vertex v from V o using a certain selection policy A.3• If v /∈ V i then we remove v from V o and inspect it using a specific inspection policy A.1• For each edge (v, u) acquired we add u to V o . V o may include vertices which are also in V i and mayalso contain duplicate vertices• We then add v to V i and consider it to be inspected• We repeat the process until |V i | reaches a satisfactory sizeOur crawler does not have a concept <strong>of</strong> position. It simply selects a vertex from what it has seen atsome point during the inspections it had made and chooses to inspect that, this can be seen as prioritizingthe BFS tree. Additionally the sampling does not allow vertices to be revisited.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!