Upgrade Report - Department of Informatics - King's College London

More documents

Recommendations

Info

A SAMPLING MANUFACTURED GRAPHS: BFS TREE FILTERING 51Vertex coverage C V We define vertex coverage as the percentage of vertices which have been discoveredover the total number of vertices in the target graph:C V = |V s||V |(A.2)Edge coverage C E We define edge coverage as the percentage of edges which have been discovered overthe total number of edges in the target graph:C E = |E s||E|(A.3)Success rate R s We define the success rate as the number of inspected vertices |V i | over the total numberof crawler steps sDensity Proximity C DR s = |V i|sWe define density proximity as follows:(A.4)C D = D sDwhere D is the density of the original graph and D s is the density of the sampled graphWe measure D s as follows:(A.5)D s =|E s ||V i |(|V i | − 1) + |V s \V i ||V i |(A.6)Where |V i |(|V i | − 1) + |V s \V i ||V i | is the maximum possible edges we could have observed via crawling.A.2.1Crawlers With MemoryA crawling method with memory is a method which uses some sort of secondary storage to store someinformation regarding the data it has received so far in order to decide on the next step based on all storedinformation.• Our crawler stores the sets V i , V o and L• We start with an initial set of vertices s 0 ,· · · ,s i which are our seeds. We add these seeds to V o . In thisparticular case i = 1 and only a single seed was used, identical for all methods.• At each step we select a vertex v from V o using a certain selection policy A.3• If v /∈ V i then we remove v from V o and inspect it using a specific inspection policy A.1• For each edge (v, u) acquired we add u to V o . V o may include vertices which are also in V i and mayalso contain duplicate vertices• We then add v to V i and consider it to be inspected• We repeat the process until |V i | reaches a satisfactory sizeOur crawler does not have a concept of position. It simply selects a vertex from what it has seen atsome point during the inspections it had made and chooses to inspect that, this can be seen as prioritizingthe BFS tree. Additionally the sampling does not allow vertices to be revisited.
A SAMPLING MANUFACTURED GRAPHS: BFS TREE FILTERING 52A.3 Selection PoliciesBy a selection policy, we mean the method we use to select a vertex from the seen vertices list. There areseveral that we have experimented with.These include:Breadth-First A method which samples the first item in V o . Essentially the well known a Breadth-FirstSearch.Depth-First A method which samples the last item in V o . Essentially the well known Depth-First Search.Biased Random Samples an item in V o uniformly at random.Unbiased Random Samples a distinctly uniformly random item from V o only the unique items in the listare considered.Hypothetical Greedy Samples the item u which has the highest frequency f(u) in V o . In case of multipleitems having the same frequency the first one is selected.Least Discovered Samples the item u which has the lowest frequency f(u) in V o in case of multiple itemsthe first one is selected.(a) Breadth-First(b) Depth-First(c) Biased Random(d) Unbiased Random(e) Hypothetical Greedy(f) Least ObservedFigure 24: Growth pattern of all methods at 1000,2000,.., 64000 vertices
Page 1 and 2:
King’s College LondonDepartment o
Page 3 and 4:
7.5 Partitioned Preferential Attach
Page 5 and 6: List of Figures1 Erdós-Rényi Grap
Page 7 and 8: 2Part IIntroduction1 Online Social
Page 9 and 10: 4Part IIRelated Work4 Network prope
Page 11 and 12: 4 NETWORK PROPERTIES AND METRICS 6W
Page 13 and 14: 5 GRAPH GENERATION MODELS 8Figure 1
Page 15 and 16: 5 GRAPH GENERATION MODELS 10Figure
Page 17 and 18: 5 GRAPH GENERATION MODELS 12Figure
Page 19 and 20: 6 GRAPH SAMPLING AND CRAWLING 14on
Page 21 and 22: 6 GRAPH SAMPLING AND CRAWLING 16In
Page 23 and 24: 6 GRAPH SAMPLING AND CRAWLING 18{a(
Page 25 and 26: 7 GRAPH GENERATION 20(a) Constant m
Page 27 and 28: 7 GRAPH GENERATION 22Figure 9: Grow
Page 29 and 30: 7 GRAPH GENERATION 24Figure 10: Imp
Page 31 and 32: 8 EXISTING DATA SET ANALYSIS 26From
Page 33 and 34: 8 EXISTING DATA SET ANALYSIS 28From
Page 35 and 36: 9 GRAPH SAMPLING 30calls. We presen
Page 37 and 38: 9 GRAPH SAMPLING 32Figure 19: Real
Page 39 and 40: 9 GRAPH SAMPLING 34where b > 0 cons
Page 41 and 42: 9 GRAPH SAMPLING 36Theorem 3. For c
Page 43 and 44: 9 GRAPH SAMPLING 38Figure 22: Plots
Page 45 and 46: 9 GRAPH SAMPLING 40wide range of re
Page 47 and 48: 11 GRAPH SAMPLING 42In addition to
Page 49 and 50: 12 GRAPH GENERATION MODELS 4411.1.3
Page 51 and 52: 46Part VReferencesReferences[1] Dim
Page 53 and 54: REFERENCES 48[36] Jure Leskovec, La
Page 55: 50Part VIAppendixASampling Manufact
show all

Upgrade Report - Department of Informatics - King's College London

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?