Multilevel Graph Clustering with Density-Based Quality Measures

More documents

Recommendations

Info

$Eine Einführung in LaTeX-Beamer - studiy - Brandenburgische ...$

3 The Multi-Level Refinement AlgorithmFilenamegname.meta.gzgname.vertices.gzgname.edges.gzgname.cluster.algname.gzgname.cluster.algname.loggname.cluster.algname.meta.gzgname.cluster_best.gzgname.cluster_info.gzgname.vertex-degree.gzDescriptionmeta-information about the graph gname(source, content, type, . . . )the vertex data (vertex name)the edge data (start-vertex, end-vertex, weight)the clustering computed with algorithm algnamefree form log-file of the algorithmconfiguration of the algorithm(parameters, cluster count, modularity, . . . )copy of the best found clusteringtable comparing the clusterings(algorithm, modularity, runtime, . . . )weighted and unweighted vertex degreesTable 3.3: Hierarchical Naming Convention for File Namesadded which retrieve the graph object as argument. Based on the argument typesthe compiler then exploits this function overloading to select the correct implementation.Another advantage of this strategy is better readability. Many small details can behidden behind the adapter functions. Therefore such adapters are provided for theown graph implementations and index maps. The C++ code in Figure 3.17 showsa small example to calculate the weighted degree of all vertices. Using a for-eachconstruct the outer loop iterates over all vertices of the graph. At each vertex theout-going edges are visited by the inner loop and their weights summed up.In summary the index spaces and maps with their abstract access by indicesare conceptually very similar to graphs and property maps in Boost. Here theconcept is extended by the explicit integration of index spaces. Without them itis not possible to directly describe the dependencies between property maps andthe structures they attribute. The Boost Graph Library already supplies the userwith basic implementations for various graph types. However the simple collections(vectors, maps) of the Standard Template Library are used which do not scale wellto big graphs. Here the chunks-based implementation for index maps provides ascalable alternative.3.5.2 Data ManagementThe management of graphs, clusterings and other data faces some special requirementsin this project. Most importantly the evaluation will handle many graphsand produce huge amounts of clusterings. Thus in each clustering no duplicate informationlike the graph structure and vertex names should be stored. Besides theactual graph data quite some structured meta data emerges. This includes informationabout used algorithms, parameters, source references and comments. This datashould be easily accessible to computers and humans. And finally the data shouldbe easy to import in other software like GNU-R for post-processing.In consequence simple text files similar to the CSV (comma separated values)56
3.5 Further Implementation Notesformat are used. Each file stores exactly one table. The columns are separated bytabs (\t) and rows by a single newline (\n )character. Strings may be delimitedby quotation marks. The value NA may be used for missing values. The first rowcontains the table header with column titles and the first column is always used asindex. To save storage space all files are transparently compressed using the gzipformat.Separate files are used for different data. For example graphs are stored in threefiles containing vertex data, edge data, and meta data. For each vertex its name ororiginal identifier is stored. The edge table contains the pairs of end-vertex indicesand the edge weights. Additional files are used to store clusterings and similar.In order to retrieve data easily a hierarchical naming convention is employed.The dot is universally used as separator in filenames. The first component namesthe graph and following components differentiate data sets. The naming schemeis best explained by example as in the table below. Meta data is organized in asimilar fashion. Here character strings are used as row indices and the dot is usedas hierarchical separator.57
Page 1:
Brandenburgische Technische Univers
Page 5 and 6:
ContentsList of FiguresList of Tabl
Page 7:
List of Figures1.1 Graph of the Mex
Page 11 and 12:
1 IntroductionSince the rise of com
Page 13 and 14:
1.2 Objectives and Outline1.2 Objec
Page 15 and 16: 2 Graph ClusteringThis chapter intr
Page 17 and 18: 2.2 The Modularity Measure of Newma
Page 19 and 20: 2.3 Density-Based Clustering Qualit
Page 27 and 28: 2.4 Fundamental Clustering Strategi
Page 35 and 36: 3 The Multi-Level Refinement Algori
Page 37 and 38: 3.1 The Multi-Level Schemeas starti
Page 39 and 40: 3.2 Graph CoarseningData: graph,sel
Page 41 and 42: 3.2 Graph Coarseningnearly no edges
Page 43 and 44: 3.3 Merge SelectorsExtent Name Desc
Page 45 and 46: 3.3 Merge Selectorsdifferent size.
Page 47 and 48: 3.3 Merge SelectorsThe probability
Page 49 and 50: 3.3 Merge SelectorsAs selection qua
Page 51 and 52: 3.3 Merge Selectorsvectors the eige
Page 53 and 54: 3.4 Cluster Refinementleave the loc
Page 55 and 56: 3.4 Cluster Refinementmoving v from
Page 57 and 58: 3.4 Cluster RefinementAlgorithm Sea
Page 59 and 60: 3.4 Cluster RefinementData: graph,c
Page 61 and 62: 3.4 Cluster RefinementModularity0.2
Page 63 and 64: 3.5 Further Implementation NotesInd
Page 65: 3.5 Further Implementation NotesBOO
Page 70 and 71: 4 Evaluationparameter component des
Page 72 and 73: 4 Evaluationsignificance scale also
Page 74 and 75: 4 EvaluationModularity by Match Fra
Page 76 and 77: 4 Evaluation5% 10% 30% 50% 100%G-no
Page 78 and 79: 4 Evaluationmean modularity0.50 0.5
Page 80 and 81: 4 Evaluation1 2 3 4RWreach-none 1 0
Page 82 and 83: 4 EvaluationG-none M-none G-sgrd M-
Page 84 and 85: 4 Evaluationmean modularity time DI
Page 86 and 87: 4 EvaluationRuntime vs. Graph SizeR
Page 88 and 89: 4 EvaluationComparison of Modularit
Page 90 and 91: 4 Evaluation(a) karate(b) dolphinsF
Page 92 and 93: 4 Evaluation(a) jazz(b) celegans me
Page 94 and 95: 4 Evaluationadministrators, and gra
Page 97 and 98: 5 Results and Future WorkThe object
Page 99 and 100: 5.3 Directions for Future Workstrat
Page 101: 5.3 Directions for Future Workties
Page 104 and 105: BIBLIOGRAPHY[14] B.L. Chamberlain.
Page 106 and 107: BIBLIOGRAPHY[42] H. Jeong, B. Tombo
Page 108 and 109: BIBLIOGRAPHY[71] A. J. Soper and C.
Page 110 and 111: A The Benchmark Graph Collectionsub
Page 112 and 113: B Clustering ResultsRWreach-sgrd 1
Page 114: B Clustering Resultswalktrap leadin
show all

Multilevel Graph Clustering with Density-Based Quality Measures

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?