Visualization of Hyperedges in Fixed Graph Layouts

tucottbus

Visualization of Hyperedges in Fixed Graph Layouts

Brandenburg University of Technology Cottbus

Computer Science Department

Software Systems Engineering Research Group

Diploma Thesis

(Diplomarbeit)

Visualization of Hyperedges

in Fixed Graph Layouts

Martin Junghans

Matrikel-Nr.: 2203076

October 2008

Adviser: Prof. Dr. Claus Lewerentz


Eidesstattliche Erklärung

Ich versichere, dass ich die vorliegende Arbeit selbständig und ohne Benutzung anderer als

der angegebenen Literatur und Hilfsmittel angefertigt habe. Alle verwendeten Hilfsmittel

und Quellen sind im Literaturverzeichnis vollständig aufgeführt und die aus den benutzten

Quellen wörtlich oder inhaltlich entnommenen Stellen als solche kenntlich gemacht.

Diese Arbeit wurde bisher in gleicher oder ähnlicher Form keiner anderen Prüfungsbehörde

vorgelegt und auch nicht veröffentlicht.

Martin Junghans

Cottbus, 9. November 2008

Declaration of Authorship

I certify that the work presented here is, to the best of my knowledge and belief, original

and the result of my own investigations, except as acknowledged, and has not been submitted,

either in part or whole, for a degree at this or any other University.

Martin Junghans

Cottbus, November 9, 2008

iii


Abstract

Graphs and their visualizations are widely used to communicate the

structure of complex data in a formal way. Hypergraphs are dedicated

to represent real-world data as they allow to relate multiple objects with

each other. However, existing graph drawing techniques lack the ability

to embed hyperedges into fixed two-dimensional graph layouts. We utilize

a set of curves to visualize hyperedges and employ an energy-based

technique to position them in the layout. By avoiding node occlusion

and cluster intersections we are able to preserve the expressiveness of

the given graph layout. Additionally, we investigate techniques to reduce

the visual complexity of hypergraph drawings. A comprehensive

evaluation using real-world data sets demonstrates the suitability of the

proposed hyperedge layout techniques.

v


Contents

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Applications of Hypergraph Visualization . . . . . . . . . . . . . . . . . . . 2

1.3 Scope of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Structure of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Preliminaries of Hypergraph Visualization 9

2.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.2 Energy-Based Graph Layouts . . . . . . . . . . . . . . . . . . . . . . 11

2.1.3 Readability and Aesthetics of Graph Drawings . . . . . . . . . . . . 13

2.2 Hypergraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.2 Requirements of Hypergraph Visualizations . . . . . . . . . . . . . . 18

2.2.3 Criteria for Hypergraph Visualizations . . . . . . . . . . . . . . . . . 19

2.2.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.3 Hyperedge Visualization Structures . . . . . . . . . . . . . . . . . . . . . . . 25

3 Hyperedge Routing 29

3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.3 Routing Based on Cluster Bounds . . . . . . . . . . . . . . . . . . . . . . . 30

3.3.1 Clustering of Graph Layouts . . . . . . . . . . . . . . . . . . . . . . 30

3.3.2 Solid Cluster Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3.3 Floating Cluster Bounds . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.4 Energy-Based Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.4.1 Modeling of Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4.2 Repulsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.4.3 Strain of Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.4.4 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4 Reduction of Visual Complexity 55

4.1 Classification of Visual Complexity . . . . . . . . . . . . . . . . . . . . . . . 55

4.1.1 Cognitive Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.1.2 Utilization of Cognitive Load Theory . . . . . . . . . . . . . . . . . . 57

4.2 Techniques to Reduce Visual Complexity . . . . . . . . . . . . . . . . . . . 59

vii


Contents

4.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.4 Energy-Based Curve Bundling . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.4.1 Attraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.4.2 Order of Energy-Based Routing and Bundling . . . . . . . . . . . . . 65

4.4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.5 Energy-Threshold-Based Aggregation . . . . . . . . . . . . . . . . . . . . . . 67

4.5.1 Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.5.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.6 Cluster-Based Curve Aggregation . . . . . . . . . . . . . . . . . . . . . . . . 69

4.6.1 Identification of Curve Groups . . . . . . . . . . . . . . . . . . . . . 70

4.6.2 Branch Out of Aggregated Curves . . . . . . . . . . . . . . . . . . . 72

4.6.3 Movable Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.7 Energy-Based Curve Widening . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.7.1 Visualization of Curves . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.7.2 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.7.3 Model of a Widened Curve . . . . . . . . . . . . . . . . . . . . . . . 75

4.7.4 Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.7.5 Formalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.7.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5 Evaluation 85

5.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.2 Example Hypergraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

5.3 Preservation of Graph Layout Expressiveness . . . . . . . . . . . . . . . . . 89

5.3.1 Experiment 1 – Node Occlusion . . . . . . . . . . . . . . . . . . . . . 89

5.3.2 Experiment 2 – Cluster Intersection . . . . . . . . . . . . . . . . . . 98

5.4 Reduction of Visual Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.4.1 Experiment 3 – Model-Based Aggregation . . . . . . . . . . . . . . . 102

5.4.2 Experiment 4 – Visual Bundling . . . . . . . . . . . . . . . . . . . . 106

5.5 Comprehensive Hypergraph Layouts . . . . . . . . . . . . . . . . . . . . . . 111

6 Summary 115

6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Bibliography 119

viii


1 Introduction

1.1 Motivation

In 1736, the study of graphs was started when Leonhard Euler presented his solution to

the problem of “The Seven Bridges of Königsberg” [27], which is deemed to be the oldest

application of graph theory. Since then, graphs are widely used to represent abstract

problems. In various aspects well-investigated graph algorithms have been indispensable

for the research of the complexity of mathematical problems. Theoretical computer science

employs graphs to prove the computability of problems for instance. If a problem can be

represented as a graph and the problem’s solution can be mapped to an operation on

graphs, e.g., to the NP-hard [6] traveling salesman problem, then the solution of the

problem will also be NP-hard.

With the advent of the Internet, the implied growth of computer networking, and the accompanying

wide range of internetworking problems, graphs gained even more importance

as a model of the topological structure of networks [65] which, among other things, allows

analysis and development of internetworking technologies [18]. To advance the Internet,

ranging from the development of strategies for resource reservation to efficient routing

protocols [64], it is crucial to capture the Internet’s large-scale topology. Due to the gigantic

scale of the Internet, visualizations can not be used to model Internet traffic or to

evaluate the topology of internetworks in general. A viewer’s cognition and the potential

to display a graph visualization is always limited to a small fraction of the real world.

Albeit not all problems of Internet topologies, related to graphs and their visualization,

can be mentioned in this work, the bottom line can be summarized as follows: Graphs are

used for computer-based analysis and simulation studies [65]; Visualizations allow a quick

cognition of general topological characteristics and of the interconnectedness within a limited

extent of the reality. The arrangement of vertices and edges in graph visualizations is

meaningful as it allows to reveal groupings and relations between vertices. Furthermore,

a visualization easily and clearly communicates the displayed data.

For this work, it is essential to distinguish graphs, as a representation of structured

data, from their visualization. Numerous applications of graphs, a few topics are listed

in the following, also rely on an adequate visualization for the sake of cognition and

communication of the displayed information.

Versatility of Graphs In business environments, graphs and their visualizations are utilized

in document management systems, project management (for instance PERT network

charts), and also to depict organizational structures. Taxonomies that portray the relations

between species, and evolutionary trees are applications of graph visualizations in

biology. Furthermore, there are numerous applications related to computer science and

1


1 Introduction

information technology, which crucially rely on graphs and (often) their visual representation

[31]. Such applications include:

• Web site maps, browsing history

• Semantic networks, knowledge presentation

• Object-oriented systems, class browsers, data structures (compiler), real-time systems

(state-transition diagrams, Petri nets)

• Data flow diagrams, (subroutine-) call graphs, entity relationship diagrams (UML,

database structures)

• Logic programming (derivation trees)

• Computer aided design, computer aided modeling

• Integrated circuit creation (very-large-scale integration)

The purpose of graph visualizations is to convey the structure of information. Usually,

the visualizations are constrained to binary relationships between vertices. However, real

world graphs that represent realistic scenarios usually relate more than only two objects

to each other. For this sake, hyperedges are required to model and visually represent a

subset of vertices that are distinguished from the remaining vertices by a certain property.

The following section therefore motivates applications of graph visualization that benefit

from hyperedges.

1.2 Applications of Hypergraph Visualization

Graph drawings depict a set of objects and the relationships among them. Usually, it

is desired to arrange objects based on a similarity metric, according to the cross-linkage

between objects, or by using a hierarchy relation if available.

Hypergraphs are a generalization of ordinary graphs, as they allow hyperedges that connect

several vertices of the graph and can be treated as a subset of vertices. A definition

is given in Section 2.2.1 on page 16. The visual representation of a hyperedge distinguishes

this subset of vertices from the remaining vertices. Abstract use case scenarios of

a visualization of hyperedges embedded into box-and-line graph layouts are:

• A hyperedge emphasizes a subset of displayed objects with a certain property.

• A hyperedge visualizes a further grouping criterion of the displayed objects, while

the positions of vertices may also reveal another grouping criterion.

This section introduces several concrete applications of hypergraph visualizations and

demonstrates their usefulness in comparison to visualizations of ordinary graphs. The

following advantages promised by decent hypergraph visualizations cover only a small

part of possible applications and benefits.

Software Visualization

The visualization of software systems is the main motivation of this work. Software artifacts

can be highly related to each other. There is a plethora of possible scenarios that

2


1.2 Applications of Hypergraph Visualization

utilize the visualization of hypergraphs, which will assist the software development process.

More specifically, it is crucial for engineers to visualize software systems in the software

design and quality assurance phases, as it allows them to gain an understanding of the

overall structure and the interactions between components.

Software artifacts like packages, classes, methods, or attributes are represented by vertices.

Edges may reflect the usage of components, call traces, method calls, or any communication

between software artifacts. The computation of proper graph layout is fundamental,

as it determines the readability and usability of a visualization, as Figure 2.2 on

page 14 illustrates.

Generally, a layout that reflects the inclusion hierarchy of the presented software system

or the relation among artifacts is favored. It enables engineers to easily and quickly catch

the affiliation of artifacts to modules or the interrelationships of artifacts, respectively. The

following scenarios reveal benefits of an additional visualization of hyperedges. Thereby, it

becomes possible to emphasize relations between more than two artifacts without loosing

the meaning of the layout.

For the following use case scenarios, it is assumed that graph layouts either represent the

hierarchy of the system or reveal the relationships and connectivity among the artifacts.

Thus, vertices are positioned close to each other if they are affiliated to the same module

or the represented software artifacts are tightly coupled.

Evaluation of Software Design Besides design and development of software, software

engineers also have to focus on software quality, e.g., the quality of design. Various techniques

to leverage these important tasks are known and assist engineers. In a few cases,

tools suggest hints or improve the software’s quality automatically after analyzing the

source code. But engineers more often have to revise the code manually and rely on

meaningful visualizations that present the structure of software systems.

Common change, sometimes called co-change, denotes the stability of a group of software

artifacts against change. A change of a software system is typically considered to be

restricted to a certain (logic) functionality of the software. Hence, software artifacts

that are frequently changed together, are likely logically coupled regarding object-oriented

development paradigms. To reduce maintenance cost of software, a low cohesion between

different modules, i.e., a subset of software artifacts implementing a certain functionality,

is the goal of software designers and developers [54]. As a consequence, the distribution

of frequently changed artifacts, which were changed at the same time, over multiple parts

of the software system may indicate design flaws.

Analysis tools identify groups with a high co-change value. A visualization that features

hyperedges allows software engineers to explore those groups visually and to study the

distribution of the artifacts. Only hypergraph visualizations allow to quickly answer the

question “Where are frequently and simultaneously changed software artifacts?”.

Further, a co-change visualization can also suggest a more preferable software modularization,

as done by Beyer in [15]. Beyer’s visualization is limited to a distinct coloring of

vertices to display hyperedges of co-changed artifacts. In contrast, this thesis aims at a

structural representation of hyperedges.

3


1 Introduction

Bug tracking systems are a valuable source of information about co-change. Bug reports

usually describe one flawed functionality and the associated bug fix repairs the source

code that caused the flawed functionality. Thus, fixing a bug should not involve various

software artifacts of multiple software subsystems, because a good software design aims at

the decoupling of functional components. Subsequently, a hyperedge that connects vertices

of multiple software subsystems strongly indicates a design that should be reviewed and

possibly improved.

Abstraction of Subsystems Software systems often have a layered software architecture,

and major components rely on the functionalities of a base system. Several components

of the remaining system access the low-level base system or an external library that also

offers common functionalities.

In such a scenario, the inner structure of a subsystem is not a matter of particular

interest. The subsystem can be hidden to save the valuable screen space resource. The

visualization of a hyperedge can connect those software artifacts of the remaining software

system that access the hidden subsystem.

The visualization of the hyperedge allows an engineer to answer the question: “Which

software artifacts access a particular subsystem, or any other software artifact?”. Furthermore,

a hyperedge visualization could also answer the question “Where are the software

artifacts that access a particular subsystem?”.

A slight variation of this scenario is to visualize the relationships between a subsystem

and the remaining system. Therefore, the subsystem of interest is collapsed to save screen

space in the visualization. A collapsed subsystem joints all contained artifacts into one

vertex; edges of contained artifacts are connected with the collapsed vertex instead. A

hyperedge connects the collapsed subsystem with artifacts of the remaining system, which

are connected to artifacts of the collapsed subsystem. Hence, screen space is saved and

the relations between the subsystem and the remaining system are visualized.

Code Coverage With the advent of agile software development, extreme programming,

and test-driven development, the importance of testing grew steadily. Thorough testing,

including planning and analysis of unit tests, promises a high software quality and so there

is a demand to create complete sets of unit tests that cover the entire implementation. As

testing of a complex software system involves a huge amount of unit test cases, developers

have to assess the test coverage to identify untested software artifacts. Moreover, the

visualization of test cases aids the comprehension of software systems [19].

The need for visualizations of test coverage comes with the need to visualize relations

between more than two artifacts. The visualization of a hyperedge could present software

artifacts covered by a particular test case or test suite. Further, to attain a soundly structured

set of unit tests, an engineer may require that each visualized software component

is covered by exactly one hyperedge to avoid overlaps. Consequently, hyperedges will be

a useful supplement for these scenarios.

“Software As A City” Metaphor Manifold research is conducted on the visualization

of software systems as a city. Buildings represent software artifacts. The positioning of

4


1.2 Applications of Hypergraph Visualization

vertices in box-and-line visualizations corresponds to the positioning of buildings in the

landscape. This vivid metaphor of a city benefits from the local scope of the visualized

information displayed to a viewer. As a viewer navigates through a three-dimensional city,

only a few artifacts in the close environment are visible. More distant buildings are only

visible if they are comparably tall, which corresponds to larger (major) software artifacts.

The limited amount of displayed information allows to focus on a particular subsystem

utilizing the entire screen area. Entering a building allows to explore the inner structure

of software artifacts. A neighborhood of nearby buildings implies a grouping which can

represent the modular software design.

A hypergraph may model scenarios as described above. Then, this metaphor of a city

in mind, a hyperedge enriches the city by illustrating a body of water curling around the

buildings. The water, for instance, depicts an external library, and software artifacts that

access this library also have access to the water through a pier.

Social Networks

Of course, the visualization of hypergraphs can be applied to various fields beyond software

engineering. A layout that spatially groups vertices already distinguishes subsets

of vertices as hyperedges do. The additional visualization of hyperedges adds a further

dimension to depict grouping information.

Visualizations of social networks reveal groupings of humans and their relationships.

For instance, a graph models students of a university. A layout places vertices according

to the students’ field of study such that a viewer easily identifies the students’ affiliations

to departments.

The binary edge relation is sufficient to model friendships between students, but not

circles of friends. As the friendship relation is not transitive, no statement about circles of

friends can be derived from the graph nor from its visualization. Therefore, a hypergraph

is required to remedy this limitation of binary relations (note that a hyperedge is not the

transitive closure of the binary relation). A visualization of hyperedges reveals information

about the interaction of students beyond department boundaries and thus gives a more

precise prediction about the students’ friendships.

Internet Topologies

Another field of applications of graph visualization techniques are computer network (or

Internet) topologies. Servers are modeled by vertices of a graph and their positions in

the layout may indicate their geographical location. The distance between nodes in a

visualization reflects the latency between servers. Edges illustrate network connections and

are weighted by their bandwidth. In this scenario, the advantage of using a hypergraph,

in comparison to a weighted graph with binary edges only, is the additional potential

to distinguish sub-networks or private corporate net structures within the given network

topology. Hyperedges connect network nodes of the same sub-network. Since there are

also logical sub-networks, i.e., servers might be distributed globally, a visualization without

hyperedges is not capable to reveal geographical location, bandwidth, latency, and subnetwork

memberships at the same time.

5


1 Introduction

Summary

Hypergraphs are more powerful than ordinary graphs that feature binary edges only. A

hyperedge represents a certain group of objects with a distinct property. Assuming that

the layout already represents grouping information, the additional display of hyperedges

is able to visualize a further grouping of vertices with respect to another grouping criteria.

The simplest way to visualize hyperedges is to label nodes according to their affiliation,

using different node color or shape. Since this approach is not able to answer the questions

mentioned in the scenarios above, this work will focus on the visualization of hyperedges

as structures that connect visually all vertices of the hyperedge.

All previously mentioned use case scenarios allow to conclude the purpose of the visualization

of hyperedges. A visualized hyperedge generally allows a viewer to find objects

sharing a certain property and to understand the relationship between those objects. We

do not aim at hypergraph visualizations that convey the specific value or degree of that

property.

1.3 Scope of this Thesis

The goal of this thesis is to present and evaluate techniques to visualize hypergraphs.

A graph layout is already given and the visualization of hyperedges must not affect the

given graph layout. We therefore study the embedding of hyperedges into a given layout.

Hyperedges should utilize the free space of the graph layout plane between the nodes.

A contiguous shape that visually connects all hyperedge nodes equally with each other

represents a hyperedge.

A preliminary depiction of our notion of a hypergraph visualization embedded in a given

graph drawing is shown in Figure 1.1. As can be seen in the figure, the hyperedge is laid

out in the unused space between the nodes of the graph. In this sketch the hyperedge

corresponds to a plane that connects all hyperedge nodes. In contrast to published hypergraph

drawings that use planes to represent hyperedges, the plane in Figure 1.1 is laid out

in such a way that it considers the graph layout immediately surrounding the visualized

hyperedge. The nodes that are not part of the hyperedge cause recesses in the plane.

We also envision a curve-based hypergraph visualization. Such a hypergraph drawing

might be similar to Figure 1.1 if solely the contours of the plane are employed to represent

the hyperedge. In contrast to available hypergraph drawings that use curve-based hypergraph

visualizations, the goal of our visualization is not based on straight-line curves.

Straight-line curves can not avoid node occlusion if the graph layout is immutable.

Both targeted visualization approaches, i.e., planes and curves, enable a viewer to find

the nodes of a hyperedge. The object that represents a hyperedge, either by curves or by

a plane, leads viewer’s eyes from the hyperedge to all hyperedge nodes.

Readability is a significant concern for the visualization of hypergraphs. The additional

display of hyperedges must not introduce visual clutter and must permit a quick cognition

of nodes connected by hyperedges without interfering the interpretability of the remaining

graph.

6


1.3 Scope of this Thesis

Figure 1.1: Sketch of our initial notion of a hypergraph visualization

More specifically, this work is going to introduce techniques to visualize hypergraphs

in two-dimensional box-and-line drawings. These techniques mainly rely on energy-based

layout methods. As the layout of a graph is immutable, the hyperedges are embedded

into the two-dimensional graph drawings. Thus, the computation of hypergraph layouts

is independent of the computation of the graph layouts. Only the positions of the vertices

in the graph layout are required to compute the layout of hyperedges. Consequently,

specific types of graphs such as directed or hierarchical graphs do not have to be handled

separately.

Different approaches and techniques are introduced and their practicability is discussed

on the basis of their advantages and disadvantages. For the sake of an evaluation, a prototype

implementation produces hypergraph visualizations using the presented approaches.

Focus on Software Engineering Scenarios As motivated in the previous section, multiple

tasks of the software development process rely on the visualization of software systems.

As this field inspired us to conduct research on hypergraph visualization, scenarios related

to software development are potential applications. Hypergraphs allow to group software

artifacts in their visual representation, either by their position or the hyperedge connectivity.

The major advantage of hypergraph visualizations can be shortly described by its ability

to add another grouping dimension to the visualization. The values of a certain property,

which is a criterion for grouping, is not displayed. Similar to a spatial clustering of graphs

that easily reveals groups of a nodes, the visualization of hyperedge should allow a viewer

to recognize nodes as members of a group. The demand for hypergraph visualization in

the field of software visualization was already emerged in Section 1.2.

7


1 Introduction

1.4 Structure of this Thesis

This thesis is structured as follows. The subsequent Chapter 2 first formalizes graphs

and hypergraphs in Sections 2.1 and 2.2, respectively. Section 2.2 also introduces three

requirements of hypergraph visualizations and infers criteria that are used to evaluate

whether these requirements are met. At the end of Chapter 2, structures of hyperedge

visualizations are discussed. A proper choice of the structure fulfills the first requirement

of hypergraph visualizations.

The main contributions of this thesis are introduced in the following two Chapters 3

and 4, which present layout techniques to fulfill the remaining two requirements of hypergraph

visualizations.

Chapter 3 presents two approaches of hyperedge routing. A routing technique is required

to create hypergraph visualizations that avoid node occlusion and cluster intersections,

which is the second requirement of hypergraph visualizations. The first approach routes

hyperedges based on cluster bounds and is presented in Section 3.3. The second technique,

an energy-based routing technique, is elaborated in Section 3.4.

The reduction of the visual complexity of hypergraph drawings is investigated in Chapter

4. First, the notion of visual complexity of hypergraph drawings is clarified. In

Sections 4.4 through 4.7, four techniques are presented that can be used to simplify the

visualized information of hyperedges, either by a simplification of the hypergraph model

or by a visual simplification. As a consequence, these techniques improve the readability

of hypergraph visualizations by reducing the visual complexity, which is the third requirement

of hypergraph visualizations.

The techniques introduced in this thesis are evaluated in Chapter 5. The experiments

prove the capabilities of the proposed layout techniques and illustrate their achievements

by several example graph drawings. They practically show that hyperedge routing avoids

node occlusion and cluster intersection, and that the visual complexity of hyperedges can

be reduced by the techniques presented in Chapter 4.

8

Finally, Chapter 6 concludes this thesis and outlines areas of future work.


2 Preliminaries of Hypergraph Visualization

After giving an overview of the topic and motivating applications of hypergraph visualizations

in the previous chapter, this chapter introduces fundamental concepts used for

hypergraph visualizations in this thesis. The first part of this chapter gives an overview

of ordinary graphs. The computation of graph layouts and aesthetic criteria of graph

drawings are described in Section 2.1.2 and 2.1.3, respectively.

The second part of this chapter formalizes hypergraphs in Section 2.2.1. Then, Section

2.2.2 clarifies the requirements of our notion of hypergraph visualization as the aesthetic

criteria of graph visualizations are insufficient for this purpose. Based on these requirements,

criteria for evaluating hypergraph visualizations are derived in Section 2.2.3.

A discussion of approaches and techniques related to hypergraph visualizations of other

authors is given in Section 2.2.4.

The last section of this chapter discusses visualization structures of hyperedges. The

choice of such a structure allows to fulfill a first requirement of hypergraph visualizations.

The hypergraph layout techniques that aim at meeting the remaining requirements are the

major contribution of this thesis and are introduced in the subsequent Chapters 3 and 4.

We introduce techniques that are capable to produce hypergraph visualizations compliant

to the notion, requirements, and constraints mentioned in the present chapter.

2.1 Graphs

This section communicates the formalization of graphs. Their definition is based on common

notation of literature about graph theory, as for instance in [62]. The author of this

book published in [61] a discussion of notation in graph theory. Here, the most preferable

and common notation is applied.

2.1.1 Notation

A finite graph G = (V, E) is a pair of a finite set V (G) of vertices and a set E(G) ⊆

V (G)×V (G) of edges, i.e., two-element subsets of V (G) [22, page 2]. An undirected graph

does not observe the direction of edges, whereas a directed graph (digraph) distinguishes

the start (head) vertex from the end vertex (tail) of an edge.

Edges join two vertices v1, v2 ∈ V (G). Two vertices v1 and v2 are called adjacent,

if there is an edge (v1, v2) in E(G). A directed edge is indicated by (v1, v2) ∈ E(G).

{v1, v2} ∈ E(G) signifies an undirected edge, where the order of vertices is irrelevant. Two

edges e1, e2 ∈ E(G) are adjacent if both have one node in common. Incident edges of a

node v are edges {e ∈ E(G)|v ∈ e} that start or end in v. The degree deg(v) of a node

v ∈ V (G) denotes the number of incident edges of v.

9


2 Preliminaries of Hypergraph Visualization

A connected graph G is a graph where any two vertices v1, v2 ∈ V (G) are connected

with each other by a path in G. A path is the shortest sequence of adjacent vertices from

v1 to v2.

A weighted graph G = (V, E, w) assigns each vertex and edge a weight. A weighting

function w that is a union of vertex and edge weighting functions, w : (V (G) ∪ E(G)) → R,

maps vertices and edges to their weight, i.e., a real number. Unweighted graphs can be

assumed to be weighted graphs by adding the weighting w : (V (G) ∪ E(G)) → 1. The

weights of a vertex v ∈ V (G) and an edge e ∈ E(G) are abbreviated by wv and we,

respectively.

A hierarchical graph H = (V, E, w, T ) is defined by a base graph and a hierarchy tree.

A base graph is a (weighted) graph (V, E, w) that models the non-hierarchical adjacency

relation between vertices as in Figure 2.1a. In addition to a base graph, a hierarchical

graph models parent-child relations between vertices, for instance the inclusion relation,

by means of an acyclic hierarchy tree T (H). Figure 2.1b depicts the hierarchy tree of the

hierarchical graph in Figure 2.1a. Vertices V (H) of the base graph are the leaves in T (H).

By definition, each leaf has exactly one hierarchical parent. At each level of the tree T (H),

inner (non-leaf) nodes of the hierarchical tree imply subgraphs of the base graph (V, E, w)

with a partition of base graph vertices.

(a) A box-and-line visualization of a

hierarchical graph

(b) Hierarchy tree of the hierarchical

graph of Figure (a)

Figure 2.1: Example hierarchy graph and its hierarchy tree. The hierarchical structure of

the graph implies a graph clustering that is marked with gray circles.

The hierarchical graph clustering of a base graph denominates the hierarchical grouping

of base graph vertices that is implied by the hierarchy tree, as Figure 2.1 illustrates. Graph

clustering does not consider the spatial position of vertices in a layout, it is solely based

on the information of a (hierarchical) graph as described above.

The set of vertices V (G) and edges E(G) of a graph G are denoted by V and E,

respectively, if it is clear from the context which graph is of interest. |V | and |E| denote

the number of vertices and edges of a graph, respectively.

Visualization In reference to [22], a typical way to visualize graphs is the box-and-line

diagram: each vertex is depicted by a dot and an edge is depicted by a line that joins two

10


2.1 Graphs

of these dots. Regardless of the gray enclosing ellipses, Figure 2.1a depicts a graph in a

box-and-line diagram.

Strictly speaking, a graph drawing is a visualization that assigns further properties like

shape, color, and size to graph objects. Box-and-line diagrams, also called straight-line

drawings, are used in this work. It allows to focus on the computation of the graph or

hypergraph layout, because it is not relevant how the dots and lines are drawn.

A two-dimensional graph layout p = (p v)v∈V of a graph G = (V, E) is a vector of vertex

positions p v ∈ R 2 . Abscissa and ordinate of a vertex position p v are denoted as p x v and

p y v, respectively.

A vertex is called node in the context of a visualization to distinguish a vertex as an

object of the graph model from its visualization. The term node is also used to emphasize

that a position in the layout was assigned to the corresponding vertex of the graph.

2.1.2 Energy-Based Graph Layouts

This section allows to gain in-depth understanding of the computation of graph layouts

and describes different criteria of readability of graph visualizations. The energy-based

graph layout computation is introduced, since these concepts are later applied to compute

hypergraph layouts.

The astrophysical N-body problem is similar to the energy-based computation of graph

layouts. Analog to planets in a solar system, forces act on the vertices of a graph. Typically,

vertices of a graph repulse each other and edges cause an attraction between joined

vertices. Layout algorithms alter the graph layout from initial (maybe arbitrary) positions

of vertices to a stable configuration, where the forces compensate each other and create

an equilibrium.

The following two paragraphs explain the principle of layout computation based on

forces or an energy function. By this, the relation and analogy between both approaches

is revealed.

Force-Directed Force-directed methods [29] are generally applied for undirected graphs.

Equally, spring embedder algorithms [23] are mechanical systems that use Hooke’s law to

describe forces between vertices. Edges are modeled as springs tied to the corresponding

vertices. A graph layout is produced by computing an equilibrium, the heuristic algorithms

move the vertices in several iterations to a stable and final location. In each iteration,

the forces on each vertex are calculated. The net force, i.e., the resultant force of the

individual forces, acting on a vertex is the vector sum of the individual forces acting on

it. Direction and magnitude of the net force determine the displacement of each vertex in

each iteration of the layout algorithm. If all vertices were moved to stable positions, i.e.,

the acting forces at all vertex positions compensate each other, then all vertices are not

moved anymore.

Energy-Based Energy-based algorithms also move vertices iteratively. An energy function

assigns a value to each layout. This value expresses the quality of the graph layout.

A lower energy stands for better layout quality than a higher one. Thus, the computation

11


2 Preliminaries of Hypergraph Visualization

of graph layouts is equivalent to minimizing the system’s energy, a common mathematical

optimization problem. Since the force is the negative gradient of the energy, the forces

that determine the displacement of the vertices in each iteration are easily derived from

the energy functions. This relation between energy and forces allows to utilize energy and

force equations similarly to describe layout algorithms.

2.1.2.1 Concrete Energy Models for Layout Computation

Energy models specify the type of layouts that are computed by energy minimization

algorithms [46]. The computation of layouts denotes the calculation and assignment of

two- or three-dimensional positions to the vertices of the graph. Energy-based graph

drawing algorithms are able to compute interpretable box-and-line visualizations. For

instance, the distance between nodes may reflect the cluster structure of graphs as in [46].

It is also possible to meet aesthetic criteria like uniform node distribution or a uniform

edge length. However, these layouts do not reveal information by the positions of nodes.

This section introduces concrete energy models, as similar models serve as a starting

point of the layout techniques used for the energy-based computation of hypergraph visualizations.

LinLog Energy Models The idea of LinLog energy models is that nodes repulse each

other and adjacent nodes attract each other. It was introduced by Noack in [44]. The

LinLog energy models produce layouts with densely grouped nodes if they are connected

with each other. Not or sparsely connected nodes are separated. In contrast to the model

of Fruchterman and Reingold [29], this approach considers clustering criteria to position

the nodes. Thus, the layout reveals affiliation information of nodes. The total (noderepulsion)

LinLog energy Enr(p) of a graph layout p was defined as

Enr(p) =

{u,v}∈E

p(u) − p(v) −


{u,v}∈V (2)

ln p(u) − p(v) . (2.1)

As the name of the energy model suggests, the attraction energy (the first sum in Equation

2.1) grows linearly with increasing distance between two adjacent nodes u and v. The

repulsion energy (the latter sum) diminishes against the distance between a pair of nodes.

The edge-repulsion LinLog energy Eer(p) of a graph layout p further takes the degree of

nodes into account. The number of incident edges of each node is added to the repulsion

energy. Thus, a node with rather high degree repulses close nodes stronger and removes the

bias of node-repulsion LinLog energy models towards centering nodes with high degrees

in the layout area.

Eer(p) =

{u,v}∈E

p(u) − p(v) −


{u,v}∈V (2)

deg(u) · deg(v) · ln p(u) − p(v) (2.2)

PolyPoly Energy Models A generalization of LinLog energy models culminates in the

parameterized PolyPoly energy models. A more thorough investigation of PolyPoly energy

12


2.1 Graphs

models, exposing their differences, and enabling a conversion between them, is elaborated

in [35]. Attraction and repulsion are both described by polynomials.

Epp(p) =

{u,v}∈E

p(u) − p(v) a −


{u,v}∈V (2)

deg(u) · deg(v) · p(u) − p(v) r

(2.3)

The attraction exponent a > 0 and repulsion exponent r ≤ 0 are customizable parameters,

which allow to generate layouts reflecting different criteria. For r = 0, the repulsion

is defined equally as in the edge-repulsion LinLog energy model in Equation 2.2. PolyPoly

energy models are also denoted as (a, r)-energy models. The nomenclature may be extended

to r-LinPoly energy models, if a = 1 and r < 0, and to a-PolyLog energy models, if

a > 1 and r = 0. PolyPoly energy models are the generalization of LinLog energy models,

which correspond to the (1, 0)-energy models. For instance, a (2, −2)-energy model was

also used in association with simulated annealing [21].

2.1.2.2 Global vs. Local Energy Minima

A graph layout with minimal energy has reached a stable state, i.e., further displacements

of nodes do not decrease the system’s energy. Equivalently, all forces acting on each node

compensate each other. The net forces of all nodes are zero. The nodes are consequently

not moved anymore. The achievement of globally minimal energy is the goal of all graph

drawing algorithms.

A local energy minima is a local minima of the energy function. If a layout p has

locally minimal energy, then every similar graph layout p ′ has a higher energy E(p ′ ) >

E(p) ≫ Emin, but both energies, E(p) and E(p ′ ), are potentially far away from the globally

minimal energy Emin. A graph drawing algorithm with the goal to minimize the energy

of the layout p will move the nodes according to the acting forces. The similar layout p ′

of the next iteration, however, will have a higher energy E(p ′ ) > E(p). Consequently, the

nodes are not moved further; the layout p is trapped in a local energy minima.

LinLog energy models can produce interpretable and readable graph layouts by encoding

structural information of the graph into the node positions. However, the computation

of these layouts is not straightforward, since LinLog energy models have more local energy

minima than, e.g., PolyPoly energy models, and thus are more susceptible to getting

trapped in local energy minima [35]. PolyPoly (except for LinLog) energy models fortunately

have less local energy minima, but are not likely to produce interpretable layouts

as LinLog energy models can do. In [35], we evaluate a method to combine benefits of

both methods. Local energy minima are bypassed and the energy minimization algorithms

still produce interpretable LinLog layouts. This approach allows to disregard local energy

minima for the energy-based graph visualization, as it is capable to overcome local energy

minima.

2.1.3 Readability and Aesthetics of Graph Drawings

A graph is eligible [31] for visualization if there is an inherent relation among the data

elements to be visualized. Data can be represented by vertices of a graph, with edges

13


2 Preliminaries of Hypergraph Visualization

(a) (b)

Figure 2.2: The meaning of layout: Two graph drawings of the same graph. Taken

from [65], “The dangers of visual representation.”

representing their relations. The graph drawing problem denotes the process of arranging

vertices in a drawing area and visualizing edges by curves. A visualization that communicates

the displayed data and its structure to a viewer is only as good as the positioning of

vertices and edges. Figure 2.2 shows two distinct drawings of the same graph. The layout

in Figure 2.2a allows the cognition of the graph structure. In contrast, the visual clutter

of curves and the even distribution of vertices in the layout of Figure 2.2b does not allow

to draw a conclusion from the graph drawing.

2.1.3.1 Aesthetic Criteria

Graph drawings are a common language to express structures in a formal way. A major

quality criterion for graph drawings is readability, i.e., the ability to capture the meaning

of a drawing [59]. However it is very complex to generate aesthetically pleasing drawings

automatically. For instance, the minimization of the number of edge crossings is NP-hard.

Aesthetics describes criteria that aid readability of graph drawings. Examples of such

criteria are:

• Minimization of the number of edge crossings

• Minimization of the number of edge bends

• Minimization of total or average edge length

• Uniform edge length

• Uniform vertex density

• Avoidance of unnecessary squandering of space

Each criterion features the capability to foster readability. However, the context of the

visualization and the viewers’ preferences demand a prioritization of the influence of these

14


(a) Minimal number of edge bends (b) Minimal number of edge crossings

2.1 Graphs

Figure 2.3: Different aesthetic criteria applied on the same graph. Taken from [59], “Conflicts

among aesthetics.”

criteria. Figure 2.3 depicts the implications of pursuing two different criteria for the

visualizations. The graph drawing in Figure 2.3a was drawn with a focus on a minimal

number of edge bends. The graph drawing in Figure 2.3b considers the minimization of

the number of edge crossings as aesthetic. It remains arguable which layout has a higher

readability. The evaluation of graph drawings thus is biased towards a certain set and

prioritization of criteria. Nevertheless, this freedom of choice allows to fit drawings to

specific applications and purposes.

But what characterizes a good graph drawing? Besides the definition of properties,

like planarity and the classification of layouts, there are plenty of aesthetic rules. Not

all of those are without controversy, because very few of the findings of cognitive science

have practical application [31] and only a few usability studies were published. Some very

common aesthetic rules are uniform edge length, minimal number of edge crossings, and

an even distribution of vertices. In [50, 51], Purchase shows the importance of a reduction

of edge crossings to foster readability of graph drawings, but the minimization of edge

bends seems to be less beneficial to readability.

Graph layouts are typically computed by layout algorithms. The choice of a layout

algorithm implies the set of aesthetic criteria that are considered. A taxonomy of graph

drawing algorithms regarding their aesthetic criteria was published in [59].

The aesthetic criteria of graph drawings serve as a measure for readability of graph

drawings. However, they are not applicable to our notion of hypergraph visualizations.

For instance, Figure 2.2 and Figure 2.3 does not consider node occlusions explicitly, which

is in accordance with the majority of available publications. Consequently, we have to

define different requirements and criteria for the visualizations of hypergraphs. These are

introduced in Sections 2.2.2 and 2.2.3.

2.1.3.2 Limitations of Graph Visualizations

From a viewer’s perspective, a graph visualization should answer the questions “Where am

I?” and “Where is the object of interest?” [31]. But there are two general limitations on

15


2 Preliminaries of Hypergraph Visualization

information visualization: the available screen space and the cognition as a human ability.

The latter limit is reached more quickly [52].

Screen space is a valuable and restricted resource. A visualization is not useful in terms

of readability if there is a shortage of screen space. Beside the screen resolution, the most

critical parameter is the amount of data to display. Miscellaneous techniques overcome

the problem of information overload, for instance by the specification of multiple levels

of abstraction. The part of interest is shown in detail, whereas the rest is aggregated

according to a hierarchical structure of the graph [52, 24]. A viewer can set the level of

detail manually, or the viewer’s perspective is used to determine the degree of abstraction

automatically [7]. Another option to browse graphs are so-called fisheye views [53].

The amount of displayed information, i.e., typically the amount of vertices and edges,

mainly influences the required size of the screen space resource. A large graph size may

involve the following consequences:

• Exceeding the display size. Independent of any graph drawing algorithm, the screen

space will always be limited.

• Decreasing readability. If the layout is too dense, it may become impossible to

distinguish between vertices and edges. Vertices may even overlap.

• Decreasing usability. Usability is fairly subjective since the cognition depends on

human factors, but information overload generally hinders cognition.

• Decreasing performance of drawing algorithms. Time complexity is crucial, as small

changes of the graph must not cause complex computations, in order to allow user

interactions and on-line visualizations.

Comprehension and analysis of the graph visualization correlates with the size of the

graph. However, as illustrated in Figure 2.2a, a visualization may easily reveal the overall

structure of a large graph. Several visualization techniques try to remedy the information

overflow, for instance by navigation-dependent abstractions [7] or the use of non-Euclidean

geometry. Considering these techniques is beyond the scope of this work.

2.2 Hypergraphs

2.2.1 Notation

A hypergraph H = (V, E) allows edges of any (non-zero) cardinality. Thus a hypergraph is

a generalization of an ordinary graph, as defined above in Section 2.1.1. Thus, a graph is

a hypergraph with a constant edge cardinality of 2.

A hyperedge is a subset of the vertices of a hypergraph. The set of hyperedges E ⊆ V (H) n

is a subset of the Cartesian product V (H) n of the set of vertices. The number of vertices

of a hyperedge is shortly denoted by the cardinality |ε| ≥ 1 of the hyperedge ε ∈ E.

An unordered hyperedge ε, i.e., the order of the vertices is insignificant, is denoted by

e = {v1, v2, ..., vn} with vi ∈ V (H) for 1 ≤ i ≤ n = |ε|. A hyperedge node is a node

v ∈ V (H) that is part of a hyperedge v ∈ ε ∈ E(H). This term is used to distinguish

hyperedge nodes from nodes of the hypergraph that are not part of a hyperedge.

16


2.2 Hypergraphs

This work distinguishes hypergraphs from graphs in the following way. The underlying

graph G(H) of a hypergraph H denominates the hypergraph without non-binary edges.

G(H) comprises binary edges, i.e., edges of the cardinality 2, of the hypergraph H only.

The set of binary edges that are included in the set of hyperedges of a hypergraph is

denoted by E(H).

G(H) = (V (H), E(H))

E(H) = E(H) ∩ (V (H) × V (H))

(2.4)

The same distinction applies to the notion of edges and hyperedges. Edges, i.e., binary

relations between nodes, are consequently denominated as “edges”, even though the term

hyperedges would also comprise binary edges.

Analogously to the extension of the notion of graphs, the definition of hypergraphs

can be extended to weighted hypergraphs (V, E, w), and weighted hierarchical hypergraphs

(V, E, w, T ).

Terms of Hypergraph Visualization Undirected and weighted hypergraphs are the subject

of the present work. The approaches and techniques of hypergraph visualization

proposed in the following rely on the following notions and notations.

The distance between two nodes, or two points p1 and p2 in general, is the Euclidean

distance, and the length of the vector p2 − p1 is denoted p2 − p1.

A curve is the visualization of the relation between two nodes. This notion includes

binary edges e ∈ E(G) of a graph G = (V, E) as well as constituent parts of a hyperedge

ε ∈ E(H). A hyperedge ε consists of a set of curves C(ε). A curve c ∈ C(ε) is the

connection of its two end points u and v. start(c) and end(c) therefore denote both end

points of a curve.

A curve c can be represented by a set ∆(c) of distinguished points. Each distinguished

point δ ∈ ∆(c) has two neighbors neighbor(δ) = {δ1, δ2 |δ1, δ2 ∈ ∆(c) ∪ {start(c), end(c)}}

on the same curve. Near the ends of curves, one neighbor of δ is an end point of the curve.

An end point of a curve has exactly on neighboring distinguished point on the same curve.

The set of distinguished points of a hyperedge is abbreviated by ∆(ε) =

∆(c)

c∈C(ε)

These distinguished points ∆(c) divide the curve c into curve segments connecting neighboring

points of ∆(c). The number of curve segments |∆(c)| + 1 is determined by the

number of distinguished points of the curve.

The length len(c) of a straight-line curve c is the Euclidean distance between the end

points of the curve. Otherwise, the length of a curve is the sum of the lengths of its curve

segments, whereas the length of a curve segment is the Euclidean distance between the

end points of the curve segment.

A two-dimensional hypergraph layout p of a hypergraph H = (V, E) is a vector of positions

p ∈ R 2 . The hypergraph layout p comprises the positions of all points necessary

to visualize a hypergraph. Consequently, p contains positions of end and distinguished

points of each curve and the position of the vertices of the hypergraph.

17


2 Preliminaries of Hypergraph Visualization

2.2.2 Requirements of Hypergraph Visualizations

This section clearly states the requirements of the envisioned hypergraph visualizations,

which were already outlined above. Every layout and drawing technique presented in

this work must meet these conditions. The requirements thus constrain the options and

freedom to design layout techniques.

2.2.2.1 Uniformity

Formally, a hyperedge is a set of vertices. This work does not impose an order on nodes

of a hyperedge. There is also no weighting of a particular subset of hyperedge nodes.

Consequently, all vertices of a hyperedge are equally related with each other. This entails

that, regardless of how the hyperedge nodes are interconnected with each other, visualizations

of hyperedges must not emphasize any particular part of the connectivity of the

hyperedge.

2.2.2.2 Preservation of Layout Expressiveness

Graph drawings are able to express information by the positions of nodes. Energy-based

algorithms for instance may arrange nodes with respect to connectivity between vertices as

outlined in Section 2.1.2.1. A layout places densely connected vertices close and sparsely

connected nodes more distant [46]. This characteristic allows viewers to perceive the

relationships among vertices due to the layout without displaying the binary edges. Furthermore,

the overall structure of a depicted graph is revealed by a grouping of nodes in

the layout, which reflects the graph clustering. In order to preserve this expressiveness of

the graph drawing, hyperedge visualizations must meet the following three requirements.

Fixed Graph Layout The graph layout is immutable. If nodes were displaced by adding

a hyperedge into the graph layout, then the displayed information might be significantly

distorted. The connectivity between nodes and the graph clustering in particular are not

properly reflected by an altered graph layout.

Consequently, the visualization of hyperedges must not alter the layout, because otherwise

the expressiveness of a layout would be compromised and the revealed information

about the graph would be faulty and thus less useful.

Avoidance of Node Occlusion Also, a hyperedge must not hide nodes of the graph.

Otherwise the displayed information is incomplete, and the drawing’s expressiveness is

faulty and may cause misleading conclusions. This requirement applies to hyperedges

that are visualized as curves as well as to those visualized as planes. The latter case is

obvious as a plane can cover nodes completely. Edges or curves may not completely cover

the visual representation of nodes, but since a discussion about the line width of curves is

not part of this work, curves of the hyperedges must not cover nodes either.

18


2.2 Hypergraphs

Avoidance of Cluster Intersections Curves or slim extents of a plane can be misinterpreted

as boundaries and thus visually divide the surrounding graph layout. An intersected

cluster is visually partitioned by the hyperedge. Consequently, a hyperedge must

not intersect node clusters that do not contain hyperedge nodes.

2.2.2.3 Visual Complexity

A proper hypergraph drawing offers high readability and allows for quick comprehension of

the displayed connectivity between hyperedge nodes. A simple structure of the visualized

hyperedge avoids information overload. Only a minimal amount of information is shown

to a viewer.

2.2.3 Criteria for Hypergraph Visualizations

Criteria are directly derived from the requirements of hypergraph visualizations. They

measure the compliance of actual hypergraph drawings with the requirements and are

used to evaluate techniques that are presented in the subsequent chapters of this work.

These criteria for the quality of hypergraph drawings are also used for the evaluation in

Chapter 5 at the end of this work.

Node Occlusion The number of nodes occluded by a hyperedge is a direct measure

of layout quality of hypergraphs. It indicates the amount of visual clutter a hyperedge

introduces. A high number of occluded nodes implies a high amount of information that a

hyperedge hides or distorts and thus implies a low layout quality. The quality of algorithms

that compute the layouts of hyperedges are evaluated towards their ability to avoid node

occlusions.

Cluster Intersections As mentioned above, hyperedges should not intersect dense groups

of nodes. Clusters can only be penetrated to connect a hyperedge node of the respective

cluster. A “good” hypergraph layout does not intersect dense groups of nodes without connecting

one of them, and a high number of cluster intersections signifies a bad hypergraph

layout.

Visual Complexity The visual complexity of a hyperedge is the amount of visualized

information required to visualize the hyperedge. A low visual complexity is preferred

since it promises higher readability and quicker comprehension of the displayed relations

between hyperedge nodes.

The complexity of a hyperedge is measured by the number or extent of visualized objects,

for instance the number of curves, the total length of all curves, or the total circumference

of a plane. The surface area of a plane hyperedge visualization is not a valid measure,

because its size can be highly restricted by the surrounding graph layout. The notion of

visual complexity of hypergraph drawings is elaborated in Section 4.1.

19


2 Preliminaries of Hypergraph Visualization

Stability Predictability [47], also referred to as the stability of layouts, is an important

criterion to evaluate layout algorithms. It denotes the stability of calculated graph layouts

against minor changes of the graph. Several layout computations of a graph should also

lead to the same or a very similar result.

This criterion is especially important for on-line animations of the evolution of a system

that is depicted by the graph. For example, to animate the evolution of software

systems, i.e., sequentially visualize a set of source code snapshots with smooth transitions

in-between, it is crucial to avoid radical changes of graph and hypergraph layouts. A small

change of the given graph layout, caused by an added node or by some slightly rearranged

nodes, should only have a minor impact on the layout of hyperedges.

This criterion might be considered to evaluate hypergraph visualizations, though it is

not a primary concern of this work.

2.2.4 Related Work

Hypergraphs are frequently used in the context of information visualization. This is substantiated

by a plethora of multifaceted and well-established applications, for instance

the relationships in database systems (e.g., entity relationship diagrams), electrical circuits,

dynamic programming (e.g., debugging of Dyna), parallel programming, debugging

of makefiles, chemical reactions [25] (vertices represent chemical substances and each hyperedge

represents a chemical reaction), and theorem proving.

However, little research is conducted on the visualization of hypergraphs in particular.

The following paragraphs describe the work of other authors related to the visualization of

hypergraphs and distinguish their conception from the requirements of the present work.

First, two general types of hypergraph visualizations are discussed. Then, the outcomes

of published research on topics that do not primarily focus on hypergraphs are covered.

These instances utilize hypergraphs in their visualizations but are too dissimilar for the

goals of this work.

2.2.4.1 Types of Hypergraph Visualizations

Mäkinen distinguished two general types of hypergraph drawings: edge standard and

subset standard [40]. Points in two-dimensional space are utilized to represent the vertices

of a hyperedge. The edge standard (Figure 2.4a) connects hyperedge nodes either with

straight-line curves or smooth curve lines [38]. The subset standard as in Figure 2.4b

represents each hyperedge by a closed curve that contains all vertices of the hyperedge.

Subset Standard

Bertault and Eades investigated hypergraph drawings in the subset standard [14]. Their

PATATE system uses force directed methods to compute the closed curves, optionally as

convex hulls of the subset of vertices. Three structures of the hyperedges were proposed.

The first one introduces a dummy node that represents the hyperedge and is connected

to all vertices of the hyperedge. The second structure is a minimal Euclidean spanning

20


(a) Edge standard (b) Subset standard

2.2 Hypergraphs

Figure 2.4: Hypergraph drawings produced by the PATATE system taken from [14]

(a) A “simple” hypergraph drawing [14] with

fairly poor readability

(b) The readability problem, as in the left figure,

is a prevalent issue

Figure 2.5: Two hypergraph drawings in the subset standard from [14] show that this

method obviously suffers from the absence of readability

tree that is build upon the vertices of the hyperedges. Lastly, the third option is a Steiner

tree. Steiner points are introduced and connected to the vertices of the hyperedge. In all

three cases, the underlying structure of a hyperedge is modeled by introduced vertices and

binary edges on these vertices. A force directed method places this underlying structure

in the layout.

The Steiner tree produces reasonable layouts for small hypergraphs, but fails for a high

number of hyperedges [14]. Unfortunately these methods do not consider the graph layout

at all, and thus occlusions of nodes and cluster intersections were not addressed. Figure 2.5

shows two example hypergraph drawings produced with the PATATE system, which were

published in their demo paper [14]. They both suffer visual clutter introduced by only a

21


2 Preliminaries of Hypergraph Visualization

few hyperedges in these small graphs. In contrast to these figures, our work investigates

approaches that focus on readability and the preservation of the graphs’ expressiveness.

Both issues are not addressed by Bertault and Eades. Still, the introduction of vertices

that represent a hyperedge is also utilized for hypergraph visualizations in the present

work.

Euler Diagrams The complexity to compute hypergraph

layouts in the subset standard is similar to the

complexity of drawing Venn diagrams. Euler diagrams

are a generalization of Venn diagrams. In hypergraph

layouts in the subset standard the planarity of hypergraphs

is often focused [20]. Mäkinen also mainly considered

the planarity for hyperedge drawings [40]. Mutton

et al. enhanced Euler diagrams with hypergraphs [42] in

order to compute Euler diagram layouts. This approach

also introduces vertices that constitute hyperedges. Each

vertex represents a contour of an Euler diagram and is

placed by force-directed methods, cf. Figure 2.6 that

shows a trivial example. The methods aim at a minimization

of edge crossings and of the hypergraph’s total

edge length in order to increase aesthetics and layout

quality of the Euler diagrams.

Figure 2.6: An Euler diagram

enhanced by a hypergraph,

taken from [42]

In contrast to Euler diagrams, hypergraph drawings in the subset standard should avoid

intersections and occlusions. Further, this approach is limited to straight-line edges (and to

some simply shaped curves). But a visualization of hypergraphs with the goal of generality

must allow arbitrarily shaped edges.

Edge Standard

Local browsing is an approach to navigate through large or infinite directed hypergraphs

where only a small subgraph is visible at any given time [25]. Real world graphs are

usually too large for static graph drawings and the local browsing approach might be

applicable to several drawings. More importantly for this work is the technique to visualize

hypergraphs. In [25] Eisner et al. also introduce an intermediate vertex that is

connected by directed binary edges to each vertex of the hyperedge. Unfortunately, their

presented method has several limitations, which make the method inapplicable to general

hypergraph drawing techniques that are introduced in the present work. For example, the

layouts are based on Sugiyama [55], which means that the ordinate of a vertex position is

chosen such that upwards edges are minimized. A top-to-bottom flow, similar to a tree,

with edges represented as splines, however, is much too restrictive for a general approach

for hypergraph visualizations. The directional flow does not allow force-directed layout

methods. Furthermore, edge crossings are not prevented and the hyperedges may occlude

the remaining vertices of the graph.

22

A distinction between directed and undirected hypergraphs as in [38] is not investigated


2.2 Hypergraphs

in the present work. The given scenarios and applications of hypergraph visualizations

rarely benefit from the additional display of the direction.

2.2.4.2 Implicit Research on Hypergraph Visualizations

The computation of graph and hypergraph layouts is addressed in various areas, such as

hardware and software design. Also, diagrams of the information systems life cycle, like

PERT charts, data flow diagrams, entity relationship diagrams, and database management

system model diagrams [59] are laid out as networks.

An overview of literature on this topic is given by Battista et al. in [12]. The hypergraph

layouts are either in the straight-line standard (for instance for the visualization of trees)

or in the grid standard (also called orthogonalization) [59]. The latter standard allows to

place vertices and edge bends on the tracks of the grid only. The edges are fragmented

into horizontal and vertical straight-line edge segments.

Orthogonal Standard Network diagrams, PERT charts, and entity relationship diagrams

are familiar instances of graph layouts that are usually drawn in the orthogonal standard,

which is among the most popular standards [34]. Vertices are represented by rectangular

blocks and are placed on intersections of the underlying grid. According to Mäkinen,

the orthogonal standard is preferred over any other graph standard, since the perceptual

organization of symmetric and aligned patterns is preferred over a free positioning [39,

page 441].

The layouts of theses diagrams are computed considering aesthetic criteria [51] like a

minimization of edge bends, edge crossings, total edge length, maximum edge length, or

the establishment of a uniform edge length. The occlusion of blocks (by edges or other

blocks) is avoided by choosing other edge paths on the grid or moving the blocks [36].

However, limitation of available routes and vertex positions is a restriction of the layout

that is not admissible for this work. Further, orthogonal layouts tend to consume more

screen space than box-and-line visualizations with unrestricted positioning of vertices and

edges.

Hypergraphs and Electrical Circuits Due to the way electrical circuits, such as integrated

circuits, are fabricated, their design depends on the orthogonal graph standard as well. The

design and visualization of circuits highly rely on hypergraphs. The gates are represented

by vertices and interconnected by several wires. Algorithms that are similar to the ones

described above for the orthogonal standard are available to route edges (wires) such

that edge crossings and bends are minimized. But the design and visualization (e.g.,

for teaching and documentation purposes) of electrical circuits pursue different goals. A

visualization usually utilizes two dimensions. In contrast, the circuit is designed on a stack

of layers in the third dimension. The wires in circuits are mainly routed with respect to

the efficiency, whereas a visualization targets less confusing wire routes to improve the

clarity of drawings, enable comprehension, and increase readability.

Common visualizations align gates to (horizontal) layers and connect them with orthogonal

straight lines [26]. Eschbach et al. assume that the horizontal layers are initially

23


2 Preliminaries of Hypergraph Visualization

at a good position with minimal hyperedge crossings. Then, the relations between the

hyperedge nodes of a hyperedge are aggregated to a single track until a local edge crossing

optimum is reached.

The aggregation of connections of a hyperedge reduces the crossings and occlusion of

remaining graph objects that will be also useful for this work. However, node occlusion

is not prevented in this visualization technique, since the given gates are only placed on

a lattice and the edges are geometrically routed with respect to the discrete positions of

the lattice.

Matrix Visualizations Matrix visualizations also reveal the relations between vertices

of a graph. The advantage of matrix views is the clear and stable layout, but it is less

intuitive, especially in comparison to box-and-line diagrams [30].

A thorough comparison between both diagram types included the comparison of node,

path, and subset characteristics. Node characteristics, like the degree of nodes, discovery of

outliers, and labeling, were measured by the average time a concrete task took, e.g., to find

the most connected node. The number of links, the existence of a path between two nodes,

critical paths, or loops are characteristics of paths. Unfortunately, the characteristics of

subgraphs were not measured and compared in [30]. The characteristics of subgraphs, such

as the group of nodes connected to a node or the identification of densely connected node

groups, would be particularly interesting, because a hyperedge corresponds to a subset of

nodes.

The conclusion of the work in [30] is that both visualization types are qualified. Matrices

are preferable for large graphs. Path related tasks are difficult on both types, but viewers

are usually more familiar with box-and-line visualization.

A two-dimensional matrix of vertices indicates a binary edge as an entry in the matrix,

possibly encoded by color, shade, or a number. The ordering of the vertices may reflect

a hierarchy or a clustering of vertices. The representation of hyperedges would require

more dimensions, which is visualizable only to a very limited extent, as any visualization

nowadays is bound to a maximum of three dimensions.

Two imaginable matrix visualization types of hypergraphs are discussed now. First,

each hyperedge is encoded by a certain color (or shade, number) in a two-dimensional

matrix. Any hyperedge cardinality is allowed. However, each vertex is connected to a

maximum of one hyperedge at a time. This also includes binary edges, if they are not

omitted. This approach can only present very few edges, which might be impractical for

real world hypergraphs. Secondly, the matrix is visualized in a 2.5-dimensional layout, i.e.,

the matrix is laid out in the two-dimensional plane as usual. The binary edges between

vertices are indicated in the matrix plane. Additional hyperedges are depicted by dummy

nodes in the third dimension. The dummy node of each hyperedge then connects each

vertex of the hyperedge. The “Hierarchical Net” by Balzer et al. in software landscapes is

similar to this approach and illustrated in Figure 4.5 on page 62.

As a consequence, the fairly intuitive box-and-line visualization of graphs and their

habitual usage promise a higher readability of the displayed graph information. After

all, the communication of structural information of a graph is the main motivation of

24


2.3 Hyperedge Visualization Structures

information visualization, which can be accelerated by a simple and quick cognition of

graph drawings. Subsequently, a matrix view is not studied in the following.

2.2.4.3 Box-And-Line Drawings of Hypergraphs

The essential problem of all box-and-line visualizations is the clutter of edges and nodes.

Edges intersect other edges and occlude vertices of the graph. Viewers have difficulties

to identify vertices and concrete relationships between vertices. This dilemma boils down

to the finding that too much information is displayed in the two or three dimensions of a

box-and-line visualization.

One option is to generate graph layouts that reflect the graph structure by the positions

of the vertices. Edges can be omitted if densely connected vertices are placed closely and

sparsely connected vertices are separated [44]. Thus, no visual clutter of edges confuses a

viewer. Even if it is not clear which pairs of vertices are connected and which are not, a

viewer gets a rough impression of the connectivity of the overall graph. In the majority

of cases this will correspond to a viewer’s claims. This reduction of visual complexity by

indicating the connectivity by node positions rather than by drawing edges is discussed in

various publications [44, 45, 46] by Noack et al.

In summary, the visualization of hypergraphs with box-and-line diagrams without rigorous

limitations, e.g., the dependency on an orthogonal standard, does not seem to be

investigated and published yet.

2.3 Hyperedge Visualization Structures

This section represents a first step towards the computation of hypergraph layouts. Unlike

a simple labeling of hyperedge nodes, our notion requires to connect all nodes of a hyperedge

equally, either in the edge- or the subset-standard. The structures of hyperedge

visualizations, shortly denoted as hyperedge structures, describes how a set of curves is

employed to visualize the connectivity between hyperedge nodes.

The structural fundamentals sketching the connectivity of hyperedges allow to lead a

viewer’s eyes quickly to all hyperedge nodes. This particular goal constitutes the labeling

of hyperedge nodes as ineligible. Hence, a set of curves is introduced to visually model

the structure of hyperedges. Curves, i.e., the binary connection between hyperedge nodes,

must connect all hyperedge nodes equally (cf. uniformity in Section 2.2.2.1).

The remainder of this section introduces and discusses structures of hyperedge visualizations.

These different ways to connect hyperedge nodes with each other by curves are

derived from basic network topologies like line, ring, mesh, fully connected, star, tree, and

bus.

Line and Ring Structure

A chain of hyperedge nodes is not sufficient to uniformly connect them with each other.

Each node of the chain has two neighbors (one neighbor at the ends of a chain) and by

25


2 Preliminaries of Hypergraph Visualization

Figure 2.7: A set of curves models a fully connected network on five hyperedge nodes

this an order is implied by the distinguished connectivity between neighbors. There is no

justification for any particular sequence among the hyperedge nodes. Consequently, the

line and ring topology are not applied in this work.

Fully Connected Structure

A mesh is a partially connected network. Again, there is no justification to choose which

pairs of hyperedge nodes are joined by a curve and which are not. A fully connected

network uniformly connects each pair of hyperedge nodes, as illustrated in Figure 2.7. In

contrast to a mesh, a fully connected structure meets our requirements for hypergraphs

visualizations.

Star Structure

The star topology is a centralized hyperedge model that particularly takes the uniform

connectivity into account. A central point (not a node), denoted by crux in the following,

serves as a (network) hub and connects all hyperedge nodes by curves, as illustrated

in Figure 2.8. The position of the crux therefore should be central to hyperedge node

positions. For instance, the barycenter or a weighted barycenter can be used.

The crux is not visualized in a hypergraph drawing. Curves connect the crux with

all nodes of the hyperedge. No particular connection is distinguished by this centralized

structure.

Tree and Bus Structures

Similar to the line and ring structures of the hyperedge model, a tree and a bus will not

fulfill the requirement for uniformity of node connectivity. Nevertheless, both structures

can be applied to produce more sophisticated hypergraph layouts. An inhomogeneous

distribution of hyperedge nodes in the graph layout with several dense groups of hyperedge

nodes is a proper candidate of the tree or bus structure. Each cluster of hyperedge nodes is

inter-connected in the centralized or fully connected structure with relatively short curves.

26


2.3 Hyperedge Visualization Structures

Figure 2.8: Model of a centralized hyperedge structure. Curves connect all hyperedge

nodes with the crux (rectangle).

The connectivity between clusters is modeled with a tree or bus structure. For instance,

a bus connects the crux of each cluster of hyperedge nodes.

Discussion

The requirement of uniform connectivity between hyperedge nodes allows to choose between

the centralized and the fully connected structures to visualize hyperedges. The

choice of the hyperedge structure mainly has to focus on the requirement of uniform hyperedge

node connectivity, but not on the preservation of the expressiveness of the given

graph layouts.

A fully connected structure of a hyperedge visualization suffers the vast number of

curves. A hyperedge ε ∈ E of a hypergraph H with n = |ε| hyperedge nodes is consequently

visualized by n(n−1)

2 curves. The centralized structure of this hyperedge visualizes

n curves and thus is remarkably less complex. Not only the visual complexity, but also

the computational complexity of the layout calculation of hyperedges with the centralized

structure is lower than the complexity of fully connected structures. Nevertheless, both

structures are investigated in the following work and used to evaluate the proposed hypergraph

layout techniques. From a viewer’s perspective, it is arguable which structure is

superior. Both structures qualify for the purpose of hypergraph drawings.

The centralized structure also has a disadvantage: the star might be not as stable against

slight position changes of graph nodes as the fully connected structure. A displacement

of the position of the crux can also impact the layout. The fully connected structure is

more stable as is has much more curves. Even if some curves are rearranged, the majority

can remain unchanged. However, this criteria for hypergraph visualization is not focused

in this work.

27


3 Hyperedge Routing

Routing denotes, in the context of this thesis, the process of finding an optimal path that

is used for the visualization of the curves of the hyperedge structure. In comparison to

the routing performed in networks, the number of paths eligible for curve visualization is

potentially unlimited. Consequently, no graph-theoretic routing algorithm for networks

can be applied here. Routing of curves more precisely denotes the displacement of curves.

Routing techniques move curves to an optimal position regarding the second requirement

introduced in Section 2.2.2. An optimal curve position preserves the expressiveness of the

graph layout by avoiding node occlusion and cluster intersections. The third requirement,

the reduction of the visual complexity, is not a concern of routing techniques, and is thus

handled separately in Chapter 4.

Both hyperedge structures used in this work, the centralized and the fully connected

structure, represent the underlying skeletal structure of a hyperedge visualization. The

input of a routing technique is a hyperedge layout with straight-line curves, as illustrated

in Figures 2.7 and 2.8.

In this Chapter, two techniques for hyperedge routing are presented. The first one, in

Section 3.3, is based on a spatial clustering of the graph layout. Although this approach

might satisfy the requirements of hypergraph visualizations, it is not an optimal solution,

as will be discussed in Section 3.3.4. The conclusions drawn from the first approach

motivate the development of a second, energy-based approach presented in Section 3.4.

The latter technique proves to be more suitable, as concluded in Section 3.5.

3.1 Motivation

The routing of curves of the hyperedge structure is inevitable to preserve a layout’s expressiveness.

Any change of the graph layout, i.e., any alteration of node positions, is

prohibited. The goal of hyperedge routing is to move curves to the free space between

nodes. This section introduces two routing techniques that compute paths of curves such

that occlusions of nodes and intersections of clusters are prevented. This in turn can

give curves enough space to spread width-wise and create a hypergraph drawing in the

subset-standard afterwards.

Hyperedge routing techniques have two requirements to produce good hypergraph layouts

with respect to the preservation of layout expressiveness. First, hyperedges must

not occlude nodes, which means that curves can not be visualized as straight lines. This

avoids a loss of information that a graph layout presents. Second, the hyperedges must

not distort the visualized information of the drawn graph by avoiding an intersection of

clusters. Furthermore, an additional restriction of a routing technique is to avoid moving

hyperedges outside of the graph layout area.

29


3 Hyperedge Routing

3.2 Related Work

Section 2.2.4 already described the approaches of the visualization of hypergraphs by

other authors in general. The routing of edges is a more specific technique that becomes

inevitable if the edges and hyperedges are visualized by curves.

The most notable differences to other existing edge routing approaches are the fixed

graph layout and the requirements to avoid occlusions of nodes that are not part of the

hyperedge of current interest and to avoid cluster intersections.

Often, lines are routed geometrically. For instance, the intersections of boxes and lines

in network diagrams are calculated geometrically. Then ,the intersections are minimized

by rearranging boxes and lines within the grid. The minimization of edge crossings, edge

bends, et cetera [51] suffers high computational complexity. In addition, such techniques

rely on testing the limited amount of discrete positions that a grid layout offers [36].

Besides this geometrical way of routing, there is also the topological way to route

curves [13]. Bazylevych models the available routing area between nodes as triangular

or rectangular faces. Various parameters are required to describe the topology of routes

or channels through faces that can be used for routing. This technique produces arbitrarily

shaped curves. But it is not capable to consider the local neighborhood (beyond direct

neighbor nodes) of a node, which is inevitable to avoid cluster intersections.

Other routing techniques depend on additional meta data of graphs. Holten uses base

points derived from a graph’s hierarchy to route edges [32]. An edge routing technique

that is specific to clustered graphs is proposed in [8]. The latter method determines the

base points of a curve’s route from the visualization of shapes that represent the cluster

bounds. Both approaches are limited to certain types of graphs. In this thesis, no certain

types of graphs are assumed, in order to develop a more general routing approach for

hyperedges.

3.3 Routing Based on Cluster Bounds

In this section, we present an approach to route curves such that no clusters are intersected

(except for the clusters of the curve’s end points), given that we computed a spatial

clustering of the graph in a preprocessing step. Using this approach, the avoidance of

node occlusion is implicitly achieved as well. Afterwards, it remains to solve the problem

of node occlusions within the clusters of the end points of the curves.

The presented curve routing algorithm uses a spatial clustering of the graph layout and

determines cluster bounds. Curves that intersect cluster bounds have to be rearranged

until the intersections are eliminated. Although a clustering is necessary, this section

presents this approach in order to discuss such an obvious curve routing approach that is

based on geometric information.

3.3.1 Clustering of Graph Layouts

The first step is the computation of a spatial clustering. Depending on the position of

nodes, each node of the graph is assigned to one cluster. A spatial clustering assures

30


3.3 Routing Based on Cluster Bounds

that closely placed nodes belong to the same cluster. A density-based spatial clustering

algorithm finds contiguous areas with high density of nodes. The Euclidean distance of

the graph nodes in the layout serves as the distance measure. However, such an algorithm

does not assign each node to a cluster. The handling of outliers is crucial: an outlier is

either assigned to the nearest cluster or it constitutes its own cluster.

While software systems often feature a hierarchy that implies a (hierarchical) graph

clustering, such a graph clustering usually is not compliant to the spatial distribution

of vertices in the graph layout. Energy-based graph layout algorithms, like the LinLogenergy

model from Section 2.1.2.1, are used to produce layouts that reflect the graph

clustering [46]. Then, a graph clustering, e.g., derived by a generalization of Mark Newman’s

Modularity measure [43], can be similar to the spatial clustering.

3.3.2 Solid Cluster Bounds

The following paragraphs introduce the rationale of a divide and conquer routing algorithm

that is based on cluster bounds. Three steps remove intersections of curves and clusters.

1. Compute Cluster Bounds Assuming a clustering is available, the first step is the

computation of cluster bounds. They allow to determine whether a point in the graph

layout area is inside or outside of the area occupied by a cluster. The cluster bound is a

polygon that encloses all nodes of a cluster. Several well-known algorithms, e.g., discussed

in [49, Chapter 3] and [9, 17], can compute a minimum bounding box or the convex hull

for a set of nodes of a cluster.

A polygon is preferred over other figures since a polygon is simply represented by a set

of line segments, offers arbitrary accuracy to describe the cluster area, and allows easy

computations of intersections.

2. Detect Intersections To detect cluster intersections and potential node occlusions,

the points of intersections of curves and cluster bounds are calculated. The polygons are

described by a set of line segments. Initially, the curves are also straight line segments.

So the computation of intersections boils down to the computation of intersection of two

line segments in the two-dimensional space. Two points, entry p in ∈ R 2 and exit point

p out ∈ R 2 of the curve on the cluster bound, are calculated for each intersection.

3. Eliminate Intersections An intersection of a curve and a cluster bound is removed

by rearranging the part of the curve between entry and exit point. The center point c0

between exit and entry points is moved away from the cluster area, into the free space.

c0 = p in + p out − p in

2

=


p x in + px out

p y

in

+ py

out

Let m be orthogonal to the direction of the initial curve pout − pin.

p

m =

y


in − pyout

p x out − p x in


· 1

2

(3.1)

(3.2)

31


3 Hyperedge Routing

The center point of the curve segment is moved in the direction m or −m, until it is no

longer inside the cluster. The new location of the center point is denoted c ′ 0 . The exact

distance of c ′ 0 from the cluster bound is determined by a predefined parameter. The choice

of the direction, i.e., m or −m, depends on which direction promises a shorter detour of

the curve.

After the center point c0 was moved out of the cluster area (to c ′ 0 ), the curve may

still intersects the cluster. The curve is therefore divided into two parts. The first part

straightly connects one end point of the curve with the displaced center point c ′ 0 . The

second part connects c ′ 0 with the other end of the curve. The intersections of both parts of

the curves are computed and eliminated as described above until there are no intersections

between this curve and this cluster left.

Aftermath This algorithm assures that curves will not intersect clusters anymore. Since

clusters are not intersected and each node is affiliated with a cluster, no node occlusions

in clusters except for those of the curves’ end points are possible. The algorithm can be

extended to find curve paths with a minimal total detour. All possible paths have to be

computed in advance or a backtracking strategy can be applied.

The distance between the displaced center point of a curve segment and the cluster

bound has to be predefined. However, there are no uniform scale units applicable to all

graph layouts and it is unclear what distance is desirable. Specific graph layout measures

like the average or the minimal node distance may help to set a proper distance between

routed curves and clusters.

3.3.3 Floating Cluster Bounds

The definition of an optimal distance between a routed curve and the cluster bound is

inflexible. It also prevents the production of visually pleasing routes, since the routed

curves will always have the same distance from the cluster bound. Floating bounds are,

in contrast to solid bounds, not explicitly spatially determined. They can be obtained by

the utilization of techniques like implicit surfaces. Implicit surfaces are used to compute

representations of molecular surfaces or splashing water. They were also applied for a

visually simplified representation of vertex clusters [7]. Figure 3.1a depicts the principle

and Figure 3.1b shows an application of implicit surfaces.

The approach using floating cluster bounds considers clusters in the near proximity

of a curve and will move the routed curves to the middle of the free space among two

neighboring clusters. This approach also follows a divide and conquer scheme and consists

of the following two steps:

1. Compute Cluster Bounds A cluster creates an energy field that encloses the entire

cluster and is composed by its generator points in all cluster nodes. A floating cluster

bound is described by the equipotential lines of the energy field as depicted in Figure 3.1a.

A customizable threshold parameter adjusts the influence radii of the energy fields and

consequently the positions of the cluster bounds. The threshold assures the avoidance of

node occlusion as it implies a blocked area that can not be used to route curves.

32


(a) Overlapping energy fields of implicit surfaces

3.3 Routing Based on Cluster Bounds

(b) Implicit surfaces create a visually simplified

representation of vertex clusters

Figure 3.1: Implicit surfaces in the two-dimensional space, taken from [7]

2. Eliminate Intersections Intersections between curves and cluster bounds are not

calculated. A curve is moved from its initial position towards the next accessible valley of

total energy. The curve is therefore halved and the center of the curve is moved into the

direction that promises the largest reduction of the energy. The negative gradient of the

energy of these fields determines the direction a curve is moved.

By this, a curve is not moved across other clusters, because every movement has to

reduce the energy. Finally this approach places curves, more specifically the considered

central point of a curve, in the middle of the valley with low energy between clusters.

Further intersections of the routed curves with the cluster bounds can be eliminated by

applying this approach the the curve segments to the left and to the right of the central

point.

Consequently, there is no need to define an absolute distance values as required for the

approach based on solid cluster bounds.

3.3.4 Conclusion

A thorough examination of the two approaches of a routing technique based on cluster

bounds leads to the conclusion that both of them are not applicable for the given purpose.

The first algorithm, based on solid cluster bounds, is a geometrical approach. The computation

and elimination of intersections can become fairly complex as is has to consider

many cases, e.g., odd contours of cluster bounds and overlapping cluster bounds. Furthermore,

the need to define fixed geometric values finally makes this algorithm inapplicable

for this purpose.

The approach based on floating cluster bounds remedies the need to define fixed values

and also produces more visual pleasing curve layouts, since neighboring clusters are

considered. However, the placement of curves in the middle between clusters does not

support a restriction of the extension of the curves. Both algorithms can produce very

33


3 Hyperedge Routing

long curves on inappropriate paths between the clusters if large areas of the graph layout

area are blocked. For example, the surfaces in Figure 3.1a block the entire area between

the objects if the threshold is too high.

Overall, the routing of curves based on floating cluster bounds is preferable over solid

cluster bounds. The concept of floating cluster bounds is now further refined to avoid

the blocking of large areas of the graph layout area. The following section introduces an

energy-based routing technique that similarly use energy fields, though they originate at

each individual vertex of the graph.

3.4 Energy-Based Routing

The cluster-bound-based routing technique using floating bounds revealed that an energy

field should on the one hand be as fine-grained as possible to avoid a wastage of layout

space and to remedy the problem of overlapping clusters. On the other hand, a method

based on energy fields must also operate in a coarse-grained manner, where a dense group

of nodes creates an energy field by the superposition of their individual energy fields, to

avoid cluster intersections. The energy-based routing technique introduced in this section

is similar to the routing approach that is the based on floating bounds, and will extend it

to meet all requirements of hypergraph visualizations.

An energy-based routing technique utilizes energy models, similar to the energy-based

computation of graph layouts such as in [46], to compute the position of curves and thus

to draw hyperedges in fixed graph layouts. It is applicable to both hyperedge structures,

the centralized and the fully connected structure.

The gist of energy-based routing is that nodes create energy fields that repulse curves

and move them to free space of the graph layout area. The arrows in Figure 3.2a depict

such a repulsion. The repulsion impedes the occlusion of nodes by curves. The hypergraph

layout, i.e., the positions of the curves, is computed by energy minimization algorithms

that move curves to positions with (locally) minimal energy. Setting the energy field

strength very high at positions of the repulsing nodes, and still not low in the close

proximity of individual nodes or clusters, energy minimization algorithms are able to

route curves properly.

Individual energy fields are overlapping each other and, according to the superposition

principle, create a compound repulsion by clusters. Energy minimization algorithms that

find energy minima in the graph layout area move curves away from nodes and likewise

push them out of node clusters. Consequently, the intersection of clusters is also prevented

by such an repulsion.

The remainder of this section is organized as follows. In the beginning, a simplified

representation of curves in energy fields is discussed. Curves need to be approximated by

a finite set of points, as depicted in Figure 3.2b, which serve as receptors of forces acting on

curves. Next, an energy model for energy-based routing of curves is introduce in the three

Sections 3.4.2 through 3.4.4. In doing so, the repulsion energy is formalized in a first step.

Then, in compliance to Newton’s third law of motion, i.e., the law of reciprocal actions, an

opposite force that prevents infinite strain of curves is introduced in Section 3.4.3. This

force is illustrated by the wide arrows overlaying the curve in Figure 3.2c. By employing

34


(a) Repulsion force acting on a curve

(b) Repulsion force acting on dummy nodes

(c) Opposite force

3.4 Energy-Based Routing

node repulsion force

routed curve

dummy node attraction

Figure 3.2: The rationale of energy-based routing

this opposite force, curves are modeled as elastic bands. Section 3.4.4 combines both forces

together and discusses the equilibrium created by this composition. A conclusion of this

energy-based routing technique concludes this Section.

3.4.1 Modeling of Curves

Figure 3.2a depicts the repulsion acting on a curve, more precisely the arrows in this figure

represent direction and magnitude of the repulsion forces. All nodes of the graph, except

for the two hyperedge nodes at the end of the curve, create an energy field.

A problem arises as energy-based layout algorithms are not capable to process bodies,

i.e., an infinite number of points. Newton already described the motion of bodies by some

suitable chosen points of the body. Accordingly, a curve is approximated by a finite set

of considered points. This approximation of the curve model is necessary to allow the

computation of the impact of forces on curves and the resulting displacement of curves.

Dummy Nodes The distinguished points that are used to approximate a curve are denoted

dummy nodes in the following. They are receptors of the repulsion, do not affect

the remaining layout, and are not visualized later. Dummy nodes are used to compute the

energy at these points and thus to compute the influence of the surrounding graph layout

35


3 Hyperedge Routing

on this part of the curve. Figure 3.2b depicts the same graph layout and the same curve

from Figure 3.2a, but the repulsion now only acts on the dummy nodes that model this

curve.

3.4.1.1 Curve Model Fidelity and Accuracy

The fidelity of the curve model reflects the number of dummy nodes that model a curve.

The fidelity and thus the accuracy of curves increases with increasing number of dummy

nodes per curve. A homogeneous distribution of dummy nodes on the curves is assumed

as an uneven distribution would not permit the prediction of the accuracy of curves, the

probability to occlude nodes and intersect clusters, and thus the quality of the routing

result.

The fidelity of the curve model and the accuracy of curves correlate with each other.

The accuracy denotes the quality of a modeled curve and the potential routing result in

consequence as it is influenced by the distance between neighboring dummy nodes. A

large distance between neighboring dummy nodes of a curve is imprecise and features low

accuracy and routing quality. In contrast to the accuracy, the fidelity does not consider

the distance between neighboring dummy nodes since it is a measure of the number of

dummy nodes per curve.

A low fidelity of the curve model allows a very fast hypergraph layout computation. A

hypergraph layout with higher fidelity of the curve model needs more computational effort,

but also promises a higher hypergraph layout quality with respect to the requirements.

Only at the positions of the dummy nodes, node occlusions can be definitely prevented.

The following paragraphs introduce three different approaches to define the number of

dummy nodes that are used to model each curve.

Fixed Number of Dummy Nodes If a fixed number |∆| = |∆(c)| of dummy nodes is

used to model each curve c of a hyperedge, then the distance dδ(c) between neighboring

dummy nodes differs for different curves as it depends on the length len(c) of the curve c.

dδ(c) = len(c)

|∆| + 1

The accuracy of the curves may differ for different curves although they have the same

fidelity. Thus each curve will meet the requirements of hypergraph visualizations differently.

A fixed number of dummy nodes is not appropriate for the requirements of this

work.

Fixed Dummy Node Distance A fixed distance dδ between neighboring dummy nodes

of all curves overcomes this shortcoming. The actual distance dδ(c) slightly varies for each

curve as the curve’s length len(c) is not always divisible without remainder by the integer

number of curve segments |∆(c)| + 1. However, this method ensures an approximately

36


equal accuracy of each curve.


len(c)

|∆(c)| = rd − 1


dδ(c) = len(c)

≅ const

|∆(c)| + 1

3.4 Energy-Based Routing

The distance dδ between neighboring dummy nodes of a curve has to be predefined.

Incrementally Increasing Number of Dummy Nodes Using this approach, all curves

always have an equal fidelity. To avoid the specification of fixed values of the aforementioned

approaches and to gain more flexibility, the number of dummy nodes modeling a

curve is increased during the execution of the routing algorithm. Energy minimization

algorithms usually compute the layouts iteratively. So, after a couple of iterations of the

energy minimization algorithm, the number of dummy nodes is increased. Then, an energy

minimization algorithm continues the calculation of the routes using a more complex and

accurate model of curves.

Initially, a straight-line curve connects the two end points, e.g., two hyperedge nodes. A

curve is initially modeled by one dummy node that is placed at the curve’s center between

both end points. Two straight-line curve segments connect the dummy node with the end

points of the curve. An energy minimization algorithm is applied and the dummy nodes

of curves are rearranged. After some iterations, ideally when the dummy nodes have been

moved to a local energy minima, the number of dummy nodes of each curve is increased.

The introduction of new dummy nodes halves each curve segment. The position of newly

introduced dummy nodes is the center of the end points of the respective curve segment.

Algorithm 3.1 clarifies this procedure. The total number of iterations n of the energybased

routing algorithm is divided into phases with distinct fidelity of the curve model.

The phases in the shown algorithm are equally sized in Algorithm 3.1, i.e., each phase

consists of the same number of iterations.

1

2

3

4

5

6

7

8

9

Input: number of iterations n, final curve model fidelity ff , set of curves curves,

hyperedge layout p

Output: hyperedge layout p

f ← 0;

for i ← 1 to n do

if f = ⌈ i

n · ff ⌉ then

f ← f + 1; // increase current fidelity f

foreach curve in curves do

foreach curve segment in curve do // halve each curve segment

δ ←newDummyNode(center(curve segment));

p(δ) ← center(curve segment);

p ← minimizeEnergy(p);

Algorithm 3.1: Energy-based routing of curves in conjunction with an incremental

increasing curve model fidelity

37


3 Hyperedge Routing

The incremental approach defines the fidelity f of the curve model as follows. Let the

initial fidelity f = 1 denote that the number ∆f=1(c) = 1 of dummy nodes per curve is 1.

Due to the assumed homogeneous distribution of dummy nodes on each curve, ∆f dummy

nodes divide a curve into 2 f equally sized curve segments. The number of dummy nodes

of a curve c that is modeled with a curve model fidelity f is defined as:

|∆f (c)| = 2 f − 1

dδ(c) = len(c) · 2 −f

(3.3)

The distance dδ(c) between neighboring dummy nodes still depends on the curve length

len(c), but the ability to increase the fidelity f guarantees that a desired level of accuracy

can be obtained for all curves. The growth of the fidelity implies an exponential rise of

the number of dummy nodes.

3.4.1.2 Benefits of an Incremental Approach

In summary, the benefits of incrementally increasing fidelity over the two earlier approaches

for setting the number of dummy nodes are:

• User-defined, theoretically arbitrarily high accuracy, since there are no fixed limitations

specified.

• An early abort promises a quickly available approximation of hypergraph layout with

lower fidelity.

• High computation effort promises accurate hypergraph layouts with a high probability

to preserve the layout’s expressiveness.

• The fidelity of the curve model can be increased even after a first hyperedge routing

was finished. The routing technique may continue with increased fidelity to further

smooth the paths of the curves.

• As already mentioned, the ability to produce on-line animated visualizations [24] is,

although beyond the scope of this work, very important for future applications of the

introduced routing technique. Then, a user could abort the computation during the

computation if the layout complies with his personal requirements. If further curves

are added to an already routed hyperedge, then the incremental approach instantly

produces a rough approximation of the updated hyperedge visualization and with

increasing curve model fidelity the curve converges towards its final path.

Consequently, this work utilizes this incremental technique to model curves in energy

fields.

3.4.2 Repulsion

Now that the curve model allows to compute the impact of forces on curves, the energy

model for the energy-based curve routing technique is motivated and formalized in

the following sections. This section therefore introduces the node repulsion that avoids

node occlusion and cluster intersections by curves. The following Section 3.4.3 introduces

another energy and both are composed into one consistent energy model in Section 3.4.4.

38


3.4 Energy-Based Routing

Each node creates a repulsing energy field centered at the node’s layout position. The

impact of all repulsing energy fields on a particular position of the graph layout is determined

by the sum of the individual energy fields (superposition principle). The result

of the energy-based routing technique are curves curling around the nodes and clusters.

Thus, node occlusion and cluster intersections are avoided.

Dummy nodes are repulsed by all nodes of the graph, except for the nodes that are

part of the hyperedge and serve as end points of the respective curve, cf. Figure 3.2b

on page 35. An energy minimization algorithm, e.g., the hierarchical force-based Barnes-

Hut algorithm [10], is used to move curves to an area with low repulsion energy in several

iterations of dummy node displacements. A stable state (equilibrium) is attained when the

system’s total energy is locally minimal and each dummy node experiences an equilibrium

of forces.

3.4.2.1 Formalization of the Repulsion

Initially, curves are straight-line connections without consideration of the graph layout.

Then, a very high repulsion at the position of nodes moves the dummy nodes away from

the nodes. The strength of the energy field diminishes with increasing distance from the

node. Therefore, an theoretically optimal position of a routed curve is as far as possible

away from repulsing nodes; an optimal position of the dummy nodes is characterized by

a minimal sum of repulsion energies.

The repulsion on a dummy node δ, caused by a node v ∈ V of a graph H = (V, E), is

basically determined by the following factors:

• Position p δ of the dummy node δ

• Position p v of the node v

• Weight wv of the node v

These given parameters lead to the specification of an repulsion energy that is eligible

to route curves accordingly. A node v ∈ V of the underlying graph G = (V, E) of a

hypergraph H = (V, E) creates a repulsion energy field ER(d) centered at p v in a given

graph layout p. The energy ER(d) is described as a function of the distance d from its

source p v.

The routing algorithm calculates the repulsion only at the positions of the dummy nodes.

The distance d = p δ − p v is the Euclidean distance between the source of the repulsion

and the position p δ of a dummy node δ of a curve. The absolute value of the repulsion

energy is characterized by the following requirements:

• The repulsion energy is maximal at the position of a node, i.e., d = 0. It has to

be larger than all other occurring energies to eliminate node occlusion. Defining

|ER(0)| := ∞ is sufficient.

• The repulsion energy decreases with increasing distance from the source of the energy

field.

• The repulsion energy converges limd→∞ER(d) = 0 to zero at an arbitrarily large

distance.

39


3 Hyperedge Routing

ER

0

d

r = 0

r = −1

r = −2

Figure 3.3: The absolute values of different repulsion energies ER(d) against the distance

d from a repulsing graph node for various repulsion exponents r ∈ {0, −1, −2}

Similar to energy models used for an energy-based computation of graph layouts [21,

46, 35], the repulsion force can be characterized by a function similar to FR ≈ 1

d . Further,

the weight wv of a node v ∈ V influences the magnitude of the repulsion. This allows

to influence the repulsion with respect to particular properties of vertices, for instance its

degree or the size of its visualization.

The energy-based routing approach utilizes this concept of iterative node displacements.

Nodes repulse curves, or more precisely the dummy nodes of the curve model.

Repulsion Energy The repulsion energy ER(d), caused by a node v ∈ V , on a dummy

node δ of a curve at the position pδ in a given graph layout p is defined as shown in

Equation 3.4. The distance between node v and dummy node δ is denoted as d, i.e.,

d = pv − pδ. ⎧

⎪⎨


ER(d) =

⎪⎩

wv

r · pv − pδ r

−wv · log (pv − pδ) if r < 0

if r = 0

(3.4)

The function plots in Figure 3.3 depict the repulsion energy with different repulsion exponents

r ≤ 0 against the distance d = p v − p δ between the repulsing node and the

dummy node. A repulsion exponent is used to adjust the energy field strength of the repulsion.

In the case of r = 0, the logarithm function is applied as it results in a repulsion

force similar to d −1 .

Repulsion Force The repulsion force FR(d) = −▽ER(d) is the negative gradient of the

repulsion energy shown in Equation 3.4. The repulsion force magnitudes FR(d) are plotted

in Figure 3.4.

40

FR(d) = −▽ER(d) = −wv · p v − p δ r−1 · p v − p δ

p v − p δ

(3.5)


FR

0

d

r = 0

r = −1

r = −2

3.4 Energy-Based Routing

Figure 3.4: The magnitude FR(d) of the repulsion force for various repulsion exponents

r ∈ {0, −1, −2} at the distance d from a repulsing node

Displacement of Dummy Nodes The magnitude FR of the repulsion force FR correlates

to the amplitude of the dummy node displacement in each iteration of an energy

minimization algorithm and is determined by the first derivation of the repulsion energy

at the position p δ of the dummy node.

FR(d) = E ′ R(d) = −wv · p v − p δ r−1

(3.6)

A repulsing node v causes a displacement wv · p v − p δ r−1 of the dummy node δ in the

direction of −(p v −p δ) away from p v. In a two-dimensional space, a dummy node is moved

by ∆x and ∆y in the direction of the x- and y-axis, respectively, which is determined by

the partial derivation of the repulsion energy

3.4.2.2 Concluding Remarks

∆x = ∂ER

∂x = −wv · (p x v − p x δ ) · p v − p δ r−2 ,

∆y = ∂ER

∂y = −wv · (p y v − p y

δ ) · p v − p δ r−2 .

(3.7)

The repulsion introduced above meets the requirement to preserve the expressiveness of

the given graph layout. If a curve occludes a node, then the strong repulsion within a short

distance from the node moves the surrounding dummy nodes of the curves away. In order

to achieve this, the fidelity of the curve model must be sufficiently high; a small distance dδ

between neighboring dummy nodes corresponds to a high probability that dummy nodes

of the respective curve segment are moved away by the repulsion force.

Furthermore, the repulsion can also hinder curves to intersect dense groups of nodes.

The energy minimization algorithm will move curves out of clusters since there is a higher

repulsion energy field strength inside than outside the cluster.

Ideally, an energy minimization algorithm overcomes the problem of curves trapped in

local energy minima. Practically, there can be many local energy minima within clusters

41


3 Hyperedge Routing

and it might become more difficult to move curves out of them. However, this repulsion

force and the energy model as it is completed in the following paragraphs, provides the

proper concept to fulfill our requirements of hypergraph visualizations. Later, the practical

evaluation of this approach will show the results and experiences collected with example

hypergraphs.

So far, the positions of dummy nodes are solely influenced by the repulsion of nodes.

Theoretically, the repulsion can push the curves infinitely far away, because the displacement

of dummy nodes is not limited yet. The following section will introduce an opposite

force that prevents curves from being routed to the outside of the graph layout area.

3.4.3 Strain of Curves

Newton’s third law, the law of reciprocal actions, allows the creation of a stable state

and, in this particular case, hinders the infinite strain of curves caused by the addition of

a repulsion force. The law states that to every action there is always opposed an equal

reaction. Applied to the energy-based routing of curves, another force that hinders the

strain of curves, i.e., the elongation of curves, is needed.

This section introduces another energy to prevent unlimited strain of curves. It is

added to the repulsion energy of the previous section and also relies on the model of

curves introduced in Section 3.4.1 above.

Elastic Band Model The global minimum regarding the repulsion energy is always ad

infinitum, i.e., as far away as possible from the area occupied by the graph layout. The

strain of curves, i.e., the magnification of the curves’ lengths, is not limited during the

process of energy-based routing. However, it is not desired to route curves to the outside

of the graph layout area. Hence, the goal is to model curves as elastic bands. This way,

the repulsion is not able to move curves ad infinitum. For the sake of aesthetics there is no

reasonable threshold of strain distinguishable. The energy-based approach does not rely

on a threshold; the elongation of curves depends on the strength of repulsing nodes.

An attraction between neighboring dummy nodes of a curve models a curve as an elastic

band with limited strain. The attraction increases with an increasing distance between

neighboring dummy nodes. An energy minimization algorithm can be utilized to compute

the layout in conjunction with the repulsion. Of course, an optimal layout of a curve that

considers only the attraction is the straight-line connection of the curve’s end points. In

conjunction with the repulsion, an optimal layout will represent a trade-off of both goals.

3.4.3.1 Formalization of the Attraction

Requirements Each dummy node is attracted to its two directly neighboring nodes on

the same curve (see Figure 3.2c on page 35). The outer dummy nodes are attracted to

the immutable positions of the curve end points, respectively. To formally describe the

characteristics of the attraction energy, a class of appropriate functions is again derived

from the following requirements.

42


EA

0

a = 1

a = 2

a = 3

d

3.4 Energy-Based Routing

Figure 3.5: Different attraction energies EA(d) against the distance d from a neighboring

dummy node for various attraction exponents a ∈ {1, 2, 3}

• The attraction energy is independent of the total curve length len(c). Additionally,

the number of dummy nodes that model the curves must not influence the attraction

strength.

• The attraction energy is independent of the total elongation of curves. The elongation

is the difference between the length of a routed curve and the straight-line

curve. The impact of an elongation depends on the initial curves length. A very

long curve should be allowed to increase its length more than a very short one.

• The attraction energy is minimal, if a curve length is minimal, i.e., if the curve is a

straight line.

• The attraction energy tends to infinity for infinite curve length.

• The requirement to avoid node occlusion is placed over aspiring short curve lengths.

Attraction Energy Taking these requirements into account, the attraction energy EA on

a dummy node δ is a function against the distance d = pn − pδ to its neighbor node n

(either another dummy node or one of the two end points of the curve), and is defined as

follows:

EA(d) = 1

a · pn − pδ a

(3.8)

Similar to the repulsion energy from above and to other energy models used to compute

graph layouts, the attraction exponent a allows an adjustment of the attraction energy

field strength. The selection of an attraction exponent a > 0 greater than zero will meet

the given requirements of the attraction. This exponent determines the elasticity of the

band model and the strain potential of the curves respectively. The energy functions for

various exponents are plotted in Figure 3.5.

Attraction Force The attraction force FA = −▽EA is calculated as the negative gradient

of the attraction energy.

FA(d) = −▽EA(d) = p n − p δ a−1

(3.9)

43


3 Hyperedge Routing

FA

0

a = 1

a = 2

a = 3

Figure 3.6: The magnitude FA(d) of the attraction force for various attraction exponents

a ∈ {1, 2, 3} at the distance d from a neighboring dummy node

Notice that the lowest possible attraction exponent a = 1 does not only mean that the

attraction energy grows linearly against the distance d, but also stands for a constant

magnitude |FA(d)| = 1 of the attraction force as Figure 3.6 shows. For a = 1, the

attraction force is independent of the distance d, but the composition of attraction and

repulsion energy will in sum show the desired behavior: even a constant attraction will at

one point be stronger than the repulsion.

Displacement of Dummy Nodes The attraction force that acts on movable dummy

nodes leads to a further influence, besides the repulsion, on the positioning of curves.

Here the sole effect of the attraction force FA on a dummy node δ is shown. The direction

and magnitude of the displacement of a dummy node that is caused by a neighboring node

n on the same curve is as follows:

3.4.3.2 Concluding Remarks

∆x = ∂EA(d)

∂x = (px n − p x δ ) · p n − p δ a−2

∆y = ∂EA(d)

∂y = (py n − p y

δ ) · p n − p δ a−2

d

(3.10)

This subsection introduced an attraction between neighboring dummy nodes on a curve

to model the curve as an elastic band. As desired, the attraction prevents an infinite shift

of the curves out of the graph layout area caused by the repulsion. The benefit of such an

energy-based approach is a lack of necessity to specify absolute values and thresholds. For

instance, the specification of a maximum length of a routed curve or a maximum ratio of

the routed curve’s length to the initial curve’s length was avoided.

The combination of repulsion and attraction creates an energy model. The strengths

of both energies determine the locations of the stable position of dummy nodes. The

following section investigates the location of the equilibrium created by the combination

of repulsion and attraction.

44


3.4.4 Equilibrium

3.4 Energy-Based Routing

It was shown that an energy-based routing technique necessitates two opposite forces to

create a stable hypergraph layout. However, the positions of the curves are still uncertain.

The stable position can be varied by adjusting the strength of repulsion and attraction

energy. This relocates the positions of minimal energy in the layout area and thus relocates

the optimal positions of dummy nodes.

An equilibrium of a dummy node is a stable state, where the sum of all forces Fδ = 0

at the position p δ of a dummy node δ of a curve c is zero. Equivalently, the sum of total

repulsion and attraction energies is locally minimal. Let H = (V, E) denote the hypergraph

and p the corresponding hypergraph layout. Then the net force Fδ is calculated as follows.

Fδ =

FR(pv − pδ) +

v∈V


n∈neighbors(δ)

FA(p n − p δ) (3.11)

Consequently, the sum of the net forces

δ∈ν(H) Fδ acting on all dummy nodes of a

hyperedge is zero.

Illustration Figure 3.7 exemplifies an artificial case of a curve modeled by one dummy

node. One node repulses this dummy node. Figure 3.7a depicts the initial configuration

with a straight-line curve that is not routed. The small distance between graph and dummy

node causes a strong repulsion that is depicted by the thin (red) arrow. The attraction is

minimal since the curve’s length is also minimal. The net force, which affects the dummy

node, is determined by spanning a parallelogram of the individual forces and applying the

principles of vector addition. The widest blue arrow depicts the net force in these figures.

In Figure 3.7a, the repulsion is stronger than the attraction and consequently the net force

moves the dummy node away from the graph node.

An energy minimization algorithm will increase the distance between both mentioned

nodes. After some iterations the dummy node gets to a position where the distance is large

enough to let the attraction either compensate the repulsion (as in Figure 3.7c) or even

(a) Repulsion stronger

than attraction

(b) Attraction stronger

than repulsion

(c) Equilibrium

Figure 3.7: Towards a stable position of the dummy node on the curve

45


3 Hyperedge Routing

exceed the repulsion. In the latter case, which is shown in Figure 3.7b, the dummy node

was moved too far away from the repulsing node, and the curve’s strain, which is depicted

by the (green) arrows along the curve, becomes too large. As a result, the attraction to

both end points of the curve is stronger than the repulsion and thus the net force brings

the dummy node closer to the repulsing node.

As the displacement of the dummy nodes is proportional to the magnitude of forces

(cf. Equations 3.7 and 3.10), the oscillation of dummy nodes around the position of the

equilibrium finally results in a stable state. The closer a dummy node is to the equilibrium,

the smaller its displacement will be.

3.4.4.1 Adjustment of Repulsion and Attraction

For each dummy node there is a stable position in the graph layout. The placement of the

equilibrium depends on the scaling of the layout since the forces are mainly derived from

the distances between nodes, and the energy and force functions evolve differently against

the distances. We therefore now add further coefficients to the computation of repulsion

and attraction forces, in order to adjust the forces to each other, and balance them in such

a way that the position of the equilibrium does not depend on the graph’s scaling.

To place the equilibrium at a certain point, all energies at this point must be minimal

compared to the close proximity. Equivalently, the sum of all forces is zero at this point.

Therefore, the repulsion and attraction on a dummy node are set to a certain constant

value if the dummy node is at an optimal position regarding repulsion and attraction,

respectively.

At first glance, the strength of repulsion and attraction energy can be influenced by their

respective exponents. Both exponents determine the way the energy field strengths evolve

against the distance from their sources. The plots of the energy functions in Figures 3.3

and 3.5 show how these exponents r and a can influence the slope of the energies at certain

regions. The exponents r and a mainly influence the development of forces against the

distance, whereas a coefficient can change the magnitude of forces. Linear coefficients only

alter the value of the energy, and not the slope of the energy functions.

The remainder of this section adjusts the magnitudes of forces to an equal fixed value,

here it is 1. Obviously, the direction of all forces remains unchanged by the addition of

the following coefficients.

Repulsion Coefficient The repulsion energy as defined in Equation 3.4 already considers

the node weight wv. A further coefficient wδ, which is specific for each dummy node, is

introduced to adjust the repulsion energy ER. In an optimal position of the dummy node,

the coefficient wδ is chosen such that the magnitude of the repulsion force FR on a dummy

node δ is set to 1. Only the magnitudes FR = |FR| are influenced and considered here.

The functions of the distance are abbreviated as E = E(d) and F = F (d).

46

FR = −wv · wδ · p v − p δ r−1

(3.12)


3.4 Energy-Based Routing

If the dummy node is placed at an optimal position p o,r

δ with respect to the repulsion,

then the coefficient wδ is derived as follows.

FR = 1 = −wv · wδ · pv − p o,r


δ

r−1

wδ =

− 1

wv · pv − p o,r

δ


r−1 (3.13)

The following Section 3.4.4.2 presents a method to find the optimal position p o,r

δ of a

dummy node δ.

Attraction Coefficient The attraction is also adjusted. A coefficient wc, which can be

considered as the weight of the curve, enhances the attraction force from Equation 3.9 to

FA = wc · p n − p δ a−1 . (3.14)

The distance pn − pδ between the dummy node δ and one of its neighboring nodes

n ∈ neighbors(δ) on the curve is the length of the respective curve segment. The coefficient

wc is specific to the particular curve segment as it is basically derived from its length. The

attraction force FA is set to 1 for a dummy node δ in its optimal position p o,a

δ with respect

to the attraction.

FA = 1 = wc · pn − p o,a

wc = pn − p o,a

Section 3.4.4.3 clarifies the optimal position p o,a

δ

3.4.4.2 Optimum in Terms of Repulsion

δ

δ

1−a

a−1

(3.15)

of a dummy node δ in terms of attraction.

An optimal position of a dummy node with respect to the repulsion corresponds already to

the desired solution of the routing problem. There are possibly several optimal positions

imaginable. For instance, another optimal layout of the graph in Figure 3.7 places the

dummy node on the left side of the repulsing node. Most energy minimization algorithms

will not lift the dummy node over the node, because this would increase the repulsion

before it can be minimized. The infinitely distant positions are not considered as optimal

positions since a position that is very close to the area of the given graph layout is desired.

For the sake of the adjustment of energies, an approximation of an optimal position is, or

practically has to be, sufficient. An optimal position of a dummy node is characterized by

a minimal repulsion energy in the potential layout area of the curve. The repulsion energy

ER(p s) at a sample point p s ∈ R 2 is the sum of the repulsion (as defined by Equation 3.4)

47


3 Hyperedge Routing

caused by all nodes v ∈ V of a hypergraph H = (V, E).

ER(ps) =

ER(pv − ps) v∈V




⎪⎨ v∈V

=

⎪⎩

wv

r · pv − ps r

if r < 0,


−wv · log (pv − ps) if r = 0

v∈V

(3.16)

To find a position with minimal energy min{ER(p s)|s ∈ S} from a set of sample points

S, the total repulsion energy at each sample p s∈S is computed according to Equation 3.16.

Such a sampling technique finds a position with minimal energy and is elaborated in the

next paragraphs.

Rationale First, the sampling area is limited to a rectangle enclosing the respective curve.

This sampling area represents the potential positions of all dummy nodes of a routed

curve. It would be too restrictive to predict the area of potentially optimal positions for

each dummy node of a curve individually. Therefore, the introduced coefficient wδ is equal

for all dummy nodes of the same curve. The sample point with minimal energy represents

the optimal position of all dummy nodes of the curve.

Figure 3.8 depicts a sampling area of a curve. The length of the curve determines the

dimension of the sampling area. Depending on the visual objectives of routing, the sampling

area can also be extended to a larger area. This will only increase the computational

effort.

Second, after the sampling area of each curve is set, the number, location, and distribution

of the samples have to be arranged. A homogeneous distribution of samples in the

sampling area is used. The positions of the samples can be easily derived once the number

of samples is specified. A value between 100 and 1, 000 might be an appropriate number

|S| of samples. The sampling area in Figure 3.8 for instance contains 11 × 11 samples.

Third, the repulsion energy ER(ps) (see Equation 3.16) caused by all nodes is calculated

at each sample point ps. The minimal repulsion energy at the optimal sample point po,r s

regarding the repulsion (light gray in Figure 3.8) allows the calculation of the repulsion

coefficient wδ for all dummy nodes δ ∈ ∆(c) of a curve c.

wδ =


v∈V

3.4.4.3 Optimum in Terms of Attraction

− 1

wv · p v − p o,r

s r−1 (3.17)

Finding optimal positions p o,a

δ of dummy nodes with respect to the attraction is trivial.

The dummy nodes experience minimal attraction if and only if the distance to their neighboring

nodes is also minimal (cf. Equations 3.8 and 3.14). As required in Section 3.4.3, the

48


Emin =

ER

v∈V

3.4 Energy-Based Routing

Sample point

Sampling area

Figure 3.8: Sampling in a curve’s potential routing area identifies the optimal position

with minimal repulsion energy Emin

optimum is the straight-line connection of both end points of a curve with homogeneously

distributed dummy nodes on it. The depicted curve in Figure 3.8 is already in its optimal

position with respect to the attraction. The distance dδ between a dummy node δ and

one of its neighbors n on this curve is the ratio of the curve’s length len(c) to the number

of curve segments |∆(c)| + 1.

dδ = len(c)

|∆(c)| + 1 = pn − p o,a


δ

The coefficient wc is determined by setting the attraction force to FA(dδ) = 1. The

attraction coefficient is equal for each dummy node of the same curve, because the distance

between neighboring nodes in an optimal case is constant.

wc =


len(c)

|∆(c)| + 1

1−a (3.18)

49


3 Hyperedge Routing

3.4.4.4 Concluding Remarks

An equilibrium was already created by the composition of repulsion and attraction. The

focus of this section was the adjustment of the equilibrium to permit the generation of adequate

curve layouts. Therefore, a repulsion and an attraction coefficient were introduced.

The repulsion coefficient wδ represents a weighting of the dummy nodes. The normalization

of the repulsion abstracts from concrete distance values between nodes and ensures

that a large number of nodes with a small node weight are equally repulsing like a smaller

number of nodes with higher node weights. An illustrative example is that ten nodes, each

with a node weight of 1, are equally repulsing as one node with a weight of 10.

The sampling method reuses the optimal position p o,r

δ for all dummy nodes of a curve.

The sampling method only computes the coefficients once. If additionally a moderate

number of sample points is specified, then the overall computational effort of the sampling

method is very low in comparison to the total complexity of the energy-based routing

technique. The number of samples and the size of the sampling area further affect the

result’s accuracy. Both parameters allow users to trade-off accuracy against computational

effort.

The attraction coefficient wc represents a weighting of the respective curve and consequently

is equal for each dummy node of the curve. It decouples the attraction from the

actual value of the curve length, because the attraction must not be radically different for

two congruent, but not equal, curves. The computation of wc is also inexpensive, because

it is computed for each curve only. The incremental approach to model curves in energy

fields, which was described in Section 3.4.1, requires a recalculation of the attraction

coefficient wc every time after the fidelity of the curve is increased.

3.4.5 Conclusion

Routing avoids node occlusion and cluster intersections. The presented energy-based routing

technique reuses the energy-based approach that is also capable to produce graph

layouts without any edge routing. Energy-based layout techniques solve an optimization

problem; energy minimization algorithms are capable to reduce the total energy of hypergraph

layouts. Dummy nodes were introduced to model curves in energy fields in a way

that allows to compute the energy’s impact on the curve. Three different strategies to

model curves were examined. The incremental doubling of the number of dummy nodes

emerged as the preferable method.

Repulsion and Attraction The repulsion caused by nodes is commonly used to compute

graph layouts. The elastic band model is a novel approach to limit the strain of curves by

an attraction force. Energy minimization algorithms are used to minimize the repulsion

and attraction energies of all dummy nodes. Therefore the dummy nodes are moved to

positions with minimal energy and by this the curves are routed.

The energies were formalized and combined in an energy model that allows to create

hypergraph layouts and fulfills the given requirement to preserve the expressiveness of

layouts. Table 3.1 shows the outcomes of the previous paragraphs in an overview. In

50


Repulsion Energy ER

Repulsion Force FR

− wv · wδ

r

· p v − p δ r

3.4 Energy-Based Routing

r < 0

−wv · wδ · log (p v − p δ) r = 0

−wv · wδ · p v − p δ r−1 · p v −p δ

p v −p δ

Node Displacement ∆x −wv · wδ · (p x v − p x δ ) · p v − p δ r−2

Attraction Energy EA

Attraction Force FA

∆y −wv · wδ · (p y v − p y

δ ) · p v − p δ r−2

wc

a · p n − p δ a

wc · p n − p δ a−1 · p n −p δ

p n −p δ

Node Displacement ∆x wc · (p x n − p x δ ) · p n − p δ a−2

∆y wc · (p y n − p y

δ ) · p n − p δ a−2

Table 3.1: Repulsion and attraction in a nutshell

summary, all energies acting on a dummy node δ ∈ ∆(c) are given in Equation 3.19.

E(δ) is the composition of individual repulsion energies emitted by all nodes v ∈ V of a

given hypergraph H = (V, E) and the attraction energies emitted by both neighbors of the

respective curve c ∈ C(ε).

r < 0 ⇒

E(δ) =

r = 0 ⇒

E(δ) =


n∈neighbors(δ)


n∈neighbors(δ)

wc

a · p n − p δ a −

v∈V

wc

a · p n − p δ a −

v∈V

wv · wδ

r

· p v − p δ r

wv · wδ · log (p v − p δ)

(3.19)

Equilibrium The establishment of an equilibrium of two forces was not straightforward.

Obviously, both forces balance each other, or equivalently, the total energy is minimal in

an optimal layout. The challenge is to determine an optimum regarding both energies

individually and has not been investigated before. By introducing the coefficients wδ and

wc, it becomes feasible to compute insightful and decent hyperedge layouts that meet

the requirement to preserve the expressiveness of layouts. Without both coefficients, it

is impossible to consistently create layouts that fulfill this requirement, and a scaling of

51


3 Hyperedge Routing

graph layouts would lead to radically distinct curve layouts, in which nodes are possibly

occluded by curves.

The presented energy-based routing technique clearly meets the expectations. Node

occlusion, cluster intersections and an arbitrary elongation of curves are prevented. The

practical evaluation of energy-based routing is presented in Chapter 5.

Positioning of the Crux So far, the energy-based routing technique assumed fixed positions

of the end points of curves. This assumption is completely valid for hyperedges in the

fully connected hyperedge structure. The crux of hyperedges in the centralized structure

is one end point of all curves that does not necessarily have to be fixed. The position

of the crux, e.g., the barycenter of the hyperedge nodes, can be very disadvantageous to

the hypergraph layout quality if it is inside a cluster. Therefore, the crux of a centralized

hyperedge structure is a movable end point of the curves. Its position changes during the

process of energy-based routing similar to the dummy nodes.

3.5 Discussion

A routing attempt that relies on the identification of cluster bounds turned out to be

insufficient for the goal of routing hyperedges. Unlike the previous cluster-bound-based

routing approach, the energy-based approach to route curves is fairly convenient as it does

not require the explicit specification of certain characteristics. Instead, it is customizable

by parameters affecting the layouts indirectly. The repulsion and attraction exponents r

and a, and the repulsion and attraction coefficients wδ and wc determine the actual routes

of curves in the generated hypergraph layouts.

Generality Most curve routing techniques target aesthetic criteria like a reduction of

edge bends or edge crossings. Other routing techniques that were also discussed in the

Sections 2.2.4 and 3.2 about related work are limited to certain classes of graphs. In

contrast to those routing techniques, our energy-based technique does not rely on any

particular method to compute hypergraph layouts and does not require any specific type

of graph or graph layout information (e.g., clustering of nodes or a node hierarchy). Solely

the positions of nodes are necessary to route curves.

The presented energy-based approach is capable to route curves, i.e., any binary connection

of two points (not even nodes necessarily). Accordingly, this energy-based routing

technique is capable to route edges in general. Each box-and-line visualization can be

extended to utilize this approach to visualize (binary) edges within a graph layout.

Reduced Visual Complexity The repulsion moves curves to paths with low energy. However,

the attraction limits this ability. Still, there is a certain probability that initially

closely placed curves are moved to the same path with low energy in the layout area. Such

a visual bundling of close curves reduces the visual complexity of the entire visualization.

The following chapter will introduce explicit methods to reduce the visual complexity of

52


3.5 Discussion

hypergraph visualizations. Visual bundling as a side effect of energy-based routing of

curves is also discussed in Section 4.4.2.2 on page 67.

Visualization of Routed Curves The visualization of routed curves is not primarily focused

in this work. The dummy nodes of the curve model should serve as a foundation.

Either these dummy nodes are connected by straight lines, or splines for smoothing can

be used. An interpolation must consider the deviation of the visualized curve from the

positions of dummy nodes to avoid node occlusion that might be introduced by such a

smoothed visualization. Section 4.7.1 on page 74 will discuss the visualization of curves.

The visualizations generated for this work connect neighboring dummy nodes of curves

with straight-line curves. With increasing curve model fidelity, the bends of polygonal

chains (polylines) are hardly cognizable.

53


4 Reduction of Visual Complexity

Hyperedges allow to connect several vertices of a graph simultaneously. The visualization

of large hypergraphs, as common in the field of software visualization, involves a large

amount of visualized information of the underlying graph and additional hyperedges.

The amount of visualized information of a hyperedge amplifies with increasing number

of hyperedge nodes. The representation of the node connectivity of a hyperedge adds extra

information to a hypergraph drawing. To improve hypergraph visualizations in terms of

readability and comprehension, the visual complexity of hypergraphs must be reduced.

This chapter introduces two options to reduce the visual complexity of hypergraph

drawing. First, the computed hyperedge layout can be rearranged such that close curves

are visually bundled together. Second, a simplification of a hypergraph’s model can also

reduce the complexity of a visualization.

Prerequisites The energy-based routing of hyperedges produces layouts that fulfill the

given requirement to preserve the expressiveness of graph layouts. However, routing does

not affect the visual complexity of hypergraph drawings, because the amount of visualized

information is not altered.

The previous chapters modeled hyperedges by a set of curves connecting all hyperedge

nodes with each other. The visual complexity of hypergraphs corresponds to the amount

of visualized information that presents hyperedges. Curves are the visualized objects that

represent hyperedges. The number of curves is determined by the hyperedge structure

and the number of hyperedge nodes.

The visual complexity of a hyperedge with n hyperedge nodes is caused by n curves

using the centralized structure. The fully connected structure needs n(n−1)

2 curves. Thus,

the latter structure of hyperedges generally suffers a higher visual complexity. But, as this

chapter will show, the visual complexity of both structures can be reduced.

4.1 Classification of Visual Complexity

The term visual complexity reflects the readability of a drawing and the ability to comprhend

the displayed amount of information. The more information of a drawing a viewer

can process in a shorter amount of time, the higher is the readability of a hypergraph

visualization. The amount of visualized information, or more precisely, the amount of

information necessary to comprehend the visualized matter, corresponds to the term of

visual complexity in this work and also corresponds to cognitive load, which is explained

in the following.

55


4 Reduction of Visual Complexity

A higher complexity of a hypergraph drawing impedes the readability, because a viewer

can not easily and quickly cognize a complex structure. A low visual complexity of hypergraph

drawings allows a viewer to comprehend the structure of a hyperedge more easily.

Two major influences can be identified to describe the complexity of a hypergraph drawing:

the amount of information, e.g., the number of curves modeling a hyperedge, and the

way of presenting the hypergraphs to a viewer.

Hence, the obvious question is how to estimate and compare the visual complexity of

different hypergraph layouts. The cognitive load theory allows to answer this question.

The next section briefly introduces the cognitive load theory. The theory focuses on

the analysis of human learning capabilities and the preparation of the presentation of

information. The theory also describes the limitations of human cognition and considers

the amount of presented information. Afterwards, the theory is applied to describe visual

complexity of hypergraph drawings.

4.1.1 Cognitive Load

The cognitive load theory is founded on Miller’s publication [41] from 1956, which suggests

that the human capacity for processing information is limited. Since this work from the

perspective of a cognitive psychologist, cognitive science assumes that, independent of a

particular task and also independent of the process someone uses to solve a task, the

amount of memorized information is limited.

The cognitive load theory deals with the units (chunks) of information that a human

can retain in short term memory before loss. Humans are capable and limited to process

seven chunks of information (plus or minus two). For instance, most people can remember

a seven digit phone number [41]. The cognitive load theory investigates the human process

of learning by understanding the human cognitive skills and provides empirically-based

guidelines to present information and “optimize intellectual performance” [58].

The cognitive load denominates the information chunks that are processed to solve a

problem. The study of a subject written with an unfamiliar vocabulary, for instance,

means a higher cognitive load than with a familiar vocabulary. A cognitive overload

causes a lower understanding and comprehension performance [57]. According to recent

publications from Sweller et al., three types of cognitive load are differentiated, and are

briefly described in the following.

Intrinsic Cognitive Load The intrinsic load can be described as the complexity of the

content or the difficulty of a task. It is solely reduced by the amount of information, and

not by its visual representation [56].

Extraneous Cognitive Load The extraneous load denotes the unnecessary information

chunks that should be avoided. A reduction of extraneous cognitive load is the easiest way

to prevent an information overload. For instance, a square is best described visually [37].

In comparison, a verbal description of a square is much more complex and difficult to

understand. Thus, the more efficient visual specification is preferable for this example.

56


4.1 Classification of Visual Complexity

Germane Cognitive Load The germane load illustrates the amount of information that

is necessary to allow a person to construct or acquire a schema [58] from the presented

information. A schema helps to accelerate the cognition and processing of information and

thus increases the human learning performance. This type of load utilizes the remaining

free space of the available working memory.

In contrast to the intrinsic load, the two latter types of cognitive load are manipulable

by the presentation of the information. The intrinsic and the extraneous cognitive loads

are additive.

4.1.2 Utilization of Cognitive Load Theory

The cognitive load theory is applied to hypergraph drawings and allows to assess the visual

complexity of them. The intrinsic load of a hypergraph is constituted by the fixed graph

layout and the hyperedges that have to connect all hyperedge nodes with each other.

The way of presenting hyperedges, i.e., the way of connecting all hyperedge nodes and

visualizing these connections in particular, reflects the extraneous load. Because both

loads are additive, a reduction of the extraneous load becomes very important if there is

a high intrinsic load, as usual for the visualization of large hypergraphs.

The germane load is not considered as it does not describe or reduce the visual complexity

of hypergraph drawings. A reduction of the visual complexity of hypergraph drawings

can be achieved by a reduction of the intrinsic, of the extraneous cognitive load, or of both

loads altogether.

A viewer’s capability to process the displayed amount of information is highly limited,

since only a few chunks of information, which can be related to each other, are retained

in the working memory. For instance, a viewer might ask which nodes of a graph are

connected by a hyperedge and where those nodes are. These questions are the main

motivation of hypergraph visualizations and their importance was explained in Section 1.2.

It is not necessary to identify individual curves between hyperedge nodes to answer these

questions. It is sufficient to identify the overall connectivity among all hyperedge nodes.

In terms of the cognitive load theory, individual curves are extraneous cognitive load if

there is still any visual connection between all nodes of the hyperedge.

Aggregation of Curves The aggregation of curves simplifies the information that is necessary

to visualize the connectivity between hyperedge nodes. Therefore, close curves,

which use a similar paths in the graph layout, are aggregated to reduce the amount of displayed

information. The aggregation comprises two different types. First, the aggregation

of curves that simplifies the hypergraph model reduces the intrinsic load. Second, if the

aggregation of curves bundles curves visually together without changing the underlying

hypergraph structure, then the extraneous load is reduced. In both cases, the total load

and thus the visual complexity is reduced.

The example hyperedge in Figure 4.1 is modeled in the fully connected structure. The

left drawing in Figure 4.1a depicts curves by straight-line curves connecting hyperedge

nodes. The same hyperedge is shown in Figure 4.1b and still reveals the same connectivity

57


4 Reduction of Visual Complexity

(a) Curves of a fully connected hyperedge

produce visual clutter. The visual complexity

is high and the readability is obviously

low.

(b) Aggregated curves of the hyperedge

in (a) still reveal the same connectivity

information. The visual complexity is reduced

and the readability is increased.

Figure 4.1: Reduction of visual complexity of a hyperedge by aggregating curves

information. All hyperedge nodes are connected with each other by a set of aggregated

curves. Thus, the cognitive load is reduced and the clutter caused by the tangle of curves

in Figure 4.1a is bypassed by the aggregation of close edges.

The aggregation of curves may have an impact on the uniform connectivity between

hyperedge nodes, as the hyperedge structure is modified. This chapter focuses on the presentation

of techniques that promise a reduction of the visual complexity of hypergraphs.

The evaluation in Section 5 presents hypergraph drawings and will thus allow to address

the compliance with the first requirement of hypergraph visualizations.

Measures The depicted instance in Figure 4.1 reduces the visual complexity of the hypergraph

drawing by reducing the number of curves from 21 in Figure 4.1a on the left to

8 in Figure 4.1b on the right side. Furthermore, the total edge length is also obviously

reduced by the aggregation. In this particular case, the aggregation of curves reduced the

total edge length by more than 80 percent.

This example demonstrates the reduction of the visual complexity of hypergraphs by

an aggregation of curves. The ratio of the reduced to the initial total length of curves is

an indicator for the reduction of the cognitive load. The length of a curve is determined

by the sum of its curve segment lengths if curves are modeled by the dummy nodes as in

the previous chapter. The length of a curve segment is the Euclidean distance between

the end points of the curve segment. The ratio of lengths is independent of any scaling

of the graph layout and indicates the reduction of the load that visualizes the hyperedge

connectivity.

The number of curves indicates the visual complexity only to a limited extent. The

number of curves may increase through curve aggregation if the aggregated parts of curves

are counted, too.

58


4.2 Techniques to Reduce Visual Complexity

(a) An example hyperedge

in centralized structure

(d) Curve aggregation based

on curve clusters

4.2 Techniques to Reduce Visual Complexity

(b) Energy-based curve

bundling visually bundles

curves

(e) Energy-based widening

of curves visually bundles

curves and cuts nodes out

(c) An energy threshold aggregates

visually bundled

curves in the gray marked

area

Figure 4.2: Results of curve aggregation techniques introduced in this chapter that reduce

the visual complexity of hypergraph drawings

Four techniques are introduced in this chapter. They all allow to reduce the visual complexity

of hyperedge drawings by reducing the cognitive load. The hypergraph drawings

in Figure 4.2 exemplify different techniques and depict their effects on hypergraph drawings.

Initially, an example hyperedge, which consists of two hyperedge nodes and connects

them by two curves to the crux, is given. Figure 4.2a shows that hypergraph without any

reduction of its visual complexity.

As mentioned earlier, the aggregation of curves results in two distinct types of load

reduction. Table 4.1 allocates the techniques according to this distinction. Model-based

aggregation techniques in the right column reduce the intrinsic cognitive load. The extraneous

load is reduced by techniques that visually bundle curves.

The energy-based bundling technique adds an attraction between close curves to bundle

them visually together. Figure 4.2b drafts the result of the bundling technique. Both

curves overlap each other. Visually bundled curves can also reduce the visual complexity

as the extraneous load is decreased by a less detailed amount of displayed information.

59


4 Reduction of Visual Complexity

Visual bundling Model-based aggregation

Energy-based bundling

Energy-based Section 4.4 Energy-threshold-based aggregation

computation Energy-based widening Section 4.5

Section 4.7

Geometrical Cluster-based aggregation

computation – Section 4.6

Table 4.1: Overview of curve aggregation techniques to reduce the visual complexity of

hypergraph drawings

Another energy-based technique, the energy-threshold-based aggregation, allows to reduce

the complexity of the hypergraph model. Based on a preset energy threshold, close

and overlapping curves (marked by the gray background in Figure 4.2c) are aggregated

in the hyperedge model to reduce the intrinsic load. As this figure also shows, the curves

of the hyperedge might be bundled with the energy-based bundling technique in advance.

The resulting hypergraph drawing is similar to the result of the following cluster-based

curve aggregation technique (cf. Figure 4.2d).

A cluster-based curve aggregation directly reduces the intrinsic load with a geometrical

approach. The hyperedge structure is simplified by the identification and aggregation of

close parts of adjacent curves. Subsequently, in contrast to energy-based bundling, the

resulting hypergraph drawing as in Figure 4.2d does not completely show the formerly

individual curves.

The widening of curves is a different approach to reduce the visual complexity of hypergraphs.

The width of curve visualizations was not discussed before, because curves

are usually visualized by line segments with a cognizable tiny line width. A significantly

higher curve width increases the probability of overlapping curves. Thus, curve widening

visually bundles close parts of curves. Consequently, the widening of curves also reduces

the extraneous load. Figure 4.2e depicts the widened curves by gray planes enclosing the

curves. Using the two orthogonal visual concepts of boxes and lines of graph drawings

on the one hand and irregular planes on the other hand may further foster readability of

hypergraph drawings.

4.3 Related Work

Before these techniques are examined, this section discusses related work of graph visualizations

that also combine edges. These contributions were motivated by the reduction

of visual clutter caused by edges. Both approaches, a visual bundling and an aggregation

in the model, were also applied for other graph visualizations. The reduction of visual

complexity of hyperedges was not explicitly examined before.

Hierarchical Edge Bundling Holten et al. investigated the visualization of hierarchical

edge bundles [32] to reduce visual clutter caused by straight-line edges. The edges are

60


4.3 Related Work

bent and modeled as B-spline curves. Two examples of the layout of bundled edges are

depicted in Figure 4.4.

The hierarchy tree of a hierarchical graph is used to

bundle edges visually. By this, the edges are also routed

implicitly. Each node of the hierarchy tree is assigned

to a position in the graph layout. For instance, the nonleaf

nodes P1, P2, P3 in Figure 4.3 of a hierarchy tree

are placed at the center of the postions of their children

nodes.

An edge (u, v) between two nodes u and v of the

base graph is routed through the positions of the parent

nodes, which are on the shortest path in the hierarchy

tree between the leafs representing u and v.

Figure 4.3: Path of an edge is

derived from the graph hierarchy,

taken from [33]

Holten’s method is limited to hierarchical graphs and tree visualization methods. A

hierarchy tree is therefore required. In contrast, the visualization techniques presented in

this thesis can be applied to graphs without hierarchical information, too. In addition,

Holten’s bundling technique also routes edges implicitly, but the major requirements of

the present thesis like the avoidance of node occlusion and cluster intersections are not

considered by Holten et al.

Figure 4.4: Two edge bundling results of Holten’s approach in [32]

Cluster-Based Edge Bundling Balzer and Deussen developed an interactive visualization

of clustered graphs [7]. Implicit surfaces, i.e., adaptive shapes that enclose vertices of a

cluster, represent cluster bounds. Edges are routed and bundled to reduce the amount of

visualized information. The position of the cluster bounds is used to compute base points

that are used to route and bundle edges across different clusters. Figure 3.1b on page 33

also shows a bundling of edges connecting different clusters.

The level-of-detail of the graph visualization depends on the viewpoint. Both the visualization

of cluster bounds and edges are influenced by the distance from the viewpoint.

A close viewer cognizes bundled edges more individually and identifies the content of a

61


4 Reduction of Visual Complexity

Figure 4.5: A Hierarchical Net in software landscapes, i.e., a 2.5-dimensional box-and-line

visualization of a software system, taken from [8]

cluster more clearly. More abstract and solidly bundled edges and opaque cluster bounds

are shown to a distant viewer.

Hierarchical Nets in software landscapes [8] is another edge bundling approach that

relies on the clustering information or the hierarchy of a graph. Figure 4.5 depicts an

example visualization. The vertices are placed on a two-dimensional plane (neglecting the

transparent spheres, which indicate clusters). The net of bundled edges connects vertices

across cluster bounds utilizing the third dimension.

Again, both mentioned cluster-based edge bundling techniques could not be applied to

produce hypergraph layouts, which fulfill our requirements of hypergraph visualizations.

First, the routes of edges do not respect the remaining graph layout. Node occlusions

are not prevented. Second, these methods are not applicable to solely reduce the visual

complexity of hypergraphs since a clustering of nodes is required. It is also not practicable

to automatically compute a reasonable clustering of any graph in advance, because an

inappropriate clustering would also cause an inappropriate edge bundling.

Flow Maps To conclude this section about edge bundling techniques of other authors, a

final approach is discussed. Flow map layouts [48] aggregate edges hierarchically. A binary

splitting method based on node positions determines the route of the flow. Starting from

a certain point in the layout, the flow splits into two parts; the main part continues to

connect the remaining nodes and an auxiliary part connects the main flow with a close

node. These binary splits connect each node (of a hyperedge). Figure 4.6 shows an

example flow map.

One disadvantage of this method is the limitation to binary splits. Furthermore, a locally

high node density leads to visual clutter. Very little work on the automated flow layout

computation is available. The flow layout method by Phan et al. displaces nodes of the

graph layout to allow the routing of the flow between any pair of nodes [48]. Consequently,

no node occlusion occurs when using this method. However, the graph layout is changed,

62


Figure 4.6: A flow map taken from [48]

4.4 Energy-Based Curve Bundling

which does not comply with our requirements. Moreover, the flow routing method does

not prevent cluster intersections.

4.4 Energy-Based Curve Bundling

The curve bundling technique visually converges (parts of) close curves. The hyperedge

model, i.e., the set of curves modeling a hyperedge, remains unchanged, but the visual

complexity is reduced. An attraction between curves bundles curve with respect to the

following two aspects.

• Close parts of curves are moved closer. The visual representations of those parts

may overlap each other or are very close.

• Distant parts of curves are not bundled. As those curve parts are not eligible for

bundling, they must not be affected at all.

Bundled curves share a common path in the graph layout. A viewer perceives an aggregated

group of curves and this visual aggregation in turn simplifies the displayed structural

information of the hypergraph.

The path of a bundled curve is roughly in the area between those paths of the individual

curves, whereupon other criteria may further influence the positioning. More precisely,

the goal is to place the bundled curves on a path that optimally fits in the graph layout.

Thus, all bundled curves have to be routed to determine their optimal paths regarding the

graph layout as described in Section 3.1 about the motivation of routing.

Prerequisites The energy-based technique to bundle curves employs an attraction between

curves. Similar to energy-based routing, it is required to model the curves as chains

of dummy nodes. The bundling is independent of any particular hyperedge structure, so

this technique can be applied to the centralized and the fully connected structure. It is not

63


4 Reduction of Visual Complexity

required to route curves before bundling. Since each change of a dummy node’s position

may result in node occlusion, the bundled curves must be routed after each bundling to

preserve the layout’s expressiveness. The order of routing and bundling is discussed in

Section 4.4.2 below.

Eligibility Curves are eligible for bundling if they are close to each other. This imprecise

requirement expresses the need of an attraction for close parts of curves only. Regarding

the bundling of curves, distant curves must not affect each other. A value that defines the

eligibility of parts of curves for bundling is denoted as a maximum distance dmax between

a pair of dummy nodes of two distinct curves. It must be defined in advance. The dynamic

identification of a proper value for dmax is not discussed here, as this section focuses on

the introduction of the basic principle of this energy-based bundling approach.

4.4.1 Attraction

Energy-based algorithms model curves rather by a set of dummy nodes than by a body.

Thus, the attraction acts between dummy nodes of distinct curves, but not between

dummy nodes of the same curve.

Attraction Energy The attraction energy is characterized by the following requirements.

• The attraction energy of bundled curves is minimal.

• The attraction energy of not bundled curves, which are eligible for bundling, is not

minimal.

• The attraction energy of distant curves, which are not eligible for bundling, is minimal

since the attraction must no influence distant parts of curves.

In comparison to the attraction of the energy-based routing technique, these requirements

demand a more laborious formalization of the attraction.

The first option is the usage of an attraction energy similar to the one that hinders stain

of curves during energy-based routing. The bundling attraction energy EB of a dummy

node δ1 towards another dummy node δ2 of distinct curves is defined as below. Again,

a > 0 is a customizable exponent, which is not related to the exponents of other energies

in this work.

EB = 1

a · pδ2 − p


δ1

a

(4.1)

However, this attraction energy EB is high for very distant pairs of dummy nodes. To

restrict the influence of the attraction between parts of curves that are not eligible for

bundling, the definition of the attraction energy EB(d) has to be limited to the range

0 ≤ d = pδ2 − p


δ1 ≤ dmax and otherwise it is defined to be already minimal, e.g., zero.


1

EB(d) = a · da if d ≤ dmax

(4.2)

0 else

A second option is to line-up the bundling attraction with the attraction force that

hinders the strain of curves (cf. Section 3.4.3.1). Both attraction forces are opposed to

64


4.4 Energy-Based Curve Bundling

each other into one combined energy model for concurrent routing and bundling of curves.

Assuming that the bundling attraction is weaker than the strain delimiting attraction of

curves, distant parts of curves (that are not eligible for bundling) are not bundled. This

option is not formalized in the following.

Attraction Force The energy decisively determines the attraction force acting on the

dummy nodes. The end points of the curves are not displaced. The attraction force FB

on a dummy node δ1 is directly derived from the energy defined in Equation 4.2.


p ⎪⎨ δ2

F B =

⎪⎩

− p


δ1

a−1 · pδ −p

2 δ1

pδ2 −pδ1 if pδ2 − p 0



δ1 ≤ dmax and

δ1 ∈ ∆(c1), δ2 ∈ ∆(c2), c1 = c2

else

(4.3)

A dummy node δ1 of the curve c1 is attracted to a dummy node δ2 of a different curve

c2 = c1. There is a weak force, and thus a small node displacement, if the two dummy

nodes are already close to each other. Consequently, the distance between them tends to

zero. Distant parts of the curves, that are not eligible for bundling, are not bundled. This

is because there is no attraction between dummy nodes that are more distant than dmax.

FB

dmax

bundled curves

Figure 4.7: Energy-based bundling of two curves eligible for bundling

Figure 4.7 shows a sketch of the attraction acting between two close curves. Direction

and magnitude of the bundling attraction force are depicted by the arrows between dummy

nodes. For simplicity and avoidance of clutter in the figure, each dummy node is only

attracted to one other dummy node. The bundling result are two closer, visually bundled

curves.

4.4.2 Order of Energy-Based Routing and Bundling

The order of routing and bundling may impact the produced hypergraph layouts. Therefore,

the differences are briefly outlined in the following paragraphs.

4.4.2.1 Separated Curve Routing and Bundling

Energy-Based Curve Routing First The energy-based routing of curves can radically

separate initially close curves (depicted in Figure 4.8a). As the example visualization in

Figure 4.8b shows, after routing both curves are clearly separated by the central cluster

65


4 Reduction of Visual Complexity

(b) The curves of the hyperedge in (a) are

routed first

(a) An example drawing of a not routed and

not bundled hyperedge

(c) The curves of the hyperedge in (a) are

bundled before routing

Figure 4.8: Impact of the order of energy-based routing and bundling on the resulting

hypergraph layout

of repulsing nodes. Consequently, the energy-based approach from above will not bundle

the curves as both are too distant and not eligible for bundling anymore.

After the bundling of curves, all moved curves must be re-routed again to ensure the

avoidance of both, the occlusion of nodes and the intersection of clusters.

Energy-Based Curve Bundling First Figure 4.8c depicts the result of the example hypergraph

where both curves are bundled before routing. This approach obviously produces a

different layout compared to Figure 4.8b. A re-routing is not necessary for this approach.

Conclusion It is possible to create instances to underline advantages and disadvantages

of both orders. No particular order can be favored because of their layouts. The need of

the first approach to re-route the curves might be a drawback due to its additional time

consumption.

4.4.2.2 Concurrent Curve Routing and Bundling

A further option is the combination of routing and bundling. This section picks up the

statement from the discussion of routing techniques in Section 3.5 that the routing of curves

fosters curve bundling. Hence, the subsequent combination of energy-based routing and

bundling is a reasonable option.

This combination was already discussed above in Section 4.4.1 as a second option to

formalize the attraction energy EB. The combination of both techniques into one consistent

energy model overcomes the need to decide on the order of routing and bundling.

66


4.5 Energy-Threshold-Based Aggregation

Thus, both goals of routing and bundling are likewise pursued. This approach applied

on the example hypergraph may result in layouts similar to both layouts in Figures 4.8b

and 4.8c.

Energy-Based Routing Fosters Bundling The purpose of routing techniques is to move

curves to paths without hindering nodes. These correspond to paths with (locally) low

total repulsion energy caused by nodes. For instance, an optimal path for routed curves

between two clusters of repulsing nodes is located in-between them. There is a high

probability that two curves, which are initially placed close to this path, will be placed

at this optimal path with low energy after routing. Subsequently, energy-based routing

already assists bundling as close curves are converged. The combination of routing and

bundling thus can support each other to visually bundle curves.

4.4.3 Conclusion

The energy-based bundling technique visually bundles close parts of curves together. However,

it requires an explicit distance-based restriction to decide which curves are eligible

for bundling. Alternatively, energy-based routing and bundling can be combined into one

consistent energy model. Both approaches of the bundling technique reduce the visual

complexity of hypergraphs.

An automated identification of bundled parts of the curves is not possible by this energybased

bundling technique, because the hypergraph model remains unchanged. A hypergraph

drawing consequently shows the curves individually; the impression of an aggregation

of curves arises by an overlapping of curves.

The following section introduces a technique to aggregate visually bundled curves in the

hyperedge model.

4.5 Energy-Threshold-Based Aggregation

The reduction of the complexity of the hyperedge models might be preferable to a solely

visual bundling of curves. The information of which parts of curves are aggregated can be

useful for further hypergraph processing or the visualization. Furthermore, a simplified

model reduces the complexity of the hypergraph layout computation.

This technique identifies close parts of curves of a given hypergraph layout, aggregates

those parts of the curves and reduces the complexity of the hyperedge structure accordingly.

Each curve creates an energy field with maximum field strength at the curve’s

position. The strength of the energy field decreases with increasing distance from the

curve. The energy fields of close curves overlap each other. Two parts of curves are aggregated

if the sum of both field strengths between both parts are higher than a certain

threshold.

Prerequisites Routed and bundled hypergraph layout are likely to position curves more

closely. The energy-based routing tends to accumulate curves at paths with locally low

67


4 Reduction of Visual Complexity

energy and the energy-based bundling technique directly aims at a visual bundling of

curves. Routing or bundling of curves are not required to apply this aggregation technique,

but can be expedient steps in advance.

Threshold-Based Closeness The closeness of curves can be measured by the distances

between them or by the strengths of energy fields. Both measures are similar as the energy

field strength is a function against the distance, e.g., E ∼ d x . Energy is a more abstract

measure, as it can be adjusted to the scaling of the graph layout, and it allow different

energies of different curves, e.g., to increase the energy of aggregated curves.

Two points on distinct curves are close to each other, and thus will be aggregated, if

the sum of both energy field strengths between those points is higher than the energy

threshold. Again, it is necessary to model curves by chains of dummy nodes. The energy

field strengths are computed with respect to the positions of the dummy nodes.

4.5.1 Rationale

Once an energy threshold is set, the sum of the energy field strengths Ec1 +Ec2 between two

dummy nodes δ1 ∈ ∆(c1) and δ2 ∈ ∆(c2) of two distinct curves c1 and c2 is calculated.

The energy sum E(ps) = Ec1 (d1) + Ec2 (d2) at a position ps depends on the distances

d1 = pδ1 − p


s and d2 = pδ2 − p


s to the dummy nodes, respectively.

The curves are aggregated if and only if the energy sum E(p s) ≥ t is greater than an

energy threshold t at all positions p s between the dummy nodes δ1 and δ2. The energy

field strengths against the distance are plotted in Figure 4.9 for the following scenario.

The cross sections of curves are depicted, the horizontal axis reflects the distance between

them. The vertical axis reflects the energy field strength. The two curves c0 and c1

were already aggregated and they both create a stronger energy field than curve c2. The

sum of all field strengths (depicted by the bold plot in this figure) between the curves is

higher than the threshold value (the dotted line). Eventually, the curves c0, c1, and c2 are

aggregated.

A threshold distance dt is derived from the threshold energy t. As the energies can be different

for each curve, this threshold distance dt can be different for different pairs of curves.

This threshold distance dt is derived from the energy fields Ec1 and Ec2 . Consequently,

two parts of the curves c1 and c2 are close if and only if the distance pδ2 − p


δ1 ≤ dt

between the dummy nodes is not larger than the threshold distance dt.

The position of an aggregated curve is determined by the remaining graph layout. The

aggregated curves are roughly positioned between the individual curves. The precise

positions are computed by the energy-based routing technique to avoid the introduction

of node occlusion or cluster intersection by the aggregation of curves.

4.5.2 Conclusion

This curve aggregation technique based on an energy-threshold allows to simplify the

hypergraph models and thus reduces the visual complexity of hypergraph drawings. The

68


c0

E E

energy sum

c1

cross section of curves

4.6 Cluster-Based Curve Aggregation

threshold

0 0

Figure 4.9: The aggregation of close curves based on the preset energy threshold (dotted

horizontal line)

specification of a threshold constitutes the aggregation of curves. So the choice of a proper

threshold value is crucial.

This aggregation technique can be generally applied to hyperedges. A previous routing

or bundling of curves is optional. But this technique can enhance the energy-based

bundling technique of the previous section, if both techniques are used in sequence. It

bridges the gap between visual bundling and an model-based aggregation of curves, by

transforming a visually bundled curve layout into aggregated curves in the model. However,

the cluster-based curve aggregation technique in the subsequent section is less laborious

than this one.

4.6 Cluster-Based Curve Aggregation

The aggregation of curves based on a spatial clustering is another technique to reduce the

visual complexity of hypergraph visualizations. The hypergraph structure is simplified in

the model, which corresponds to a reduction of the intrinsic cognitive load. This technique

first groups close adjacent curves. Then, curves of the same group are aggregated as far

as possible and at a certain point on an aggregated curve, the aggregated curve branches

out to individual curves joining the hyperedge nodes.

Prerequisites In this section, for the sake of simplicity, the centralized hyperedge structure

is assumed unless stated otherwise. Curves connect hyperedge nodes with the crux.

Assuming that the curves are not routed yet, curves are aggregated near their ends at

the crux due to the small distance between them. The absolute distance between two

neighboring curves in a radial setup is not crucial for grouping, but the included angle

implies their eligibility of aggregation.

At the end of this section, it is shown that the cluster-based curve aggregation can

also be applied to the fully connected structure as this technique is not restricted to the

centralized structure.

c2

d

69


4 Reduction of Visual Complexity

The remainder of this section introduces a simple prototype algorithm to identify groups

of curves that are going to be aggregated. After this, a method for the calculation of

branch points of aggregated curves is shown in Section 4.6.2. Then, Section 4.6.3 describes

movable curve end points as a consequence of aggregating and branching curves. And

finally, a conclusion of this curve aggregation technique is drawn.

4.6.1 Identification of Curve Groups

The first step towards an aggregation is the identification of groups of close curves. A

polar coordinate system is originated at the position of the crux. The hyperedge node

positions are described by the azimuth angle (or polar angle θ) and the curve length. The

azimuth angle θ(c) of a curve c is the included angle of the zero degree ray (polar axis)

and the curve.

The plethora of well-investigated clustering algorithms [63] easily allows to partition

hyperedge nodes. The azimuth angle serves as a primary criterion to cluster the respective

curves. The Euclidean distance between hyperedge nodes and the crux may also influence

the creation of groups, but is not considered here.

Different clustering algorithms are expedient for different purposes. As this work does

not focus on a specific application or criterion of hypergraph visualization, a discussion of

clustering algorithms is outside its scope. The following paragraphs therefore introduce a

simple strategy to group nodes. This strategy is adequate to explain the concept of curve

aggregation and to document a decrease of visual complexity of hypergraphs.

70

1. The azimuth angle θ(c) of each curve c is computed.

2. The included angle ∆θ(ci, cj) between each pair of distinct curves ci and cj is computed.

∆θ(ci, cj) = |θ(ci) − θ(cj)| (4.4)

3. The pair of curves ci and cj with minimal included angle ∆θ(ci, cj) is aggregated. The

aggregated curve cij is located between both individual curves. The direction of the

aggregated curve is affected by the curve weights w(ci) = len(ci) and w(cj) = len(cj),

which correspond to the curves’ lengths, i.e., the Euclidean distance between the end

points of the curve. The azimuth angle θ(cij) of the aggregated curve is given as

follows.

θ(cij) = θ(ci) +

len(cj)

len(ci) + len(cj) · (θ(cj) − θ(ci)) (4.5)

The weight w(cij) = len(ci) + len(cj) of an already aggregated curve cij is the sum

of lengths of all curves aggregated into cij.

4. The curve group cij is added and the two individual curves ci and cj are removed

from the set of curves in the model of the hyperedge structure.

5. Steps 1 through 4 are repeated until there is no further pair of distinct curves with

an included angle less than the maximum angle. The maximum angle ∆θmax is

the termination criterion of this algorithm and specifies which curves are eligible for

aggregation.


v6

v5

v4

v0

crux

v1

v2

v3

(a) Hyperedge without curve aggregation

v6

v5

v4

4.6 Cluster-Based Curve Aggregation

v0

crux

v1

v2

v3

(b) Aggregated curves for ∆θmax = 90 ◦

Figure 4.10: Visualization of an exemplified hyperedge

c0 c1 c2 c3 c4 c5

c6 30 60 120 150 120 60

c5 90 120 180 150 60

c4 150 180 120 90

c3 120 90 30

c2 90 60

c1

30

Table 4.2: Initial included angles between curves of the hyperedge in Figure 4.10a

c0,6 c1 c2 c3 c4

c5 70 120 180 150 60

c4 130 180 120 90

c3 140 90 30

c2 110 60

c1

50

Table 4.3: Updated included angles after the aggregation of the curves c0 and c6

c0,6,1 c2,3

c4,5 130 120

c2,3

110

Table 4.4: Result of the grouping algorithm

71


4 Reduction of Visual Complexity

The example in Figure 4.10a depicts a centralized hyperedge composed of seven curves.

The included angles between curves are shown in Tables 4.2 through 4.4. The bold values

in these tables label the minimal included angle between two curves that are aggregated in

the respective grouping step. Two grouping steps between Table 4.3 the result in Table 4.4

were omitted.

If the maximum aggregation angle is set to ∆θmax = 90 ◦ , then, in compliance with

Table 4.4, the presented grouping algorithm will compute three groups of curves: c0,6,1,

c2,3 and c4,5 as depicted in Figure 4.10b.

4.6.2 Branch Out of Aggregated Curves

In the following, an aggregated curve that emerged from the aggregation of two curves ci

and cj is denoted by cij. After curves are aggregated and the direction of the aggregated

curves is computed, it is necessary to branch out the aggregated curves again. By this,

the respective hyperedge nodes are connected to the aggregated curve. The branch point

should be placed in a way that reduces the visual complexity and allows smooth furcation

of individual curves and the aggregated curve.

A branch point p b is placed at the latter half (that is not adjacent to the crux) of an

aggregated curve cij. A large included angle between two curves ci and cj corresponds

to a short common aggregated path of ci and cj. The maximum common path length of

curves that are aggregated in cij is limited by the minimal length min{len(ci), len(cj)}

of individual curves. The minimal common aggregated path is here set to the half of the

minimal individual curve length.

pb = 1

2 min{len(ci), len(cj)} + 1

2 min{len(ci),


len(cj)} · 1 − ∆θ(ci,


cj)

∆θmax


= min{len(ci), len(cj)} · 1 − ∆θ(ci,

(4.6)

cj)

2 · ∆θmax

An aggregated curve, i.e., group of curves, that consists of more than two individual

curves is branched out in almost the same manner. The two longest curves ci and cj of

an aggregated curve are bifurcated first as in Equation 4.6. Then the third longest curve

ck is branched out from the aggregated curve cij. In order to do this, the length of the

aggregated curve cij needs to be known. This length is defined as the distance between the

crux and the latest branch point p b of cij. This procedure is applied until all hyperedge

nodes of this group are connected.

Finally, the branch points are connected to the corresponding hyperedge nodes and to

the other branch points on the path to the crux. The result of curve aggregation applied

to the example above is shown in Figure 4.10b.

4.6.3 Movable Curves

Branch points, as well as the crux, do not consider the given fixed graph layout. Thus,

they may be placed inappropriately as they can occlude nodes, or they could be placed

72


4.7 Energy-Based Curve Widening

inside a dense group of nodes. The end points of curves, which are introduced by the

branching method, must also meet the requirements of hypergraph visualizations from

Section 2.2.2.

Hyperedge nodes that are end points of curves are not displaced. The other curve end

points are movable in the layout area. The optimal position of curve end points can

be computed during energy-based routing. The energy model for routing presented in

Section 3.4 can also integrate movable curve ends as dummy nodes. Consequently, an

optimal position of an end point respects the fixed graph layout (cf. node repulsion in

Section 3.4.2) and prevents very long curve lengths (cf. strain of curves in Section 3.4.3).

4.6.4 Conclusion

A prototype algorithm for the aggregation of close curves was introduced. This algorithm

serves as a placeholder for more sophisticated and potentially more adapted grouping algorithms

for visually pleasing and meaningful results. For instance, a hierarchical clustering

algorithm can produce a hyperbolic tree, i.e., a nested aggregation of curves. Nevertheless,

the algorithm allows to demonstrate that curve aggregation is able to reduce the visual

complexity of hypergraph drawings as the evaluation in Section 5.4.1 will prove. Concluding

the example from above, the aggregation of curves reduced the total curve length by

more than 20 percent and increased the number of curves by 2.

The aggregation and branching of curves introduces new curves. The increasing number

of curves and dummy nodes generally increases the computation time needed to generate

hypergraph layouts. At the same time, it increases the quality of approximation of the

curve representation.

Fully-Connected Hyperedge Structures The aggregation of curves is not limited to centralized

hyperedge structures. Other structures may not have a crux, but a subset of

curves are incident as they share a common hyperedge node. The curve aggregation can

be applied to all hyperedge nodes where incident curves can be aggregated.

Certainly, this approach will not aggregate close curves that are not incident at all. So

the centralized structure is preferable for curve aggregation.

4.7 Energy-Based Curve Widening

The visual complexity of hypergraphs can be reduced by bundling curves visually together

or aggregating them in the hypergraph model. The widening of curves is a technique to

bundle curves visually together as wide curves are more likely to overlap each other. This

approach also aims at the reduction of the visual complexity by reducing the extraneous

cognitive load. The intrinsic load is not modified.

As the width of curves in hypergraph drawings increases, the curves will be rather

considered as planes than line segments. A hyperedge is represented by a plane construct

in the unused graph layout area. It is visually clearly separated from the remaining graph,

in particular from ordinary straight-line edges. These different visualization paradigms

73


4 Reduction of Visual Complexity

of hyperedges and boxes and lines foster the readability of hypergraph visualizations and

still allows to visualize binary edges of the underlying graph.

Based on the findings of the examined curve routing approaches in Chapter 3, an energybased

approach is favored over a geometrical approach. Geometrical concepts tend to rely

on the specification of fixed values and spatial computations, which can become fairly

complex as a plethora of occurring instances has to be considered.

This section introduces a transformation of curves to a plane structure by utilizing energy

fields. First, an introduction to the visualization of curves in hypergraph drawings

is given to further motivate a widening of curves. Section 4.7.3 introduces a model of of

widened curves that allow their representation in energy fields. Afterwards, Section 4.7.4

outlines the basic principle of this approach that is formalized in Section 4.7.5. A conclusion

of this energy-based widening technique finishes this section.

4.7.1 Visualization of Curves

The visual representation of a curve in a graph drawing is a line segment. The positioning

of curves and the specification of the line width influence their visualization. A brief

digression on the positioning of curves is given first. Then, the line width of visualized

curves is discussed.

Positioning The positioning of curves is based on the positions of the dummy nodes.

Curves may be drawn as a sequence of straight line segments connecting neighboring

dummy nodes. Alternatively, the next two paragraphs show methods for curve fitting,

that can smoothly draw the curves.

Splines may be used to form smooth curves passing the dummy nodes. The de Casteljau

algorithm [28] allows to interpolate a curve between any number of control points. Control

points are given by the dummy nodes and the end points of the curve.

However, splines do not necessarily intersect the control points. They are only required

to go through the end points of a curve. In contrast, so called interpolating Lagrange

curves intersect all given points. The Aitken algorithm [28] allows to compute interpolating

Lagrange curves, i.e., lines intersecting all dummy nodes and the end points of the curves.

Splines and Lagrange curves both tend to deviate more than straight lines connecting

neighboring dummy nodes. Since energy-based routing only guarantees the prevention of

node occlusion at the positions of the dummy nodes, splines are more susceptible to node

occlusion.

Line Width Besides the positioning of curves, their line width is of particular concern in

the remaining section. Each line drawing has a certain width. Curves can be visualized

with marginal line width or by plane constructs with much larger curve width. A wide

curve must still avoid node occlusion to meet the requirements of hyperedge visualizations.

Therefore, if the line width is not minimal, a plane curve has to spare the nodes.

Another benefit of a plane curve representation is a higher ability to recognize characteristic

forms of a hyperedge. An on-line animation of the evolution of a hypergraph may

74


4.7 Energy-Based Curve Widening

slightly change the positions or the set of nodes in each step. Characteristic forms, like

a bulge of a wide curve, supports the viewer’s navigation and orientation in the hypergraph

drawing. In contrast, lines are not able to form characteristic shapes that are stable

against a slight change of a graph layout in an on-line animation.

Further, the curve width may reflect information about the hypergraph. For instance, a

broad curve is placed between two clusters. In the following step of an on-line animation

of this hypergraph, the two clusters grow by adding further nodes. As a result of this

change, the curve width reduces since the increased repulsive strength of the clusters

stronger hinders the widening. A viewer of this animation thus can interpret the curve

width as an indicator of density of surrounding nodes.

4.7.2 Prerequisites

Widened curves must also fulfill the requirements of hypergraph visualization. The expressiveness

of the given graph layout is still preserved if the widened curves do not occlude

nodes and, of course, the graph layout must not change.

The width of a visualized curve has to be limited. If there are no graph nodes limiting

the widening of curves, a curve must not be widened infinitely.

4.7.3 Model of a Widened Curve

The definition of repulsion and attraction energies enables the calculation of the layout

of widened curves. Similar to the model of curves, which allows to handle curves in

energy fields, a model of planes representing widened curves in energy fields is required

first. Afterwards, the energy-based principle and the composition of the energies into one

consistent energy model are examined.

Hull A routed curve is modeled as a chain of dummy nodes between the end points.

These dummy nodes are also the initial point of the model of a widened curve. A widened

curve is basically a line segment visualized with a non-zero line width. It is sufficient to

model the hull, i.e., the boundaries of the plane, to describe a widened curve. The hull

circumferentially encloses the entire curve and is the set of the outermost points of the

widened curve. Figure 4.11a shows a dashed-line hull of a routed curve.

Hull Points Still, the hull of a curve is an infinite set of points. Similar to curves, a

proper approximation is needed to compute the influence of energy fields on the hull and

thus on the body of a widened curve. The concept of dummy nodes is reused to construct

an approximation of the hull. This way, the flexibility of the incremental raise of the

accuracy of curves also holds for the accuracy of corresponding hulls.

Perpendicular vectors on both sides of a curve, originated at the positions of the dummy

nodes, reflect the potential positions of the hull points. Figure 4.11a depicts hull points by

the gray circles placed on the hull. The direction cδ of a curve c at the position of a dummy

node δ is derived from the positions of both neighboring dummy nodes δ1, δ2 ∈ neighbor(δ).

75


4 Reduction of Visual Complexity

(a) Hull of a routed curve modeled by hull points

(b) Perpendiculars of a routed curve

(c) The result of the transformation from a routed curve to a plane

Figure 4.11: Energy-based widening of routed curves

This complies with the linear Bézier spline between the two control points p δ1 and p δ2 ,

obtained by linear interpolation.

cδ = p δ2 − p δ1

Then, in two-dimensional space the perpendiculars o1 and o2 at the position of δ are

orthogonal to the local direction cδ of the curve.


−c

o1 =

y

δ

cx δ


c

o2 =


y

δ

−cx δ

(4.7)

As the hull points of the hull are part of the hyperedge layout, their positions in the graph

layout are denoted by p h.

76


FNR

o1

FCR

FCA

v

FHA FHA

FCA

4.7 Energy-Based Curve Widening

FCR

o2

FNR

Figure 4.12: Acting forces in the process of energy-based widening

Figure 4.11b depicts a routed curve and its perpendicular vectors. As the end points of

the curve only have one neighboring dummy node, the direction of the curve at the end is

derived from the end point and its neighboring dummy node. Further, another hull point

and perpendicular is introduced for each end point of a curve. Both are placed on the

extension of the curve’s direction at the end point.

4.7.4 Rationale

The starting point of the widening algorithm is a routed curve, whose line width can be

assumed to be zero. The initial hull is therefore placed on the curve. The hull is “inflated”

by a repulsion. The repulsion pushes the hull equally in all directions away from the curve.

The repulsion of the hull points by the curve is limited to a maximum curve width.

It is crucial that the hull must not occlude nodes of the graph layout. Therefore, the

repulsion of the curve is limited by an opposing node repulsion. Figure 4.11c drafts the

result of this approach. A gray plane represents the area enclosed by the hull of the curve

that spares nodes. Before all energies are formalized, their purposes and requirement are

briefly summarized in the subsequent paragraphs.

Node Repulsion The repulsion of hull points by the curve, which is addressed below, is

the driving force of this energy-based widening technique. However, the node repulsion is

more important for the definition of an energy model, because widened curves must not

occlude graph nodes. Figure 4.12 illustrate node repulsion that is labeled with FNR and

the curve repulsion that is labeled with FCR acting on the hull points. The remaining two

forces in Figure 4.12 are be introduced below.

The specification of the node repulsion is crucial as it permanently has to be predominant

over the remaining energies to guarantee the prevention of node occlusion. The hull

points are only repulsed by the nodes of the same side of the curve. Otherwise the curve

width would be increased by repulsing graph nodes of the opposite side of the curve.

77


4 Reduction of Visual Complexity

Consequently, the layout of one side of a curve does not influence the hull on the other

side of the curve.

Again, the accuracy of the hull model determines the quality of the result. A large

distance between neighboring dummy nodes, which also entails a large distance between

neighboring hull points, is more susceptible to node occlusion than a smaller distance.

FCR

FNR

Figure 4.13: Two hull points are

mainly repulsed by the close curve.

The node repulsion force FNR is minor

and thus does not prevent node

occlusion.

Curve Repulsion Hull points are repulsed from the

curve to widen the curve. Each hull point is repulsed

in the direction of the corresponding perpendicular.

In contrast to previous repulsions used in this work,

the curve repulsion must not tend to infinity for

very small distances between dummy node and hull

point. Otherwise, as Figure 4.13 depicts, the curve

repulsion might be stronger than a node repulsion, if

the distance between curve and hull point is significantly

smaller than the distance between node and

hull point. This would consequently permit node

occlusion.

Curve Attraction In order prevent an infinite curve width, an attraction between the

curve and the hull is added. The curve attraction acts in the opposite direction of the

curve repulsion. While the curve repulsion decreases and the curve attraction increases

with increasing distance from the curve. The curve attraction starts to dominate the curve

repulsion at a certain distance Wc. The specification of curve repulsion and attraction thus

determines the maximum curve width Wc of a curve.

Hull Attraction Node repulsion, curve repulsion, and curve attraction are still not sufficient

to produce proper hull layouts. As the node repulsion can be much stronger than

curve attraction and curve repulsion to avoid node occlusion, hull points can rigorously deviate

from the perpendiculars. An attraction between neighboring hull points of a curve’s

hull keeps them closer to the position of the perpendiculars and hinders a distortion of the

hull. This hull attraction is similar to the attraction between neighboring dummy nodes

to hinder the strain of curves.

4.7.5 Formalization

Four energies are required to compute the hulls that represent widened curves. After

the clarification of their purposes in the previous section, this section formalizes energies,

forces as functions of the distance, and the iterative displacements of hull points. The

creation of an equilibrium is already considered as the equations are introduced.

4.7.5.1 Node Repulsion

The hull points are repulsed by the nodes on the same side of the curve. There is a strong

repulsion by a node v ∈ V of a hypergraph H = (V, E) in a small distance from the hull

78


4.7 Energy-Based Curve Widening

point h. Further, the repulsion diminishes with increasing distance p v − p h. Obviously,

the node repulsion energy will only affect the hull point’s position. The graph layout

remains unchanged.

Equilibrium To create a balance of all forces in the widening process, the magnitudes

of the forces are balanced regarding optimal layouts. A similar procedure to balance the

forces of energy-based routing was thoroughly introduced in Section 3.4.4. Hence, the

determination of the coefficients of the following equations are only briefly justified and

used in the following equations.

Nodes are obstacles for the widening of curves. As the width of a curve increases, i.e.,

the hull points are moved on the perpendiculars away from the respective dummy nodes,

the node repulsion must be stronger than the repulsion by the curve (in the close proximity

of a node). This requirement must apply for every possible position of hull points in such

a close proximity of nodes to prevent node occlusion. Otherwise, as Figure 4.13 already

depicted, curves may occlude nodes.

The radius of the area around a node, which must

assert a stronger node repulsion than curve repulsion,

depends on the distance dh between neighboring hull

points of a curve, as illustrated in Figure 4.14. By this,

if a node is located between two neighboring perpen-

diculars, then the distance between the node and the

closest hull point is smaller than dh

2 for a certain curve

width. Therefore, the node repulsion at a distance dh

2

from the node must be always stronger than the curve

repulsion. Because there are no other requirements of

an equilibrium so far, the magnitude of the node repulsion

force can be defined as 1 at the distance dh

2 .

dh

Figure 4.14: Distance dh between

two neighboring hull points

Node Repulsion Energy The repulsion energy ENR acting on a hull point h, caused by

a node v ∈ V , adds up to the following.


⎪⎨ −

ENR =

⎪⎩

1

r ·

r 2

· pv − ph dh

r

if r < 0,

− dh

2 · log pv − ph if r = 0

The repulsion exponent r ≤ 0 allows to adjust the strength of the repulsion by nodes.

(4.8)

Node Repulsion Force The force FNR is calculated as the negative gradient of the energy

ENR.

FNR = −

2

dh

r−1

· p v − p h r−1 · p v −p h

p v −p h

(4.9)

79


4 Reduction of Visual Complexity

Hull Point Displacement In each iteration of an energy minimization algorithm, a hull

point h is moved by ∆x and ∆y along the x- and y-axis, respectively.

4.7.5.2 Curve Repulsion

∆x = −

∆y = −

2

r−1

dh

r−1 2

dh

· (p x v − p x h) · p v − p h r−2

· p y v − p y

h · pv − ph r−2

(4.10)

The curve repulsion must take the strength of the node repulsion into account. This way,

scenarios of occluded nodes as in Figure 4.13 from page 78 can be prevented. The curve

repulsion is maximal for distance of zero and decreases with increasing distance. The

magnitude of the curve repulsion force acting on a hull point therefore is limited by a

certain value F. The value of F, which is the largest magnitude FCR(0) of the curve

repulsion force, is equal to the smallest magnitude FNR( dh

2 ) of the node repulsion force

that may occur at possible positions of hull points. The value of F can be different for

each curve, as it depends on the distance dh between neighboring hull points.

F = FCR(0) = FNR( dh

2 ) (4.11)

Equilibrium The balance of forces is defined for an optimal layout. That is, regarding

curve repulsion, a curve with maximum curve width Wc. The maximum curve width

is determined below in conjunction with the curve attraction. A hull point h is in its

optimal position p o,CR

h regarding curve repulsion if it has a distance pδ − ph = Wc from

the respective dummy node δ on the curve. The magnitude of the curve repulsion force is

set to FCR(Wc) = 1 at the optimal position of h.

The curve repulsion force function FCR(d) ∼ d r against the distance d = p δ − p h

between dummy node δ and hull point h has two characteristic points (0, F) and (Wc, 1).

Let the curve repulsion energy exponent be r = 0, the two parameters a and b adjust the

force accordingly.

FCR(d) = b

d + a

(4.12)

Both parameters are derived from the maximum curve width Wc of a curve c and the

maximum curve repulsion magnitude F by substitution the two characteristic points into

Equation 4.12.

80

a = Wc

F − 1

b = Wc · F

= a · F

F − 1

(4.13)


4.7 Energy-Based Curve Widening

Curve Repulsion Energy Derived from the parameters a and b, the curve repulsion energy

ECR between a dummy node δ and a hull point h is as follows:

ECR = −b · log (p δ − p h + a) (4.14)

In analogy to repulsion energies introduced earlier in this work, ECR in Equation 4.14

corresponds to repulsion energies with a repulsion exponent r = 0. Different curve repulsion

exponents require a recalculation of the parameters a and b, because both parameters

were computed under the assumption (Equation 4.12) of a curve repulsion force that is

similar to the function 1

d .

Curve Repulsion Force The repulsion force FCR acting on a hull point h in the direction

of the respective perpendicular o is:

Hull Point Displacement

4.7.5.3 Curve Attraction

FCR = −b · (p δ − p h + a) −1 · o

o

∆x = −b · (p x δ − p x h) · (pδ − ph + a) −2

∆y = −b · p y

δ − py

h · (pδ − ph + a) −2

(4.15)

(4.16)

Equilibrium The magnitude of the curve repulsion force FCR(Wc) = 1 was set to 1 for a

distance Wc between hull point and respective dummy node. Hence, the curve attraction

is characterized by the following two requirements.

• The curve attraction is weaker than the curve repulsion, if the distance is smaller

than the maximum curve width Wc

• The curve attraction is stronger than the curve repulsion, if the distance is larger

than the maximum curve width Wc

Curve Attraction Energy The curve attraction is described as a function similar to

ECA ∼ d a against the distance d = p δ − p h between a hull point h and the respective

dummy node δ. The attraction exponent a > 0 must be greater than zero.

Curve Attraction Force

ECA =

1

a · W a−1

c

· p δ − p h a

FCA = 1

W a−1 · pδ − ph c

a−1 · o

o

(4.17)

(4.18)

81


4 Reduction of Visual Complexity

Hull Point Displacement

4.7.5.4 Hull Attraction

∆x = 1

W a−1 · (p

c

x δ − p x h) · pδ − ph a−2

∆y = 1

W a−1 ·

c

p y

δ


− py

h · pδ − ph a−2

(4.19)

Equilibrium Regarding the distortion of a hull, the hull points of a curve’s hull are ideally

homogeneously distributed. However, the node repulsion impedes an even distribution and

distorts the hull.

δ1

o1

g

dopt(g,h)

δ2

h

o2

Figure 4.15: Optimal distance dopt

between two hull points is the average

distance between their perpendiculars

Optimal Distance of Neighboring Hull Points

The optimal distance dopt between two neighboring

hull points of a hull is the average distance between

the respective perpendiculars. The perpendiculars

of bent curves are not parallel and so the distance

between hull points on the perpendiculars changes

with the actual curve width. Thus, the optimal distance

between two neighboring hull points is the average

distance between the two perpendiculars as

depicted in Figure 4.15.

Let δ1 and δ2 be two neighboring dummy nodes

on a curve c with maximum curve width Wc. The

corresponding normalized perpendicular vectors are

o1 and o2. Then the optimal distance between the two corresponding hull points g and h

is, as depicted in Figure 4.15, determined by:




dopt(h, g) =

p Wc

δ2 +

2 · o2


Wc



− pδ1 + · o1

(4.20)

2

The magnitude FHA(dopt(g, h)) = 1 of the hull attraction force between neighboring

hull points is defined as 1 for an optimal distance between two neighboring hull points g

and h.

Hull Attraction Energy The attraction energy EHA on a hull point h is formalized below.

Again, it is adaptable by an attraction exponent a > 0.

Hull Attraction Force

82

FHA =

EHA =

1

a · dopt(g, h) a−1 · p g − p h a

1

dopt(g, h) a−1 · p g − p h a−1 · p g −p h

p g −p h

(4.21)

(4.22)


Hull Point Displacement

4.7.6 Conclusion

∆x =

∆y =

1

dopt(g, h) a−1 · (px g − p x h) · p g − p h a−2

1

dopt(g, h) a−1 · (py g − p y

h ) · p g − p h a−2

4.7 Energy-Based Curve Widening

(4.23)

The model of a hull that is approximated by the hull points is necessary to compute

the widened curves utilizing energies. An energy model for the widening of curves was

introduced. The four energies create an equilibrium of the hull points that is able to

increase the curve width, to avoid node occlusion, and to prevent an infinite curve width.

As the curve widths can be increased, close curves may overlap each other. This promises

a reduction of the visual complexity. Several hypergraph visualizations with widened curve

are shown in the evaluation in Section 5.4.2.

Smoothing of the Hull Besides the prevention of the hull distortion, the hull attraction

additionally evens the surface of the hull. A radical change of the curve width between

neighboring hull points is smoothed, because the attraction of hull points to its neighbors

grows with increasing distance in between. After several iterations of an energy

minimization algorithm, the hull points swing into a more smooth hull surface.

The energy-based widening technique operates on individual curves. Adjacent curves

are not recognized and so this technique is not capable to smooth or merge adjacent

curves. Nevertheless, such a visual enhancement is not crucial for a reduction of the visual

complexity of hypergraphs and for the cognition of structural information of hyperedges.

Side Effect of the Hull Attraction For the sake of simplicity, the optimal distance

dopt between neighboring hull points was approximated to the average distance between

perpendiculars (cf. Equation 4.20). A routed and thus bent curve consists of convex and

concave curve segments. With increasing curve width the actual hull point distance on

the convex side becomes larger than the calculated optimal distance from Equation 4.20.

Thus, the hull attraction produces a smaller curve width on the convex side of a bent

curve.

As a consequence, an energy minimization algorithm does not increase the curve width

of convex curve segments to the maximum width Wc. This minor effect is not crucial, since

curves are usually not radically bent by an energy-based routing technique. Furthermore,

as usually nodes repulse an expanding hull, a downsizing of the maximal curve width will

not effect the drawing either.

Performance The widening of curves can become fairly expensive. In particular the

computation of the node repulsion involves |V | angle computations for each hull point

of each curve of the hypergraph. The angle between the vectors o of the respective

83


4 Reduction of Visual Complexity

perpendicular and the vector p v − p δ determines the relative position of a node v ∈ V to

the curve.

A remarkable simplification is the restriction to nodes which are close enough to impact

the widening of a curve. This is possible since the potential maximum curve width Wc

has to be defined anyway. A smaller number of considered nodes reduces the number

of angle computations significantly. Furthermore, the set of nodes that is considered for

each hull point can be stored for reuse in all subsequent iterations of an energy minimization

algorithm used to widen the curves. These modifications significantly reduce the

computational effort of an energy-based curve widening.

If only nodes within a certain range repulse hull points, a radical change of the curve

width among neighboring hull points may occur. This happens if one hull point is within

and another hull point is outside of such a range. Nevertheless, a radical change of the

curve width between neighboring hull points is eliminated by the hull attraction force.

4.8 Discussion

A visualization of hyperedges in graph layouts generates additional cognitive load in a

graph drawing. The reduction of the visual complexity of hyperedge visualizations aims

at retaining the readability of the displayed information within the narrow screen space.

This chapter therefore introduced four techniques to reduce the complexity of hypergraph

visualizations. Since hyperedges are based on curves, these techniques operate on the

curves, too.

Energy-based bundling of curves can be combined with the energy-based routing technique.

The curves are visually bundled together. The resulting hyperedge layouts fulfill

the requirements of hypergraph visualizations and have a reduced visual complexity.

The energy-threshold-based aggregation of curves transfers a visual bundling of curves

into an aggregation of curves in the hyperedge model. The combination of energy-based

bundling and threshold-based aggregation is a two-tiered technique to reduce the complexity

of the hyperedge structures by aggregating curves.

Another technique that reduces the complexity of hyperedge models is the clusterbased

aggregation of curves. The latter technique seems less laborious than the two-tired

one and promises comparable results. Therefore, the cluster-based aggregation of curves

is preferred for a later prototype implementation and is evaluated in Section 5.4.1 on

page 5.4.1 to demonstrate the reduction of the visual complexity of hyperedges.

The widening of curves also bundled close curves visually by altering the visualization

of curves rather than rearranging their paths. The evaluation in Section 5.4.2 will examine

the avoidance of node coverings of the energy-based curve widening technique. The accompanied

produced hypergraph drawings can prove the reduction of their visual complexity

due to this technique.

84


5 Evaluation

This chapter evaluates the proposed layout techniques, namely energy-based routing,

energy-based curve widening, and the cluster-based aggregation of curves, and investigates

their abilities to meet the requirements of hypergraph visualizations. For this purpose,

the evaluation utilizes the criteria for hypergraph visualizations that were given in

Section 2.2.3.

Each of the following experiments examines a single hypothesis that was assumed in this

work. Furthermore, and not less important, these experiments prove the suitability of the

proposed layout techniques. This work is accompanied by a prototype implementation that

is briefly introduced in the subsequent Section 5.1. Next, the example hypergraphs used

to evaluate the hypergraph layout techniques are presented in Section 5.2. The remaining

structure of this chapter is derived from the requirements of hypergraph visualizations as

follows.

The proposed layout techniques were designed to meet the given requirements of hypergraph

visualization, in particular

1. the establishment of a uniform visual connectivity between all nodes of a hyperedge,

2. the preservation of the expressiveness of the given graph layout, and

3. the reduction of the visual complexity of hypergraph drawings.

The first requirement is fulfilled by the choice of the hyperedge structure that was briefly

discussed in Section 2.3. The second requirement comprises a fixed graph layout and the

avoidance of node occlusion and cluster intersection. The graph layout was not altered

by any layout technique. The energy-based curve routing technique that hinders node

occlusion and cluster intersection is evaluated by the experiments in Section 5.3. The

reduction of the visual complexity of hypergraph layouts is evaluated in Section 5.4.

According to the definition in Section 2.2.1, the term hypergraph layout denotes the

positions of the nodes and the dummy nodes of a hypergraph. The graph layout is given

or computed separately, and this chapter therefore distinguishes the graph layout, i.e.,

the positions of the nodes, from the positions of hyperedges. Thus, the term hypergraph

layout as used in this chapter excludes the positioning of the nodes.

5.1 Implementation

The prototype implementation is based on the LinLogLayout tool [3]. In its own words,

“LinLogLayout is a simple, easy-to-use open source program (written in Java) for computing

graph drawings, using the LinLog energy models and standard energy models

like Fruchterman-Reingold, and graph clusterings, using the Modularity measure of Mark

Newman. It includes a reusable energy minimizer (spring embedder) class based on the

85


5 Evaluation

efficient Barnes-Hut algorithm, and a reusable class for Modularity clustering based on a

multi-scale algorithm.”

Graph Layout Computation The basic concepts of energy-based layout algorithms were

already introduce in Section 2.1.2. The LinLog energy model, i.e., the (a, r) = (1, 0)energy

model computes graph layouts that reflect the structure of the graph. Such layouts

place densely connected vertices close and sparsely connected nodes more distant.

The computation of graph layouts starts with random positions of nodes. The energy

minimization algorithm initiates the layout computation with an PolyPoly energy model

that is less susceptible to local energy minima than the LinLog energy models [35]. In

the progress of the iterative energy minimization, the energy model is slightly changed

to the final (1, 0)-energy model. The number of iterations of the energy minimization

algorithm is chosen very conservatively, i.e., rather high, to produce stable graph layouts

with (locally) minimal energy.

Hypergraph Layout Computation Hypergraph layouts are computed to compare different

parameters, techniques, and settings with each other. The generated layouts are

mainly evaluated based on their energy values. For this, it is inevitable to facilitate the

same initial graph layout for the computation of the hypergraph layout, as otherwise the

layout energies are incommensurable.

The layout of hyperedges is determined by the positions of dummy nodes and optionally

by the positions of hull points. The layout computation utilizes the same iterative energyminimization

approach as in LinLogLayout to move dummy nodes or hull points. The

energy models of the hyperedge layout techniques do not change during execution.

5.2 Example Hypergraphs

In the experiments presented in Sections 5.3 and 5.4, various hypergraphs are subjected

to the layout techniques to examine their capabilities. The employed graphs depict a software

system, a social network, the world trade, and a pseudo-random clustering structure.

In each of the visualizations of these hypergraphs, nodes of different clusters are assigned

different colors, such that the affiliation of each node to a specific cluster is obvious. This

color coding is especially important when we evaluate the avoidance of cluster intersections.

The size of the nodes in each graph visualization corresponds to their degree in the

underlying graph.

The following paragraphs outline the background and purposes of the example hypergraphs.

Each of the following hypergraphs consists of the original graph and one additional

hyperedge.

ArgoUML Software System

The visualization of software systems is the main motivation of this work. Consequently,

the proposed hypergraph visualization techniques are evaluated using graphs that depict

software systems. ArgoUML [1] is an open source UML modeling tool.

86


5.2 Example Hypergraphs

Graph The graph was derived from the source code of ArgoUML. The code was obtained

from the project’s Subversion repository on October 17, 2008. The nodes represent the

software artifacts at the level of Java classes. Method calls, attribute access, and inheritance

are modeled by weighted binary edges between nodes. The graph layout reflects

the hierarchical structure of the software system. This means that the layout allows to

spatialize each class to a package. In addition, packages can be spatialized to further

packages of higher levels of the software hierarchy. Such a clustering of nodes is typical

for visualizations of software systems, but not a requirement.

Not all nodes of the graph are connected to the major part of the software system. 67

nodes were not connected at all, and overall 86 nodes were removed to create a connected

graph of 1, 434 nodes and 4, 372 edges.

Hypergraphs A hyperedge of the ArgoUML graph represents a set of Java classes that

were commonly changed, i.e., classes that were changed between two subsequent revisions

of the Subversion version control system. The scenario that such hyperedges reveal was

already introduced as co-change of software artifacts in Section 1.2.

76 hyperedges, which connect more than two nodes, were derived from the most recent

commits to the repository. Large hyperedges are rare, as commits usually do not

affect many software classes. The frequency of commits decreases against the commit size

(number of changed classes). For instance, there are 26 hyperedges of size 4, 10 hyperedge

of size 6, and one hyperedge of size 13. The largest investigated hyperedge connects 32

nodes.

The hypergraphs, each of them containing one hyperedge, are denoted according to the

pattern “argouml-he76-hn32” to indicate hyperedge number 76 that connects 32 hyperedge

nodes.

Social Network

Social networks were already mentioned as possible applications of hyperedge visualizations,

cf. Section 1.2. This graph models high-tech managers that are employed by the

company “Krackhardt”, and was originally used in [60] as a multi-relational network.

Graph The graph data [2] models the friendship relations between the employees. 147

edges represent the mutual friendship relations between 33 employees that are modeled

by nodes.

Hypergraph A hyperedge visualizes the common friends of two employees, Rick and Tom.

Both are strongly related to the same group of friends, i.e., they have a higher number

of common friends than two employees of different groups would have. The hyperedge

consists of six nodes. The hypergraph is denoted by “Hitech” in the following.

87


5 Evaluation

World Trade

Trade graphs represent the economic relations between countries. The relations may

represent the trade volume of the annual import and export among countries and thus these

graphs reveal dependencies between economic systems. The statistical data of world trade

is made available by the World Bank at http://www.worldbank.org/trade (accessed

August 15, 2008).

Graph The graph used in this work represents the import of wares of 66 countries that

are represented by nodes. A weighted and undirected edge between two countries represents

the import trade volumes between both countries in 1999. Thus, the graph models

imported as well as exported trade volumes. The particular graph data [5] was also used

in previous work to discuss the LinLog drawings and it was shown that the graph layout

can represent an arrangement of nodes that is similar to the geographical arrangement of

countries.

Hypergraphs The hyperedges for this graph were derived from an obvious question a

viewer might ask: Which countries are economically strongly connected to a certain country?

6 hyperedges of this type are created. The top 10 and top 20 countries that are

highly connected to the USA, Germany, and China. Each of the three countries is highly

connected to geographic neighbors, but is also connected to countries of other continents.

For instance, the top 10 countries related to Germany except for European countries are

located in America, Australia, and Asia. The choice of hyperedges therefore promises

a mixed distribution of connected hyperedge nodes, on the one hand between different

clusters and on the other hand strongly integrated in the cluster of the examined country.

In the following, the mentioned hypergraphs will be denoted by, e.g., “WorldImport1999-

GER10”. The name indicates the country of interest by a three character country code

and the number of considered trade partners at the end.

Pseudo-Random Graph

The pseudo random graph offers a clear spatial clustering of nodes in the graph layout.

This specific graph property allows to examine the routing technique that avoids cluster

intersections.

Graph The graph of 400 nodes and 30278 edges clearly conveys 8 clusters having 50

nodes each. The specified probabilities of edges connecting the nodes of the graph as well

as the graph itself are available from the author [4].

Hypergraphs As the graph does not reflect a real world problem, the hyperedges are userdefined.

The hyperedges connect nodes of different clusters in order to allow a reasonable

evaluation of the reduction of cluster intersections. 3 hyperedges of different sizes connect

6, 10, and 16 nodes and are identified by the graph names “8Clusters-6”, “8Clusters-10”,

and “8Clusters-16”, respectively.

88


5.3 Preservation of Graph Layout Expressiveness

5.3 Preservation of Graph Layout Expressiveness

This section presents two experiments to evaluate to energy-based curve routing technique

that was proposed in Section 3.4. This routing technique is evaluated with regard to its

ability to preserve the expressiveness of the given graph layouts. This requirement is

represented by the criteria to avoid node occlusion and cluster intersections. Both criteria

are examined separately in the following Sections 5.3.1 and 5.3.2.

5.3.1 Experiment 1 – Node Occlusion

The first experiment evaluates the energy-based routing technique of curves. The hypothesis

is that energy-based routing improves the hypergraph layout quality, which is

measured by the criteria for hypergraph visualizations from Section 2.2.3. Namely, node

occlusion, cluster intersections, and the reduction of visual complexity can be evaluated.

Routing of curves does not aim at the reduction of visual complexity. The avoidance

of cluster intersections by the routing technique is examined in the second experiment in

Section 5.3.2.

The first experiment evaluates the remaining criterion of avoiding node occlusion by

examining the energy of hypergraph layouts. It was already stated in Chapter 3 that

the fidelity of the curve model influences the quality of the produced hypergraph layouts.

Thus, the influence of the curve model fidelity is investigated in the second part of this

experiment.

5.3.1.1 Energy-Based Routing Reduces Hyperedge Layout Energy

Hypothesis: Energy-based routing reduces the energy of hypergraph layouts

and thus avoids node occlusion.

The energy-based routing technique was designed to preserve the expressiveness of the

given graph layout and mainly focuses on the avoidance of node occlusion by hyperedges.

Hence, the energy model was established such that node occlusions are penalized with

high energy values. The energy model that is used in the energy-based routing algorithm

determines the total energy of a hypergraph layout. This experiment proves that the

energy of hypergraph layouts is reduced, which complies with the reduced likelihood of

node occlusion. The energy-based routing technique is evaluated independent of other

layout techniques proposed in this work.

A reduction of the energy is only attainable by a reduction of the repulsion energy.

The attraction energy is already minimal for not routed straight curves, since a straight

line is the shortest connection between two points. Thus, the attraction energy is always

increased by rearranging the curves. Consequently, the total hyperedge layout energy is

reduced by a reduction of the repulsion energy that implies an enlargement of the distances

between repulsing nodes and dummy nodes.

The occlusion of nodes can not be directly measured, because points (the positions

of nodes) are very improbable to intersect lines (the curve segments). Furthermore, the

occlusion of nodes in a visualization depends on the size of drawn nodes and the line

89


5 Evaluation

width of drawn curve segments. Therefore, a change of the hyperedge layout energy is a

more abstract indicator, which is independent of a particular visualization of hypergraph

layouts.

Definitions

The energy model of the routing technique (which was summarized in Table 3.1) employs

a repulsion of dummy nodes by nodes and an attraction of neighboring dummy nodes of

the same curve. The energy E(p) of a hypergraph layout p thus is calculated as the sum

of the total energies acting on all dummy nodes δ ∈ ∆(ε) of the hyperedge ε. The total

energy E(δ) of a dummy node was given in Equation 3.19 at the end end of Section 3.4.5

about energy-based routing.

E(p) =

δ∈∆(ε)

E(δ) (5.1)

As the value of the total hyperedge layout energy widely varies for arbitrary scalings of

layouts, the change ϱ of the hyperedge layout energy is computed. The change of energy

of a hyperedge layout is determined based on the initial energy E0 of a not routed layout

and the final energy E r of a routed layout of a hyperedge ε.

Setup

ϱ = E r − E0

E0

(5.2)

The energy of a hyperedge layout is computed before and after routing. The hyperedge

layout energy E0 before routing is computed with the initial layout where hyperedge nodes

are connected by straight-line curves. Then the hypergraph is routed and the energy E r

of the resulting layout is measured. The repulsion and attraction exponents r and a of

the energy model used for routing, as summarized in Table 3.1, were set to r = −1 and

a = 2.

The fidelity f of the curve model is incrementally increased during routing. The final

fidelity in this experiment is set to f = 6, which corresponds to 63 dummy nodes modeling

each curve, and is sufficient to produce decent hypergraph layouts. The energy was

minimized in a total of 180 iterations. 30 iterations for each level of fidelity turned out to

be adequate to compute stable hypergraph layouts of these example graphs.

In summary, the experimental results presented next were obtained using the following

setup:

90

• Centralized and fully connected hyperedge structure

• Fidelity f = 6 of the curve model

• Each level of fidelity is routed in 30 iterations

• No reduction of visual complexity: no curve aggregation and no curve widening


5.3 Preservation of Graph Layout Expressiveness

Graph E0 E r ϱ

8Clusters-6 3, 898, 986 183, 510 −95%

8Clusters-10 6, 488, 107 281, 943 −96%

8Clusters-16 10, 260, 682 471, 213 −95%

Hitech 221, 661 9463 −96%

WorldImport1999-CHN10 67, 616, 953 2, 759, 081 −96%

WorldImport1999-CHN20 100, 954, 597 4, 403, 640 −96%

WorldImport1999-GER10 65, 463, 982 2, 641, 378 −96%

WorldImport1999-GER20 141, 974, 221 5, 732, 665 −96%

WorldImport1999-USA10 66, 129, 628 2, 863, 279 −96%

WorldImport1999-USA20 131, 226, 440 5, 607, 047 −96%

Table 5.1: Comparison of total energies of initial and routed hypergraph layouts

Experimental Results

The energies E0 and E r of initial and routed hyperedge layouts were measured to compare

the quality of the layouts. The energies of two layouts of the same hyperedge are only

comparable if the hyperedge has the same curve model fidelity each time the energy

is determined. Thus, before computing the energy of a hypergraph, the fidelity of the

computed layouts is raised to a common level without changing the layouts. This means

that each curve is modeled by the same number of dummy nodes, and as a result the

energies of different layouts of a hyperedge are comparable. This procedure is explained

in more detail in the following Section 5.3.1.2.

The results of the centralized hyperedge structure are shown in the following. As expected,

the values of hyperedge energies of the fully connected structure are much higher

than those of the centralized structure, since the fully connected structure involves a higher

number of curves. Beside that, the reduction of the hyperedge energies of all example hypergraphs

was similar for both visualization structures.

Table 5.1 shows the hyperedge energies of the initial and the computed hypergraph

layout. The ratio ϱ indicates the relative decrease of the total hyperedge layout energy.

The results of the 76 hypergraphs derived from the ArgoUML software system were

omitted as they would not reveal any further information. On average, the ratio ϱ of

these 76 total energies was decreased by 94%±0.7%. The standard deviation of only 0.7%

indicates the stability of the energy reduction of all ArgoUML hypergraphs.

Tables 5.2 and 5.3 show the attraction energy EA and repulsion energy ER of the example

layouts separately. The ratios ϱA and ϱR are calculated analogously to ϱ in Equation

5.2. The attraction energy is increased by routing, because the curves are stretched

during routing.

Conclusion

The measurements of the total hyperedge energies of the example hypergraphs in Table 5.1

show that in every case the hypergraph layout energy was clearly reduced by the energybased

routing technique. The reduction of the repulsion energy is achieved by increasing

91


5 Evaluation

Graph EA,0 EA, r ϱA

8Clusters-6 1, 754 2, 037 +16%

8Clusters-10 3, 021 3, 653 +21%

8Clusters-16 5, 241 6, 110 +17%

Hitech 95 172 +81%

WorldImport1999-CHN10 32, 754 41, 179 +26%

WorldImport1999-CHN20 69, 141 79, 280 +15%

WorldImport1999-GER10 37, 955 45, 717 +20%

WorldImport1999-GER20 76, 755 94, 153 +23%

WorldImport1999-USA10 33, 813 39, 804 +18%

WorldImport1999-USA20 71, 036 87, 447 +23%

Table 5.2: Comparison of attraction energies of initial and routed hypergraph layouts

Graph ER,0 ER, r ϱR

8Clusters-6 3, 897, 233 181, 473 −95%

8Clusters-10 6, 485, 086 278, 290 −96%

8Clusters-16 10, 255, 441 465, 103 −95%

Hitech 221, 566 9, 291 −96%

WorldImport1999-CHN10 67, 584, 200 2, 717, 902 −96%

WorldImport1999-CHN20 100, 885, 457 4, 324, 360 −96%

WorldImport1999-GER10 65, 426, 026 2, 595, 661 −96%

WorldImport1999-GER20 141, 897, 466 5, 638, 512 −96%

WorldImport1999-USA10 66, 095, 816 2, 823, 475 −96%

WorldImport1999-USA20 131, 155, 404 5, 519, 600 −96%

Table 5.3: Comparison of repulsion energies of initial and routed hypergraph layouts

the distance between repulsing nodes and the dummy nodes of the curves of the hyperedges.

Assuming that the curve model fidelity is sufficiently high, the reduction of the

hyperedge layout energy entails the avoidance of node occlusion.

The separation of repulsion and attraction in Tables 5.2 and 5.3 reveals that the total

energy is reduced exclusively by the repulsion. The attraction is consequently increased as

the curves were displaced. In total, the reduction of the repulsion is much higher than the

increase of the attraction. The Hitech hypergraph layout, and also a few layouts depicting

the ArgoUML software system, highly increased the attraction energy by stretching the

curves. Nevertheless, in all instances the total energy was significantly and steadily reduced

by approximately 95%, independent of the hyperedge size.

Furthermore, the energy-based routing technique is examined by an assessment of produced

hypergraph layouts. Figures 5.1 through 5.6 show sequences of visualizations of

routed hypergraph layouts of four example hypergraphs in the centralized and the fully

connected structure. Each sequence of six hypergraph drawings show the initial setting

and the intermediate results of the energy-based curve routing. For illustrative purposes

the dummy nodes are depicted.

The graph drawings help to illustrate the measured energy values of the experimental

results above, as the meaning of a certain energy value of a hypergraph layout and the

impact of a certain degree of energy reduction is unclear in the number-based representation.

92


5.3 Preservation of Graph Layout Expressiveness

The first drawing of each sequence in Figures 5.1 through 5.6 shows the initial, not

routed, curves of a hyperedge. Then the curve model fidelity is increased every 30 iterations

and consequently the number of dummy nodes increases. The last drawing of each sequence

shows a layout with curve fidelity f = 5. Those figures show that curves are successfully

repulsed by the nodes. Node occlusions by the curves are prevented. Visualizations of

layouts with curve fidelity f = 6 are omitted, because more dummy nodes per curve could

not be distinguished visually and the layout of curves do not change significantly when

increasing the curve fidelity to f = 6. The latter fact will be demonstrated in the next

experiment in Section 5.3.1.2 and is confirmed by the values in Table 5.4 on page 98.

Another comparison of not routed and routed hypergraph layouts is shown in Figure 5.9

on page 103 by the graph drawings that also reveal the avoidance of cluster intersections.

These layouts show the pseudo-random hypergraph in the centralized structure. The crux

of the structure was moved away from the central (red) cluster as the position of the

barycenter of the hyperedge was not optimal. A closer look at the parts of the layouts

where curves intersect clusters in order to connect hyperedge nodes also reveal that curves

do not occlude the nodes.

Due to the small scale of the shown layouts, the hyperedge nodes were marked with a

black rim to allow a distinction. Routed curves were displaced with respect to the nodes.

It is also observable that close curves were moved to a common optimal path and thus

visually bundled, especially in Figure 5.9f.

The next experiment shows the influence of the curve model fidelity. It also reveals that

the chosen fidelity f = 6 of the experiment is sufficient and proves the stability of these

measurements.

5.3.1.2 Fidelity of Curve Model

Hypothesis: A higher curve model fidelity produces better hypergraph layouts

as it further reduces the hypergraph layout energy.

This experiment shows the impact of the curve fidelity on the hypergraph layout quality

in the context of energy-based routing. The assumption, which was already used in this

thesis, is that a higher curve model fidelity produces better hypergraph layouts than a

curve model with lower fidelity. The hypergraph layout quality is again evaluated with

respect to the requirements of hypergraph visualizations, which is measured by the total

energy of hyperedges. Later experiments on the aggregation and widening of curves will

also consider the impact of the curve model fidelity in their own context.

Setup

The hypergraph layouts are computed with equal settings except for different fidelities of

the curve model.

A different fidelity of the curve model means a different number of dummy nodes modeling

the curves. The energies of hypergraph layouts are only comparable if the layouts are

raised to a common level of fidelity. This principle is illustrated in Figure 5.7. First, the

hypergraph layouts are computed with a certain fidelity of the curve model. Figure 5.7a

93


5 Evaluation

94

(a) f = 0, no routing (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.1: Energy-based routing of argouml-he21-hn5 in the centralized structure

(a) f = 0, no routing (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.2: Energy-based routing of argouml-he21-hn5 in the fully connected structure


5.3 Preservation of Graph Layout Expressiveness

(a) f = 0, no routing (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.3: Energy-based routing of Hitech hypergraph in the centralized structure

(a) f = 0, no routing (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.4: Energy-based routing of Hitech hypergraph in the fully connected structure

95


5 Evaluation

(a) f = 0, no routing (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.5: Energy-based routing of WorldImport1999-GER10 hypergraph in the centralized

structure

(a) f = 0, no routing (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.6: Energy-based routing of WorldImport1999-GER20 hypergraph in the centralized

structure

96


(a) A routed curve modeled with low

fidelity

5.3 Preservation of Graph Layout Expressiveness

(c) The route from (a) is applied to

the high fidelity curve model from (b)

(b) The same curve as in (a), but

routed with a higher fidelity model

Figure 5.7: Comparability of hypergraph energies of layouts that are computed with different

curve model fidelities

depicts the route that was computed with a lower fidelity than in Figure 5.7b. To compare

the energies of both layouts, the lower fidelity is raised to a higher level, at least to

the level of the compared layout. As shown in Figure 5.7c, the layout of curves remains

unchanged with an increase of the curve fidelity.

The energy-based routing technique is applied as above. The energy was minimized in

180 iterations as this corresponds to at least 30 iterations for each level of the curve model

fidelity. The setup in this experiment is as follows:

• Centralized and fully connected hyperedge structure

• Curve model fidelity 1 ≤ f ≤ 6

• Energy-based routing in 180 iterations

• No reduction of visual complexity: no curve aggregation and no curve widening

Experimental Results

The energy was computed for various example hypergraphs identified by their names in

the leftmost column of Table 5.4. Similar to the previous experimental results, Table 5.4

shows the reduction of the hyperedge energies of the centralized structure. To reduce the

number of shown results, this table only shows the ratio of change of the layout energy

to the according not routed layout energy. The table’s last row aggregates the measured

results of all ArgoUML hypergraphs. The average reduction of the layout energy and the

standard deviation of this value throughout all ArgoUML hypergraphs summarize these

results.

Again, the measurements of the fully connected structure are omitted in Table 5.4,

because they are very similar to the centralized structure. The plots in Figure 5.8 illustrate

some additional information about the reduction of the hyperedge energies in selected

example hypergraphs. Each plot depicts the energy E(p) of a computed layout p of

the centralized and the fully connected structure against the different fidelities f. As

97


5 Evaluation

Graph f = 1 f = 2 f = 3 f = 4 f = 5 f = 6

8Clusters-6 −75% −88% −94% −95% −95% −95%

8Clusters-10 −75% −88% −94% −95% −96% −96%

8Clusters-16 −75% −88% −94% −95% −95% −95%

Hitech −76% −88% −94% −96% −96% −96%

WorldImport1999-CHN10 −78% −89% −95% −96% −96% −96%

WorldImport1999-CHN20 −76% −88% −94% −95% −96% −96%

WorldImport1999-GER10 −75% −89% −94% −96% −96% −96%

WorldImport1999-GER20 −76% −88% −94% −96% −96% −96%

WorldImport1999-USA10 −76% −89% −94% −96% −96% −96%

WorldImport1999-USA20 −76% −88% −94% −96% −96% −96%

argouml-he*

−75% −87% −93% −94% −94% −94%

±1.4% ±0.9% ±0.7% ±0.7% ±0.7% ±0.7%

Table 5.4: Hypergraph layout energy reduction ϱ against increasing curve fidelity f

centralized structures have less curves, the energies are significantly smaller than the

energies of the fully connected structures. So each of the plots in Figure 5.8 separates the

two structures clearly from each other.

Conclusion

The second part of this experiment clearly proves that a higher fidelity allows to compute

better hypergraph layouts with regard to the routing of curves. The total energy of the

routed hyperedges based on curves modeled with a higher fidelity is always smaller, or

at most equally high as, the energy of the same hyperedge that was routed with a lower

fidelity.

The table and the plots reveal a significant reduction of the energy with increasing

fidelity of the curve model in the beginning. The layout computations of all example

hypergraphs not shown here reveal the same result. The energy was never increased. Due

to space constraints, not all results can be shown in this thesis to prove the results.

The energy plots in Figure 5.8 also prove that the layouts converge to an optimal

layout. It can not be expected that the employed energy minimization algorithm can

further reduce the energy E(p) of the layouts significantly by increasing the fidelity beyond

f = 6. This observation also indicates that the used curve model fidelity in the first part

of the experiment was sufficient for these examples as the layout quality converges to an

optimum.

5.3.2 Experiment 2 – Cluster Intersection

Hypothesis: Energy-based routing avoids cluster intersections by curves.

The previous experiment proved the capability of the energy-based routing technique to

reduce the hypergraph layout energy. The avoidance of node occlusion is a consequence of

the energy reduction. The second aspect of the preservation of the expressiveness of graph

layouts is the avoidance of cluster intersections. The energy model of the energy-based

routing technique was also designed to avoid the intersection of clusters by curves.

98


E(p)

E(p)

6e+5

9e+3

3e+6

9e+3

centralized

fully connected

0 2 4 6

Fidelity f

(a) Hitech

centralized

fully connected

0 2 4 6

Fidelity f

(c) argouml-he76-hn32

5.3 Preservation of Graph Layout Expressiveness

E(p)

E(p)

2e+9

6e+6

4e+7

3e+5

centralized

fully connected

0 2 4 6

Fidelity f

(b) WorldImport1999-GER20

centralized

fully connected

0 2 4 6

Fidelity f

(d) 8Clusters-10

Figure 5.8: Reduction of the layout energy of example hypergraphs against increasing

curve fidelity

This experiment investigates the influence of routing on the number of cluster intersections

by curves. The number of cluster intersections, however, strongly depends on a

proper definition of a clustering, i.e., a partitioning of the nodes. A spatial clustering of

the graph layout is preferred over a graph clustering. A graph clustering is independent

from a layout and solely considers the relations between nodes. Nevertheless, the Lin-

Log energy model, which is used to compute the example graph layouts except for the

ArgoUML hypergraphs, produces graph layouts that reflect the graph clustering [44, 46].

Thus, the graph clustering that is calculated by the LinLogLayout tool can be used to

identify spacial clusters, as the layout will “represent the cluster structure of graphs by

grouping densely connected nodes and separating sparsely connected nodes” [46].

The graph layout of the ArgoUML graphs were precomputed and represent the hierarchical

structure of the software system. The nodes are placed according to their affiliation

to packages. Different packages are spatially and visually separated. Thus, the hierarchy

of the software system is used to cluster the ArgoUML graph layout.

Both types of clustering, a graph and a spatial clustering, require proper parameters

to configure a threshold that allows to determine groups of nodes. That is, a threshold

specifies whether a node belongs to a certain cluster or not. If the threshold is too high, the

clusters can become very large and might contain large gaps between nodes, which could

be wide enough to accommodate curves. Conversely, if the threshold is too low, there will

99


5 Evaluation

be many small clusters and thus cluster intersections are not very likely, as curves can

easily curl around the many small clusters. Furthermore, it is impossible to compute a

clustering that complies with human sense throughout. Therefore, this experiment also

examines drawings of the example hypergraphs.

Definitions

A clustering is a partitioning P of the set of graph nodes V of a hypergraph H = (V, E).

A curve intersects a cluster of the hypergraph if there are nodes of this cluster located

on both sides of the curve. The clusters that contain the hyperedge nodes that are end

points of the curve are not considered, as these intersections are inevitable to connect the

hyperedge nodes with each other.

The number of cluster intersections φ of a hyperedge is the sum of cluster intersections

of all curves. If the hyperedge is not routed, φ0 is easily determined, because curves are

straight-line segments. A routed curve is not straight and complicates the measurement

of the number of cluster intersections φr. The relative position of a node to the curve

is determined by the least distant curve segment, i.e., the straight-line segment between

neighboring dummy nodes.

Setup

The number of cluster intersections per curve are measured before and after routing. To

produce stable hypergraph layouts, the curve model fidelity is set to f = 6 and the energy

is minimized in 180 iterations. The suitability of these settings was already demonstrated

in the first experiment. Furthermore, the hyperedges are not widened or aggregated.

In summary, the experimental results presented next were obtained using the following

setup:

• Centralized and fully connected hyperedge structure

• Fidelity f = 6 of the curve model

• Each level of fidelity is routed in 30 iterations

• No reduction of visual complexity: no curve aggregation and no curve widening

Experimental Results

Table 5.5 contrasts the number φ0 of cluster intersections of the initial layout with the

number φr of the routed layout for the three pseudo-random hypergraphs. The results

are shown separately for different hyperedge structures. The Hitech and the World Trade

hypergraphs are too small as they both contain only three clusters each and thus no cluster

intersections were measured.

For the ArgoUML example, two clusterings of the underlying graph of ArgoUML were

used. A fine-grained clustering of 40 clusters groups classes with respect to their package

affiliation on the lowest package level. A coarse-grained clustering of 18 clusters is derived

if the second highest package level is chosen. Table 5.6 shows the measurements of

hypergraphs with hyperedges that connect nodes of distinct clusters.

100


Conclusion

5.3 Preservation of Graph Layout Expressiveness

Graph Structure φ0 φr

8Clusters-6

8Clusters-10

8Clusters-16

centralized 4 0

fully connected 4 0

centralized 7 1

fully connected 8 1

centralized 5 0

fully connected 29 5

Table 5.5: Cluster intersections before and after energy-based routing

Tables 5.5 and 5.6 verify the reduction of cluster intersections by energy-based routing.

The higher number of curves of a fully connected structure causes more cluster intersections

compared to a centralized structure.

In compliance with software modularity, a change in the software system should not

affect artifacts of multiple subsystems. Consequently, the hypergraphs representing the

co-change of the ArgoUML software system should not be proper test candidates for this

experiment. However, the co-change of classes of the ArgoUML tool turned out to be

scattered among several packages.

The fine-grained clustering features a higher probability of cluster intersections since

there are more clusters that can be intersected between the end points of curves. Both cases

show that the number of initial cluster intersections φ0 was generally reduced by energybased

routing. However, it is also possible that cluster intersections are introduced if a

path that intersects a cluster has a lower energy than the initial path. In our experiments,

this was the case for the hypergraphs “argouml-he16-hn19”, “argouml-he54-hn23”, and

“argouml-he55-hn11” using the coarse-grained clustering. This observation confirms the

statement from above that a coarse clustering impedes the evaluation of the avoidance of

cluster intersections.

Figure 5.9 depicts computed hypergraph drawings of the pseudo-random hypergraphs

and allows the comparison of the not routed and the routed layout in the centralized

structure. These layouts confirm the avoidance of cluster intersections. The clustering is

visualized by the colors of nodes.

The layout in Figure 5.9d could not resolve one cluster intersection. This observation

is compliant to Table 5.5. This case may occur if dummy nodes of the respective curve

are trapped in local energy minima since the nodes of the red cluster repulse the curve

equally from both sides.

The previous experiment in Section 5.3.1.1 proved the avoidance of node occlusion by the

energy-based routing technique. Combined with the capability to avoid the intersection

of clusters by curves, it is proven that the routing technique preserves the expressiveness

(cf. Section 2.2.2.2) of fixed graph layouts. This requirement is fulfilled if hyperedges are

routed using the proposed energy-based technique.

101


5 Evaluation

Graph Structure

argouml-he16-hn19

argouml-he20-hn4

argouml-he21-hn5

argouml-he22-hn12

argouml-he38-hn8

argouml-he43-hn6

argouml-he50-hn5

argouml-he54-hn23

argouml-he55-hn11

argouml-he61-hn8

argouml-he63-hn6

coarse-grained

clustering

fine-grained

clustering

φ0 φr φ0 φr

centralized 0 1 9 7

fully connected 7 3 108 44

centralized 2 0 4 0

fully connected 0 0 4 0

centralized 2 0 4 0

fully connected 2 0 6 2

centralized 0 0 0 0

fully connected 3 0 5 0

centralized 7 2 7 1

fully connected 9 2 9 1

centralized 4 0 5 0

fully connected 0 0 2 0

centralized 0 0 0 0

fully connected 3 1 3 1

centralized 0 1 2 1

fully connected 42 11 90 28

centralized 1 2 1 0

fully connected 17 10 20 14

centralized 0 0 3 1

fully connected 1 1 6 1

centralized 1 1 6 0

fully connected 0 0 5 3

Table 5.6: Cluster intersections before and after energy-based routing of a coarse-grained

and a fine-grained clustering of the ArgoUML software system

5.4 Reduction of Visual Complexity

The remaining requirement of hypergraph visualizations is the reduction of visual complexity

of hypergraphs. Visual complexity is crucial but also difficult to formalize and

to measure. Therefore, the generated hypergraph layouts are of particular importance to

examine whether the proposed techniques succeed to fulfill this requirement. A modelbased

curve aggregation technique and a visual curve bundling technique are evaluated in

Sections 5.4.1 and 5.4.2, respectively.

5.4.1 Experiment 3 – Model-Based Aggregation

Hypothesis: The cluster-based aggregation of curves reduces the visual complexity

of hypergraph layouts.

The visual complexity of hypergraph drawings is reduced by a reduction of the cognitive

load, i.e., the amount of information that a viewer has to cognize in order to comprehend

a hypergraph drawing. As already mentioned in Section 4.6.4 about visual complexity, a

reduction of the total curve length of a hyperedge signifies a reduction of the cognitive load.

This experiment therefore measures the reduction of the total curve length of hyperedges.

102


5.4 Reduction of Visual Complexity

(a) 8Clusters-6 without routing (b) 8Clusters-6 after energy-based routing

(c) 8Clusters-10 without routing (d) 8Clusters-10 after energy-based routing

(e) 8Clusters-16 without routing (f) 8Clusters-16 after energy-based routing

Figure 5.9: Energy-based routing avoids node occlusion and cluster intersection

103


5 Evaluation

The cluster-based curve aggregation technique is examined in this experiment. This

model-based aggregation is independent of routing, so the hypergraphs are not assumed

to be routed before the technique is applied. The hypergraph drawings at the end of this

experiment will show routed hypergraph layouts.

Definitions

The total curve length λ(ε) of a hyperedge ε is defined in Equation 5.3 below. As curves

are not routed the length len(c) of each curve c ∈ C(ε) is determined by the Euclidean

distance between both of its end points.

λ(ε) =

c∈C(ε)

len(c) (5.3)

After an aggregation of curves, the total curve length λa(ε) is calculated analogously.

The aggregation replaced aggregated parts of curves by a single curve in the hyperedge

model. Thus, each aggregated path is counted once. Notice the distinction of the length

of an aggregated curve from the weight of an aggregated curve, which denotes the sum of

the individual curve lengths (cf. Section 4.6.1) for the curve aggregation algorithm.

Setup

Both hyperedge structures were tested regarding the curve aggregation. As the curves

are not routed for the measurement of the total curve lengths of hyperedges, no further

settings of routing and widening have to be specified. The maximum azimuth angle ∆θmax

between curves that limits the aggregation of curves is varied in this experiment.

The experimental results presented next were obtained using the following setup:

• Centralized and fully connected hyperedge structure

• Maximum azimuth angle ∆θmax ∈ {50 ◦ , 70 ◦ , 90 ◦ }

Experimental Results

The results presented in the following are shown for the centralized and the fully connected

structure separately. Table 5.7 shows the ratio ϱ = λa−λ0 of the change of total curve

λ0

length, i.e., λa − λ0, to the initial (not aggregated) total curve length λ0 for three different

values of the maximum azimuth angle ∆θmax.

The example hypergraphs shown in this table are selected in order to show a variety of

different hyperedge sizes. The omitted hypergraphs show similar results. The total curve

length of hyperedges can be decreased by the aggregation of curves. An increase of the

maximum azimuth angle does not always entail a reduction of the total curve length. A

more detailed investigation of the relative reduction if the total curve length of centralized

hyperedges against differently specified maximum azimuth angles is shown in the plots of

Figure 5.10.

104


−ϱ in percent

5.4 Reduction of Visual Complexity

Graph Structure ∆θmax = 50 ◦ ∆θmax = 70 ◦ ∆θmax = 90 ◦

8Clusters-10

8Clusters-16

Hitech

WorldImport1999-GER10

WorldImport1999-GER20

argouml-he1-hn4

argouml-he37-hn10

argouml-he46-hn12

argouml-he54-hn23

argouml-he75-hn18

75

50

25

0

Conclusion

centralized −21% −24% −13%

fully connected −48% −53% −56%

centralized −43% −35% −36%

fully connected −62% −62% −59%

centralized −5% −11% −11%

fully connected −50% −57% −60%

centralized −44% −51% −53%

fully connected −65% −67% −68%

centralized −40% −49% −55%

fully connected −66% −68% −60%

centralized −13% −13% −15%

fully connected −29% −32% −39%

centralized −35% −36% −37%

fully connected −55% −57% −57%

centralized −43% −47% −49%

fully connected −56% −60% −59%

centralized −48% −35% −39%

fully connected −61% −58% −52%

centralized −34% −40% −1%

fully connected −58% −54% −60%

Table 5.7: Change of total curve lengths by cluster-based curve aggregation

8Clusters-10

Hitech

WorldImport1999-GER10

WorldImport1999-GER20

20 40 60 80 100

∆θmax in degree

(a)

−ϱ in percent

75

50

25

0

argouml-he1-hn4

argouml-he54-hn23

argouml-he75-hn18

20 40 60 80 100

∆θmax in degree

Figure 5.10: Reduction of total curve length against the threshold angle θmax

The aggregation of curves reduces the total curve length. This reduction causes a decline

of the amount of visualized information, the cognitive load. The relative reduction of the

total curve length of fully connected structures is constantly higher than of its centralized

pendant. This is founded by the fact that curves of the fully connected structure can be

aggregated at both ends. Still, this higher reduction of the visual complexity by curve

aggregation does not compensate the tremendously higher number of curves compared to

the centralized structure.

The plots in Figure 5.10 reveal that the maximum azimuth angle does not correlate with

(b)

105


5 Evaluation

an reduction of the total curve length of hyperedges. A value of the maximum azimuth

angle that is optimal for any hypergraph is indeterminable. The relation between the angle

and the curve length reduction is specific to each hyperedge. An increasing azimuth angle

also implies the possibility to increase the length of the connections between hyperedge

nodes as the divergence of shared path and the individual curves increases.

This experiment proves that the choice of a value of the maximum azimuth angle ∆θmax

is not crucial for the reduction of the total curve length of hyperedges. Consequently, it

is left to the viewer’s preference to specify this angle.

Figures 5.11 through 5.15 show the produced graph drawings for various values of ∆θmax.

A comparison of routed hypergraph layouts, which were not aggregated, to the layouts

that were aggregated and routed is depicted in the first three Figures 5.11 through 5.13.

The remaining figures depict further visual results of the curve aggregation technique

separated for both hyperedge structures.

The graph drawings of hyperedges in the fully connected structure reveal that the

cluster-based curve aggregation technique can also be applied to this structure, as in

Figures 5.15 and 5.12d. But as the curves are aggregated for each hyperedge node individually,

the incident curves of close hyperedge nodes are similarly placed in the layout, i.e.,

the different aggregated paths are close to each other, but do not overlap. A mesh of intersecting

curves is created in the central area of an aggregated fully connected hyperedge.

Since this curve aggregation technique only aggregates adjacent edges, it is recommended

to further aggregate not adjacent curves.

The visual complexity of the depicted hypergraph drawings was reduces. The visualizations

of small (e.g., in Figures 5.14a and 5.14d) and large hyperedges (e.g., in Figures 5.13b

and 5.14b) clearly benefit from the aggregation of curves.

5.4.2 Experiment 4 – Visual Bundling

Hypothesis: The energy-based curve widening technique reduces the visual

complexity of hypergraphs layouts and still avoids the occlusion of nodes.

The energy-based curve widening technique increases the line width of the visual representation

of curves. Like ordinary curves, widened curves must not occlude nodes. This

experiment measures node occlusion caused by a widening of curves. Therefore, it is assumed

that the routing of (not widened) curves prevents node occlusion, as the previous

experiment in Section 5.3.1.1 has proven.

Widened curves span planes that are described as polygons in the graph layout area.

The position and shape of a polygon are determined by the positioning of the hull points of

the respective curve. The number of nodes that are occluded by a polygon is a measure for

the quality of layouts produced by the widening technique with respect to the preservation

of the layout’s expressiveness.

The accuracy of the curve model determines the accuracy of the hull, because the

number of hull points depends on the number of dummy nodes that are used to model a

curve. As the hull model is an approximation of the widened curve, node occlusion can not

always be prevented. In this respect, the approximation of the hull by hull points implies

similar challenges as the approximation of curves by dummy nodes. The quality of layouts

106


(a) No curve aggregation (b) Curve aggregation with

∆θmax = 20 ◦

5.4 Reduction of Visual Complexity

(c) Curve aggregation with

∆θmax = 50 ◦

Figure 5.11: Curve aggregation of 8Clusters-16 in the centralized structure

(a) No curve aggregation (b) Curve aggregation with ∆θmax = 70 ◦

(c) No curve aggregation (d) Curve aggregation with ∆θmax = 70 ◦

Figure 5.12: Curve aggregation of the Hitech hypergraph in the centralized structure in

(a), (b) and the fully connected structure in (c), (d)

107


5 Evaluation

(a) No curve aggregation (b) Curve aggregation with ∆θmax = 70 ◦

Figure 5.13: Curve aggregation of the argouml-he16-hn19 hypergraph in the centralized

structure

(a) argouml-he12, ∆θmax = 60 ◦

(d) 8Clusters-6, ∆θmax = 90 ◦

(b) argouml-he54, ∆θmax = 50 ◦

(e) WorldImport1999-GER10,

∆θmax = 70 ◦

(c) argouml-he55, ∆θmax = 70 ◦

(f) WorldImport1999-USA20,

∆θmax = 30 ◦

Figure 5.14: Curve aggregation of various example hypergraphs in the centralized structure

108


(a) argouml-he21-hn5, ∆θmax = 90 ◦

5.4 Reduction of Visual Complexity

(b) argouml-he47-hn6, ∆θmax = 90 ◦

Figure 5.15: Curve aggregation of ArgoUML hypergraphs in the fully connected structure

produced by the widening technique basically depends on the curve model fidelity and the

quality of the layout of routed curves. The curve widening technique with a low curve

model fidelity, i.e., large distances between neighboring hull points, is more susceptible to

node occlusion.

Definitions

The number Ω of nodes that are occluded by a hyperedge is the sum of the number of

nodes that are occluded by the polygons representing the widened curves. The same node

might be occluded by several widened curves. The occlusion of such a node is counted

multiple times, because the widening technique works on each curve individually and the

multiple occlusion of a node is a flaw in each curve widening.

The ratio of actually occluded nodes by a polygon to the number of potentially occluded

nodes by a polygon is denoted by ϱ. In comparison to the absolute number of node

occlusions, this ratio takes the number of nodes in the close proximity of a curve into

account. This close proximity of a curve is determined by the maximum curve width Wc

that was specified to adjust the energies. A potentially occluded node is located in the

polygon of a curve with maximum width.

The ratio ϱ in this experiment is used as an indicator for the error rate of the curve

widening technique. A low value of ϱ means that only a small percentage of the nodes

that lie in the potential area of the widened curve were actually occluded by the produced

polygon.

Setup

The hypergraph layouts are already routed as in the first experiment. The energy-based

widening algorithm minimized the energy of the hull in 50 iterations, which are more than

109


5 Evaluation

adequate to compute stable layouts for these example graphs. The experimental results

presented next were obtained using the following setup:

• Centralized and fully connected hyperedge structure

• Fidelities f ∈ {4, 6, 8} of the curve model

• Energy-based routing in 180 iterations, i.e., each level of fidelity is routed in at least

30 iterations

• Curve widening in 50 iterations

• No curve aggregation

Experimental Results

Table 5.8 shows the number Ω of occluded nodes and the ratio ϱ for several example

hypergraphs. These examples were selected with respect to the size of the hyperedge.

Hypergraphs without node occlusions for any curve model fidelity f are omitted. The

results are separately listed for both hyperedge structures and the curve model fidelities.

Conclusion

This experiment proves that the number of node occlusions is tremendously reduced with

increasing curve model fidelity. As the results in Table 5.8 show, node occlusions are

almost entirely prevented with the fidelity f = 8 of the curve model. The fidelity of the

curve model influences, as expected, the widening results significantly. A higher curve

fidelity that corresponds to a smaller distance between neighboring hull points avoids

node occlusion. This experiment reveals that the energy-based curve widening technique

requires a higher curve model fidelity to produce hypergraph layouts with very high quality.

In comparison, the energy-based curve routing technique already produced hypergraph

layouts with very high quality with a fidelity f = 6.

The fully connected structure of hyperedges tends to occlude more nodes than the

centralized structure. The absolute number Ω of node occlusions of the fully connected

structure is higher as this structure features a larger number of curves. The ratio ϱ is

also higher, because the layouts of the ArgoUML software system and the pseudo-random

graph are clearly spatially clustered. Hyperedge nodes are affiliated to clusters and incident

curves are probably routed through these clusters. As the occlusions of nodes by each curve

are counted, the higher number of curves of the fully connected structure also increases

the ratio.

The widening of curves reduces the visual complexity of hyperedges by visually bundling

close curves together. The computed hypergraph drawings in Figure 5.16 demonstrate

the reduction of the cognitive load of hyperedges. The planes cover the tracks of the

close curves and thereby bundle them visually and simplify the displayed structure of

hyperedges. The hypergraph drawings clearly spare the nodes of the graphs, e.g., if curves

intersect clusters to connect hyperedge nodes. Hyperedges in the fully connected structure

show the same effects as the ones in the centralized structure. For instance, Figure 5.16e

shows the fully connected hyperedge of five hyperedge nodes. The curves in this drawing

110


5.5 Comprehensive Hypergraph Layouts

Graph Structure


f = 4

ϱ Ω

f = 6

ϱ Ω

f = 8

ϱ

8Clusters-6

centralized

fully connected

7

21

6.4%

7.9%

1

6

1.0%

2.3%

0

0

0.0%

0.0%

8Clusters-10

centralized

fully connected

15

66

8.6%

17.0%

3

20

1.9%

5.2%

0

0

0.0%

0.0%

8Clusters-16

centralized

fully connected

19

111

8.8%

28.0%

3

34

1.4%

8.7%

0

1

0.0%

0.3%

argouml-he13-hn7

centralized

fully connected

5

23

13.5%

25.8%

0

6

0.0%

6.6%

0

0

0.0%

0.0%

argouml-he17-hn5

centralized

fully connected

2

1

40.0%

20.0%

0

0

0.0%

0.0%

0

0

0.0%

0.0%

argouml-he24-hn6

centralized

fully connected

17

82

13.2%

24.9%

4

36

3.1%

10.5%

0

4

0.0%

1.2%

argouml-he38-hn8

centralized

fully connected

10

13

18.2%

12.9%

2

4

3.7%

3.8%

0

2

0.0%

1.9%

argouml-he42-hn4

centralized

fully connected

30

55

17.1%

18.0%

5

33

2.8%

10.0%

0

8

0.0%

2.5%

argouml-he44-hn4

centralized

fully connected

1

5

6.7%

25.0%

0

3

0.0%

14.3%

0

0

0.0%

0.0%

argouml-he45-hn28

centralized

fully connected

17

138

13.4%

67.6%

4

52

3.2%

25.4%

0

9

0.0%

4.4%

argouml-he51-hn10

centralized

fully connected

13

106

13.7%

48.8%

4

33

4.3%

14.7%

0

2

0.0%

0.9%

argouml-he56-hn8

centralized

fully connected

0

2

0.0%

14.3%

0

0

0.0%

0.0%

0

0

0.0%

0.0%

argouml-he6-hn9

centralized

fully connected

0

4

0.0%

20.0%

0

0

0.0%

0.0%

0

0

0.0%

0.0%

argouml-he7-hn14

centralized

fully connected

35

252

14.0%

60.9%

6

129

2.4%

30.6%

0

10

0.0%

2.4%

argouml-he72-hn7

centralized

fully connected

20

108

10.6%

30.4%

4

32

2.0%

8.6%

0

2

0.0%

0.5%

argouml-he73-hn5

centralized

fully connected

9

30

12.9%

19.7%

3

9

4.2%

6.2%

0

0

0.0%

0.0%

argouml-he74-hn13

centralized

fully connected

31

197

10.4%

44.1%

4

81

1.3%

17.4%

0

20

0.0%

4.3%

Table 5.8: Node occlusion for different curve model fidelities f

are visually bundled as the widened curves overlap and thus do not reveal individual curves

anymore.

5.5 Comprehensive Hypergraph Layouts

After a large amount of measurements and graph drawings were discussed to evaluate

the layouts against the individual criteria, this section demonstrates graph drawings that

result from the combination of the proposed techniques. As we are interested in software

visualizations in particular, the following Figure 5.17 shows six hypergraph layouts of the

111


5 Evaluation

112

(a) 8Clusters-10 (b) 8Clusters-16

(c) argouml-he16-hn19 (d) argouml-he22-hn12 (e) argouml-he30-hn5

(f) argouml-he47-hn6 (g) argouml-he54-hn23

Figure 5.16: Energy-based curve widening applied to routed hypergraphs


5.5 Comprehensive Hypergraph Layouts

ArgoUML software system. Each hyperedge represents the co-change of Java classes and

connects those classes that were changed in one commit.

The visual complexity of the shown hyperedge layouts was first reduced by the clusterbased

curve aggregation technique. Then, the aggregated curves were routed by the

energy-based technique. Additionally, the curves were widened at the end of the layout

computation.

The first four drawings in Figure 5.17a through 5.17d show hyperedges that a widely

distributed in the layout area. Usually, co-change visualizations of modularized software

systems are limited to a small part of the layout area and thus look similar to those in

Figures 5.17e and 5.17f.

113


5 Evaluation

(a) argouml-he12-hn6, centralized (b) argouml-he12-hn6, fully connected

(c) argouml-he21-hn5, centralized (d) argouml-he47-hn6, centralized

(e) argouml-he68-hn9, centralized (f) argouml-he11-hn4, fully connected

Figure 5.17: Hypergraph visualizations of co-change of the ArgoUML software system.

Hyperedges are aggregated, routed, and widened.

114


6 Summary

In this chapter we first summarize the goals and achievements of this work in Section 6.1.

Then, Section 6.2 lists areas where future work can build upon the findings introduced in

this thesis.

6.1 Conclusions

The visualization of hypergraphs comprises a wide field of applications. With the focus on

the visualization of software systems, hypergraphs allow to model n-ary relations between

software artifacts. A visualization of hypergraphs enables a viewer to get an overview of

the relationships between the depicted artifacts.

Since the positions of nodes usually represent a certain property, e.g., the graph clustering,

the graph layout is assumed to be immutable when hyperedges are added. This is

the major contrast to other available hypergraph visualization techniques. Furthermore,

the visualization of hyperedges must not cover nodes to maintain the expressiveness of the

entire graph layout. Cluster intersections must also be avoided to maintain the expressiveness.

The preservation of the expressiveness of the given graph layouts was not used

before as a requirement to visualize hypergraphs. The three aspects immutable graph

layout, avoidance of node occlusion and cluster intersections constitute the preservation

of the expressiveness of graph layouts.

This thesis also focused on the readability of produced hypergraph drawings. Additional

hyperedges in graph drawings increase the cognitive load and may overload the

viewer’s perception. Therefore, this work aimed at hypergraph visualization techniques

that produce layouts with low visual complexity.

Solutions Several techniques to compute hypergraph visualizations, which fulfill our requirements,

were introduced. All techniques do not alter the given graph layout. First,

the choice of the hyperedge structure influences the visual complexity and readability of

hypergraphs drawings, as discussed in Section 2.3. Second, the energy-based routing technique

introduced in Section 3.4 is applied to hyperedges to avoid the occlusion of nodes.

A routing technique based on cluster-bounds was found to be not appropriate to meet the

requirements.

The visual complexity of hypergraph drawings that is caused by hyperedges is reduced

by the techniques described in Chapter 4. They all aim at a reduction of the amount of

visualized objects, i.e., the cognitive load, either by simplifying the hyperedge model or by

visually bundling curves. The first option, the simplification of the model, is independent

of a later visualization. Thus, it does not need to consider the graph layout and is applied

115


6 Summary

before routing. The second option, the visual bundling of curves, changes the curve

layout and thus must consider the graph layout to fulfill our requirements of hypergraph

visualizations.

Results The experimental evaluation in Chapter 5 proved the avoidance of node occlusion

by the energy-based curve routing technique. An occlusion of nodes can not directly be

measured. The reduction of hyperedge layout energy is an appropriate measure of the

avoidance of node occlusion and thus indicates the hypergraph layout quality.

The routing of curves also reduces intersections of clusters. Cluster intersections were

not always prevented as the dummy nodes of the corresponding curves can get trapped in

local energy minima. This is not a drawback of the defined energy model for routing, but

a shortcoming of the energy minimization algorithm that was not focused in this thesis.

As an important outcome of this work, the proposed energy-based routing technique can

also be applied to lay out any binary connection between nodes. The visualization of routed

binary edges of ordinary graphs equally benefits from the avoidance of node occlusions.

Edges visualized by straight-line segments introduce visual clutter that obscures nodes as

illustrated in Figure 2.2b on page 14.

Four techniques to reduce the visual complexity of hypergraph drawings were discussed.

The cluster-based aggregation of curves is preferred over the laborious combination of

the energy-based bundling and the energy-threshold-based aggregation technique. Consequently,

the cluster-based aggregation was implemented and evaluated. Besides measuring

the cognitive load by the total curve length of hyperedges, the variety of example graph

drawings in Figures 5.11 through 5.15 demonstrate the achievements of this approach.

The readability of the drawings was significantly increased.

The energy-based widening technique also reduced the visual complexity. The curves

in the hypergraph drawings in Figure 5.16 were not bundled as much as the cluster-based

aggregation did. Nevertheless, plane representations of curves may increase the readability

of hypergraph drawings due to the orthogonality of planes to the boxes and lines of the

remaining graph drawing.

Finally, all experiments shown in Sections 5.3 and 5.4 enable us to assess the stability of

the evaluated layout techniques. The measurements consistently lead to the same conclusions,

such as the significant reduction of the hypergraph layout energy for all subjected

example hypergraphs, and the reduction of visual complexity. The hypergraph drawings

confirmed these observations, as the avoidance of node occlusion and cluster intersection

is visually noticeable. The centralized structure of hyperedges is the preferred choice to

visualize hypergraphs as it utilizes less space in the graph layout area and thus significantly

lowers the cognitive load in comparison to the fully connected structure.

In summary, this thesis successfully proposed and evaluated techniques to produce hypergraph

layouts that fully meet the requirements of hypergraph visualization.

116


6.2 Future Work

6.2 Future Work

This thesis is a first step towards hypergraph visualizations in fixed graph layouts that,

among other things, do not occlude nodes. Several ideas that promise improvement were

not examined in the scope of this thesis. This section documents open approaches that

may serve as a foundation of further research on this topic.

The curve model fidelity is independent of the surrounding graph layout of curves. To

increase the quality of routed layouts, the curve model fidelity can be increased locally

at those parts of the curves that are closely surrounded by nodes. Thus, with minimal

increase of the additional computational effort, the quality of the layouts can be increased.

The routing process is assumed to start with a curve model fidelity of f = 1. A higher

value of f might be reasonable. Properties of the given graph layout, e.g., the minimal

distance between graph nodes, can be used to determine a proper initial value of f.

The assessment of readability and visual complexity of hypergraph drawings was chiefly

done by the author of the present work. However, such highly subjective assessments

depend on the viewer’s professional background and her or his knowledge of box-andline

graph drawings. Therefore, user studies are necessary to substantially support the

statements in this work.

The computational performance of the accompanying prototype implementation of the

energy-based curve routing and the energy-based curve widening technique can be improved.

Currently, each node of the graph repulses each dummy node and hull point

individually. The Barnes-Hut algorithm [10] spatially divides the graph layout area into a

tree to approximate a set of distant nodes by one combined node. The repulsion of nodes

close to dummy nodes or hull points is computed individually. Thus the computation

of the node repulsion forces of the energy-based routing and widening techniques can be

considerably simplified.

Hypergraph drawings as the one in Figure 5.9f sometimes show curves that rigorously

intersect clusters containing an end point of the respective curve. Such a curve that directly

connects the hyperedge node of the (light green) cluster in the bottom right corner is shown

in Figure 5.9f. Visually, it can be desirable to let curves only minimally intersect clusters

to connect a hyperedge node. This effect can be seen on a different curve in the same

Figure 5.9f, the one that connects the hyperedge node in the (dark green) cluster on the

right. However, the energy minimization algorithm is not always able to route curves on a

shortest path out of the cluster, as parts of a curve can be trapped in local energy minima.

To remedy this problem, the routing of curves can start using an energy model with fewer

local energy minima as we proposed in [35]. Another option is to initially route curves

with no or a weak strain attraction force, because routing curves on a shortest path out

of the cluster of one of its end points requires a large elongation of a small segment of the

curve.

The stability of hypergraph layouts in terms of predictability as mentioned in Section

2.2.3 is not evaluated yet. As on-line visualizations of the development of software

systems can be of high value for software developers, the layout techniques introduced in

this work have to be investigated towards this criterion.

Regarding the hyperedge structures, large hyperedges might benefit from a tree or bus

117


6 Summary

structured hyperedge visualization in conjunction with both the centralized and the fully

connected structures. This combined approach of different hyperedge structures avoids

many long distant curves and thus increases readability. This approach is similar to the

usage of Steiner tree structures in the subset standard by Bertault and Eades that was

discussed in Section 2.2.4.1. First, the visualization of the hyperedges in the centralized

and fully connected structures have to be investigated. This work investigated the centralized

and fully connected structure. In future work, the techniques proposed in this

thesis can be extended to more sophisticated hyperedge structures to produce even better

hypergraph layouts.

118


Bibliography

[1] ArgoUML, Open Source UML Modeling Tool. http://argouml.tigris.org/ (accessed

August 15, 2008).

[2] Hi-Tech Graph Data. http://vlado.fmf.uni-lj.si/pub/networks/data/ESNA/

hiTech.htm (accessed August 15, 2008).

[3] LinLogLayout Tool. http://www.informatik.tu-cottbus.de/~an/GD/ (accessed

August 15, 2008).

[4] Pseudo Random Graph Data. http://www.informatik.tu-cottbus.de/~an/GD/

ERLinLog/Random.html (accessed August 15, 2008).

[5] World Trade 1999 Graph Data. http://www.informatik.tu-cottbus.de/~an/GD/

ERLinLog/WorldTrade.html (accessed August 15, 2008).

[6] Sanjeev Arora. Approximation Schemes for Geometric NP-Hard Problems: A Survey.

In Ramesh Hariharan, Madhavan Mukund, and V. Vinay, editors, FSTTCS, volume

2245 of Lecture Notes in Computer Science, pages 16–17. Springer, 2001.

[7] M. Balzer and O. Deussen. Level-of-Detail Visualization of Clustered Graph Layouts.

Asia-Pacific Symposium on Visualization (APVIS), 0:133–140, 2007.

[8] Michael Balzer, Andreas Noack, Oliver Deussen, and Claus Lewerentz. Software

Landscapes: Visualizing the Structure of Large Software Systems. In Oliver Deussen,

Charles D. Hansen, Daniel A. Keim, and Dietmar Saupe, editors, VisSym, pages

261–266. Eurographics Association, 2004.

[9] C. Bradford Barber, David P. Dobkin, and Hannu Huhdanpaa. The Quickhull Algorithm

for Convex Hulls. ACM Transactions on Mathematical Software, 22(4):469–483,

1996.

[10] Josh E. Barnes and Piet Hut. A Hierarchical O(N log N) Force-Calculation Algorithm.

Nature, 324(6270):446–449, 1986.

[11] Giuseppe Di Battista, editor. Graph Drawing, 5th International Symposium, GD ’97,

Rome, Italy, September 18-20, 1997, Proceedings, volume 1353 of Lecture Notes in

Computer Science. Springer-Verlag, 1997.

[12] Giuseppe Di Battista, Peter Eades, Roberto Tamassia, and Ioannis G. Tollis. Algorithms

for Drawing Graphs: An Annotated Bibliography. Technical Report 5,

Amsterdam, The Netherlands, The Netherlands, 1994.

119


BIBLIOGRAPHY

[13] Roman Petrovych Bazylevych. Principles of Algorithmic Methods of Flexible Connection

Routing. IFAC-Workshop on Computer Control in Discrete Manufacturing,

Prague, 1977.

[14] François Bertault and Peter Eades. Drawing Hypergraphs in the Subset Standard

(Short Demo Paper). In Joe Marks, editor, Graph Drawing, volume 1984 of Lecture

Notes in Computer Science, pages 164–169. Springer, 2000.

[15] Dirk Beyer. Co-Change Visualization. In ICSM (Industrial and Tool Volume), pages

89–92, 2005.

[16] Franz-Josef Brandenburg, editor. Graph Drawing, Symposium on Graph Drawing,

GD ’95, volume 1027 of Lecture Notes in Computer Science. Springer, 1996.

[17] Alex Bykat. Convex Hull of a Finite Set of Points in Two Dimensions. Information

Processing Letters, 7(6):296–298, 1978.

[18] Kenneth L. Calvert, Matthew B. Doar, and Ellen W. Zegura. Modeling Internet

Topology. IEEE Communications Magazine, 35(6):160–163, June 1997.

[19] Bas Cornelissen, Arie van Deursen, Leon Moonen, and Andy Zaidman. Visualizing

Testsuites to Aid in Software Understanding. In CSMR ’07: Proceedings of the 11th

European Conference on Software Maintenance and Reengineering, pages 213–222,

Washington, DC, USA, 2007. IEEE Computer Society.

[20] H. O. Pollak D. S. Johnson. Hypergraph Planarity and the Complexity of Drawing

Venn Diagrams. In Journal of Graph Theory, volume 11, pages 309–325. AT&T Bell

Laboratories Murray Hill, New Jersey; Bell Communications Research Morristown,

New Jersey, 1987.

[21] Ron Davidson and David Harel. Drawing Graphs Nicely Using Simulated Annealing.

ACM Trans. Graph., 15(4):301–331, 1996.

[22] Reinhard Diestel. Graph Theory (Graduate Texts in Mathematics, Third Edition).

Springer, August 2005.

[23] Peter Eades. A Heuristic for Graph Drawing. In Congressus Numerantium, volume 42,

pages 149–160, 1984.

[24] Peter Eades, Robert F. Cohen, and Mao Lin Huang. Online Animated Graph Drawing

for Web Navigation. In Battista [11], pages 330–335.

[25] Jason Eisner, Michael Kornbluh, Gordon Woodhull, Raymond Buse, Samuel Huang,

Constantinos Michael, and George Shafer. Visual Navigation Through Large Directed

Graphs and Hypergraphs. In Proceedings of the IEEE Symposium on Information

Visualization (InfoVis’06), Poster/Demo Session, pages 116–117, Baltimore, October

2006.

[26] Thomas Eschbach, Wolfgang Günther, and Bernd Becker. Orthogonal Hypergraph

Drawing for Improved Visibility. Journal of Graph Algorithms and Applications,

10(2):141–157, 2006.

120


BIBLIOGRAPHY

[27] Leonhard Euler. Solutio Problematis ad Geometriam Situs Pertinentis (The Solution

of a Problem Relating to the Geometry of Position). Commentarii academiae

scientiarum Petropolitanae, 8:128–140, 1741.

[28] Gerald Farin. Curves and Surfaces for CAGD: a Practical Guide. Morgan Kaufmann

Publishers Inc., San Francisco, CA, USA, 2002.

[29] Thomas M. J. Fruchterman and Edward M. Reingold. Graph Drawing by Force-

Directed Placement. Software – Practice & Experience, 21(11):1129–1164, 1991.

[30] Mohammad Ghoniem, Jean-Daniel Fekete, and Philippe Castagliola. A Comparison

of the Readability of Graphs Using Node-Link and Matrix-Based Representations. In

INFOVIS, pages 17–24. IEEE Computer Society, 2004.

[31] Ivan Herman, Guy Melancon, and M. Scott Marshall. Graph Visualization and Navigation

in Information Visualization: A Survey. IEEE Transactions on Visualization

and Computer Graphics, 06(1):24–43, 2000.

[32] Danny Holten. Hierarchical Edge Bundles: Visualization of Adjacency Relations

in Hierarchical Data. IEEE Transactions on Visualization and Computer Graphics,

12(5):741–748, 2006.

[33] Danny Holten, Bas Cornelissen, and Jarke J. van Wijk. Trace Visualization Using

Hierarchical Edge Bundles and Massive Sequence Views. In Proceedings of the

4th International Workshop on Visualizing Software for Understanding and Analysis

(VISSOFT), pages 47–54. IEEE, 2007.

[34] Michael Jünger and Petra Mutzel. Automatisches Layout von Diagrammen. OR

News, 12:5–12, 2001.

[35] Martin Junghans. Avoidance of Local Energy Minima in Energy-Based Graph Drawing.

Brandenburg University of Technology, Cottbus, Juli 2006. Published in German

only as “Vermeidung lokaler Energieminima beim energie-basierten Graphenzeichnen.”

Contact the author at martin.junghans@ieee.org for a digital copy of the

German version.

[36] Paulis Kikusts and Peteris Rucevskis. Layout Algorithms of Graph-Like Diagrams

for GRADE Windows Graphic Editors. In Brandenburg [16], pages 361–364.

[37] Paul A. Kirschner, John Sweller, and Richard E. Clark. Why Minimal Guidance

During Instruction Does Not Work: An Analysis of the Failure of Constructivist,

Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching. Educational

Psychologist, 41(2):75–86, 2006.

[38] Harri Klemetti, Ismo Lapinleimu, Erkki Mäkinen, and Mika Sieranta. A Programming

Project: Trimming the Spring Algorithm for Drawing Hypergraphs. ACM SIGCSE

Bulletin, 27(3):34–38, 1995.

[39] Corey Kosak, Joe Marks, and Stuart M. Shieber. Automating the Layout of Network

Diagrams with Specified Visual Organization. IEEE Transactions on Systems, Man,

and Cybernetics, 24(24):440–454, 1994.

121


BIBLIOGRAPHY

[40] Erkki Mäkinen. How to Draw a Hypergraph. In Taylor and Francis, editors, International

Journal of Computer Mathematics, volume 34, pages 177–185, 1990.

[41] George A. Miller. The Magical Number Seven, Plus or Minus Two: Some Limits on

Our Capacity for Processing Information. The Psychological Review, 63:81–97, 1956.

[42] Paul Mutton, Peter Rodgers, and Jean Flower. Drawing Graphs in Euler Diagrams. In

Alan F. Blackwell, Kim Marriott, and Atsushi Shimojima, editors, Diagrams, volume

2980 of Lecture Notes in Computer Science, pages 66–81. Springer, 2004.

[43] M. E. J. Newman. Analysis of Weighted Networks. Physical Review E, 70:056131,

2004.

[44] Andreas Noack. An Energy Model for Visual Graph Clustering. In Giuseppe Liotta,

editor, Graph Drawing, volume 2912 of Lecture Notes in Computer Science, pages

425–436. Springer-Verlag, 2003.

[45] Andreas Noack. Energy Models for Drawing Clustered Small-World Graphs. Technical

Report 07/03, 2003.

[46] Andreas Noack. Energy Models for Graph Clustering. Journal of Graph Algorithms

and Applications, 11(2):pp. 453–480, 2007. Communicated by Peter Eades and Patrick

Healy.

[47] Stephen C. North. Incremental Layout in DynaDAG. In Brandenburg [16], pages

409–418.

[48] Doantam Phan, Ling Xiao, Ron Yeh, Pat Hanrahan, and Terry Winograd. Flow Map

Layout. In INFOVIS ’05: Proceedings of the Proceedings of the 2005 IEEE Symposium

on Information Visualization, page 29, Washington, DC, USA, 2005. IEEE Computer

Society.

[49] Franco P. Preparata and Michael I. Shamos. Computational Geometry: An Introduction

(Monographs in Computer Science). Springer, August 1985.

[50] Helen C. Purchase. Which Aesthetic has the Greatest Effect on Human Understanding?

In Battista [11], pages 248–261.

[51] Helen C. Purchase, Robert F. Cohen, and Murray James. Validating Graph Drawing

Aesthetics. In GD ’95: Proceedings of the Symposium on Graph Drawing, pages

435–446, London, UK, 1996. Springer-Verlag.

[52] Aaron Quigley and Peter Eades. FADE: Graph Drawing, Clustering, and Visual

Abstraction. In GD ’00: Proceedings of the 8th International Symposium on Graph

Drawing, pages 197–210, London, UK, 2001. Springer-Verlag.

[53] Manojit Sarkar and Marc H. Brown. Graphical Fisheye Views of Graphs. In CHI

’92: Proceedings of the SIGCHI conference on Human factors in computing systems,

pages 83–91, New York, NY, USA, 1992. ACM.

122


BIBLIOGRAPHY

[54] Wayne P. Stevens, Glenford J. Myers, and Larry L. Constantine. Structured Design.

IBM Systems Journal, 13(2):115–139, 1974.

[55] Kozo Sugiyama, Shojiro Tagawa, and Mitsuhiko Toda. Methods for Visual Understanding

of Hierarchical System Structures. IEEE Trans. Systems, Man and Cybernetics,

11(2):109–125, February 1981.

[56] J. Sweller. Some Cognitive Processes and Their Consequences for the Organisation

and Presentation of Information. Australian Journal of Psychology, 45(1):1–8, 1993.

[57] John Sweller. Cognitive Load During Problem Solving: Effects on Learning. Cognitive

Science, 12(2):257–285, 1988.

[58] John Sweller, Jeroen J. G. Van Merrienboer, and Fred G. W. C. Paas. Cognitive

Architecture and Instructional Design. Educational Psychology Review, 10:251–296,

1998.

[59] Roberto Tamassia, Giuseppe Di Battista, and Carlo Batini. Automatic Graph Drawing

and Readability of Diagrams. IEEE Transactions on Systems, Man and Cybernetics,

18(1):61–79, 1988.

[60] Stanley Wasserman and Katherine Faust. Social Network Analysis: Methods and

Applications. Cambridge University Press, November 1994.

[61] Douglas B. West. A Question on Notation in Graph Theory. http://www.math.

uiuc.edu/~west/igt/notat.html, 2000. [Online; accessed June 29, 2008].

[62] Douglas B. West. Introduction to Graph Theory (Second Edition). Prentice Hall,

August 2000.

[63] Rui Xu and II Wunsch, Donald. Survey of Clustering Algorithms. Neural Networks,

IEEE Transactions on, 16(3):645–678, May 2005.

[64] S. H. Yook, H. Jeong, and A. L. Barabasi. Modeling the Internet’s Large-Scale

Topology. Proc Natl Acad Sci U S A, 99(21):13382–13386, October 2002.

[65] Ellen W. Zegura, Kenneth L. Calvert, and Michael J. Donahoo. A Quantitative

Comparison of Graph-Based Models for Internet Topology. IEEE/ACM Trans. Netw.,

5(6):770–783, 1997.

123

More magazines by this user
Similar magazines