Visualization of Hyperedges in Fixed Graph Layouts

Brandenburg University **of** Technology Cottbus

Computer Science Department

S**of**tware Systems Eng**in**eer**in**g Research Group

Diploma Thesis

(Diplomarbeit)

**Visualization** **of** **Hyperedges**

**in** **Fixed** **Graph** **Layouts**

Mart**in** Junghans

Matrikel-Nr.: 2203076

October 2008

Adviser: Pr**of**. Dr. Claus Lewerentz

Eidesstattliche Erklärung

Ich versichere, dass ich die vorliegende Arbeit selbständig und ohne Benutzung anderer als

der angegebenen Literatur und Hilfsmittel angefertigt habe. Alle verwendeten Hilfsmittel

und Quellen s**in**d im Literaturverzeichnis vollständig aufgeführt und die aus den benutzten

Quellen wörtlich oder **in**haltlich entnommenen Stellen als solche kenntlich gemacht.

Diese Arbeit wurde bisher **in** gleicher oder ähnlicher Form ke**in**er anderen Prüfungsbehörde

vorgelegt und auch nicht veröffentlicht.

Mart**in** Junghans

Cottbus, 9. November 2008

Declaration **of** Authorship

I certify that the work presented here is, to the best **of** my knowledge and belief, orig**in**al

and the result **of** my own **in**vestigations, except as acknowledged, and has not been submitted,

either **in** part or whole, for a degree at this or any other University.

Mart**in** Junghans

Cottbus, November 9, 2008

iii

Abstract

**Graph**s and their visualizations are widely used to communicate the

structure **of** complex data **in** a formal way. Hypergraphs are dedicated

to represent real-world data as they allow to relate multiple objects with

each other. However, exist**in**g graph draw**in**g techniques lack the ability

to embed hyperedges **in**to fixed two-dimensional graph layouts. We utilize

a set **of** curves to visualize hyperedges and employ an energy-based

technique to position them **in** the layout. By avoid**in**g node occlusion

and cluster **in**tersections we are able to preserve the expressiveness **of**

the given graph layout. Additionally, we **in**vestigate techniques to reduce

the visual complexity **of** hypergraph draw**in**gs. A comprehensive

evaluation us**in**g real-world data sets demonstrates the suitability **of** the

proposed hyperedge layout techniques.

v

Contents

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Applications **of** Hypergraph **Visualization** . . . . . . . . . . . . . . . . . . . 2

1.3 Scope **of** this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Structure **of** this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Prelim**in**aries **of** Hypergraph **Visualization** 9

2.1 **Graph**s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.2 Energy-Based **Graph** **Layouts** . . . . . . . . . . . . . . . . . . . . . . 11

2.1.3 Readability and Aesthetics **of** **Graph** Draw**in**gs . . . . . . . . . . . . 13

2.2 Hypergraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.2 Requirements **of** Hypergraph **Visualization**s . . . . . . . . . . . . . . 18

2.2.3 Criteria for Hypergraph **Visualization**s . . . . . . . . . . . . . . . . . 19

2.2.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.3 Hyperedge **Visualization** Structures . . . . . . . . . . . . . . . . . . . . . . . 25

3 Hyperedge Rout**in**g 29

3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.3 Rout**in**g Based on Cluster Bounds . . . . . . . . . . . . . . . . . . . . . . . 30

3.3.1 Cluster**in**g **of** **Graph** **Layouts** . . . . . . . . . . . . . . . . . . . . . . 30

3.3.2 Solid Cluster Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3.3 Float**in**g Cluster Bounds . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.4 Energy-Based Rout**in**g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.4.1 Model**in**g **of** Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4.2 Repulsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.4.3 Stra**in** **of** Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.4.4 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4 Reduction **of** Visual Complexity 55

4.1 Classification **of** Visual Complexity . . . . . . . . . . . . . . . . . . . . . . . 55

4.1.1 Cognitive Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.1.2 Utilization **of** Cognitive Load Theory . . . . . . . . . . . . . . . . . . 57

4.2 Techniques to Reduce Visual Complexity . . . . . . . . . . . . . . . . . . . 59

vii

Contents

4.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.4 Energy-Based Curve Bundl**in**g . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.4.1 Attraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.4.2 Order **of** Energy-Based Rout**in**g and Bundl**in**g . . . . . . . . . . . . . 65

4.4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.5 Energy-Threshold-Based Aggregation . . . . . . . . . . . . . . . . . . . . . . 67

4.5.1 Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.5.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.6 Cluster-Based Curve Aggregation . . . . . . . . . . . . . . . . . . . . . . . . 69

4.6.1 Identification **of** Curve Groups . . . . . . . . . . . . . . . . . . . . . 70

4.6.2 Branch Out **of** Aggregated Curves . . . . . . . . . . . . . . . . . . . 72

4.6.3 Movable Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.7 Energy-Based Curve Widen**in**g . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.7.1 **Visualization** **of** Curves . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.7.2 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.7.3 Model **of** a Widened Curve . . . . . . . . . . . . . . . . . . . . . . . 75

4.7.4 Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.7.5 Formalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.7.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5 Evaluation 85

5.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.2 Example Hypergraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

5.3 Preservation **of** **Graph** Layout Expressiveness . . . . . . . . . . . . . . . . . 89

5.3.1 Experiment 1 – Node Occlusion . . . . . . . . . . . . . . . . . . . . . 89

5.3.2 Experiment 2 – Cluster Intersection . . . . . . . . . . . . . . . . . . 98

5.4 Reduction **of** Visual Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.4.1 Experiment 3 – Model-Based Aggregation . . . . . . . . . . . . . . . 102

5.4.2 Experiment 4 – Visual Bundl**in**g . . . . . . . . . . . . . . . . . . . . 106

5.5 Comprehensive Hypergraph **Layouts** . . . . . . . . . . . . . . . . . . . . . . 111

6 Summary 115

6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Bibliography 119

viii

1 Introduction

1.1 Motivation

In 1736, the study **of** graphs was started when Leonhard Euler presented his solution to

the problem **of** “The Seven Bridges **of** Königsberg” [27], which is deemed to be the oldest

application **of** graph theory. S**in**ce then, graphs are widely used to represent abstract

problems. In various aspects well-**in**vestigated graph algorithms have been **in**dispensable

for the research **of** the complexity **of** mathematical problems. Theoretical computer science

employs graphs to prove the computability **of** problems for **in**stance. If a problem can be

represented as a graph and the problem’s solution can be mapped to an operation on

graphs, e.g., to the NP-hard [6] travel**in**g salesman problem, then the solution **of** the

problem will also be NP-hard.

With the advent **of** the Internet, the implied growth **of** computer network**in**g, and the accompany**in**g

wide range **of** **in**ternetwork**in**g problems, graphs ga**in**ed even more importance

as a model **of** the topological structure **of** networks [65] which, among other th**in**gs, allows

analysis and development **of** **in**ternetwork**in**g technologies [18]. To advance the Internet,

rang**in**g from the development **of** strategies for resource reservation to efficient rout**in**g

protocols [64], it is crucial to capture the Internet’s large-scale topology. Due to the gigantic

scale **of** the Internet, visualizations can not be used to model Internet traffic or to

evaluate the topology **of** **in**ternetworks **in** general. A viewer’s cognition and the potential

to display a graph visualization is always limited to a small fraction **of** the real world.

Albeit not all problems **of** Internet topologies, related to graphs and their visualization,

can be mentioned **in** this work, the bottom l**in**e can be summarized as follows: **Graph**s are

used for computer-based analysis and simulation studies [65]; **Visualization**s allow a quick

cognition **of** general topological characteristics and **of** the **in**terconnectedness with**in** a limited

extent **of** the reality. The arrangement **of** vertices and edges **in** graph visualizations is

mean**in**gful as it allows to reveal group**in**gs and relations between vertices. Furthermore,

a visualization easily and clearly communicates the displayed data.

For this work, it is essential to dist**in**guish graphs, as a representation **of** structured

data, from their visualization. Numerous applications **of** graphs, a few topics are listed

**in** the follow**in**g, also rely on an adequate visualization for the sake **of** cognition and

communication **of** the displayed **in**formation.

Versatility **of** **Graph**s In bus**in**ess environments, graphs and their visualizations are utilized

**in** document management systems, project management (for **in**stance PERT network

charts), and also to depict organizational structures. Taxonomies that portray the relations

between species, and evolutionary trees are applications **of** graph visualizations **in**

biology. Furthermore, there are numerous applications related to computer science and

1

1 Introduction

**in**formation technology, which crucially rely on graphs and (**of**ten) their visual representation

[31]. Such applications **in**clude:

• Web site maps, brows**in**g history

• Semantic networks, knowledge presentation

• Object-oriented systems, class browsers, data structures (compiler), real-time systems

(state-transition diagrams, Petri nets)

• Data flow diagrams, (subrout**in**e-) call graphs, entity relationship diagrams (UML,

database structures)

• Logic programm**in**g (derivation trees)

• Computer aided design, computer aided model**in**g

• Integrated circuit creation (very-large-scale **in**tegration)

The purpose **of** graph visualizations is to convey the structure **of** **in**formation. Usually,

the visualizations are constra**in**ed to b**in**ary relationships between vertices. However, real

world graphs that represent realistic scenarios usually relate more than only two objects

to each other. For this sake, hyperedges are required to model and visually represent a

subset **of** vertices that are dist**in**guished from the rema**in****in**g vertices by a certa**in** property.

The follow**in**g section therefore motivates applications **of** graph visualization that benefit

from hyperedges.

1.2 Applications **of** Hypergraph **Visualization**

**Graph** draw**in**gs depict a set **of** objects and the relationships among them. Usually, it

is desired to arrange objects based on a similarity metric, accord**in**g to the cross-l**in**kage

between objects, or by us**in**g a hierarchy relation if available.

Hypergraphs are a generalization **of** ord**in**ary graphs, as they allow hyperedges that connect

several vertices **of** the graph and can be treated as a subset **of** vertices. A def**in**ition

is given **in** Section 2.2.1 on page 16. The visual representation **of** a hyperedge dist**in**guishes

this subset **of** vertices from the rema**in****in**g vertices. Abstract use case scenarios **of**

a visualization **of** hyperedges embedded **in**to box-and-l**in**e graph layouts are:

• A hyperedge emphasizes a subset **of** displayed objects with a certa**in** property.

• A hyperedge visualizes a further group**in**g criterion **of** the displayed objects, while

the positions **of** vertices may also reveal another group**in**g criterion.

This section **in**troduces several concrete applications **of** hypergraph visualizations and

demonstrates their usefulness **in** comparison to visualizations **of** ord**in**ary graphs. The

follow**in**g advantages promised by decent hypergraph visualizations cover only a small

part **of** possible applications and benefits.

S**of**tware **Visualization**

The visualization **of** s**of**tware systems is the ma**in** motivation **of** this work. S**of**tware artifacts

can be highly related to each other. There is a plethora **of** possible scenarios that

2

1.2 Applications **of** Hypergraph **Visualization**

utilize the visualization **of** hypergraphs, which will assist the s**of**tware development process.

More specifically, it is crucial for eng**in**eers to visualize s**of**tware systems **in** the s**of**tware

design and quality assurance phases, as it allows them to ga**in** an understand**in**g **of** the

overall structure and the **in**teractions between components.

S**of**tware artifacts like packages, classes, methods, or attributes are represented by vertices.

Edges may reflect the usage **of** components, call traces, method calls, or any communication

between s**of**tware artifacts. The computation **of** proper graph layout is fundamental,

as it determ**in**es the readability and usability **of** a visualization, as Figure 2.2 on

page 14 illustrates.

Generally, a layout that reflects the **in**clusion hierarchy **of** the presented s**of**tware system

or the relation among artifacts is favored. It enables eng**in**eers to easily and quickly catch

the affiliation **of** artifacts to modules or the **in**terrelationships **of** artifacts, respectively. The

follow**in**g scenarios reveal benefits **of** an additional visualization **of** hyperedges. Thereby, it

becomes possible to emphasize relations between more than two artifacts without loos**in**g

the mean**in**g **of** the layout.

For the follow**in**g use case scenarios, it is assumed that graph layouts either represent the

hierarchy **of** the system or reveal the relationships and connectivity among the artifacts.

Thus, vertices are positioned close to each other if they are affiliated to the same module

or the represented s**of**tware artifacts are tightly coupled.

Evaluation **of** S**of**tware Design Besides design and development **of** s**of**tware, s**of**tware

eng**in**eers also have to focus on s**of**tware quality, e.g., the quality **of** design. Various techniques

to leverage these important tasks are known and assist eng**in**eers. In a few cases,

tools suggest h**in**ts or improve the s**of**tware’s quality automatically after analyz**in**g the

source code. But eng**in**eers more **of**ten have to revise the code manually and rely on

mean**in**gful visualizations that present the structure **of** s**of**tware systems.

Common change, sometimes called co-change, denotes the stability **of** a group **of** s**of**tware

artifacts aga**in**st change. A change **of** a s**of**tware system is typically considered to be

restricted to a certa**in** (logic) functionality **of** the s**of**tware. Hence, s**of**tware artifacts

that are frequently changed together, are likely logically coupled regard**in**g object-oriented

development paradigms. To reduce ma**in**tenance cost **of** s**of**tware, a low cohesion between

different modules, i.e., a subset **of** s**of**tware artifacts implement**in**g a certa**in** functionality,

is the goal **of** s**of**tware designers and developers [54]. As a consequence, the distribution

**of** frequently changed artifacts, which were changed at the same time, over multiple parts

**of** the s**of**tware system may **in**dicate design flaws.

Analysis tools identify groups with a high co-change value. A visualization that features

hyperedges allows s**of**tware eng**in**eers to explore those groups visually and to study the

distribution **of** the artifacts. Only hypergraph visualizations allow to quickly answer the

question “Where are frequently and simultaneously changed s**of**tware artifacts?”.

Further, a co-change visualization can also suggest a more preferable s**of**tware modularization,

as done by Beyer **in** [15]. Beyer’s visualization is limited to a dist**in**ct color**in**g **of**

vertices to display hyperedges **of** co-changed artifacts. In contrast, this thesis aims at a

structural representation **of** hyperedges.

3

1 Introduction

Bug track**in**g systems are a valuable source **of** **in**formation about co-change. Bug reports

usually describe one flawed functionality and the associated bug fix repairs the source

code that caused the flawed functionality. Thus, fix**in**g a bug should not **in**volve various

s**of**tware artifacts **of** multiple s**of**tware subsystems, because a good s**of**tware design aims at

the decoupl**in**g **of** functional components. Subsequently, a hyperedge that connects vertices

**of** multiple s**of**tware subsystems strongly **in**dicates a design that should be reviewed and

possibly improved.

Abstraction **of** Subsystems S**of**tware systems **of**ten have a layered s**of**tware architecture,

and major components rely on the functionalities **of** a base system. Several components

**of** the rema**in****in**g system access the low-level base system or an external library that also

**of**fers common functionalities.

In such a scenario, the **in**ner structure **of** a subsystem is not a matter **of** particular

**in**terest. The subsystem can be hidden to save the valuable screen space resource. The

visualization **of** a hyperedge can connect those s**of**tware artifacts **of** the rema**in****in**g s**of**tware

system that access the hidden subsystem.

The visualization **of** the hyperedge allows an eng**in**eer to answer the question: “Which

s**of**tware artifacts access a particular subsystem, or any other s**of**tware artifact?”. Furthermore,

a hyperedge visualization could also answer the question “Where are the s**of**tware

artifacts that access a particular subsystem?”.

A slight variation **of** this scenario is to visualize the relationships between a subsystem

and the rema**in****in**g system. Therefore, the subsystem **of** **in**terest is collapsed to save screen

space **in** the visualization. A collapsed subsystem jo**in**ts all conta**in**ed artifacts **in**to one

vertex; edges **of** conta**in**ed artifacts are connected with the collapsed vertex **in**stead. A

hyperedge connects the collapsed subsystem with artifacts **of** the rema**in****in**g system, which

are connected to artifacts **of** the collapsed subsystem. Hence, screen space is saved and

the relations between the subsystem and the rema**in****in**g system are visualized.

Code Coverage With the advent **of** agile s**of**tware development, extreme programm**in**g,

and test-driven development, the importance **of** test**in**g grew steadily. Thorough test**in**g,

**in**clud**in**g plann**in**g and analysis **of** unit tests, promises a high s**of**tware quality and so there

is a demand to create complete sets **of** unit tests that cover the entire implementation. As

test**in**g **of** a complex s**of**tware system **in**volves a huge amount **of** unit test cases, developers

have to assess the test coverage to identify untested s**of**tware artifacts. Moreover, the

visualization **of** test cases aids the comprehension **of** s**of**tware systems [19].

The need for visualizations **of** test coverage comes with the need to visualize relations

between more than two artifacts. The visualization **of** a hyperedge could present s**of**tware

artifacts covered by a particular test case or test suite. Further, to atta**in** a soundly structured

set **of** unit tests, an eng**in**eer may require that each visualized s**of**tware component

is covered by exactly one hyperedge to avoid overlaps. Consequently, hyperedges will be

a useful supplement for these scenarios.

“S**of**tware As A City” Metaphor Manifold research is conducted on the visualization

**of** s**of**tware systems as a city. Build**in**gs represent s**of**tware artifacts. The position**in**g **of**

4

1.2 Applications **of** Hypergraph **Visualization**

vertices **in** box-and-l**in**e visualizations corresponds to the position**in**g **of** build**in**gs **in** the

landscape. This vivid metaphor **of** a city benefits from the local scope **of** the visualized

**in**formation displayed to a viewer. As a viewer navigates through a three-dimensional city,

only a few artifacts **in** the close environment are visible. More distant build**in**gs are only

visible if they are comparably tall, which corresponds to larger (major) s**of**tware artifacts.

The limited amount **of** displayed **in**formation allows to focus on a particular subsystem

utiliz**in**g the entire screen area. Enter**in**g a build**in**g allows to explore the **in**ner structure

**of** s**of**tware artifacts. A neighborhood **of** nearby build**in**gs implies a group**in**g which can

represent the modular s**of**tware design.

A hypergraph may model scenarios as described above. Then, this metaphor **of** a city

**in** m**in**d, a hyperedge enriches the city by illustrat**in**g a body **of** water curl**in**g around the

build**in**gs. The water, for **in**stance, depicts an external library, and s**of**tware artifacts that

access this library also have access to the water through a pier.

Social Networks

Of course, the visualization **of** hypergraphs can be applied to various fields beyond s**of**tware

eng**in**eer**in**g. A layout that spatially groups vertices already dist**in**guishes subsets

**of** vertices as hyperedges do. The additional visualization **of** hyperedges adds a further

dimension to depict group**in**g **in**formation.

**Visualization**s **of** social networks reveal group**in**gs **of** humans and their relationships.

For **in**stance, a graph models students **of** a university. A layout places vertices accord**in**g

to the students’ field **of** study such that a viewer easily identifies the students’ affiliations

to departments.

The b**in**ary edge relation is sufficient to model friendships between students, but not

circles **of** friends. As the friendship relation is not transitive, no statement about circles **of**

friends can be derived from the graph nor from its visualization. Therefore, a hypergraph

is required to remedy this limitation **of** b**in**ary relations (note that a hyperedge is not the

transitive closure **of** the b**in**ary relation). A visualization **of** hyperedges reveals **in**formation

about the **in**teraction **of** students beyond department boundaries and thus gives a more

precise prediction about the students’ friendships.

Internet Topologies

Another field **of** applications **of** graph visualization techniques are computer network (or

Internet) topologies. Servers are modeled by vertices **of** a graph and their positions **in**

the layout may **in**dicate their geographical location. The distance between nodes **in** a

visualization reflects the latency between servers. Edges illustrate network connections and

are weighted by their bandwidth. In this scenario, the advantage **of** us**in**g a hypergraph,

**in** comparison to a weighted graph with b**in**ary edges only, is the additional potential

to dist**in**guish sub-networks or private corporate net structures with**in** the given network

topology. **Hyperedges** connect network nodes **of** the same sub-network. S**in**ce there are

also logical sub-networks, i.e., servers might be distributed globally, a visualization without

hyperedges is not capable to reveal geographical location, bandwidth, latency, and subnetwork

memberships at the same time.

5

1 Introduction

Summary

Hypergraphs are more powerful than ord**in**ary graphs that feature b**in**ary edges only. A

hyperedge represents a certa**in** group **of** objects with a dist**in**ct property. Assum**in**g that

the layout already represents group**in**g **in**formation, the additional display **of** hyperedges

is able to visualize a further group**in**g **of** vertices with respect to another group**in**g criteria.

The simplest way to visualize hyperedges is to label nodes accord**in**g to their affiliation,

us**in**g different node color or shape. S**in**ce this approach is not able to answer the questions

mentioned **in** the scenarios above, this work will focus on the visualization **of** hyperedges

as structures that connect visually all vertices **of** the hyperedge.

All previously mentioned use case scenarios allow to conclude the purpose **of** the visualization

**of** hyperedges. A visualized hyperedge generally allows a viewer to f**in**d objects

shar**in**g a certa**in** property and to understand the relationship between those objects. We

do not aim at hypergraph visualizations that convey the specific value or degree **of** that

property.

1.3 Scope **of** this Thesis

The goal **of** this thesis is to present and evaluate techniques to visualize hypergraphs.

A graph layout is already given and the visualization **of** hyperedges must not affect the

given graph layout. We therefore study the embedd**in**g **of** hyperedges **in**to a given layout.

**Hyperedges** should utilize the free space **of** the graph layout plane between the nodes.

A contiguous shape that visually connects all hyperedge nodes equally with each other

represents a hyperedge.

A prelim**in**ary depiction **of** our notion **of** a hypergraph visualization embedded **in** a given

graph draw**in**g is shown **in** Figure 1.1. As can be seen **in** the figure, the hyperedge is laid

out **in** the unused space between the nodes **of** the graph. In this sketch the hyperedge

corresponds to a plane that connects all hyperedge nodes. In contrast to published hypergraph

draw**in**gs that use planes to represent hyperedges, the plane **in** Figure 1.1 is laid out

**in** such a way that it considers the graph layout immediately surround**in**g the visualized

hyperedge. The nodes that are not part **of** the hyperedge cause recesses **in** the plane.

We also envision a curve-based hypergraph visualization. Such a hypergraph draw**in**g

might be similar to Figure 1.1 if solely the contours **of** the plane are employed to represent

the hyperedge. In contrast to available hypergraph draw**in**gs that use curve-based hypergraph

visualizations, the goal **of** our visualization is not based on straight-l**in**e curves.

Straight-l**in**e curves can not avoid node occlusion if the graph layout is immutable.

Both targeted visualization approaches, i.e., planes and curves, enable a viewer to f**in**d

the nodes **of** a hyperedge. The object that represents a hyperedge, either by curves or by

a plane, leads viewer’s eyes from the hyperedge to all hyperedge nodes.

Readability is a significant concern for the visualization **of** hypergraphs. The additional

display **of** hyperedges must not **in**troduce visual clutter and must permit a quick cognition

**of** nodes connected by hyperedges without **in**terfer**in**g the **in**terpretability **of** the rema**in****in**g

graph.

6

1.3 Scope **of** this Thesis

Figure 1.1: Sketch **of** our **in**itial notion **of** a hypergraph visualization

More specifically, this work is go**in**g to **in**troduce techniques to visualize hypergraphs

**in** two-dimensional box-and-l**in**e draw**in**gs. These techniques ma**in**ly rely on energy-based

layout methods. As the layout **of** a graph is immutable, the hyperedges are embedded

**in**to the two-dimensional graph draw**in**gs. Thus, the computation **of** hypergraph layouts

is **in**dependent **of** the computation **of** the graph layouts. Only the positions **of** the vertices

**in** the graph layout are required to compute the layout **of** hyperedges. Consequently,

specific types **of** graphs such as directed or hierarchical graphs do not have to be handled

separately.

Different approaches and techniques are **in**troduced and their practicability is discussed

on the basis **of** their advantages and disadvantages. For the sake **of** an evaluation, a prototype

implementation produces hypergraph visualizations us**in**g the presented approaches.

Focus on S**of**tware Eng**in**eer**in**g Scenarios As motivated **in** the previous section, multiple

tasks **of** the s**of**tware development process rely on the visualization **of** s**of**tware systems.

As this field **in**spired us to conduct research on hypergraph visualization, scenarios related

to s**of**tware development are potential applications. Hypergraphs allow to group s**of**tware

artifacts **in** their visual representation, either by their position or the hyperedge connectivity.

The major advantage **of** hypergraph visualizations can be shortly described by its ability

to add another group**in**g dimension to the visualization. The values **of** a certa**in** property,

which is a criterion for group**in**g, is not displayed. Similar to a spatial cluster**in**g **of** graphs

that easily reveals groups **of** a nodes, the visualization **of** hyperedge should allow a viewer

to recognize nodes as members **of** a group. The demand for hypergraph visualization **in**

the field **of** s**of**tware visualization was already emerged **in** Section 1.2.

7

1 Introduction

1.4 Structure **of** this Thesis

This thesis is structured as follows. The subsequent Chapter 2 first formalizes graphs

and hypergraphs **in** Sections 2.1 and 2.2, respectively. Section 2.2 also **in**troduces three

requirements **of** hypergraph visualizations and **in**fers criteria that are used to evaluate

whether these requirements are met. At the end **of** Chapter 2, structures **of** hyperedge

visualizations are discussed. A proper choice **of** the structure fulfills the first requirement

**of** hypergraph visualizations.

The ma**in** contributions **of** this thesis are **in**troduced **in** the follow**in**g two Chapters 3

and 4, which present layout techniques to fulfill the rema**in****in**g two requirements **of** hypergraph

visualizations.

Chapter 3 presents two approaches **of** hyperedge rout**in**g. A rout**in**g technique is required

to create hypergraph visualizations that avoid node occlusion and cluster **in**tersections,

which is the second requirement **of** hypergraph visualizations. The first approach routes

hyperedges based on cluster bounds and is presented **in** Section 3.3. The second technique,

an energy-based rout**in**g technique, is elaborated **in** Section 3.4.

The reduction **of** the visual complexity **of** hypergraph draw**in**gs is **in**vestigated **in** Chapter

4. First, the notion **of** visual complexity **of** hypergraph draw**in**gs is clarified. In

Sections 4.4 through 4.7, four techniques are presented that can be used to simplify the

visualized **in**formation **of** hyperedges, either by a simplification **of** the hypergraph model

or by a visual simplification. As a consequence, these techniques improve the readability

**of** hypergraph visualizations by reduc**in**g the visual complexity, which is the third requirement

**of** hypergraph visualizations.

The techniques **in**troduced **in** this thesis are evaluated **in** Chapter 5. The experiments

prove the capabilities **of** the proposed layout techniques and illustrate their achievements

by several example graph draw**in**gs. They practically show that hyperedge rout**in**g avoids

node occlusion and cluster **in**tersection, and that the visual complexity **of** hyperedges can

be reduced by the techniques presented **in** Chapter 4.

8

F**in**ally, Chapter 6 concludes this thesis and outl**in**es areas **of** future work.

2 Prelim**in**aries **of** Hypergraph **Visualization**

After giv**in**g an overview **of** the topic and motivat**in**g applications **of** hypergraph visualizations

**in** the previous chapter, this chapter **in**troduces fundamental concepts used for

hypergraph visualizations **in** this thesis. The first part **of** this chapter gives an overview

**of** ord**in**ary graphs. The computation **of** graph layouts and aesthetic criteria **of** graph

draw**in**gs are described **in** Section 2.1.2 and 2.1.3, respectively.

The second part **of** this chapter formalizes hypergraphs **in** Section 2.2.1. Then, Section

2.2.2 clarifies the requirements **of** our notion **of** hypergraph visualization as the aesthetic

criteria **of** graph visualizations are **in**sufficient for this purpose. Based on these requirements,

criteria for evaluat**in**g hypergraph visualizations are derived **in** Section 2.2.3.

A discussion **of** approaches and techniques related to hypergraph visualizations **of** other

authors is given **in** Section 2.2.4.

The last section **of** this chapter discusses visualization structures **of** hyperedges. The

choice **of** such a structure allows to fulfill a first requirement **of** hypergraph visualizations.

The hypergraph layout techniques that aim at meet**in**g the rema**in****in**g requirements are the

major contribution **of** this thesis and are **in**troduced **in** the subsequent Chapters 3 and 4.

We **in**troduce techniques that are capable to produce hypergraph visualizations compliant

to the notion, requirements, and constra**in**ts mentioned **in** the present chapter.

2.1 **Graph**s

This section communicates the formalization **of** graphs. Their def**in**ition is based on common

notation **of** literature about graph theory, as for **in**stance **in** [62]. The author **of** this

book published **in** [61] a discussion **of** notation **in** graph theory. Here, the most preferable

and common notation is applied.

2.1.1 Notation

A f**in**ite graph G = (V, E) is a pair **of** a f**in**ite set V (G) **of** vertices and a set E(G) ⊆

V (G)×V (G) **of** edges, i.e., two-element subsets **of** V (G) [22, page 2]. An undirected graph

does not observe the direction **of** edges, whereas a directed graph (digraph) dist**in**guishes

the start (head) vertex from the end vertex (tail) **of** an edge.

Edges jo**in** two vertices v1, v2 ∈ V (G). Two vertices v1 and v2 are called adjacent,

if there is an edge (v1, v2) **in** E(G). A directed edge is **in**dicated by (v1, v2) ∈ E(G).

{v1, v2} ∈ E(G) signifies an undirected edge, where the order **of** vertices is irrelevant. Two

edges e1, e2 ∈ E(G) are adjacent if both have one node **in** common. Incident edges **of** a

node v are edges {e ∈ E(G)|v ∈ e} that start or end **in** v. The degree deg(v) **of** a node

v ∈ V (G) denotes the number **of** **in**cident edges **of** v.

9

2 Prelim**in**aries **of** Hypergraph **Visualization**

A connected graph G is a graph where any two vertices v1, v2 ∈ V (G) are connected

with each other by a path **in** G. A path is the shortest sequence **of** adjacent vertices from

v1 to v2.

A weighted graph G = (V, E, w) assigns each vertex and edge a weight. A weight**in**g

function w that is a union **of** vertex and edge weight**in**g functions, w : (V (G) ∪ E(G)) → R,

maps vertices and edges to their weight, i.e., a real number. Unweighted graphs can be

assumed to be weighted graphs by add**in**g the weight**in**g w : (V (G) ∪ E(G)) → 1. The

weights **of** a vertex v ∈ V (G) and an edge e ∈ E(G) are abbreviated by wv and we,

respectively.

A hierarchical graph H = (V, E, w, T ) is def**in**ed by a base graph and a hierarchy tree.

A base graph is a (weighted) graph (V, E, w) that models the non-hierarchical adjacency

relation between vertices as **in** Figure 2.1a. In addition to a base graph, a hierarchical

graph models parent-child relations between vertices, for **in**stance the **in**clusion relation,

by means **of** an acyclic hierarchy tree T (H). Figure 2.1b depicts the hierarchy tree **of** the

hierarchical graph **in** Figure 2.1a. Vertices V (H) **of** the base graph are the leaves **in** T (H).

By def**in**ition, each leaf has exactly one hierarchical parent. At each level **of** the tree T (H),

**in**ner (non-leaf) nodes **of** the hierarchical tree imply subgraphs **of** the base graph (V, E, w)

with a partition **of** base graph vertices.

(a) A box-and-l**in**e visualization **of** a

hierarchical graph

(b) Hierarchy tree **of** the hierarchical

graph **of** Figure (a)

Figure 2.1: Example hierarchy graph and its hierarchy tree. The hierarchical structure **of**

the graph implies a graph cluster**in**g that is marked with gray circles.

The hierarchical graph cluster**in**g **of** a base graph denom**in**ates the hierarchical group**in**g

**of** base graph vertices that is implied by the hierarchy tree, as Figure 2.1 illustrates. **Graph**

cluster**in**g does not consider the spatial position **of** vertices **in** a layout, it is solely based

on the **in**formation **of** a (hierarchical) graph as described above.

The set **of** vertices V (G) and edges E(G) **of** a graph G are denoted by V and E,

respectively, if it is clear from the context which graph is **of** **in**terest. |V | and |E| denote

the number **of** vertices and edges **of** a graph, respectively.

**Visualization** In reference to [22], a typical way to visualize graphs is the box-and-l**in**e

diagram: each vertex is depicted by a dot and an edge is depicted by a l**in**e that jo**in**s two

10

2.1 **Graph**s

**of** these dots. Regardless **of** the gray enclos**in**g ellipses, Figure 2.1a depicts a graph **in** a

box-and-l**in**e diagram.

Strictly speak**in**g, a graph draw**in**g is a visualization that assigns further properties like

shape, color, and size to graph objects. Box-and-l**in**e diagrams, also called straight-l**in**e

draw**in**gs, are used **in** this work. It allows to focus on the computation **of** the graph or

hypergraph layout, because it is not relevant how the dots and l**in**es are drawn.

A two-dimensional graph layout p = (p v)v∈V **of** a graph G = (V, E) is a vector **of** vertex

positions p v ∈ R 2 . Abscissa and ord**in**ate **of** a vertex position p v are denoted as p x v and

p y v, respectively.

A vertex is called node **in** the context **of** a visualization to dist**in**guish a vertex as an

object **of** the graph model from its visualization. The term node is also used to emphasize

that a position **in** the layout was assigned to the correspond**in**g vertex **of** the graph.

2.1.2 Energy-Based **Graph** **Layouts**

This section allows to ga**in** **in**-depth understand**in**g **of** the computation **of** graph layouts

and describes different criteria **of** readability **of** graph visualizations. The energy-based

graph layout computation is **in**troduced, s**in**ce these concepts are later applied to compute

hypergraph layouts.

The astrophysical N-body problem is similar to the energy-based computation **of** graph

layouts. Analog to planets **in** a solar system, forces act on the vertices **of** a graph. Typically,

vertices **of** a graph repulse each other and edges cause an attraction between jo**in**ed

vertices. Layout algorithms alter the graph layout from **in**itial (maybe arbitrary) positions

**of** vertices to a stable configuration, where the forces compensate each other and create

an equilibrium.

The follow**in**g two paragraphs expla**in** the pr**in**ciple **of** layout computation based on

forces or an energy function. By this, the relation and analogy between both approaches

is revealed.

Force-Directed Force-directed methods [29] are generally applied for undirected graphs.

Equally, spr**in**g embedder algorithms [23] are mechanical systems that use Hooke’s law to

describe forces between vertices. Edges are modeled as spr**in**gs tied to the correspond**in**g

vertices. A graph layout is produced by comput**in**g an equilibrium, the heuristic algorithms

move the vertices **in** several iterations to a stable and f**in**al location. In each iteration,

the forces on each vertex are calculated. The net force, i.e., the resultant force **of** the

**in**dividual forces, act**in**g on a vertex is the vector sum **of** the **in**dividual forces act**in**g on

it. Direction and magnitude **of** the net force determ**in**e the displacement **of** each vertex **in**

each iteration **of** the layout algorithm. If all vertices were moved to stable positions, i.e.,

the act**in**g forces at all vertex positions compensate each other, then all vertices are not

moved anymore.

Energy-Based Energy-based algorithms also move vertices iteratively. An energy function

assigns a value to each layout. This value expresses the quality **of** the graph layout.

A lower energy stands for better layout quality than a higher one. Thus, the computation

11

2 Prelim**in**aries **of** Hypergraph **Visualization**

**of** graph layouts is equivalent to m**in**imiz**in**g the system’s energy, a common mathematical

optimization problem. S**in**ce the force is the negative gradient **of** the energy, the forces

that determ**in**e the displacement **of** the vertices **in** each iteration are easily derived from

the energy functions. This relation between energy and forces allows to utilize energy and

force equations similarly to describe layout algorithms.

2.1.2.1 Concrete Energy Models for Layout Computation

Energy models specify the type **of** layouts that are computed by energy m**in**imization

algorithms [46]. The computation **of** layouts denotes the calculation and assignment **of**

two- or three-dimensional positions to the vertices **of** the graph. Energy-based graph

draw**in**g algorithms are able to compute **in**terpretable box-and-l**in**e visualizations. For

**in**stance, the distance between nodes may reflect the cluster structure **of** graphs as **in** [46].

It is also possible to meet aesthetic criteria like uniform node distribution or a uniform

edge length. However, these layouts do not reveal **in**formation by the positions **of** nodes.

This section **in**troduces concrete energy models, as similar models serve as a start**in**g

po**in**t **of** the layout techniques used for the energy-based computation **of** hypergraph visualizations.

L**in**Log Energy Models The idea **of** L**in**Log energy models is that nodes repulse each

other and adjacent nodes attract each other. It was **in**troduced by Noack **in** [44]. The

L**in**Log energy models produce layouts with densely grouped nodes if they are connected

with each other. Not or sparsely connected nodes are separated. In contrast to the model

**of** Fruchterman and Re**in**gold [29], this approach considers cluster**in**g criteria to position

the nodes. Thus, the layout reveals affiliation **in**formation **of** nodes. The total (noderepulsion)

L**in**Log energy Enr(p) **of** a graph layout p was def**in**ed as

Enr(p) =

{u,v}∈E

p(u) − p(v) −

{u,v}∈V (2)

ln p(u) − p(v) . (2.1)

As the name **of** the energy model suggests, the attraction energy (the first sum **in** Equation

2.1) grows l**in**early with **in**creas**in**g distance between two adjacent nodes u and v. The

repulsion energy (the latter sum) dim**in**ishes aga**in**st the distance between a pair **of** nodes.

The edge-repulsion L**in**Log energy Eer(p) **of** a graph layout p further takes the degree **of**

nodes **in**to account. The number **of** **in**cident edges **of** each node is added to the repulsion

energy. Thus, a node with rather high degree repulses close nodes stronger and removes the

bias **of** node-repulsion L**in**Log energy models towards center**in**g nodes with high degrees

**in** the layout area.

Eer(p) =

{u,v}∈E

p(u) − p(v) −

{u,v}∈V (2)

deg(u) · deg(v) · ln p(u) − p(v) (2.2)

PolyPoly Energy Models A generalization **of** L**in**Log energy models culm**in**ates **in** the

parameterized PolyPoly energy models. A more thorough **in**vestigation **of** PolyPoly energy

12

2.1 **Graph**s

models, expos**in**g their differences, and enabl**in**g a conversion between them, is elaborated

**in** [35]. Attraction and repulsion are both described by polynomials.

Epp(p) =

{u,v}∈E

p(u) − p(v) a −

{u,v}∈V (2)

deg(u) · deg(v) · p(u) − p(v) r

(2.3)

The attraction exponent a > 0 and repulsion exponent r ≤ 0 are customizable parameters,

which allow to generate layouts reflect**in**g different criteria. For r = 0, the repulsion

is def**in**ed equally as **in** the edge-repulsion L**in**Log energy model **in** Equation 2.2. PolyPoly

energy models are also denoted as (a, r)-energy models. The nomenclature may be extended

to r-L**in**Poly energy models, if a = 1 and r < 0, and to a-PolyLog energy models, if

a > 1 and r = 0. PolyPoly energy models are the generalization **of** L**in**Log energy models,

which correspond to the (1, 0)-energy models. For **in**stance, a (2, −2)-energy model was

also used **in** association with simulated anneal**in**g [21].

2.1.2.2 Global vs. Local Energy M**in**ima

A graph layout with m**in**imal energy has reached a stable state, i.e., further displacements

**of** nodes do not decrease the system’s energy. Equivalently, all forces act**in**g on each node

compensate each other. The net forces **of** all nodes are zero. The nodes are consequently

not moved anymore. The achievement **of** globally m**in**imal energy is the goal **of** all graph

draw**in**g algorithms.

A local energy m**in**ima is a local m**in**ima **of** the energy function. If a layout p has

locally m**in**imal energy, then every similar graph layout p ′ has a higher energy E(p ′ ) >

E(p) ≫ Em**in**, but both energies, E(p) and E(p ′ ), are potentially far away from the globally

m**in**imal energy Em**in**. A graph draw**in**g algorithm with the goal to m**in**imize the energy

**of** the layout p will move the nodes accord**in**g to the act**in**g forces. The similar layout p ′

**of** the next iteration, however, will have a higher energy E(p ′ ) > E(p). Consequently, the

nodes are not moved further; the layout p is trapped **in** a local energy m**in**ima.

L**in**Log energy models can produce **in**terpretable and readable graph layouts by encod**in**g

structural **in**formation **of** the graph **in**to the node positions. However, the computation

**of** these layouts is not straightforward, s**in**ce L**in**Log energy models have more local energy

m**in**ima than, e.g., PolyPoly energy models, and thus are more susceptible to gett**in**g

trapped **in** local energy m**in**ima [35]. PolyPoly (except for L**in**Log) energy models fortunately

have less local energy m**in**ima, but are not likely to produce **in**terpretable layouts

as L**in**Log energy models can do. In [35], we evaluate a method to comb**in**e benefits **of**

both methods. Local energy m**in**ima are bypassed and the energy m**in**imization algorithms

still produce **in**terpretable L**in**Log layouts. This approach allows to disregard local energy

m**in**ima for the energy-based graph visualization, as it is capable to overcome local energy

m**in**ima.

2.1.3 Readability and Aesthetics **of** **Graph** Draw**in**gs

A graph is eligible [31] for visualization if there is an **in**herent relation among the data

elements to be visualized. Data can be represented by vertices **of** a graph, with edges

13

2 Prelim**in**aries **of** Hypergraph **Visualization**

(a) (b)

Figure 2.2: The mean**in**g **of** layout: Two graph draw**in**gs **of** the same graph. Taken

from [65], “The dangers **of** visual representation.”

represent**in**g their relations. The graph draw**in**g problem denotes the process **of** arrang**in**g

vertices **in** a draw**in**g area and visualiz**in**g edges by curves. A visualization that communicates

the displayed data and its structure to a viewer is only as good as the position**in**g **of**

vertices and edges. Figure 2.2 shows two dist**in**ct draw**in**gs **of** the same graph. The layout

**in** Figure 2.2a allows the cognition **of** the graph structure. In contrast, the visual clutter

**of** curves and the even distribution **of** vertices **in** the layout **of** Figure 2.2b does not allow

to draw a conclusion from the graph draw**in**g.

2.1.3.1 Aesthetic Criteria

**Graph** draw**in**gs are a common language to express structures **in** a formal way. A major

quality criterion for graph draw**in**gs is readability, i.e., the ability to capture the mean**in**g

**of** a draw**in**g [59]. However it is very complex to generate aesthetically pleas**in**g draw**in**gs

automatically. For **in**stance, the m**in**imization **of** the number **of** edge cross**in**gs is NP-hard.

Aesthetics describes criteria that aid readability **of** graph draw**in**gs. Examples **of** such

criteria are:

• M**in**imization **of** the number **of** edge cross**in**gs

• M**in**imization **of** the number **of** edge bends

• M**in**imization **of** total or average edge length

• Uniform edge length

• Uniform vertex density

• Avoidance **of** unnecessary squander**in**g **of** space

Each criterion features the capability to foster readability. However, the context **of** the

visualization and the viewers’ preferences demand a prioritization **of** the **in**fluence **of** these

14

(a) M**in**imal number **of** edge bends (b) M**in**imal number **of** edge cross**in**gs

2.1 **Graph**s

Figure 2.3: Different aesthetic criteria applied on the same graph. Taken from [59], “Conflicts

among aesthetics.”

criteria. Figure 2.3 depicts the implications **of** pursu**in**g two different criteria for the

visualizations. The graph draw**in**g **in** Figure 2.3a was drawn with a focus on a m**in**imal

number **of** edge bends. The graph draw**in**g **in** Figure 2.3b considers the m**in**imization **of**

the number **of** edge cross**in**gs as aesthetic. It rema**in**s arguable which layout has a higher

readability. The evaluation **of** graph draw**in**gs thus is biased towards a certa**in** set and

prioritization **of** criteria. Nevertheless, this freedom **of** choice allows to fit draw**in**gs to

specific applications and purposes.

But what characterizes a good graph draw**in**g? Besides the def**in**ition **of** properties,

like planarity and the classification **of** layouts, there are plenty **of** aesthetic rules. Not

all **of** those are without controversy, because very few **of** the f**in**d**in**gs **of** cognitive science

have practical application [31] and only a few usability studies were published. Some very

common aesthetic rules are uniform edge length, m**in**imal number **of** edge cross**in**gs, and

an even distribution **of** vertices. In [50, 51], Purchase shows the importance **of** a reduction

**of** edge cross**in**gs to foster readability **of** graph draw**in**gs, but the m**in**imization **of** edge

bends seems to be less beneficial to readability.

**Graph** layouts are typically computed by layout algorithms. The choice **of** a layout

algorithm implies the set **of** aesthetic criteria that are considered. A taxonomy **of** graph

draw**in**g algorithms regard**in**g their aesthetic criteria was published **in** [59].

The aesthetic criteria **of** graph draw**in**gs serve as a measure for readability **of** graph

draw**in**gs. However, they are not applicable to our notion **of** hypergraph visualizations.

For **in**stance, Figure 2.2 and Figure 2.3 does not consider node occlusions explicitly, which

is **in** accordance with the majority **of** available publications. Consequently, we have to

def**in**e different requirements and criteria for the visualizations **of** hypergraphs. These are

**in**troduced **in** Sections 2.2.2 and 2.2.3.

2.1.3.2 Limitations **of** **Graph** **Visualization**s

From a viewer’s perspective, a graph visualization should answer the questions “Where am

I?” and “Where is the object **of** **in**terest?” [31]. But there are two general limitations on

15

2 Prelim**in**aries **of** Hypergraph **Visualization**

**in**formation visualization: the available screen space and the cognition as a human ability.

The latter limit is reached more quickly [52].

Screen space is a valuable and restricted resource. A visualization is not useful **in** terms

**of** readability if there is a shortage **of** screen space. Beside the screen resolution, the most

critical parameter is the amount **of** data to display. Miscellaneous techniques overcome

the problem **of** **in**formation overload, for **in**stance by the specification **of** multiple levels

**of** abstraction. The part **of** **in**terest is shown **in** detail, whereas the rest is aggregated

accord**in**g to a hierarchical structure **of** the graph [52, 24]. A viewer can set the level **of**

detail manually, or the viewer’s perspective is used to determ**in**e the degree **of** abstraction

automatically [7]. Another option to browse graphs are so-called fisheye views [53].

The amount **of** displayed **in**formation, i.e., typically the amount **of** vertices and edges,

ma**in**ly **in**fluences the required size **of** the screen space resource. A large graph size may

**in**volve the follow**in**g consequences:

• Exceed**in**g the display size. Independent **of** any graph draw**in**g algorithm, the screen

space will always be limited.

• Decreas**in**g readability. If the layout is too dense, it may become impossible to

dist**in**guish between vertices and edges. Vertices may even overlap.

• Decreas**in**g usability. Usability is fairly subjective s**in**ce the cognition depends on

human factors, but **in**formation overload generally h**in**ders cognition.

• Decreas**in**g performance **of** draw**in**g algorithms. Time complexity is crucial, as small

changes **of** the graph must not cause complex computations, **in** order to allow user

**in**teractions and on-l**in**e visualizations.

Comprehension and analysis **of** the graph visualization correlates with the size **of** the

graph. However, as illustrated **in** Figure 2.2a, a visualization may easily reveal the overall

structure **of** a large graph. Several visualization techniques try to remedy the **in**formation

overflow, for **in**stance by navigation-dependent abstractions [7] or the use **of** non-Euclidean

geometry. Consider**in**g these techniques is beyond the scope **of** this work.

2.2 Hypergraphs

2.2.1 Notation

A hypergraph H = (V, E) allows edges **of** any (non-zero) card**in**ality. Thus a hypergraph is

a generalization **of** an ord**in**ary graph, as def**in**ed above **in** Section 2.1.1. Thus, a graph is

a hypergraph with a constant edge card**in**ality **of** 2.

A hyperedge is a subset **of** the vertices **of** a hypergraph. The set **of** hyperedges E ⊆ V (H) n

is a subset **of** the Cartesian product V (H) n **of** the set **of** vertices. The number **of** vertices

**of** a hyperedge is shortly denoted by the card**in**ality |ε| ≥ 1 **of** the hyperedge ε ∈ E.

An unordered hyperedge ε, i.e., the order **of** the vertices is **in**significant, is denoted by

e = {v1, v2, ..., vn} with vi ∈ V (H) for 1 ≤ i ≤ n = |ε|. A hyperedge node is a node

v ∈ V (H) that is part **of** a hyperedge v ∈ ε ∈ E(H). This term is used to dist**in**guish

hyperedge nodes from nodes **of** the hypergraph that are not part **of** a hyperedge.

16

2.2 Hypergraphs

This work dist**in**guishes hypergraphs from graphs **in** the follow**in**g way. The underly**in**g

graph G(H) **of** a hypergraph H denom**in**ates the hypergraph without non-b**in**ary edges.

G(H) comprises b**in**ary edges, i.e., edges **of** the card**in**ality 2, **of** the hypergraph H only.

The set **of** b**in**ary edges that are **in**cluded **in** the set **of** hyperedges **of** a hypergraph is

denoted by E(H).

G(H) = (V (H), E(H))

E(H) = E(H) ∩ (V (H) × V (H))

(2.4)

The same dist**in**ction applies to the notion **of** edges and hyperedges. Edges, i.e., b**in**ary

relations between nodes, are consequently denom**in**ated as “edges”, even though the term

hyperedges would also comprise b**in**ary edges.

Analogously to the extension **of** the notion **of** graphs, the def**in**ition **of** hypergraphs

can be extended to weighted hypergraphs (V, E, w), and weighted hierarchical hypergraphs

(V, E, w, T ).

Terms **of** Hypergraph **Visualization** Undirected and weighted hypergraphs are the subject

**of** the present work. The approaches and techniques **of** hypergraph visualization

proposed **in** the follow**in**g rely on the follow**in**g notions and notations.

The distance between two nodes, or two po**in**ts p1 and p2 **in** general, is the Euclidean

distance, and the length **of** the vector p2 − p1 is denoted p2 − p1.

A curve is the visualization **of** the relation between two nodes. This notion **in**cludes

b**in**ary edges e ∈ E(G) **of** a graph G = (V, E) as well as constituent parts **of** a hyperedge

ε ∈ E(H). A hyperedge ε consists **of** a set **of** curves C(ε). A curve c ∈ C(ε) is the

connection **of** its two end po**in**ts u and v. start(c) and end(c) therefore denote both end

po**in**ts **of** a curve.

A curve c can be represented by a set ∆(c) **of** dist**in**guished po**in**ts. Each dist**in**guished

po**in**t δ ∈ ∆(c) has two neighbors neighbor(δ) = {δ1, δ2 |δ1, δ2 ∈ ∆(c) ∪ {start(c), end(c)}}

on the same curve. Near the ends **of** curves, one neighbor **of** δ is an end po**in**t **of** the curve.

An end po**in**t **of** a curve has exactly on neighbor**in**g dist**in**guished po**in**t on the same curve.

The set **of** dist**in**guished po**in**ts **of** a hyperedge is abbreviated by ∆(ε) =

∆(c)

c∈C(ε)

These dist**in**guished po**in**ts ∆(c) divide the curve c **in**to curve segments connect**in**g neighbor**in**g

po**in**ts **of** ∆(c). The number **of** curve segments |∆(c)| + 1 is determ**in**ed by the

number **of** dist**in**guished po**in**ts **of** the curve.

The length len(c) **of** a straight-l**in**e curve c is the Euclidean distance between the end

po**in**ts **of** the curve. Otherwise, the length **of** a curve is the sum **of** the lengths **of** its curve

segments, whereas the length **of** a curve segment is the Euclidean distance between the

end po**in**ts **of** the curve segment.

A two-dimensional hypergraph layout p **of** a hypergraph H = (V, E) is a vector **of** positions

p ∈ R 2 . The hypergraph layout p comprises the positions **of** all po**in**ts necessary

to visualize a hypergraph. Consequently, p conta**in**s positions **of** end and dist**in**guished

po**in**ts **of** each curve and the position **of** the vertices **of** the hypergraph.

17

2 Prelim**in**aries **of** Hypergraph **Visualization**

2.2.2 Requirements **of** Hypergraph **Visualization**s

This section clearly states the requirements **of** the envisioned hypergraph visualizations,

which were already outl**in**ed above. Every layout and draw**in**g technique presented **in**

this work must meet these conditions. The requirements thus constra**in** the options and

freedom to design layout techniques.

2.2.2.1 Uniformity

Formally, a hyperedge is a set **of** vertices. This work does not impose an order on nodes

**of** a hyperedge. There is also no weight**in**g **of** a particular subset **of** hyperedge nodes.

Consequently, all vertices **of** a hyperedge are equally related with each other. This entails

that, regardless **of** how the hyperedge nodes are **in**terconnected with each other, visualizations

**of** hyperedges must not emphasize any particular part **of** the connectivity **of** the

hyperedge.

2.2.2.2 Preservation **of** Layout Expressiveness

**Graph** draw**in**gs are able to express **in**formation by the positions **of** nodes. Energy-based

algorithms for **in**stance may arrange nodes with respect to connectivity between vertices as

outl**in**ed **in** Section 2.1.2.1. A layout places densely connected vertices close and sparsely

connected nodes more distant [46]. This characteristic allows viewers to perceive the

relationships among vertices due to the layout without display**in**g the b**in**ary edges. Furthermore,

the overall structure **of** a depicted graph is revealed by a group**in**g **of** nodes **in**

the layout, which reflects the graph cluster**in**g. In order to preserve this expressiveness **of**

the graph draw**in**g, hyperedge visualizations must meet the follow**in**g three requirements.

**Fixed** **Graph** Layout The graph layout is immutable. If nodes were displaced by add**in**g

a hyperedge **in**to the graph layout, then the displayed **in**formation might be significantly

distorted. The connectivity between nodes and the graph cluster**in**g **in** particular are not

properly reflected by an altered graph layout.

Consequently, the visualization **of** hyperedges must not alter the layout, because otherwise

the expressiveness **of** a layout would be compromised and the revealed **in**formation

about the graph would be faulty and thus less useful.

Avoidance **of** Node Occlusion Also, a hyperedge must not hide nodes **of** the graph.

Otherwise the displayed **in**formation is **in**complete, and the draw**in**g’s expressiveness is

faulty and may cause mislead**in**g conclusions. This requirement applies to hyperedges

that are visualized as curves as well as to those visualized as planes. The latter case is

obvious as a plane can cover nodes completely. Edges or curves may not completely cover

the visual representation **of** nodes, but s**in**ce a discussion about the l**in**e width **of** curves is

not part **of** this work, curves **of** the hyperedges must not cover nodes either.

18

2.2 Hypergraphs

Avoidance **of** Cluster Intersections Curves or slim extents **of** a plane can be mis**in**terpreted

as boundaries and thus visually divide the surround**in**g graph layout. An **in**tersected

cluster is visually partitioned by the hyperedge. Consequently, a hyperedge must

not **in**tersect node clusters that do not conta**in** hyperedge nodes.

2.2.2.3 Visual Complexity

A proper hypergraph draw**in**g **of**fers high readability and allows for quick comprehension **of**

the displayed connectivity between hyperedge nodes. A simple structure **of** the visualized

hyperedge avoids **in**formation overload. Only a m**in**imal amount **of** **in**formation is shown

to a viewer.

2.2.3 Criteria for Hypergraph **Visualization**s

Criteria are directly derived from the requirements **of** hypergraph visualizations. They

measure the compliance **of** actual hypergraph draw**in**gs with the requirements and are

used to evaluate techniques that are presented **in** the subsequent chapters **of** this work.

These criteria for the quality **of** hypergraph draw**in**gs are also used for the evaluation **in**

Chapter 5 at the end **of** this work.

Node Occlusion The number **of** nodes occluded by a hyperedge is a direct measure

**of** layout quality **of** hypergraphs. It **in**dicates the amount **of** visual clutter a hyperedge

**in**troduces. A high number **of** occluded nodes implies a high amount **of** **in**formation that a

hyperedge hides or distorts and thus implies a low layout quality. The quality **of** algorithms

that compute the layouts **of** hyperedges are evaluated towards their ability to avoid node

occlusions.

Cluster Intersections As mentioned above, hyperedges should not **in**tersect dense groups

**of** nodes. Clusters can only be penetrated to connect a hyperedge node **of** the respective

cluster. A “good” hypergraph layout does not **in**tersect dense groups **of** nodes without connect**in**g

one **of** them, and a high number **of** cluster **in**tersections signifies a bad hypergraph

layout.

Visual Complexity The visual complexity **of** a hyperedge is the amount **of** visualized

**in**formation required to visualize the hyperedge. A low visual complexity is preferred

s**in**ce it promises higher readability and quicker comprehension **of** the displayed relations

between hyperedge nodes.

The complexity **of** a hyperedge is measured by the number or extent **of** visualized objects,

for **in**stance the number **of** curves, the total length **of** all curves, or the total circumference

**of** a plane. The surface area **of** a plane hyperedge visualization is not a valid measure,

because its size can be highly restricted by the surround**in**g graph layout. The notion **of**

visual complexity **of** hypergraph draw**in**gs is elaborated **in** Section 4.1.

19

2 Prelim**in**aries **of** Hypergraph **Visualization**

Stability Predictability [47], also referred to as the stability **of** layouts, is an important

criterion to evaluate layout algorithms. It denotes the stability **of** calculated graph layouts

aga**in**st m**in**or changes **of** the graph. Several layout computations **of** a graph should also

lead to the same or a very similar result.

This criterion is especially important for on-l**in**e animations **of** the evolution **of** a system

that is depicted by the graph. For example, to animate the evolution **of** s**of**tware

systems, i.e., sequentially visualize a set **of** source code snapshots with smooth transitions

**in**-between, it is crucial to avoid radical changes **of** graph and hypergraph layouts. A small

change **of** the given graph layout, caused by an added node or by some slightly rearranged

nodes, should only have a m**in**or impact on the layout **of** hyperedges.

This criterion might be considered to evaluate hypergraph visualizations, though it is

not a primary concern **of** this work.

2.2.4 Related Work

Hypergraphs are frequently used **in** the context **of** **in**formation visualization. This is substantiated

by a plethora **of** multifaceted and well-established applications, for **in**stance

the relationships **in** database systems (e.g., entity relationship diagrams), electrical circuits,

dynamic programm**in**g (e.g., debugg**in**g **of** Dyna), parallel programm**in**g, debugg**in**g

**of** makefiles, chemical reactions [25] (vertices represent chemical substances and each hyperedge

represents a chemical reaction), and theorem prov**in**g.

However, little research is conducted on the visualization **of** hypergraphs **in** particular.

The follow**in**g paragraphs describe the work **of** other authors related to the visualization **of**

hypergraphs and dist**in**guish their conception from the requirements **of** the present work.

First, two general types **of** hypergraph visualizations are discussed. Then, the outcomes

**of** published research on topics that do not primarily focus on hypergraphs are covered.

These **in**stances utilize hypergraphs **in** their visualizations but are too dissimilar for the

goals **of** this work.

2.2.4.1 Types **of** Hypergraph **Visualization**s

Mäk**in**en dist**in**guished two general types **of** hypergraph draw**in**gs: edge standard and

subset standard [40]. Po**in**ts **in** two-dimensional space are utilized to represent the vertices

**of** a hyperedge. The edge standard (Figure 2.4a) connects hyperedge nodes either with

straight-l**in**e curves or smooth curve l**in**es [38]. The subset standard as **in** Figure 2.4b

represents each hyperedge by a closed curve that conta**in**s all vertices **of** the hyperedge.

Subset Standard

Bertault and Eades **in**vestigated hypergraph draw**in**gs **in** the subset standard [14]. Their

PATATE system uses force directed methods to compute the closed curves, optionally as

convex hulls **of** the subset **of** vertices. Three structures **of** the hyperedges were proposed.

The first one **in**troduces a dummy node that represents the hyperedge and is connected

to all vertices **of** the hyperedge. The second structure is a m**in**imal Euclidean spann**in**g

20

(a) Edge standard (b) Subset standard

2.2 Hypergraphs

Figure 2.4: Hypergraph draw**in**gs produced by the PATATE system taken from [14]

(a) A “simple” hypergraph draw**in**g [14] with

fairly poor readability

(b) The readability problem, as **in** the left figure,

is a prevalent issue

Figure 2.5: Two hypergraph draw**in**gs **in** the subset standard from [14] show that this

method obviously suffers from the absence **of** readability

tree that is build upon the vertices **of** the hyperedges. Lastly, the third option is a Ste**in**er

tree. Ste**in**er po**in**ts are **in**troduced and connected to the vertices **of** the hyperedge. In all

three cases, the underly**in**g structure **of** a hyperedge is modeled by **in**troduced vertices and

b**in**ary edges on these vertices. A force directed method places this underly**in**g structure

**in** the layout.

The Ste**in**er tree produces reasonable layouts for small hypergraphs, but fails for a high

number **of** hyperedges [14]. Unfortunately these methods do not consider the graph layout

at all, and thus occlusions **of** nodes and cluster **in**tersections were not addressed. Figure 2.5

shows two example hypergraph draw**in**gs produced with the PATATE system, which were

published **in** their demo paper [14]. They both suffer visual clutter **in**troduced by only a

21

2 Prelim**in**aries **of** Hypergraph **Visualization**

few hyperedges **in** these small graphs. In contrast to these figures, our work **in**vestigates

approaches that focus on readability and the preservation **of** the graphs’ expressiveness.

Both issues are not addressed by Bertault and Eades. Still, the **in**troduction **of** vertices

that represent a hyperedge is also utilized for hypergraph visualizations **in** the present

work.

Euler Diagrams The complexity to compute hypergraph

layouts **in** the subset standard is similar to the

complexity **of** draw**in**g Venn diagrams. Euler diagrams

are a generalization **of** Venn diagrams. In hypergraph

layouts **in** the subset standard the planarity **of** hypergraphs

is **of**ten focused [20]. Mäk**in**en also ma**in**ly considered

the planarity for hyperedge draw**in**gs [40]. Mutton

et al. enhanced Euler diagrams with hypergraphs [42] **in**

order to compute Euler diagram layouts. This approach

also **in**troduces vertices that constitute hyperedges. Each

vertex represents a contour **of** an Euler diagram and is

placed by force-directed methods, cf. Figure 2.6 that

shows a trivial example. The methods aim at a m**in**imization

**of** edge cross**in**gs and **of** the hypergraph’s total

edge length **in** order to **in**crease aesthetics and layout

quality **of** the Euler diagrams.

Figure 2.6: An Euler diagram

enhanced by a hypergraph,

taken from [42]

In contrast to Euler diagrams, hypergraph draw**in**gs **in** the subset standard should avoid

**in**tersections and occlusions. Further, this approach is limited to straight-l**in**e edges (and to

some simply shaped curves). But a visualization **of** hypergraphs with the goal **of** generality

must allow arbitrarily shaped edges.

Edge Standard

Local brows**in**g is an approach to navigate through large or **in**f**in**ite directed hypergraphs

where only a small subgraph is visible at any given time [25]. Real world graphs are

usually too large for static graph draw**in**gs and the local brows**in**g approach might be

applicable to several draw**in**gs. More importantly for this work is the technique to visualize

hypergraphs. In [25] Eisner et al. also **in**troduce an **in**termediate vertex that is

connected by directed b**in**ary edges to each vertex **of** the hyperedge. Unfortunately, their

presented method has several limitations, which make the method **in**applicable to general

hypergraph draw**in**g techniques that are **in**troduced **in** the present work. For example, the

layouts are based on Sugiyama [55], which means that the ord**in**ate **of** a vertex position is

chosen such that upwards edges are m**in**imized. A top-to-bottom flow, similar to a tree,

with edges represented as spl**in**es, however, is much too restrictive for a general approach

for hypergraph visualizations. The directional flow does not allow force-directed layout

methods. Furthermore, edge cross**in**gs are not prevented and the hyperedges may occlude

the rema**in****in**g vertices **of** the graph.

22

A dist**in**ction between directed and undirected hypergraphs as **in** [38] is not **in**vestigated

2.2 Hypergraphs

**in** the present work. The given scenarios and applications **of** hypergraph visualizations

rarely benefit from the additional display **of** the direction.

2.2.4.2 Implicit Research on Hypergraph **Visualization**s

The computation **of** graph and hypergraph layouts is addressed **in** various areas, such as

hardware and s**of**tware design. Also, diagrams **of** the **in**formation systems life cycle, like

PERT charts, data flow diagrams, entity relationship diagrams, and database management

system model diagrams [59] are laid out as networks.

An overview **of** literature on this topic is given by Battista et al. **in** [12]. The hypergraph

layouts are either **in** the straight-l**in**e standard (for **in**stance for the visualization **of** trees)

or **in** the grid standard (also called orthogonalization) [59]. The latter standard allows to

place vertices and edge bends on the tracks **of** the grid only. The edges are fragmented

**in**to horizontal and vertical straight-l**in**e edge segments.

Orthogonal Standard Network diagrams, PERT charts, and entity relationship diagrams

are familiar **in**stances **of** graph layouts that are usually drawn **in** the orthogonal standard,

which is among the most popular standards [34]. Vertices are represented by rectangular

blocks and are placed on **in**tersections **of** the underly**in**g grid. Accord**in**g to Mäk**in**en,

the orthogonal standard is preferred over any other graph standard, s**in**ce the perceptual

organization **of** symmetric and aligned patterns is preferred over a free position**in**g [39,

page 441].

The layouts **of** theses diagrams are computed consider**in**g aesthetic criteria [51] like a

m**in**imization **of** edge bends, edge cross**in**gs, total edge length, maximum edge length, or

the establishment **of** a uniform edge length. The occlusion **of** blocks (by edges or other

blocks) is avoided by choos**in**g other edge paths on the grid or mov**in**g the blocks [36].

However, limitation **of** available routes and vertex positions is a restriction **of** the layout

that is not admissible for this work. Further, orthogonal layouts tend to consume more

screen space than box-and-l**in**e visualizations with unrestricted position**in**g **of** vertices and

edges.

Hypergraphs and Electrical Circuits Due to the way electrical circuits, such as **in**tegrated

circuits, are fabricated, their design depends on the orthogonal graph standard as well. The

design and visualization **of** circuits highly rely on hypergraphs. The gates are represented

by vertices and **in**terconnected by several wires. Algorithms that are similar to the ones

described above for the orthogonal standard are available to route edges (wires) such

that edge cross**in**gs and bends are m**in**imized. But the design and visualization (e.g.,

for teach**in**g and documentation purposes) **of** electrical circuits pursue different goals. A

visualization usually utilizes two dimensions. In contrast, the circuit is designed on a stack

**of** layers **in** the third dimension. The wires **in** circuits are ma**in**ly routed with respect to

the efficiency, whereas a visualization targets less confus**in**g wire routes to improve the

clarity **of** draw**in**gs, enable comprehension, and **in**crease readability.

Common visualizations align gates to (horizontal) layers and connect them with orthogonal

straight l**in**es [26]. Eschbach et al. assume that the horizontal layers are **in**itially

23

2 Prelim**in**aries **of** Hypergraph **Visualization**

at a good position with m**in**imal hyperedge cross**in**gs. Then, the relations between the

hyperedge nodes **of** a hyperedge are aggregated to a s**in**gle track until a local edge cross**in**g

optimum is reached.

The aggregation **of** connections **of** a hyperedge reduces the cross**in**gs and occlusion **of**

rema**in****in**g graph objects that will be also useful for this work. However, node occlusion

is not prevented **in** this visualization technique, s**in**ce the given gates are only placed on

a lattice and the edges are geometrically routed with respect to the discrete positions **of**

the lattice.

Matrix **Visualization**s Matrix visualizations also reveal the relations between vertices

**of** a graph. The advantage **of** matrix views is the clear and stable layout, but it is less

**in**tuitive, especially **in** comparison to box-and-l**in**e diagrams [30].

A thorough comparison between both diagram types **in**cluded the comparison **of** node,

path, and subset characteristics. Node characteristics, like the degree **of** nodes, discovery **of**

outliers, and label**in**g, were measured by the average time a concrete task took, e.g., to f**in**d

the most connected node. The number **of** l**in**ks, the existence **of** a path between two nodes,

critical paths, or loops are characteristics **of** paths. Unfortunately, the characteristics **of**

subgraphs were not measured and compared **in** [30]. The characteristics **of** subgraphs, such

as the group **of** nodes connected to a node or the identification **of** densely connected node

groups, would be particularly **in**terest**in**g, because a hyperedge corresponds to a subset **of**

nodes.

The conclusion **of** the work **in** [30] is that both visualization types are qualified. Matrices

are preferable for large graphs. Path related tasks are difficult on both types, but viewers

are usually more familiar with box-and-l**in**e visualization.

A two-dimensional matrix **of** vertices **in**dicates a b**in**ary edge as an entry **in** the matrix,

possibly encoded by color, shade, or a number. The order**in**g **of** the vertices may reflect

a hierarchy or a cluster**in**g **of** vertices. The representation **of** hyperedges would require

more dimensions, which is visualizable only to a very limited extent, as any visualization

nowadays is bound to a maximum **of** three dimensions.

Two imag**in**able matrix visualization types **of** hypergraphs are discussed now. First,

each hyperedge is encoded by a certa**in** color (or shade, number) **in** a two-dimensional

matrix. Any hyperedge card**in**ality is allowed. However, each vertex is connected to a

maximum **of** one hyperedge at a time. This also **in**cludes b**in**ary edges, if they are not

omitted. This approach can only present very few edges, which might be impractical for

real world hypergraphs. Secondly, the matrix is visualized **in** a 2.5-dimensional layout, i.e.,

the matrix is laid out **in** the two-dimensional plane as usual. The b**in**ary edges between

vertices are **in**dicated **in** the matrix plane. Additional hyperedges are depicted by dummy

nodes **in** the third dimension. The dummy node **of** each hyperedge then connects each

vertex **of** the hyperedge. The “Hierarchical Net” by Balzer et al. **in** s**of**tware landscapes is

similar to this approach and illustrated **in** Figure 4.5 on page 62.

As a consequence, the fairly **in**tuitive box-and-l**in**e visualization **of** graphs and their

habitual usage promise a higher readability **of** the displayed graph **in**formation. After

all, the communication **of** structural **in**formation **of** a graph is the ma**in** motivation **of**

24

2.3 Hyperedge **Visualization** Structures

**in**formation visualization, which can be accelerated by a simple and quick cognition **of**

graph draw**in**gs. Subsequently, a matrix view is not studied **in** the follow**in**g.

2.2.4.3 Box-And-L**in**e Draw**in**gs **of** Hypergraphs

The essential problem **of** all box-and-l**in**e visualizations is the clutter **of** edges and nodes.

Edges **in**tersect other edges and occlude vertices **of** the graph. Viewers have difficulties

to identify vertices and concrete relationships between vertices. This dilemma boils down

to the f**in**d**in**g that too much **in**formation is displayed **in** the two or three dimensions **of** a

box-and-l**in**e visualization.

One option is to generate graph layouts that reflect the graph structure by the positions

**of** the vertices. Edges can be omitted if densely connected vertices are placed closely and

sparsely connected vertices are separated [44]. Thus, no visual clutter **of** edges confuses a

viewer. Even if it is not clear which pairs **of** vertices are connected and which are not, a

viewer gets a rough impression **of** the connectivity **of** the overall graph. In the majority

**of** cases this will correspond to a viewer’s claims. This reduction **of** visual complexity by

**in**dicat**in**g the connectivity by node positions rather than by draw**in**g edges is discussed **in**

various publications [44, 45, 46] by Noack et al.

In summary, the visualization **of** hypergraphs with box-and-l**in**e diagrams without rigorous

limitations, e.g., the dependency on an orthogonal standard, does not seem to be

**in**vestigated and published yet.

2.3 Hyperedge **Visualization** Structures

This section represents a first step towards the computation **of** hypergraph layouts. Unlike

a simple label**in**g **of** hyperedge nodes, our notion requires to connect all nodes **of** a hyperedge

equally, either **in** the edge- or the subset-standard. The structures **of** hyperedge

visualizations, shortly denoted as hyperedge structures, describes how a set **of** curves is

employed to visualize the connectivity between hyperedge nodes.

The structural fundamentals sketch**in**g the connectivity **of** hyperedges allow to lead a

viewer’s eyes quickly to all hyperedge nodes. This particular goal constitutes the label**in**g

**of** hyperedge nodes as **in**eligible. Hence, a set **of** curves is **in**troduced to visually model

the structure **of** hyperedges. Curves, i.e., the b**in**ary connection between hyperedge nodes,

must connect all hyperedge nodes equally (cf. uniformity **in** Section 2.2.2.1).

The rema**in**der **of** this section **in**troduces and discusses structures **of** hyperedge visualizations.

These different ways to connect hyperedge nodes with each other by curves are

derived from basic network topologies like l**in**e, r**in**g, mesh, fully connected, star, tree, and

bus.

L**in**e and R**in**g Structure

A cha**in** **of** hyperedge nodes is not sufficient to uniformly connect them with each other.

Each node **of** the cha**in** has two neighbors (one neighbor at the ends **of** a cha**in**) and by

25

2 Prelim**in**aries **of** Hypergraph **Visualization**

Figure 2.7: A set **of** curves models a fully connected network on five hyperedge nodes

this an order is implied by the dist**in**guished connectivity between neighbors. There is no

justification for any particular sequence among the hyperedge nodes. Consequently, the

l**in**e and r**in**g topology are not applied **in** this work.

Fully Connected Structure

A mesh is a partially connected network. Aga**in**, there is no justification to choose which

pairs **of** hyperedge nodes are jo**in**ed by a curve and which are not. A fully connected

network uniformly connects each pair **of** hyperedge nodes, as illustrated **in** Figure 2.7. In

contrast to a mesh, a fully connected structure meets our requirements for hypergraphs

visualizations.

Star Structure

The star topology is a centralized hyperedge model that particularly takes the uniform

connectivity **in**to account. A central po**in**t (not a node), denoted by crux **in** the follow**in**g,

serves as a (network) hub and connects all hyperedge nodes by curves, as illustrated

**in** Figure 2.8. The position **of** the crux therefore should be central to hyperedge node

positions. For **in**stance, the barycenter or a weighted barycenter can be used.

The crux is not visualized **in** a hypergraph draw**in**g. Curves connect the crux with

all nodes **of** the hyperedge. No particular connection is dist**in**guished by this centralized

structure.

Tree and Bus Structures

Similar to the l**in**e and r**in**g structures **of** the hyperedge model, a tree and a bus will not

fulfill the requirement for uniformity **of** node connectivity. Nevertheless, both structures

can be applied to produce more sophisticated hypergraph layouts. An **in**homogeneous

distribution **of** hyperedge nodes **in** the graph layout with several dense groups **of** hyperedge

nodes is a proper candidate **of** the tree or bus structure. Each cluster **of** hyperedge nodes is

**in**ter-connected **in** the centralized or fully connected structure with relatively short curves.

26

2.3 Hyperedge **Visualization** Structures

Figure 2.8: Model **of** a centralized hyperedge structure. Curves connect all hyperedge

nodes with the crux (rectangle).

The connectivity between clusters is modeled with a tree or bus structure. For **in**stance,

a bus connects the crux **of** each cluster **of** hyperedge nodes.

Discussion

The requirement **of** uniform connectivity between hyperedge nodes allows to choose between

the centralized and the fully connected structures to visualize hyperedges. The

choice **of** the hyperedge structure ma**in**ly has to focus on the requirement **of** uniform hyperedge

node connectivity, but not on the preservation **of** the expressiveness **of** the given

graph layouts.

A fully connected structure **of** a hyperedge visualization suffers the vast number **of**

curves. A hyperedge ε ∈ E **of** a hypergraph H with n = |ε| hyperedge nodes is consequently

visualized by n(n−1)

2 curves. The centralized structure **of** this hyperedge visualizes

n curves and thus is remarkably less complex. Not only the visual complexity, but also

the computational complexity **of** the layout calculation **of** hyperedges with the centralized

structure is lower than the complexity **of** fully connected structures. Nevertheless, both

structures are **in**vestigated **in** the follow**in**g work and used to evaluate the proposed hypergraph

layout techniques. From a viewer’s perspective, it is arguable which structure is

superior. Both structures qualify for the purpose **of** hypergraph draw**in**gs.

The centralized structure also has a disadvantage: the star might be not as stable aga**in**st

slight position changes **of** graph nodes as the fully connected structure. A displacement

**of** the position **of** the crux can also impact the layout. The fully connected structure is

more stable as is has much more curves. Even if some curves are rearranged, the majority

can rema**in** unchanged. However, this criteria for hypergraph visualization is not focused

**in** this work.

27

3 Hyperedge Rout**in**g

Rout**in**g denotes, **in** the context **of** this thesis, the process **of** f**in**d**in**g an optimal path that

is used for the visualization **of** the curves **of** the hyperedge structure. In comparison to

the rout**in**g performed **in** networks, the number **of** paths eligible for curve visualization is

potentially unlimited. Consequently, no graph-theoretic rout**in**g algorithm for networks

can be applied here. Rout**in**g **of** curves more precisely denotes the displacement **of** curves.

Rout**in**g techniques move curves to an optimal position regard**in**g the second requirement

**in**troduced **in** Section 2.2.2. An optimal curve position preserves the expressiveness **of** the

graph layout by avoid**in**g node occlusion and cluster **in**tersections. The third requirement,

the reduction **of** the visual complexity, is not a concern **of** rout**in**g techniques, and is thus

handled separately **in** Chapter 4.

Both hyperedge structures used **in** this work, the centralized and the fully connected

structure, represent the underly**in**g skeletal structure **of** a hyperedge visualization. The

**in**put **of** a rout**in**g technique is a hyperedge layout with straight-l**in**e curves, as illustrated

**in** Figures 2.7 and 2.8.

In this Chapter, two techniques for hyperedge rout**in**g are presented. The first one, **in**

Section 3.3, is based on a spatial cluster**in**g **of** the graph layout. Although this approach

might satisfy the requirements **of** hypergraph visualizations, it is not an optimal solution,

as will be discussed **in** Section 3.3.4. The conclusions drawn from the first approach

motivate the development **of** a second, energy-based approach presented **in** Section 3.4.

The latter technique proves to be more suitable, as concluded **in** Section 3.5.

3.1 Motivation

The rout**in**g **of** curves **of** the hyperedge structure is **in**evitable to preserve a layout’s expressiveness.

Any change **of** the graph layout, i.e., any alteration **of** node positions, is

prohibited. The goal **of** hyperedge rout**in**g is to move curves to the free space between

nodes. This section **in**troduces two rout**in**g techniques that compute paths **of** curves such

that occlusions **of** nodes and **in**tersections **of** clusters are prevented. This **in** turn can

give curves enough space to spread width-wise and create a hypergraph draw**in**g **in** the

subset-standard afterwards.

Hyperedge rout**in**g techniques have two requirements to produce good hypergraph layouts

with respect to the preservation **of** layout expressiveness. First, hyperedges must

not occlude nodes, which means that curves can not be visualized as straight l**in**es. This

avoids a loss **of** **in**formation that a graph layout presents. Second, the hyperedges must

not distort the visualized **in**formation **of** the drawn graph by avoid**in**g an **in**tersection **of**

clusters. Furthermore, an additional restriction **of** a rout**in**g technique is to avoid mov**in**g

hyperedges outside **of** the graph layout area.

29

3 Hyperedge Rout**in**g

3.2 Related Work

Section 2.2.4 already described the approaches **of** the visualization **of** hypergraphs by

other authors **in** general. The rout**in**g **of** edges is a more specific technique that becomes

**in**evitable if the edges and hyperedges are visualized by curves.

The most notable differences to other exist**in**g edge rout**in**g approaches are the fixed

graph layout and the requirements to avoid occlusions **of** nodes that are not part **of** the

hyperedge **of** current **in**terest and to avoid cluster **in**tersections.

Often, l**in**es are routed geometrically. For **in**stance, the **in**tersections **of** boxes and l**in**es

**in** network diagrams are calculated geometrically. Then ,the **in**tersections are m**in**imized

by rearrang**in**g boxes and l**in**es with**in** the grid. The m**in**imization **of** edge cross**in**gs, edge

bends, et cetera [51] suffers high computational complexity. In addition, such techniques

rely on test**in**g the limited amount **of** discrete positions that a grid layout **of**fers [36].

Besides this geometrical way **of** rout**in**g, there is also the topological way to route

curves [13]. Bazylevych models the available rout**in**g area between nodes as triangular

or rectangular faces. Various parameters are required to describe the topology **of** routes

or channels through faces that can be used for rout**in**g. This technique produces arbitrarily

shaped curves. But it is not capable to consider the local neighborhood (beyond direct

neighbor nodes) **of** a node, which is **in**evitable to avoid cluster **in**tersections.

Other rout**in**g techniques depend on additional meta data **of** graphs. Holten uses base

po**in**ts derived from a graph’s hierarchy to route edges [32]. An edge rout**in**g technique

that is specific to clustered graphs is proposed **in** [8]. The latter method determ**in**es the

base po**in**ts **of** a curve’s route from the visualization **of** shapes that represent the cluster

bounds. Both approaches are limited to certa**in** types **of** graphs. In this thesis, no certa**in**

types **of** graphs are assumed, **in** order to develop a more general rout**in**g approach for

hyperedges.

3.3 Rout**in**g Based on Cluster Bounds

In this section, we present an approach to route curves such that no clusters are **in**tersected

(except for the clusters **of** the curve’s end po**in**ts), given that we computed a spatial

cluster**in**g **of** the graph **in** a preprocess**in**g step. Us**in**g this approach, the avoidance **of**

node occlusion is implicitly achieved as well. Afterwards, it rema**in**s to solve the problem

**of** node occlusions with**in** the clusters **of** the end po**in**ts **of** the curves.

The presented curve rout**in**g algorithm uses a spatial cluster**in**g **of** the graph layout and

determ**in**es cluster bounds. Curves that **in**tersect cluster bounds have to be rearranged

until the **in**tersections are elim**in**ated. Although a cluster**in**g is necessary, this section

presents this approach **in** order to discuss such an obvious curve rout**in**g approach that is

based on geometric **in**formation.

3.3.1 Cluster**in**g **of** **Graph** **Layouts**

The first step is the computation **of** a spatial cluster**in**g. Depend**in**g on the position **of**

nodes, each node **of** the graph is assigned to one cluster. A spatial cluster**in**g assures

30

3.3 Rout**in**g Based on Cluster Bounds

that closely placed nodes belong to the same cluster. A density-based spatial cluster**in**g

algorithm f**in**ds contiguous areas with high density **of** nodes. The Euclidean distance **of**

the graph nodes **in** the layout serves as the distance measure. However, such an algorithm

does not assign each node to a cluster. The handl**in**g **of** outliers is crucial: an outlier is

either assigned to the nearest cluster or it constitutes its own cluster.

While s**of**tware systems **of**ten feature a hierarchy that implies a (hierarchical) graph

cluster**in**g, such a graph cluster**in**g usually is not compliant to the spatial distribution

**of** vertices **in** the graph layout. Energy-based graph layout algorithms, like the L**in**Logenergy

model from Section 2.1.2.1, are used to produce layouts that reflect the graph

cluster**in**g [46]. Then, a graph cluster**in**g, e.g., derived by a generalization **of** Mark Newman’s

Modularity measure [43], can be similar to the spatial cluster**in**g.

3.3.2 Solid Cluster Bounds

The follow**in**g paragraphs **in**troduce the rationale **of** a divide and conquer rout**in**g algorithm

that is based on cluster bounds. Three steps remove **in**tersections **of** curves and clusters.

1. Compute Cluster Bounds Assum**in**g a cluster**in**g is available, the first step is the

computation **of** cluster bounds. They allow to determ**in**e whether a po**in**t **in** the graph

layout area is **in**side or outside **of** the area occupied by a cluster. The cluster bound is a

polygon that encloses all nodes **of** a cluster. Several well-known algorithms, e.g., discussed

**in** [49, Chapter 3] and [9, 17], can compute a m**in**imum bound**in**g box or the convex hull

for a set **of** nodes **of** a cluster.

A polygon is preferred over other figures s**in**ce a polygon is simply represented by a set

**of** l**in**e segments, **of**fers arbitrary accuracy to describe the cluster area, and allows easy

computations **of** **in**tersections.

2. Detect Intersections To detect cluster **in**tersections and potential node occlusions,

the po**in**ts **of** **in**tersections **of** curves and cluster bounds are calculated. The polygons are

described by a set **of** l**in**e segments. Initially, the curves are also straight l**in**e segments.

So the computation **of** **in**tersections boils down to the computation **of** **in**tersection **of** two

l**in**e segments **in** the two-dimensional space. Two po**in**ts, entry p **in** ∈ R 2 and exit po**in**t

p out ∈ R 2 **of** the curve on the cluster bound, are calculated for each **in**tersection.

3. Elim**in**ate Intersections An **in**tersection **of** a curve and a cluster bound is removed

by rearrang**in**g the part **of** the curve between entry and exit po**in**t. The center po**in**t c0

between exit and entry po**in**ts is moved away from the cluster area, **in**to the free space.

c0 = p **in** + p out − p **in**

2

=

p x **in** + px out

p y

**in**

+ py

out

Let m be orthogonal to the direction **of** the **in**itial curve pout − p**in**.

p

m =

y

**in** − pyout

p x out − p x **in**

· 1

2

(3.1)

(3.2)

31

3 Hyperedge Rout**in**g

The center po**in**t **of** the curve segment is moved **in** the direction m or −m, until it is no

longer **in**side the cluster. The new location **of** the center po**in**t is denoted c ′ 0 . The exact

distance **of** c ′ 0 from the cluster bound is determ**in**ed by a predef**in**ed parameter. The choice

**of** the direction, i.e., m or −m, depends on which direction promises a shorter detour **of**

the curve.

After the center po**in**t c0 was moved out **of** the cluster area (to c ′ 0 ), the curve may

still **in**tersects the cluster. The curve is therefore divided **in**to two parts. The first part

straightly connects one end po**in**t **of** the curve with the displaced center po**in**t c ′ 0 . The

second part connects c ′ 0 with the other end **of** the curve. The **in**tersections **of** both parts **of**

the curves are computed and elim**in**ated as described above until there are no **in**tersections

between this curve and this cluster left.

Aftermath This algorithm assures that curves will not **in**tersect clusters anymore. S**in**ce

clusters are not **in**tersected and each node is affiliated with a cluster, no node occlusions

**in** clusters except for those **of** the curves’ end po**in**ts are possible. The algorithm can be

extended to f**in**d curve paths with a m**in**imal total detour. All possible paths have to be

computed **in** advance or a backtrack**in**g strategy can be applied.

The distance between the displaced center po**in**t **of** a curve segment and the cluster

bound has to be predef**in**ed. However, there are no uniform scale units applicable to all

graph layouts and it is unclear what distance is desirable. Specific graph layout measures

like the average or the m**in**imal node distance may help to set a proper distance between

routed curves and clusters.

3.3.3 Float**in**g Cluster Bounds

The def**in**ition **of** an optimal distance between a routed curve and the cluster bound is

**in**flexible. It also prevents the production **of** visually pleas**in**g routes, s**in**ce the routed

curves will always have the same distance from the cluster bound. Float**in**g bounds are,

**in** contrast to solid bounds, not explicitly spatially determ**in**ed. They can be obta**in**ed by

the utilization **of** techniques like implicit surfaces. Implicit surfaces are used to compute

representations **of** molecular surfaces or splash**in**g water. They were also applied for a

visually simplified representation **of** vertex clusters [7]. Figure 3.1a depicts the pr**in**ciple

and Figure 3.1b shows an application **of** implicit surfaces.

The approach us**in**g float**in**g cluster bounds considers clusters **in** the near proximity

**of** a curve and will move the routed curves to the middle **of** the free space among two

neighbor**in**g clusters. This approach also follows a divide and conquer scheme and consists

**of** the follow**in**g two steps:

1. Compute Cluster Bounds A cluster creates an energy field that encloses the entire

cluster and is composed by its generator po**in**ts **in** all cluster nodes. A float**in**g cluster

bound is described by the equipotential l**in**es **of** the energy field as depicted **in** Figure 3.1a.

A customizable threshold parameter adjusts the **in**fluence radii **of** the energy fields and

consequently the positions **of** the cluster bounds. The threshold assures the avoidance **of**

node occlusion as it implies a blocked area that can not be used to route curves.

32

(a) Overlapp**in**g energy fields **of** implicit surfaces

3.3 Rout**in**g Based on Cluster Bounds

(b) Implicit surfaces create a visually simplified

representation **of** vertex clusters

Figure 3.1: Implicit surfaces **in** the two-dimensional space, taken from [7]

2. Elim**in**ate Intersections Intersections between curves and cluster bounds are not

calculated. A curve is moved from its **in**itial position towards the next accessible valley **of**

total energy. The curve is therefore halved and the center **of** the curve is moved **in**to the

direction that promises the largest reduction **of** the energy. The negative gradient **of** the

energy **of** these fields determ**in**es the direction a curve is moved.

By this, a curve is not moved across other clusters, because every movement has to

reduce the energy. F**in**ally this approach places curves, more specifically the considered

central po**in**t **of** a curve, **in** the middle **of** the valley with low energy between clusters.

Further **in**tersections **of** the routed curves with the cluster bounds can be elim**in**ated by

apply**in**g this approach the the curve segments to the left and to the right **of** the central

po**in**t.

Consequently, there is no need to def**in**e an absolute distance values as required for the

approach based on solid cluster bounds.

3.3.4 Conclusion

A thorough exam**in**ation **of** the two approaches **of** a rout**in**g technique based on cluster

bounds leads to the conclusion that both **of** them are not applicable for the given purpose.

The first algorithm, based on solid cluster bounds, is a geometrical approach. The computation

and elim**in**ation **of** **in**tersections can become fairly complex as is has to consider

many cases, e.g., odd contours **of** cluster bounds and overlapp**in**g cluster bounds. Furthermore,

the need to def**in**e fixed geometric values f**in**ally makes this algorithm **in**applicable

for this purpose.

The approach based on float**in**g cluster bounds remedies the need to def**in**e fixed values

and also produces more visual pleas**in**g curve layouts, s**in**ce neighbor**in**g clusters are

considered. However, the placement **of** curves **in** the middle between clusters does not

support a restriction **of** the extension **of** the curves. Both algorithms can produce very

33

3 Hyperedge Rout**in**g

long curves on **in**appropriate paths between the clusters if large areas **of** the graph layout

area are blocked. For example, the surfaces **in** Figure 3.1a block the entire area between

the objects if the threshold is too high.

Overall, the rout**in**g **of** curves based on float**in**g cluster bounds is preferable over solid

cluster bounds. The concept **of** float**in**g cluster bounds is now further ref**in**ed to avoid

the block**in**g **of** large areas **of** the graph layout area. The follow**in**g section **in**troduces an

energy-based rout**in**g technique that similarly use energy fields, though they orig**in**ate at

each **in**dividual vertex **of** the graph.

3.4 Energy-Based Rout**in**g

The cluster-bound-based rout**in**g technique us**in**g float**in**g bounds revealed that an energy

field should on the one hand be as f**in**e-gra**in**ed as possible to avoid a wastage **of** layout

space and to remedy the problem **of** overlapp**in**g clusters. On the other hand, a method

based on energy fields must also operate **in** a coarse-gra**in**ed manner, where a dense group

**of** nodes creates an energy field by the superposition **of** their **in**dividual energy fields, to

avoid cluster **in**tersections. The energy-based rout**in**g technique **in**troduced **in** this section

is similar to the rout**in**g approach that is the based on float**in**g bounds, and will extend it

to meet all requirements **of** hypergraph visualizations.

An energy-based rout**in**g technique utilizes energy models, similar to the energy-based

computation **of** graph layouts such as **in** [46], to compute the position **of** curves and thus

to draw hyperedges **in** fixed graph layouts. It is applicable to both hyperedge structures,

the centralized and the fully connected structure.

The gist **of** energy-based rout**in**g is that nodes create energy fields that repulse curves

and move them to free space **of** the graph layout area. The arrows **in** Figure 3.2a depict

such a repulsion. The repulsion impedes the occlusion **of** nodes by curves. The hypergraph

layout, i.e., the positions **of** the curves, is computed by energy m**in**imization algorithms

that move curves to positions with (locally) m**in**imal energy. Sett**in**g the energy field

strength very high at positions **of** the repuls**in**g nodes, and still not low **in** the close

proximity **of** **in**dividual nodes or clusters, energy m**in**imization algorithms are able to

route curves properly.

Individual energy fields are overlapp**in**g each other and, accord**in**g to the superposition

pr**in**ciple, create a compound repulsion by clusters. Energy m**in**imization algorithms that

f**in**d energy m**in**ima **in** the graph layout area move curves away from nodes and likewise

push them out **of** node clusters. Consequently, the **in**tersection **of** clusters is also prevented

by such an repulsion.

The rema**in**der **of** this section is organized as follows. In the beg**in**n**in**g, a simplified

representation **of** curves **in** energy fields is discussed. Curves need to be approximated by

a f**in**ite set **of** po**in**ts, as depicted **in** Figure 3.2b, which serve as receptors **of** forces act**in**g on

curves. Next, an energy model for energy-based rout**in**g **of** curves is **in**troduce **in** the three

Sections 3.4.2 through 3.4.4. In do**in**g so, the repulsion energy is formalized **in** a first step.

Then, **in** compliance to Newton’s third law **of** motion, i.e., the law **of** reciprocal actions, an

opposite force that prevents **in**f**in**ite stra**in** **of** curves is **in**troduced **in** Section 3.4.3. This

force is illustrated by the wide arrows overlay**in**g the curve **in** Figure 3.2c. By employ**in**g

34

(a) Repulsion force act**in**g on a curve

(b) Repulsion force act**in**g on dummy nodes

(c) Opposite force

3.4 Energy-Based Rout**in**g

node repulsion force

routed curve

dummy node attraction

Figure 3.2: The rationale **of** energy-based rout**in**g

this opposite force, curves are modeled as elastic bands. Section 3.4.4 comb**in**es both forces

together and discusses the equilibrium created by this composition. A conclusion **of** this

energy-based rout**in**g technique concludes this Section.

3.4.1 Model**in**g **of** Curves

Figure 3.2a depicts the repulsion act**in**g on a curve, more precisely the arrows **in** this figure

represent direction and magnitude **of** the repulsion forces. All nodes **of** the graph, except

for the two hyperedge nodes at the end **of** the curve, create an energy field.

A problem arises as energy-based layout algorithms are not capable to process bodies,

i.e., an **in**f**in**ite number **of** po**in**ts. Newton already described the motion **of** bodies by some

suitable chosen po**in**ts **of** the body. Accord**in**gly, a curve is approximated by a f**in**ite set

**of** considered po**in**ts. This approximation **of** the curve model is necessary to allow the

computation **of** the impact **of** forces on curves and the result**in**g displacement **of** curves.

Dummy Nodes The dist**in**guished po**in**ts that are used to approximate a curve are denoted

dummy nodes **in** the follow**in**g. They are receptors **of** the repulsion, do not affect

the rema**in****in**g layout, and are not visualized later. Dummy nodes are used to compute the

energy at these po**in**ts and thus to compute the **in**fluence **of** the surround**in**g graph layout

35

3 Hyperedge Rout**in**g

on this part **of** the curve. Figure 3.2b depicts the same graph layout and the same curve

from Figure 3.2a, but the repulsion now only acts on the dummy nodes that model this

curve.

3.4.1.1 Curve Model Fidelity and Accuracy

The fidelity **of** the curve model reflects the number **of** dummy nodes that model a curve.

The fidelity and thus the accuracy **of** curves **in**creases with **in**creas**in**g number **of** dummy

nodes per curve. A homogeneous distribution **of** dummy nodes on the curves is assumed

as an uneven distribution would not permit the prediction **of** the accuracy **of** curves, the

probability to occlude nodes and **in**tersect clusters, and thus the quality **of** the rout**in**g

result.

The fidelity **of** the curve model and the accuracy **of** curves correlate with each other.

The accuracy denotes the quality **of** a modeled curve and the potential rout**in**g result **in**

consequence as it is **in**fluenced by the distance between neighbor**in**g dummy nodes. A

large distance between neighbor**in**g dummy nodes **of** a curve is imprecise and features low

accuracy and rout**in**g quality. In contrast to the accuracy, the fidelity does not consider

the distance between neighbor**in**g dummy nodes s**in**ce it is a measure **of** the number **of**

dummy nodes per curve.

A low fidelity **of** the curve model allows a very fast hypergraph layout computation. A

hypergraph layout with higher fidelity **of** the curve model needs more computational effort,

but also promises a higher hypergraph layout quality with respect to the requirements.

Only at the positions **of** the dummy nodes, node occlusions can be def**in**itely prevented.

The follow**in**g paragraphs **in**troduce three different approaches to def**in**e the number **of**

dummy nodes that are used to model each curve.

**Fixed** Number **of** Dummy Nodes If a fixed number |∆| = |∆(c)| **of** dummy nodes is

used to model each curve c **of** a hyperedge, then the distance dδ(c) between neighbor**in**g

dummy nodes differs for different curves as it depends on the length len(c) **of** the curve c.

dδ(c) = len(c)

|∆| + 1

The accuracy **of** the curves may differ for different curves although they have the same

fidelity. Thus each curve will meet the requirements **of** hypergraph visualizations differently.

A fixed number **of** dummy nodes is not appropriate for the requirements **of** this

work.

**Fixed** Dummy Node Distance A fixed distance dδ between neighbor**in**g dummy nodes

**of** all curves overcomes this shortcom**in**g. The actual distance dδ(c) slightly varies for each

curve as the curve’s length len(c) is not always divisible without rema**in**der by the **in**teger

number **of** curve segments |∆(c)| + 1. However, this method ensures an approximately

36

equal accuracy **of** each curve.

len(c)

|∆(c)| = rd − 1

dδ

dδ(c) = len(c)

≅ const

|∆(c)| + 1

3.4 Energy-Based Rout**in**g

The distance dδ between neighbor**in**g dummy nodes **of** a curve has to be predef**in**ed.

Incrementally Increas**in**g Number **of** Dummy Nodes Us**in**g this approach, all curves

always have an equal fidelity. To avoid the specification **of** fixed values **of** the aforementioned

approaches and to ga**in** more flexibility, the number **of** dummy nodes model**in**g a

curve is **in**creased dur**in**g the execution **of** the rout**in**g algorithm. Energy m**in**imization

algorithms usually compute the layouts iteratively. So, after a couple **of** iterations **of** the

energy m**in**imization algorithm, the number **of** dummy nodes is **in**creased. Then, an energy

m**in**imization algorithm cont**in**ues the calculation **of** the routes us**in**g a more complex and

accurate model **of** curves.

Initially, a straight-l**in**e curve connects the two end po**in**ts, e.g., two hyperedge nodes. A

curve is **in**itially modeled by one dummy node that is placed at the curve’s center between

both end po**in**ts. Two straight-l**in**e curve segments connect the dummy node with the end

po**in**ts **of** the curve. An energy m**in**imization algorithm is applied and the dummy nodes

**of** curves are rearranged. After some iterations, ideally when the dummy nodes have been

moved to a local energy m**in**ima, the number **of** dummy nodes **of** each curve is **in**creased.

The **in**troduction **of** new dummy nodes halves each curve segment. The position **of** newly

**in**troduced dummy nodes is the center **of** the end po**in**ts **of** the respective curve segment.

Algorithm 3.1 clarifies this procedure. The total number **of** iterations n **of** the energybased

rout**in**g algorithm is divided **in**to phases with dist**in**ct fidelity **of** the curve model.

The phases **in** the shown algorithm are equally sized **in** Algorithm 3.1, i.e., each phase

consists **of** the same number **of** iterations.

1

2

3

4

5

6

7

8

9

Input: number **of** iterations n, f**in**al curve model fidelity ff , set **of** curves curves,

hyperedge layout p

Output: hyperedge layout p

f ← 0;

for i ← 1 to n do

if f = ⌈ i

n · ff ⌉ then

f ← f + 1; // **in**crease current fidelity f

foreach curve **in** curves do

foreach curve segment **in** curve do // halve each curve segment

δ ←newDummyNode(center(curve segment));

p(δ) ← center(curve segment);

p ← m**in**imizeEnergy(p);

Algorithm 3.1: Energy-based rout**in**g **of** curves **in** conjunction with an **in**cremental

**in**creas**in**g curve model fidelity

37

3 Hyperedge Rout**in**g

The **in**cremental approach def**in**es the fidelity f **of** the curve model as follows. Let the

**in**itial fidelity f = 1 denote that the number ∆f=1(c) = 1 **of** dummy nodes per curve is 1.

Due to the assumed homogeneous distribution **of** dummy nodes on each curve, ∆f dummy

nodes divide a curve **in**to 2 f equally sized curve segments. The number **of** dummy nodes

**of** a curve c that is modeled with a curve model fidelity f is def**in**ed as:

|∆f (c)| = 2 f − 1

dδ(c) = len(c) · 2 −f

(3.3)

The distance dδ(c) between neighbor**in**g dummy nodes still depends on the curve length

len(c), but the ability to **in**crease the fidelity f guarantees that a desired level **of** accuracy

can be obta**in**ed for all curves. The growth **of** the fidelity implies an exponential rise **of**

the number **of** dummy nodes.

3.4.1.2 Benefits **of** an Incremental Approach

In summary, the benefits **of** **in**crementally **in**creas**in**g fidelity over the two earlier approaches

for sett**in**g the number **of** dummy nodes are:

• User-def**in**ed, theoretically arbitrarily high accuracy, s**in**ce there are no fixed limitations

specified.

• An early abort promises a quickly available approximation **of** hypergraph layout with

lower fidelity.

• High computation effort promises accurate hypergraph layouts with a high probability

to preserve the layout’s expressiveness.

• The fidelity **of** the curve model can be **in**creased even after a first hyperedge rout**in**g

was f**in**ished. The rout**in**g technique may cont**in**ue with **in**creased fidelity to further

smooth the paths **of** the curves.

• As already mentioned, the ability to produce on-l**in**e animated visualizations [24] is,

although beyond the scope **of** this work, very important for future applications **of** the

**in**troduced rout**in**g technique. Then, a user could abort the computation dur**in**g the

computation if the layout complies with his personal requirements. If further curves

are added to an already routed hyperedge, then the **in**cremental approach **in**stantly

produces a rough approximation **of** the updated hyperedge visualization and with

**in**creas**in**g curve model fidelity the curve converges towards its f**in**al path.

Consequently, this work utilizes this **in**cremental technique to model curves **in** energy

fields.

3.4.2 Repulsion

Now that the curve model allows to compute the impact **of** forces on curves, the energy

model for the energy-based curve rout**in**g technique is motivated and formalized **in**

the follow**in**g sections. This section therefore **in**troduces the node repulsion that avoids

node occlusion and cluster **in**tersections by curves. The follow**in**g Section 3.4.3 **in**troduces

another energy and both are composed **in**to one consistent energy model **in** Section 3.4.4.

38

3.4 Energy-Based Rout**in**g

Each node creates a repuls**in**g energy field centered at the node’s layout position. The

impact **of** all repuls**in**g energy fields on a particular position **of** the graph layout is determ**in**ed

by the sum **of** the **in**dividual energy fields (superposition pr**in**ciple). The result

**of** the energy-based rout**in**g technique are curves curl**in**g around the nodes and clusters.

Thus, node occlusion and cluster **in**tersections are avoided.

Dummy nodes are repulsed by all nodes **of** the graph, except for the nodes that are

part **of** the hyperedge and serve as end po**in**ts **of** the respective curve, cf. Figure 3.2b

on page 35. An energy m**in**imization algorithm, e.g., the hierarchical force-based Barnes-

Hut algorithm [10], is used to move curves to an area with low repulsion energy **in** several

iterations **of** dummy node displacements. A stable state (equilibrium) is atta**in**ed when the

system’s total energy is locally m**in**imal and each dummy node experiences an equilibrium

**of** forces.

3.4.2.1 Formalization **of** the Repulsion

Initially, curves are straight-l**in**e connections without consideration **of** the graph layout.

Then, a very high repulsion at the position **of** nodes moves the dummy nodes away from

the nodes. The strength **of** the energy field dim**in**ishes with **in**creas**in**g distance from the

node. Therefore, an theoretically optimal position **of** a routed curve is as far as possible

away from repuls**in**g nodes; an optimal position **of** the dummy nodes is characterized by

a m**in**imal sum **of** repulsion energies.

The repulsion on a dummy node δ, caused by a node v ∈ V **of** a graph H = (V, E), is

basically determ**in**ed by the follow**in**g factors:

• Position p δ **of** the dummy node δ

• Position p v **of** the node v

• Weight wv **of** the node v

These given parameters lead to the specification **of** an repulsion energy that is eligible

to route curves accord**in**gly. A node v ∈ V **of** the underly**in**g graph G = (V, E) **of** a

hypergraph H = (V, E) creates a repulsion energy field ER(d) centered at p v **in** a given

graph layout p. The energy ER(d) is described as a function **of** the distance d from its

source p v.

The rout**in**g algorithm calculates the repulsion only at the positions **of** the dummy nodes.

The distance d = p δ − p v is the Euclidean distance between the source **of** the repulsion

and the position p δ **of** a dummy node δ **of** a curve. The absolute value **of** the repulsion

energy is characterized by the follow**in**g requirements:

• The repulsion energy is maximal at the position **of** a node, i.e., d = 0. It has to

be larger than all other occurr**in**g energies to elim**in**ate node occlusion. Def**in****in**g

|ER(0)| := ∞ is sufficient.

• The repulsion energy decreases with **in**creas**in**g distance from the source **of** the energy

field.

• The repulsion energy converges limd→∞ER(d) = 0 to zero at an arbitrarily large

distance.

39

3 Hyperedge Rout**in**g

ER

0

d

r = 0

r = −1

r = −2

Figure 3.3: The absolute values **of** different repulsion energies ER(d) aga**in**st the distance

d from a repuls**in**g graph node for various repulsion exponents r ∈ {0, −1, −2}

Similar to energy models used for an energy-based computation **of** graph layouts [21,

46, 35], the repulsion force can be characterized by a function similar to FR ≈ 1

d . Further,

the weight wv **of** a node v ∈ V **in**fluences the magnitude **of** the repulsion. This allows

to **in**fluence the repulsion with respect to particular properties **of** vertices, for **in**stance its

degree or the size **of** its visualization.

The energy-based rout**in**g approach utilizes this concept **of** iterative node displacements.

Nodes repulse curves, or more precisely the dummy nodes **of** the curve model.

Repulsion Energy The repulsion energy ER(d), caused by a node v ∈ V , on a dummy

node δ **of** a curve at the position pδ **in** a given graph layout p is def**in**ed as shown **in**

Equation 3.4. The distance between node v and dummy node δ is denoted as d, i.e.,

d = pv − pδ. ⎧

⎪⎨

−

ER(d) =

⎪⎩

wv

r · pv − pδ r

−wv · log (pv − pδ) if r < 0

if r = 0

(3.4)

The function plots **in** Figure 3.3 depict the repulsion energy with different repulsion exponents

r ≤ 0 aga**in**st the distance d = p v − p δ between the repuls**in**g node and the

dummy node. A repulsion exponent is used to adjust the energy field strength **of** the repulsion.

In the case **of** r = 0, the logarithm function is applied as it results **in** a repulsion

force similar to d −1 .

Repulsion Force The repulsion force FR(d) = −▽ER(d) is the negative gradient **of** the

repulsion energy shown **in** Equation 3.4. The repulsion force magnitudes FR(d) are plotted

**in** Figure 3.4.

40

FR(d) = −▽ER(d) = −wv · p v − p δ r−1 · p v − p δ

p v − p δ

(3.5)

FR

0

d

r = 0

r = −1

r = −2

3.4 Energy-Based Rout**in**g

Figure 3.4: The magnitude FR(d) **of** the repulsion force for various repulsion exponents

r ∈ {0, −1, −2} at the distance d from a repuls**in**g node

Displacement **of** Dummy Nodes The magnitude FR **of** the repulsion force FR correlates

to the amplitude **of** the dummy node displacement **in** each iteration **of** an energy

m**in**imization algorithm and is determ**in**ed by the first derivation **of** the repulsion energy

at the position p δ **of** the dummy node.

FR(d) = E ′ R(d) = −wv · p v − p δ r−1

(3.6)

A repuls**in**g node v causes a displacement wv · p v − p δ r−1 **of** the dummy node δ **in** the

direction **of** −(p v −p δ) away from p v. In a two-dimensional space, a dummy node is moved

by ∆x and ∆y **in** the direction **of** the x- and y-axis, respectively, which is determ**in**ed by

the partial derivation **of** the repulsion energy

3.4.2.2 Conclud**in**g Remarks

∆x = ∂ER

∂x = −wv · (p x v − p x δ ) · p v − p δ r−2 ,

∆y = ∂ER

∂y = −wv · (p y v − p y

δ ) · p v − p δ r−2 .

(3.7)

The repulsion **in**troduced above meets the requirement to preserve the expressiveness **of**

the given graph layout. If a curve occludes a node, then the strong repulsion with**in** a short

distance from the node moves the surround**in**g dummy nodes **of** the curves away. In order

to achieve this, the fidelity **of** the curve model must be sufficiently high; a small distance dδ

between neighbor**in**g dummy nodes corresponds to a high probability that dummy nodes

**of** the respective curve segment are moved away by the repulsion force.

Furthermore, the repulsion can also h**in**der curves to **in**tersect dense groups **of** nodes.

The energy m**in**imization algorithm will move curves out **of** clusters s**in**ce there is a higher

repulsion energy field strength **in**side than outside the cluster.

Ideally, an energy m**in**imization algorithm overcomes the problem **of** curves trapped **in**

local energy m**in**ima. Practically, there can be many local energy m**in**ima with**in** clusters

41

3 Hyperedge Rout**in**g

and it might become more difficult to move curves out **of** them. However, this repulsion

force and the energy model as it is completed **in** the follow**in**g paragraphs, provides the

proper concept to fulfill our requirements **of** hypergraph visualizations. Later, the practical

evaluation **of** this approach will show the results and experiences collected with example

hypergraphs.

So far, the positions **of** dummy nodes are solely **in**fluenced by the repulsion **of** nodes.

Theoretically, the repulsion can push the curves **in**f**in**itely far away, because the displacement

**of** dummy nodes is not limited yet. The follow**in**g section will **in**troduce an opposite

force that prevents curves from be**in**g routed to the outside **of** the graph layout area.

3.4.3 Stra**in** **of** Curves

Newton’s third law, the law **of** reciprocal actions, allows the creation **of** a stable state

and, **in** this particular case, h**in**ders the **in**f**in**ite stra**in** **of** curves caused by the addition **of**

a repulsion force. The law states that to every action there is always opposed an equal

reaction. Applied to the energy-based rout**in**g **of** curves, another force that h**in**ders the

stra**in** **of** curves, i.e., the elongation **of** curves, is needed.

This section **in**troduces another energy to prevent unlimited stra**in** **of** curves. It is

added to the repulsion energy **of** the previous section and also relies on the model **of**

curves **in**troduced **in** Section 3.4.1 above.

Elastic Band Model The global m**in**imum regard**in**g the repulsion energy is always ad

**in**f**in**itum, i.e., as far away as possible from the area occupied by the graph layout. The

stra**in** **of** curves, i.e., the magnification **of** the curves’ lengths, is not limited dur**in**g the

process **of** energy-based rout**in**g. However, it is not desired to route curves to the outside

**of** the graph layout area. Hence, the goal is to model curves as elastic bands. This way,

the repulsion is not able to move curves ad **in**f**in**itum. For the sake **of** aesthetics there is no

reasonable threshold **of** stra**in** dist**in**guishable. The energy-based approach does not rely

on a threshold; the elongation **of** curves depends on the strength **of** repuls**in**g nodes.

An attraction between neighbor**in**g dummy nodes **of** a curve models a curve as an elastic

band with limited stra**in**. The attraction **in**creases with an **in**creas**in**g distance between

neighbor**in**g dummy nodes. An energy m**in**imization algorithm can be utilized to compute

the layout **in** conjunction with the repulsion. Of course, an optimal layout **of** a curve that

considers only the attraction is the straight-l**in**e connection **of** the curve’s end po**in**ts. In

conjunction with the repulsion, an optimal layout will represent a trade-**of**f **of** both goals.

3.4.3.1 Formalization **of** the Attraction

Requirements Each dummy node is attracted to its two directly neighbor**in**g nodes on

the same curve (see Figure 3.2c on page 35). The outer dummy nodes are attracted to

the immutable positions **of** the curve end po**in**ts, respectively. To formally describe the

characteristics **of** the attraction energy, a class **of** appropriate functions is aga**in** derived

from the follow**in**g requirements.

42

EA

0

a = 1

a = 2

a = 3

d

3.4 Energy-Based Rout**in**g

Figure 3.5: Different attraction energies EA(d) aga**in**st the distance d from a neighbor**in**g

dummy node for various attraction exponents a ∈ {1, 2, 3}

• The attraction energy is **in**dependent **of** the total curve length len(c). Additionally,

the number **of** dummy nodes that model the curves must not **in**fluence the attraction

strength.

• The attraction energy is **in**dependent **of** the total elongation **of** curves. The elongation

is the difference between the length **of** a routed curve and the straight-l**in**e

curve. The impact **of** an elongation depends on the **in**itial curves length. A very

long curve should be allowed to **in**crease its length more than a very short one.

• The attraction energy is m**in**imal, if a curve length is m**in**imal, i.e., if the curve is a

straight l**in**e.

• The attraction energy tends to **in**f**in**ity for **in**f**in**ite curve length.

• The requirement to avoid node occlusion is placed over aspir**in**g short curve lengths.

Attraction Energy Tak**in**g these requirements **in**to account, the attraction energy EA on

a dummy node δ is a function aga**in**st the distance d = pn − pδ to its neighbor node n

(either another dummy node or one **of** the two end po**in**ts **of** the curve), and is def**in**ed as

follows:

EA(d) = 1

a · pn − pδ a

(3.8)

Similar to the repulsion energy from above and to other energy models used to compute

graph layouts, the attraction exponent a allows an adjustment **of** the attraction energy

field strength. The selection **of** an attraction exponent a > 0 greater than zero will meet

the given requirements **of** the attraction. This exponent determ**in**es the elasticity **of** the

band model and the stra**in** potential **of** the curves respectively. The energy functions for

various exponents are plotted **in** Figure 3.5.

Attraction Force The attraction force FA = −▽EA is calculated as the negative gradient

**of** the attraction energy.

FA(d) = −▽EA(d) = p n − p δ a−1

(3.9)

43

3 Hyperedge Rout**in**g

FA

0

a = 1

a = 2

a = 3

Figure 3.6: The magnitude FA(d) **of** the attraction force for various attraction exponents

a ∈ {1, 2, 3} at the distance d from a neighbor**in**g dummy node

Notice that the lowest possible attraction exponent a = 1 does not only mean that the

attraction energy grows l**in**early aga**in**st the distance d, but also stands for a constant

magnitude |FA(d)| = 1 **of** the attraction force as Figure 3.6 shows. For a = 1, the

attraction force is **in**dependent **of** the distance d, but the composition **of** attraction and

repulsion energy will **in** sum show the desired behavior: even a constant attraction will at

one po**in**t be stronger than the repulsion.

Displacement **of** Dummy Nodes The attraction force that acts on movable dummy

nodes leads to a further **in**fluence, besides the repulsion, on the position**in**g **of** curves.

Here the sole effect **of** the attraction force FA on a dummy node δ is shown. The direction

and magnitude **of** the displacement **of** a dummy node that is caused by a neighbor**in**g node

n on the same curve is as follows:

3.4.3.2 Conclud**in**g Remarks

∆x = ∂EA(d)

∂x = (px n − p x δ ) · p n − p δ a−2

∆y = ∂EA(d)

∂y = (py n − p y

δ ) · p n − p δ a−2

d

(3.10)

This subsection **in**troduced an attraction between neighbor**in**g dummy nodes on a curve

to model the curve as an elastic band. As desired, the attraction prevents an **in**f**in**ite shift

**of** the curves out **of** the graph layout area caused by the repulsion. The benefit **of** such an

energy-based approach is a lack **of** necessity to specify absolute values and thresholds. For

**in**stance, the specification **of** a maximum length **of** a routed curve or a maximum ratio **of**

the routed curve’s length to the **in**itial curve’s length was avoided.

The comb**in**ation **of** repulsion and attraction creates an energy model. The strengths

**of** both energies determ**in**e the locations **of** the stable position **of** dummy nodes. The

follow**in**g section **in**vestigates the location **of** the equilibrium created by the comb**in**ation

**of** repulsion and attraction.

44

3.4.4 Equilibrium

3.4 Energy-Based Rout**in**g

It was shown that an energy-based rout**in**g technique necessitates two opposite forces to

create a stable hypergraph layout. However, the positions **of** the curves are still uncerta**in**.

The stable position can be varied by adjust**in**g the strength **of** repulsion and attraction

energy. This relocates the positions **of** m**in**imal energy **in** the layout area and thus relocates

the optimal positions **of** dummy nodes.

An equilibrium **of** a dummy node is a stable state, where the sum **of** all forces Fδ = 0

at the position p δ **of** a dummy node δ **of** a curve c is zero. Equivalently, the sum **of** total

repulsion and attraction energies is locally m**in**imal. Let H = (V, E) denote the hypergraph

and p the correspond**in**g hypergraph layout. Then the net force Fδ is calculated as follows.

Fδ =

FR(pv − pδ) +

v∈V

n∈neighbors(δ)

FA(p n − p δ) (3.11)

Consequently, the sum **of** the net forces

δ∈ν(H) Fδ act**in**g on all dummy nodes **of** a

hyperedge is zero.

Illustration Figure 3.7 exemplifies an artificial case **of** a curve modeled by one dummy

node. One node repulses this dummy node. Figure 3.7a depicts the **in**itial configuration

with a straight-l**in**e curve that is not routed. The small distance between graph and dummy

node causes a strong repulsion that is depicted by the th**in** (red) arrow. The attraction is

m**in**imal s**in**ce the curve’s length is also m**in**imal. The net force, which affects the dummy

node, is determ**in**ed by spann**in**g a parallelogram **of** the **in**dividual forces and apply**in**g the

pr**in**ciples **of** vector addition. The widest blue arrow depicts the net force **in** these figures.

In Figure 3.7a, the repulsion is stronger than the attraction and consequently the net force

moves the dummy node away from the graph node.

An energy m**in**imization algorithm will **in**crease the distance between both mentioned

nodes. After some iterations the dummy node gets to a position where the distance is large

enough to let the attraction either compensate the repulsion (as **in** Figure 3.7c) or even

(a) Repulsion stronger

than attraction

(b) Attraction stronger

than repulsion

(c) Equilibrium

Figure 3.7: Towards a stable position **of** the dummy node on the curve

45

3 Hyperedge Rout**in**g

exceed the repulsion. In the latter case, which is shown **in** Figure 3.7b, the dummy node

was moved too far away from the repuls**in**g node, and the curve’s stra**in**, which is depicted

by the (green) arrows along the curve, becomes too large. As a result, the attraction to

both end po**in**ts **of** the curve is stronger than the repulsion and thus the net force br**in**gs

the dummy node closer to the repuls**in**g node.

As the displacement **of** the dummy nodes is proportional to the magnitude **of** forces

(cf. Equations 3.7 and 3.10), the oscillation **of** dummy nodes around the position **of** the

equilibrium f**in**ally results **in** a stable state. The closer a dummy node is to the equilibrium,

the smaller its displacement will be.

3.4.4.1 Adjustment **of** Repulsion and Attraction

For each dummy node there is a stable position **in** the graph layout. The placement **of** the

equilibrium depends on the scal**in**g **of** the layout s**in**ce the forces are ma**in**ly derived from

the distances between nodes, and the energy and force functions evolve differently aga**in**st

the distances. We therefore now add further coefficients to the computation **of** repulsion

and attraction forces, **in** order to adjust the forces to each other, and balance them **in** such

a way that the position **of** the equilibrium does not depend on the graph’s scal**in**g.

To place the equilibrium at a certa**in** po**in**t, all energies at this po**in**t must be m**in**imal

compared to the close proximity. Equivalently, the sum **of** all forces is zero at this po**in**t.

Therefore, the repulsion and attraction on a dummy node are set to a certa**in** constant

value if the dummy node is at an optimal position regard**in**g repulsion and attraction,

respectively.

At first glance, the strength **of** repulsion and attraction energy can be **in**fluenced by their

respective exponents. Both exponents determ**in**e the way the energy field strengths evolve

aga**in**st the distance from their sources. The plots **of** the energy functions **in** Figures 3.3

and 3.5 show how these exponents r and a can **in**fluence the slope **of** the energies at certa**in**

regions. The exponents r and a ma**in**ly **in**fluence the development **of** forces aga**in**st the

distance, whereas a coefficient can change the magnitude **of** forces. L**in**ear coefficients only

alter the value **of** the energy, and not the slope **of** the energy functions.

The rema**in**der **of** this section adjusts the magnitudes **of** forces to an equal fixed value,

here it is 1. Obviously, the direction **of** all forces rema**in**s unchanged by the addition **of**

the follow**in**g coefficients.

Repulsion Coefficient The repulsion energy as def**in**ed **in** Equation 3.4 already considers

the node weight wv. A further coefficient wδ, which is specific for each dummy node, is

**in**troduced to adjust the repulsion energy ER. In an optimal position **of** the dummy node,

the coefficient wδ is chosen such that the magnitude **of** the repulsion force FR on a dummy

node δ is set to 1. Only the magnitudes FR = |FR| are **in**fluenced and considered here.

The functions **of** the distance are abbreviated as E = E(d) and F = F (d).

46

FR = −wv · wδ · p v − p δ r−1

(3.12)

3.4 Energy-Based Rout**in**g

If the dummy node is placed at an optimal position p o,r

δ with respect to the repulsion,

then the coefficient wδ is derived as follows.

FR = 1 = −wv · wδ · pv − p o,r

δ

r−1

wδ =

− 1

wv · pv − p o,r

δ

r−1 (3.13)

The follow**in**g Section 3.4.4.2 presents a method to f**in**d the optimal position p o,r

δ **of** a

dummy node δ.

Attraction Coefficient The attraction is also adjusted. A coefficient wc, which can be

considered as the weight **of** the curve, enhances the attraction force from Equation 3.9 to

FA = wc · p n − p δ a−1 . (3.14)

The distance pn − pδ between the dummy node δ and one **of** its neighbor**in**g nodes

n ∈ neighbors(δ) on the curve is the length **of** the respective curve segment. The coefficient

wc is specific to the particular curve segment as it is basically derived from its length. The

attraction force FA is set to 1 for a dummy node δ **in** its optimal position p o,a

δ with respect

to the attraction.

FA = 1 = wc · pn − p o,a

wc = pn − p o,a

Section 3.4.4.3 clarifies the optimal position p o,a

δ

3.4.4.2 Optimum **in** Terms **of** Repulsion

δ

δ

1−a

a−1

(3.15)

**of** a dummy node δ **in** terms **of** attraction.

An optimal position **of** a dummy node with respect to the repulsion corresponds already to

the desired solution **of** the rout**in**g problem. There are possibly several optimal positions

imag**in**able. For **in**stance, another optimal layout **of** the graph **in** Figure 3.7 places the

dummy node on the left side **of** the repuls**in**g node. Most energy m**in**imization algorithms

will not lift the dummy node over the node, because this would **in**crease the repulsion

before it can be m**in**imized. The **in**f**in**itely distant positions are not considered as optimal

positions s**in**ce a position that is very close to the area **of** the given graph layout is desired.

For the sake **of** the adjustment **of** energies, an approximation **of** an optimal position is, or

practically has to be, sufficient. An optimal position **of** a dummy node is characterized by

a m**in**imal repulsion energy **in** the potential layout area **of** the curve. The repulsion energy

ER(p s) at a sample po**in**t p s ∈ R 2 is the sum **of** the repulsion (as def**in**ed by Equation 3.4)

47

3 Hyperedge Rout**in**g

caused by all nodes v ∈ V **of** a hypergraph H = (V, E).

ER(ps) =

ER(pv − ps) v∈V

⎧

−

⎪⎨ v∈V

=

⎪⎩

wv

r · pv − ps r

if r < 0,

−wv · log (pv − ps) if r = 0

v∈V

(3.16)

To f**in**d a position with m**in**imal energy m**in**{ER(p s)|s ∈ S} from a set **of** sample po**in**ts

S, the total repulsion energy at each sample p s∈S is computed accord**in**g to Equation 3.16.

Such a sampl**in**g technique f**in**ds a position with m**in**imal energy and is elaborated **in** the

next paragraphs.

Rationale First, the sampl**in**g area is limited to a rectangle enclos**in**g the respective curve.

This sampl**in**g area represents the potential positions **of** all dummy nodes **of** a routed

curve. It would be too restrictive to predict the area **of** potentially optimal positions for

each dummy node **of** a curve **in**dividually. Therefore, the **in**troduced coefficient wδ is equal

for all dummy nodes **of** the same curve. The sample po**in**t with m**in**imal energy represents

the optimal position **of** all dummy nodes **of** the curve.

Figure 3.8 depicts a sampl**in**g area **of** a curve. The length **of** the curve determ**in**es the

dimension **of** the sampl**in**g area. Depend**in**g on the visual objectives **of** rout**in**g, the sampl**in**g

area can also be extended to a larger area. This will only **in**crease the computational

effort.

Second, after the sampl**in**g area **of** each curve is set, the number, location, and distribution

**of** the samples have to be arranged. A homogeneous distribution **of** samples **in** the

sampl**in**g area is used. The positions **of** the samples can be easily derived once the number

**of** samples is specified. A value between 100 and 1, 000 might be an appropriate number

|S| **of** samples. The sampl**in**g area **in** Figure 3.8 for **in**stance conta**in**s 11 × 11 samples.

Third, the repulsion energy ER(ps) (see Equation 3.16) caused by all nodes is calculated

at each sample po**in**t ps. The m**in**imal repulsion energy at the optimal sample po**in**t po,r s

regard**in**g the repulsion (light gray **in** Figure 3.8) allows the calculation **of** the repulsion

coefficient wδ for all dummy nodes δ ∈ ∆(c) **of** a curve c.

wδ =

v∈V

3.4.4.3 Optimum **in** Terms **of** Attraction

− 1

wv · p v − p o,r

s r−1 (3.17)

F**in**d**in**g optimal positions p o,a

δ **of** dummy nodes with respect to the attraction is trivial.

The dummy nodes experience m**in**imal attraction if and only if the distance to their neighbor**in**g

nodes is also m**in**imal (cf. Equations 3.8 and 3.14). As required **in** Section 3.4.3, the

48

Em**in** =

ER

v∈V

3.4 Energy-Based Rout**in**g

Sample po**in**t

Sampl**in**g area

Figure 3.8: Sampl**in**g **in** a curve’s potential rout**in**g area identifies the optimal position

with m**in**imal repulsion energy Em**in**

optimum is the straight-l**in**e connection **of** both end po**in**ts **of** a curve with homogeneously

distributed dummy nodes on it. The depicted curve **in** Figure 3.8 is already **in** its optimal

position with respect to the attraction. The distance dδ between a dummy node δ and

one **of** its neighbors n on this curve is the ratio **of** the curve’s length len(c) to the number

**of** curve segments |∆(c)| + 1.

dδ = len(c)

|∆(c)| + 1 = pn − p o,a

δ

The coefficient wc is determ**in**ed by sett**in**g the attraction force to FA(dδ) = 1. The

attraction coefficient is equal for each dummy node **of** the same curve, because the distance

between neighbor**in**g nodes **in** an optimal case is constant.

wc =

len(c)

|∆(c)| + 1

1−a (3.18)

49

3 Hyperedge Rout**in**g

3.4.4.4 Conclud**in**g Remarks

An equilibrium was already created by the composition **of** repulsion and attraction. The

focus **of** this section was the adjustment **of** the equilibrium to permit the generation **of** adequate

curve layouts. Therefore, a repulsion and an attraction coefficient were **in**troduced.

The repulsion coefficient wδ represents a weight**in**g **of** the dummy nodes. The normalization

**of** the repulsion abstracts from concrete distance values between nodes and ensures

that a large number **of** nodes with a small node weight are equally repuls**in**g like a smaller

number **of** nodes with higher node weights. An illustrative example is that ten nodes, each

with a node weight **of** 1, are equally repuls**in**g as one node with a weight **of** 10.

The sampl**in**g method reuses the optimal position p o,r

δ for all dummy nodes **of** a curve.

The sampl**in**g method only computes the coefficients once. If additionally a moderate

number **of** sample po**in**ts is specified, then the overall computational effort **of** the sampl**in**g

method is very low **in** comparison to the total complexity **of** the energy-based rout**in**g

technique. The number **of** samples and the size **of** the sampl**in**g area further affect the

result’s accuracy. Both parameters allow users to trade-**of**f accuracy aga**in**st computational

effort.

The attraction coefficient wc represents a weight**in**g **of** the respective curve and consequently

is equal for each dummy node **of** the curve. It decouples the attraction from the

actual value **of** the curve length, because the attraction must not be radically different for

two congruent, but not equal, curves. The computation **of** wc is also **in**expensive, because

it is computed for each curve only. The **in**cremental approach to model curves **in** energy

fields, which was described **in** Section 3.4.1, requires a recalculation **of** the attraction

coefficient wc every time after the fidelity **of** the curve is **in**creased.

3.4.5 Conclusion

Rout**in**g avoids node occlusion and cluster **in**tersections. The presented energy-based rout**in**g

technique reuses the energy-based approach that is also capable to produce graph

layouts without any edge rout**in**g. Energy-based layout techniques solve an optimization

problem; energy m**in**imization algorithms are capable to reduce the total energy **of** hypergraph

layouts. Dummy nodes were **in**troduced to model curves **in** energy fields **in** a way

that allows to compute the energy’s impact on the curve. Three different strategies to

model curves were exam**in**ed. The **in**cremental doubl**in**g **of** the number **of** dummy nodes

emerged as the preferable method.

Repulsion and Attraction The repulsion caused by nodes is commonly used to compute

graph layouts. The elastic band model is a novel approach to limit the stra**in** **of** curves by

an attraction force. Energy m**in**imization algorithms are used to m**in**imize the repulsion

and attraction energies **of** all dummy nodes. Therefore the dummy nodes are moved to

positions with m**in**imal energy and by this the curves are routed.

The energies were formalized and comb**in**ed **in** an energy model that allows to create

hypergraph layouts and fulfills the given requirement to preserve the expressiveness **of**

layouts. Table 3.1 shows the outcomes **of** the previous paragraphs **in** an overview. In

50

Repulsion Energy ER

Repulsion Force FR

− wv · wδ

r

· p v − p δ r

3.4 Energy-Based Rout**in**g

r < 0

−wv · wδ · log (p v − p δ) r = 0

−wv · wδ · p v − p δ r−1 · p v −p δ

p v −p δ

Node Displacement ∆x −wv · wδ · (p x v − p x δ ) · p v − p δ r−2

Attraction Energy EA

Attraction Force FA

∆y −wv · wδ · (p y v − p y

δ ) · p v − p δ r−2

wc

a · p n − p δ a

wc · p n − p δ a−1 · p n −p δ

p n −p δ

Node Displacement ∆x wc · (p x n − p x δ ) · p n − p δ a−2

∆y wc · (p y n − p y

δ ) · p n − p δ a−2

Table 3.1: Repulsion and attraction **in** a nutshell

summary, all energies act**in**g on a dummy node δ ∈ ∆(c) are given **in** Equation 3.19.

E(δ) is the composition **of** **in**dividual repulsion energies emitted by all nodes v ∈ V **of** a

given hypergraph H = (V, E) and the attraction energies emitted by both neighbors **of** the

respective curve c ∈ C(ε).

r < 0 ⇒

E(δ) =

r = 0 ⇒

E(δ) =

n∈neighbors(δ)

n∈neighbors(δ)

wc

a · p n − p δ a −

v∈V

wc

a · p n − p δ a −

v∈V

wv · wδ

r

· p v − p δ r

wv · wδ · log (p v − p δ)

(3.19)

Equilibrium The establishment **of** an equilibrium **of** two forces was not straightforward.

Obviously, both forces balance each other, or equivalently, the total energy is m**in**imal **in**

an optimal layout. The challenge is to determ**in**e an optimum regard**in**g both energies

**in**dividually and has not been **in**vestigated before. By **in**troduc**in**g the coefficients wδ and

wc, it becomes feasible to compute **in**sightful and decent hyperedge layouts that meet

the requirement to preserve the expressiveness **of** layouts. Without both coefficients, it

is impossible to consistently create layouts that fulfill this requirement, and a scal**in**g **of**

51

3 Hyperedge Rout**in**g

graph layouts would lead to radically dist**in**ct curve layouts, **in** which nodes are possibly

occluded by curves.

The presented energy-based rout**in**g technique clearly meets the expectations. Node

occlusion, cluster **in**tersections and an arbitrary elongation **of** curves are prevented. The

practical evaluation **of** energy-based rout**in**g is presented **in** Chapter 5.

Position**in**g **of** the Crux So far, the energy-based rout**in**g technique assumed fixed positions

**of** the end po**in**ts **of** curves. This assumption is completely valid for hyperedges **in** the

fully connected hyperedge structure. The crux **of** hyperedges **in** the centralized structure

is one end po**in**t **of** all curves that does not necessarily have to be fixed. The position

**of** the crux, e.g., the barycenter **of** the hyperedge nodes, can be very disadvantageous to

the hypergraph layout quality if it is **in**side a cluster. Therefore, the crux **of** a centralized

hyperedge structure is a movable end po**in**t **of** the curves. Its position changes dur**in**g the

process **of** energy-based rout**in**g similar to the dummy nodes.

3.5 Discussion

A rout**in**g attempt that relies on the identification **of** cluster bounds turned out to be

**in**sufficient for the goal **of** rout**in**g hyperedges. Unlike the previous cluster-bound-based

rout**in**g approach, the energy-based approach to route curves is fairly convenient as it does

not require the explicit specification **of** certa**in** characteristics. Instead, it is customizable

by parameters affect**in**g the layouts **in**directly. The repulsion and attraction exponents r

and a, and the repulsion and attraction coefficients wδ and wc determ**in**e the actual routes

**of** curves **in** the generated hypergraph layouts.

Generality Most curve rout**in**g techniques target aesthetic criteria like a reduction **of**

edge bends or edge cross**in**gs. Other rout**in**g techniques that were also discussed **in** the

Sections 2.2.4 and 3.2 about related work are limited to certa**in** classes **of** graphs. In

contrast to those rout**in**g techniques, our energy-based technique does not rely on any

particular method to compute hypergraph layouts and does not require any specific type

**of** graph or graph layout **in**formation (e.g., cluster**in**g **of** nodes or a node hierarchy). Solely

the positions **of** nodes are necessary to route curves.

The presented energy-based approach is capable to route curves, i.e., any b**in**ary connection

**of** two po**in**ts (not even nodes necessarily). Accord**in**gly, this energy-based rout**in**g

technique is capable to route edges **in** general. Each box-and-l**in**e visualization can be

extended to utilize this approach to visualize (b**in**ary) edges with**in** a graph layout.

Reduced Visual Complexity The repulsion moves curves to paths with low energy. However,

the attraction limits this ability. Still, there is a certa**in** probability that **in**itially

closely placed curves are moved to the same path with low energy **in** the layout area. Such

a visual bundl**in**g **of** close curves reduces the visual complexity **of** the entire visualization.

The follow**in**g chapter will **in**troduce explicit methods to reduce the visual complexity **of**

52

3.5 Discussion

hypergraph visualizations. Visual bundl**in**g as a side effect **of** energy-based rout**in**g **of**

curves is also discussed **in** Section 4.4.2.2 on page 67.

**Visualization** **of** Routed Curves The visualization **of** routed curves is not primarily focused

**in** this work. The dummy nodes **of** the curve model should serve as a foundation.

Either these dummy nodes are connected by straight l**in**es, or spl**in**es for smooth**in**g can

be used. An **in**terpolation must consider the deviation **of** the visualized curve from the

positions **of** dummy nodes to avoid node occlusion that might be **in**troduced by such a

smoothed visualization. Section 4.7.1 on page 74 will discuss the visualization **of** curves.

The visualizations generated for this work connect neighbor**in**g dummy nodes **of** curves

with straight-l**in**e curves. With **in**creas**in**g curve model fidelity, the bends **of** polygonal

cha**in**s (polyl**in**es) are hardly cognizable.

53

4 Reduction **of** Visual Complexity

**Hyperedges** allow to connect several vertices **of** a graph simultaneously. The visualization

**of** large hypergraphs, as common **in** the field **of** s**of**tware visualization, **in**volves a large

amount **of** visualized **in**formation **of** the underly**in**g graph and additional hyperedges.

The amount **of** visualized **in**formation **of** a hyperedge amplifies with **in**creas**in**g number

**of** hyperedge nodes. The representation **of** the node connectivity **of** a hyperedge adds extra

**in**formation to a hypergraph draw**in**g. To improve hypergraph visualizations **in** terms **of**

readability and comprehension, the visual complexity **of** hypergraphs must be reduced.

This chapter **in**troduces two options to reduce the visual complexity **of** hypergraph

draw**in**g. First, the computed hyperedge layout can be rearranged such that close curves

are visually bundled together. Second, a simplification **of** a hypergraph’s model can also

reduce the complexity **of** a visualization.

Prerequisites The energy-based rout**in**g **of** hyperedges produces layouts that fulfill the

given requirement to preserve the expressiveness **of** graph layouts. However, rout**in**g does

not affect the visual complexity **of** hypergraph draw**in**gs, because the amount **of** visualized

**in**formation is not altered.

The previous chapters modeled hyperedges by a set **of** curves connect**in**g all hyperedge

nodes with each other. The visual complexity **of** hypergraphs corresponds to the amount

**of** visualized **in**formation that presents hyperedges. Curves are the visualized objects that

represent hyperedges. The number **of** curves is determ**in**ed by the hyperedge structure

and the number **of** hyperedge nodes.

The visual complexity **of** a hyperedge with n hyperedge nodes is caused by n curves

us**in**g the centralized structure. The fully connected structure needs n(n−1)

2 curves. Thus,

the latter structure **of** hyperedges generally suffers a higher visual complexity. But, as this

chapter will show, the visual complexity **of** both structures can be reduced.

4.1 Classification **of** Visual Complexity

The term visual complexity reflects the readability **of** a draw**in**g and the ability to comprhend

the displayed amount **of** **in**formation. The more **in**formation **of** a draw**in**g a viewer

can process **in** a shorter amount **of** time, the higher is the readability **of** a hypergraph

visualization. The amount **of** visualized **in**formation, or more precisely, the amount **of**

**in**formation necessary to comprehend the visualized matter, corresponds to the term **of**

visual complexity **in** this work and also corresponds to cognitive load, which is expla**in**ed

**in** the follow**in**g.

55

4 Reduction **of** Visual Complexity

A higher complexity **of** a hypergraph draw**in**g impedes the readability, because a viewer

can not easily and quickly cognize a complex structure. A low visual complexity **of** hypergraph

draw**in**gs allows a viewer to comprehend the structure **of** a hyperedge more easily.

Two major **in**fluences can be identified to describe the complexity **of** a hypergraph draw**in**g:

the amount **of** **in**formation, e.g., the number **of** curves model**in**g a hyperedge, and the

way **of** present**in**g the hypergraphs to a viewer.

Hence, the obvious question is how to estimate and compare the visual complexity **of**

different hypergraph layouts. The cognitive load theory allows to answer this question.

The next section briefly **in**troduces the cognitive load theory. The theory focuses on

the analysis **of** human learn**in**g capabilities and the preparation **of** the presentation **of**

**in**formation. The theory also describes the limitations **of** human cognition and considers

the amount **of** presented **in**formation. Afterwards, the theory is applied to describe visual

complexity **of** hypergraph draw**in**gs.

4.1.1 Cognitive Load

The cognitive load theory is founded on Miller’s publication [41] from 1956, which suggests

that the human capacity for process**in**g **in**formation is limited. S**in**ce this work from the

perspective **of** a cognitive psychologist, cognitive science assumes that, **in**dependent **of** a

particular task and also **in**dependent **of** the process someone uses to solve a task, the

amount **of** memorized **in**formation is limited.

The cognitive load theory deals with the units (chunks) **of** **in**formation that a human

can reta**in** **in** short term memory before loss. Humans are capable and limited to process

seven chunks **of** **in**formation (plus or m**in**us two). For **in**stance, most people can remember

a seven digit phone number [41]. The cognitive load theory **in**vestigates the human process

**of** learn**in**g by understand**in**g the human cognitive skills and provides empirically-based

guidel**in**es to present **in**formation and “optimize **in**tellectual performance” [58].

The cognitive load denom**in**ates the **in**formation chunks that are processed to solve a

problem. The study **of** a subject written with an unfamiliar vocabulary, for **in**stance,

means a higher cognitive load than with a familiar vocabulary. A cognitive overload

causes a lower understand**in**g and comprehension performance [57]. Accord**in**g to recent

publications from Sweller et al., three types **of** cognitive load are differentiated, and are

briefly described **in** the follow**in**g.

Intr**in**sic Cognitive Load The **in**tr**in**sic load can be described as the complexity **of** the

content or the difficulty **of** a task. It is solely reduced by the amount **of** **in**formation, and

not by its visual representation [56].

Extraneous Cognitive Load The extraneous load denotes the unnecessary **in**formation

chunks that should be avoided. A reduction **of** extraneous cognitive load is the easiest way

to prevent an **in**formation overload. For **in**stance, a square is best described visually [37].

In comparison, a verbal description **of** a square is much more complex and difficult to

understand. Thus, the more efficient visual specification is preferable for this example.

56

4.1 Classification **of** Visual Complexity

Germane Cognitive Load The germane load illustrates the amount **of** **in**formation that

is necessary to allow a person to construct or acquire a schema [58] from the presented

**in**formation. A schema helps to accelerate the cognition and process**in**g **of** **in**formation and

thus **in**creases the human learn**in**g performance. This type **of** load utilizes the rema**in****in**g

free space **of** the available work**in**g memory.

In contrast to the **in**tr**in**sic load, the two latter types **of** cognitive load are manipulable

by the presentation **of** the **in**formation. The **in**tr**in**sic and the extraneous cognitive loads

are additive.

4.1.2 Utilization **of** Cognitive Load Theory

The cognitive load theory is applied to hypergraph draw**in**gs and allows to assess the visual

complexity **of** them. The **in**tr**in**sic load **of** a hypergraph is constituted by the fixed graph

layout and the hyperedges that have to connect all hyperedge nodes with each other.

The way **of** present**in**g hyperedges, i.e., the way **of** connect**in**g all hyperedge nodes and

visualiz**in**g these connections **in** particular, reflects the extraneous load. Because both

loads are additive, a reduction **of** the extraneous load becomes very important if there is

a high **in**tr**in**sic load, as usual for the visualization **of** large hypergraphs.

The germane load is not considered as it does not describe or reduce the visual complexity

**of** hypergraph draw**in**gs. A reduction **of** the visual complexity **of** hypergraph draw**in**gs

can be achieved by a reduction **of** the **in**tr**in**sic, **of** the extraneous cognitive load, or **of** both

loads altogether.

A viewer’s capability to process the displayed amount **of** **in**formation is highly limited,

s**in**ce only a few chunks **of** **in**formation, which can be related to each other, are reta**in**ed

**in** the work**in**g memory. For **in**stance, a viewer might ask which nodes **of** a graph are

connected by a hyperedge and where those nodes are. These questions are the ma**in**

motivation **of** hypergraph visualizations and their importance was expla**in**ed **in** Section 1.2.

It is not necessary to identify **in**dividual curves between hyperedge nodes to answer these

questions. It is sufficient to identify the overall connectivity among all hyperedge nodes.

In terms **of** the cognitive load theory, **in**dividual curves are extraneous cognitive load if

there is still any visual connection between all nodes **of** the hyperedge.

Aggregation **of** Curves The aggregation **of** curves simplifies the **in**formation that is necessary

to visualize the connectivity between hyperedge nodes. Therefore, close curves,

which use a similar paths **in** the graph layout, are aggregated to reduce the amount **of** displayed

**in**formation. The aggregation comprises two different types. First, the aggregation

**of** curves that simplifies the hypergraph model reduces the **in**tr**in**sic load. Second, if the

aggregation **of** curves bundles curves visually together without chang**in**g the underly**in**g

hypergraph structure, then the extraneous load is reduced. In both cases, the total load

and thus the visual complexity is reduced.

The example hyperedge **in** Figure 4.1 is modeled **in** the fully connected structure. The

left draw**in**g **in** Figure 4.1a depicts curves by straight-l**in**e curves connect**in**g hyperedge

nodes. The same hyperedge is shown **in** Figure 4.1b and still reveals the same connectivity

57

4 Reduction **of** Visual Complexity

(a) Curves **of** a fully connected hyperedge

produce visual clutter. The visual complexity

is high and the readability is obviously

low.

(b) Aggregated curves **of** the hyperedge

**in** (a) still reveal the same connectivity

**in**formation. The visual complexity is reduced

and the readability is **in**creased.

Figure 4.1: Reduction **of** visual complexity **of** a hyperedge by aggregat**in**g curves

**in**formation. All hyperedge nodes are connected with each other by a set **of** aggregated

curves. Thus, the cognitive load is reduced and the clutter caused by the tangle **of** curves

**in** Figure 4.1a is bypassed by the aggregation **of** close edges.

The aggregation **of** curves may have an impact on the uniform connectivity between

hyperedge nodes, as the hyperedge structure is modified. This chapter focuses on the presentation

**of** techniques that promise a reduction **of** the visual complexity **of** hypergraphs.

The evaluation **in** Section 5 presents hypergraph draw**in**gs and will thus allow to address

the compliance with the first requirement **of** hypergraph visualizations.

Measures The depicted **in**stance **in** Figure 4.1 reduces the visual complexity **of** the hypergraph

draw**in**g by reduc**in**g the number **of** curves from 21 **in** Figure 4.1a on the left to

8 **in** Figure 4.1b on the right side. Furthermore, the total edge length is also obviously

reduced by the aggregation. In this particular case, the aggregation **of** curves reduced the

total edge length by more than 80 percent.

This example demonstrates the reduction **of** the visual complexity **of** hypergraphs by

an aggregation **of** curves. The ratio **of** the reduced to the **in**itial total length **of** curves is

an **in**dicator for the reduction **of** the cognitive load. The length **of** a curve is determ**in**ed

by the sum **of** its curve segment lengths if curves are modeled by the dummy nodes as **in**

the previous chapter. The length **of** a curve segment is the Euclidean distance between

the end po**in**ts **of** the curve segment. The ratio **of** lengths is **in**dependent **of** any scal**in**g

**of** the graph layout and **in**dicates the reduction **of** the load that visualizes the hyperedge

connectivity.

The number **of** curves **in**dicates the visual complexity only to a limited extent. The

number **of** curves may **in**crease through curve aggregation if the aggregated parts **of** curves

are counted, too.

58

4.2 Techniques to Reduce Visual Complexity

(a) An example hyperedge

**in** centralized structure

(d) Curve aggregation based

on curve clusters

4.2 Techniques to Reduce Visual Complexity

(b) Energy-based curve

bundl**in**g visually bundles

curves

(e) Energy-based widen**in**g

**of** curves visually bundles

curves and cuts nodes out

(c) An energy threshold aggregates

visually bundled

curves **in** the gray marked

area

Figure 4.2: Results **of** curve aggregation techniques **in**troduced **in** this chapter that reduce

the visual complexity **of** hypergraph draw**in**gs

Four techniques are **in**troduced **in** this chapter. They all allow to reduce the visual complexity

**of** hyperedge draw**in**gs by reduc**in**g the cognitive load. The hypergraph draw**in**gs

**in** Figure 4.2 exemplify different techniques and depict their effects on hypergraph draw**in**gs.

Initially, an example hyperedge, which consists **of** two hyperedge nodes and connects

them by two curves to the crux, is given. Figure 4.2a shows that hypergraph without any

reduction **of** its visual complexity.

As mentioned earlier, the aggregation **of** curves results **in** two dist**in**ct types **of** load

reduction. Table 4.1 allocates the techniques accord**in**g to this dist**in**ction. Model-based

aggregation techniques **in** the right column reduce the **in**tr**in**sic cognitive load. The extraneous

load is reduced by techniques that visually bundle curves.

The energy-based bundl**in**g technique adds an attraction between close curves to bundle

them visually together. Figure 4.2b drafts the result **of** the bundl**in**g technique. Both

curves overlap each other. Visually bundled curves can also reduce the visual complexity

as the extraneous load is decreased by a less detailed amount **of** displayed **in**formation.

59

4 Reduction **of** Visual Complexity

Visual bundl**in**g Model-based aggregation

Energy-based bundl**in**g

Energy-based Section 4.4 Energy-threshold-based aggregation

computation Energy-based widen**in**g Section 4.5

Section 4.7

Geometrical Cluster-based aggregation

computation – Section 4.6

Table 4.1: Overview **of** curve aggregation techniques to reduce the visual complexity **of**

hypergraph draw**in**gs

Another energy-based technique, the energy-threshold-based aggregation, allows to reduce

the complexity **of** the hypergraph model. Based on a preset energy threshold, close

and overlapp**in**g curves (marked by the gray background **in** Figure 4.2c) are aggregated

**in** the hyperedge model to reduce the **in**tr**in**sic load. As this figure also shows, the curves

**of** the hyperedge might be bundled with the energy-based bundl**in**g technique **in** advance.

The result**in**g hypergraph draw**in**g is similar to the result **of** the follow**in**g cluster-based

curve aggregation technique (cf. Figure 4.2d).

A cluster-based curve aggregation directly reduces the **in**tr**in**sic load with a geometrical

approach. The hyperedge structure is simplified by the identification and aggregation **of**

close parts **of** adjacent curves. Subsequently, **in** contrast to energy-based bundl**in**g, the

result**in**g hypergraph draw**in**g as **in** Figure 4.2d does not completely show the formerly

**in**dividual curves.

The widen**in**g **of** curves is a different approach to reduce the visual complexity **of** hypergraphs.

The width **of** curve visualizations was not discussed before, because curves

are usually visualized by l**in**e segments with a cognizable t**in**y l**in**e width. A significantly

higher curve width **in**creases the probability **of** overlapp**in**g curves. Thus, curve widen**in**g

visually bundles close parts **of** curves. Consequently, the widen**in**g **of** curves also reduces

the extraneous load. Figure 4.2e depicts the widened curves by gray planes enclos**in**g the

curves. Us**in**g the two orthogonal visual concepts **of** boxes and l**in**es **of** graph draw**in**gs

on the one hand and irregular planes on the other hand may further foster readability **of**

hypergraph draw**in**gs.

4.3 Related Work

Before these techniques are exam**in**ed, this section discusses related work **of** graph visualizations

that also comb**in**e edges. These contributions were motivated by the reduction

**of** visual clutter caused by edges. Both approaches, a visual bundl**in**g and an aggregation

**in** the model, were also applied for other graph visualizations. The reduction **of** visual

complexity **of** hyperedges was not explicitly exam**in**ed before.

Hierarchical Edge Bundl**in**g Holten et al. **in**vestigated the visualization **of** hierarchical

edge bundles [32] to reduce visual clutter caused by straight-l**in**e edges. The edges are

60

4.3 Related Work

bent and modeled as B-spl**in**e curves. Two examples **of** the layout **of** bundled edges are

depicted **in** Figure 4.4.

The hierarchy tree **of** a hierarchical graph is used to

bundle edges visually. By this, the edges are also routed

implicitly. Each node **of** the hierarchy tree is assigned

to a position **in** the graph layout. For **in**stance, the nonleaf

nodes P1, P2, P3 **in** Figure 4.3 **of** a hierarchy tree

are placed at the center **of** the postions **of** their children

nodes.

An edge (u, v) between two nodes u and v **of** the

base graph is routed through the positions **of** the parent

nodes, which are on the shortest path **in** the hierarchy

tree between the leafs represent**in**g u and v.

Figure 4.3: Path **of** an edge is

derived from the graph hierarchy,

taken from [33]

Holten’s method is limited to hierarchical graphs and tree visualization methods. A

hierarchy tree is therefore required. In contrast, the visualization techniques presented **in**

this thesis can be applied to graphs without hierarchical **in**formation, too. In addition,

Holten’s bundl**in**g technique also routes edges implicitly, but the major requirements **of**

the present thesis like the avoidance **of** node occlusion and cluster **in**tersections are not

considered by Holten et al.

Figure 4.4: Two edge bundl**in**g results **of** Holten’s approach **in** [32]

Cluster-Based Edge Bundl**in**g Balzer and Deussen developed an **in**teractive visualization

**of** clustered graphs [7]. Implicit surfaces, i.e., adaptive shapes that enclose vertices **of** a

cluster, represent cluster bounds. Edges are routed and bundled to reduce the amount **of**

visualized **in**formation. The position **of** the cluster bounds is used to compute base po**in**ts

that are used to route and bundle edges across different clusters. Figure 3.1b on page 33

also shows a bundl**in**g **of** edges connect**in**g different clusters.

The level-**of**-detail **of** the graph visualization depends on the viewpo**in**t. Both the visualization

**of** cluster bounds and edges are **in**fluenced by the distance from the viewpo**in**t.

A close viewer cognizes bundled edges more **in**dividually and identifies the content **of** a

61

4 Reduction **of** Visual Complexity

Figure 4.5: A Hierarchical Net **in** s**of**tware landscapes, i.e., a 2.5-dimensional box-and-l**in**e

visualization **of** a s**of**tware system, taken from [8]

cluster more clearly. More abstract and solidly bundled edges and opaque cluster bounds

are shown to a distant viewer.

Hierarchical Nets **in** s**of**tware landscapes [8] is another edge bundl**in**g approach that

relies on the cluster**in**g **in**formation or the hierarchy **of** a graph. Figure 4.5 depicts an

example visualization. The vertices are placed on a two-dimensional plane (neglect**in**g the

transparent spheres, which **in**dicate clusters). The net **of** bundled edges connects vertices

across cluster bounds utiliz**in**g the third dimension.

Aga**in**, both mentioned cluster-based edge bundl**in**g techniques could not be applied to

produce hypergraph layouts, which fulfill our requirements **of** hypergraph visualizations.

First, the routes **of** edges do not respect the rema**in****in**g graph layout. Node occlusions

are not prevented. Second, these methods are not applicable to solely reduce the visual

complexity **of** hypergraphs s**in**ce a cluster**in**g **of** nodes is required. It is also not practicable

to automatically compute a reasonable cluster**in**g **of** any graph **in** advance, because an

**in**appropriate cluster**in**g would also cause an **in**appropriate edge bundl**in**g.

Flow Maps To conclude this section about edge bundl**in**g techniques **of** other authors, a

f**in**al approach is discussed. Flow map layouts [48] aggregate edges hierarchically. A b**in**ary

splitt**in**g method based on node positions determ**in**es the route **of** the flow. Start**in**g from

a certa**in** po**in**t **in** the layout, the flow splits **in**to two parts; the ma**in** part cont**in**ues to

connect the rema**in****in**g nodes and an auxiliary part connects the ma**in** flow with a close

node. These b**in**ary splits connect each node (**of** a hyperedge). Figure 4.6 shows an

example flow map.

One disadvantage **of** this method is the limitation to b**in**ary splits. Furthermore, a locally

high node density leads to visual clutter. Very little work on the automated flow layout

computation is available. The flow layout method by Phan et al. displaces nodes **of** the

graph layout to allow the rout**in**g **of** the flow between any pair **of** nodes [48]. Consequently,

no node occlusion occurs when us**in**g this method. However, the graph layout is changed,

62

Figure 4.6: A flow map taken from [48]

4.4 Energy-Based Curve Bundl**in**g

which does not comply with our requirements. Moreover, the flow rout**in**g method does

not prevent cluster **in**tersections.

4.4 Energy-Based Curve Bundl**in**g

The curve bundl**in**g technique visually converges (parts **of**) close curves. The hyperedge

model, i.e., the set **of** curves model**in**g a hyperedge, rema**in**s unchanged, but the visual

complexity is reduced. An attraction between curves bundles curve with respect to the

follow**in**g two aspects.

• Close parts **of** curves are moved closer. The visual representations **of** those parts

may overlap each other or are very close.

• Distant parts **of** curves are not bundled. As those curve parts are not eligible for

bundl**in**g, they must not be affected at all.

Bundled curves share a common path **in** the graph layout. A viewer perceives an aggregated

group **of** curves and this visual aggregation **in** turn simplifies the displayed structural

**in**formation **of** the hypergraph.

The path **of** a bundled curve is roughly **in** the area between those paths **of** the **in**dividual

curves, whereupon other criteria may further **in**fluence the position**in**g. More precisely,

the goal is to place the bundled curves on a path that optimally fits **in** the graph layout.

Thus, all bundled curves have to be routed to determ**in**e their optimal paths regard**in**g the

graph layout as described **in** Section 3.1 about the motivation **of** rout**in**g.

Prerequisites The energy-based technique to bundle curves employs an attraction between

curves. Similar to energy-based rout**in**g, it is required to model the curves as cha**in**s

**of** dummy nodes. The bundl**in**g is **in**dependent **of** any particular hyperedge structure, so

this technique can be applied to the centralized and the fully connected structure. It is not

63

4 Reduction **of** Visual Complexity

required to route curves before bundl**in**g. S**in**ce each change **of** a dummy node’s position

may result **in** node occlusion, the bundled curves must be routed after each bundl**in**g to

preserve the layout’s expressiveness. The order **of** rout**in**g and bundl**in**g is discussed **in**

Section 4.4.2 below.

Eligibility Curves are eligible for bundl**in**g if they are close to each other. This imprecise

requirement expresses the need **of** an attraction for close parts **of** curves only. Regard**in**g

the bundl**in**g **of** curves, distant curves must not affect each other. A value that def**in**es the

eligibility **of** parts **of** curves for bundl**in**g is denoted as a maximum distance dmax between

a pair **of** dummy nodes **of** two dist**in**ct curves. It must be def**in**ed **in** advance. The dynamic

identification **of** a proper value for dmax is not discussed here, as this section focuses on

the **in**troduction **of** the basic pr**in**ciple **of** this energy-based bundl**in**g approach.

4.4.1 Attraction

Energy-based algorithms model curves rather by a set **of** dummy nodes than by a body.

Thus, the attraction acts between dummy nodes **of** dist**in**ct curves, but not between

dummy nodes **of** the same curve.

Attraction Energy The attraction energy is characterized by the follow**in**g requirements.

• The attraction energy **of** bundled curves is m**in**imal.

• The attraction energy **of** not bundled curves, which are eligible for bundl**in**g, is not

m**in**imal.

• The attraction energy **of** distant curves, which are not eligible for bundl**in**g, is m**in**imal

s**in**ce the attraction must no **in**fluence distant parts **of** curves.

In comparison to the attraction **of** the energy-based rout**in**g technique, these requirements

demand a more laborious formalization **of** the attraction.

The first option is the usage **of** an attraction energy similar to the one that h**in**ders sta**in**

**of** curves dur**in**g energy-based rout**in**g. The bundl**in**g attraction energy EB **of** a dummy

node δ1 towards another dummy node δ2 **of** dist**in**ct curves is def**in**ed as below. Aga**in**,

a > 0 is a customizable exponent, which is not related to the exponents **of** other energies

**in** this work.

EB = 1

a · pδ2 − p

δ1

a

(4.1)

However, this attraction energy EB is high for very distant pairs **of** dummy nodes. To

restrict the **in**fluence **of** the attraction between parts **of** curves that are not eligible for

bundl**in**g, the def**in**ition **of** the attraction energy EB(d) has to be limited to the range

0 ≤ d = pδ2 − p

δ1 ≤ dmax and otherwise it is def**in**ed to be already m**in**imal, e.g., zero.

1

EB(d) = a · da if d ≤ dmax

(4.2)

0 else

A second option is to l**in**e-up the bundl**in**g attraction with the attraction force that

h**in**ders the stra**in** **of** curves (cf. Section 3.4.3.1). Both attraction forces are opposed to

64

4.4 Energy-Based Curve Bundl**in**g

each other **in**to one comb**in**ed energy model for concurrent rout**in**g and bundl**in**g **of** curves.

Assum**in**g that the bundl**in**g attraction is weaker than the stra**in** delimit**in**g attraction **of**

curves, distant parts **of** curves (that are not eligible for bundl**in**g) are not bundled. This

option is not formalized **in** the follow**in**g.

Attraction Force The energy decisively determ**in**es the attraction force act**in**g on the

dummy nodes. The end po**in**ts **of** the curves are not displaced. The attraction force FB

on a dummy node δ1 is directly derived from the energy def**in**ed **in** Equation 4.2.

⎧

p ⎪⎨ δ2

F B =

⎪⎩

− p

δ1

a−1 · pδ −p

2 δ1

pδ2 −pδ1 if pδ2 − p 0

δ1 ≤ dmax and

δ1 ∈ ∆(c1), δ2 ∈ ∆(c2), c1 = c2

else

(4.3)

A dummy node δ1 **of** the curve c1 is attracted to a dummy node δ2 **of** a different curve

c2 = c1. There is a weak force, and thus a small node displacement, if the two dummy

nodes are already close to each other. Consequently, the distance between them tends to

zero. Distant parts **of** the curves, that are not eligible for bundl**in**g, are not bundled. This

is because there is no attraction between dummy nodes that are more distant than dmax.

FB

dmax

bundled curves

Figure 4.7: Energy-based bundl**in**g **of** two curves eligible for bundl**in**g

Figure 4.7 shows a sketch **of** the attraction act**in**g between two close curves. Direction

and magnitude **of** the bundl**in**g attraction force are depicted by the arrows between dummy

nodes. For simplicity and avoidance **of** clutter **in** the figure, each dummy node is only

attracted to one other dummy node. The bundl**in**g result are two closer, visually bundled

curves.

4.4.2 Order **of** Energy-Based Rout**in**g and Bundl**in**g

The order **of** rout**in**g and bundl**in**g may impact the produced hypergraph layouts. Therefore,

the differences are briefly outl**in**ed **in** the follow**in**g paragraphs.

4.4.2.1 Separated Curve Rout**in**g and Bundl**in**g

Energy-Based Curve Rout**in**g First The energy-based rout**in**g **of** curves can radically

separate **in**itially close curves (depicted **in** Figure 4.8a). As the example visualization **in**

Figure 4.8b shows, after rout**in**g both curves are clearly separated by the central cluster

65

4 Reduction **of** Visual Complexity

(b) The curves **of** the hyperedge **in** (a) are

routed first

(a) An example draw**in**g **of** a not routed and

not bundled hyperedge

(c) The curves **of** the hyperedge **in** (a) are

bundled before rout**in**g

Figure 4.8: Impact **of** the order **of** energy-based rout**in**g and bundl**in**g on the result**in**g

hypergraph layout

**of** repuls**in**g nodes. Consequently, the energy-based approach from above will not bundle

the curves as both are too distant and not eligible for bundl**in**g anymore.

After the bundl**in**g **of** curves, all moved curves must be re-routed aga**in** to ensure the

avoidance **of** both, the occlusion **of** nodes and the **in**tersection **of** clusters.

Energy-Based Curve Bundl**in**g First Figure 4.8c depicts the result **of** the example hypergraph

where both curves are bundled before rout**in**g. This approach obviously produces a

different layout compared to Figure 4.8b. A re-rout**in**g is not necessary for this approach.

Conclusion It is possible to create **in**stances to underl**in**e advantages and disadvantages

**of** both orders. No particular order can be favored because **of** their layouts. The need **of**

the first approach to re-route the curves might be a drawback due to its additional time

consumption.

4.4.2.2 Concurrent Curve Rout**in**g and Bundl**in**g

A further option is the comb**in**ation **of** rout**in**g and bundl**in**g. This section picks up the

statement from the discussion **of** rout**in**g techniques **in** Section 3.5 that the rout**in**g **of** curves

fosters curve bundl**in**g. Hence, the subsequent comb**in**ation **of** energy-based rout**in**g and

bundl**in**g is a reasonable option.

This comb**in**ation was already discussed above **in** Section 4.4.1 as a second option to

formalize the attraction energy EB. The comb**in**ation **of** both techniques **in**to one consistent

energy model overcomes the need to decide on the order **of** rout**in**g and bundl**in**g.

66

4.5 Energy-Threshold-Based Aggregation

Thus, both goals **of** rout**in**g and bundl**in**g are likewise pursued. This approach applied

on the example hypergraph may result **in** layouts similar to both layouts **in** Figures 4.8b

and 4.8c.

Energy-Based Rout**in**g Fosters Bundl**in**g The purpose **of** rout**in**g techniques is to move

curves to paths without h**in**der**in**g nodes. These correspond to paths with (locally) low

total repulsion energy caused by nodes. For **in**stance, an optimal path for routed curves

between two clusters **of** repuls**in**g nodes is located **in**-between them. There is a high

probability that two curves, which are **in**itially placed close to this path, will be placed

at this optimal path with low energy after rout**in**g. Subsequently, energy-based rout**in**g

already assists bundl**in**g as close curves are converged. The comb**in**ation **of** rout**in**g and

bundl**in**g thus can support each other to visually bundle curves.

4.4.3 Conclusion

The energy-based bundl**in**g technique visually bundles close parts **of** curves together. However,

it requires an explicit distance-based restriction to decide which curves are eligible

for bundl**in**g. Alternatively, energy-based rout**in**g and bundl**in**g can be comb**in**ed **in**to one

consistent energy model. Both approaches **of** the bundl**in**g technique reduce the visual

complexity **of** hypergraphs.

An automated identification **of** bundled parts **of** the curves is not possible by this energybased

bundl**in**g technique, because the hypergraph model rema**in**s unchanged. A hypergraph

draw**in**g consequently shows the curves **in**dividually; the impression **of** an aggregation

**of** curves arises by an overlapp**in**g **of** curves.

The follow**in**g section **in**troduces a technique to aggregate visually bundled curves **in** the

hyperedge model.

4.5 Energy-Threshold-Based Aggregation

The reduction **of** the complexity **of** the hyperedge models might be preferable to a solely

visual bundl**in**g **of** curves. The **in**formation **of** which parts **of** curves are aggregated can be

useful for further hypergraph process**in**g or the visualization. Furthermore, a simplified

model reduces the complexity **of** the hypergraph layout computation.

This technique identifies close parts **of** curves **of** a given hypergraph layout, aggregates

those parts **of** the curves and reduces the complexity **of** the hyperedge structure accord**in**gly.

Each curve creates an energy field with maximum field strength at the curve’s

position. The strength **of** the energy field decreases with **in**creas**in**g distance from the

curve. The energy fields **of** close curves overlap each other. Two parts **of** curves are aggregated

if the sum **of** both field strengths between both parts are higher than a certa**in**

threshold.

Prerequisites Routed and bundled hypergraph layout are likely to position curves more

closely. The energy-based rout**in**g tends to accumulate curves at paths with locally low

67

4 Reduction **of** Visual Complexity

energy and the energy-based bundl**in**g technique directly aims at a visual bundl**in**g **of**

curves. Rout**in**g or bundl**in**g **of** curves are not required to apply this aggregation technique,

but can be expedient steps **in** advance.

Threshold-Based Closeness The closeness **of** curves can be measured by the distances

between them or by the strengths **of** energy fields. Both measures are similar as the energy

field strength is a function aga**in**st the distance, e.g., E ∼ d x . Energy is a more abstract

measure, as it can be adjusted to the scal**in**g **of** the graph layout, and it allow different

energies **of** different curves, e.g., to **in**crease the energy **of** aggregated curves.

Two po**in**ts on dist**in**ct curves are close to each other, and thus will be aggregated, if

the sum **of** both energy field strengths between those po**in**ts is higher than the energy

threshold. Aga**in**, it is necessary to model curves by cha**in**s **of** dummy nodes. The energy

field strengths are computed with respect to the positions **of** the dummy nodes.

4.5.1 Rationale

Once an energy threshold is set, the sum **of** the energy field strengths Ec1 +Ec2 between two

dummy nodes δ1 ∈ ∆(c1) and δ2 ∈ ∆(c2) **of** two dist**in**ct curves c1 and c2 is calculated.

The energy sum E(ps) = Ec1 (d1) + Ec2 (d2) at a position ps depends on the distances

d1 = pδ1 − p

s and d2 = pδ2 − p

s to the dummy nodes, respectively.

The curves are aggregated if and only if the energy sum E(p s) ≥ t is greater than an

energy threshold t at all positions p s between the dummy nodes δ1 and δ2. The energy

field strengths aga**in**st the distance are plotted **in** Figure 4.9 for the follow**in**g scenario.

The cross sections **of** curves are depicted, the horizontal axis reflects the distance between

them. The vertical axis reflects the energy field strength. The two curves c0 and c1

were already aggregated and they both create a stronger energy field than curve c2. The

sum **of** all field strengths (depicted by the bold plot **in** this figure) between the curves is

higher than the threshold value (the dotted l**in**e). Eventually, the curves c0, c1, and c2 are

aggregated.

A threshold distance dt is derived from the threshold energy t. As the energies can be different

for each curve, this threshold distance dt can be different for different pairs **of** curves.

This threshold distance dt is derived from the energy fields Ec1 and Ec2 . Consequently,

two parts **of** the curves c1 and c2 are close if and only if the distance pδ2 − p

δ1 ≤ dt

between the dummy nodes is not larger than the threshold distance dt.

The position **of** an aggregated curve is determ**in**ed by the rema**in****in**g graph layout. The

aggregated curves are roughly positioned between the **in**dividual curves. The precise

positions are computed by the energy-based rout**in**g technique to avoid the **in**troduction

**of** node occlusion or cluster **in**tersection by the aggregation **of** curves.

4.5.2 Conclusion

This curve aggregation technique based on an energy-threshold allows to simplify the

hypergraph models and thus reduces the visual complexity **of** hypergraph draw**in**gs. The

68

c0

E E

energy sum

c1

cross section **of** curves

4.6 Cluster-Based Curve Aggregation

threshold

0 0

Figure 4.9: The aggregation **of** close curves based on the preset energy threshold (dotted

horizontal l**in**e)

specification **of** a threshold constitutes the aggregation **of** curves. So the choice **of** a proper

threshold value is crucial.

This aggregation technique can be generally applied to hyperedges. A previous rout**in**g

or bundl**in**g **of** curves is optional. But this technique can enhance the energy-based

bundl**in**g technique **of** the previous section, if both techniques are used **in** sequence. It

bridges the gap between visual bundl**in**g and an model-based aggregation **of** curves, by

transform**in**g a visually bundled curve layout **in**to aggregated curves **in** the model. However,

the cluster-based curve aggregation technique **in** the subsequent section is less laborious

than this one.

4.6 Cluster-Based Curve Aggregation

The aggregation **of** curves based on a spatial cluster**in**g is another technique to reduce the

visual complexity **of** hypergraph visualizations. The hypergraph structure is simplified **in**

the model, which corresponds to a reduction **of** the **in**tr**in**sic cognitive load. This technique

first groups close adjacent curves. Then, curves **of** the same group are aggregated as far

as possible and at a certa**in** po**in**t on an aggregated curve, the aggregated curve branches

out to **in**dividual curves jo**in****in**g the hyperedge nodes.

Prerequisites In this section, for the sake **of** simplicity, the centralized hyperedge structure

is assumed unless stated otherwise. Curves connect hyperedge nodes with the crux.

Assum**in**g that the curves are not routed yet, curves are aggregated near their ends at

the crux due to the small distance between them. The absolute distance between two

neighbor**in**g curves **in** a radial setup is not crucial for group**in**g, but the **in**cluded angle

implies their eligibility **of** aggregation.

At the end **of** this section, it is shown that the cluster-based curve aggregation can

also be applied to the fully connected structure as this technique is not restricted to the

centralized structure.

c2

d

69

4 Reduction **of** Visual Complexity

The rema**in**der **of** this section **in**troduces a simple prototype algorithm to identify groups

**of** curves that are go**in**g to be aggregated. After this, a method for the calculation **of**

branch po**in**ts **of** aggregated curves is shown **in** Section 4.6.2. Then, Section 4.6.3 describes

movable curve end po**in**ts as a consequence **of** aggregat**in**g and branch**in**g curves. And

f**in**ally, a conclusion **of** this curve aggregation technique is drawn.

4.6.1 Identification **of** Curve Groups

The first step towards an aggregation is the identification **of** groups **of** close curves. A

polar coord**in**ate system is orig**in**ated at the position **of** the crux. The hyperedge node

positions are described by the azimuth angle (or polar angle θ) and the curve length. The

azimuth angle θ(c) **of** a curve c is the **in**cluded angle **of** the zero degree ray (polar axis)

and the curve.

The plethora **of** well-**in**vestigated cluster**in**g algorithms [63] easily allows to partition

hyperedge nodes. The azimuth angle serves as a primary criterion to cluster the respective

curves. The Euclidean distance between hyperedge nodes and the crux may also **in**fluence

the creation **of** groups, but is not considered here.

Different cluster**in**g algorithms are expedient for different purposes. As this work does

not focus on a specific application or criterion **of** hypergraph visualization, a discussion **of**

cluster**in**g algorithms is outside its scope. The follow**in**g paragraphs therefore **in**troduce a

simple strategy to group nodes. This strategy is adequate to expla**in** the concept **of** curve

aggregation and to document a decrease **of** visual complexity **of** hypergraphs.

70

1. The azimuth angle θ(c) **of** each curve c is computed.

2. The **in**cluded angle ∆θ(ci, cj) between each pair **of** dist**in**ct curves ci and cj is computed.

∆θ(ci, cj) = |θ(ci) − θ(cj)| (4.4)

3. The pair **of** curves ci and cj with m**in**imal **in**cluded angle ∆θ(ci, cj) is aggregated. The

aggregated curve cij is located between both **in**dividual curves. The direction **of** the

aggregated curve is affected by the curve weights w(ci) = len(ci) and w(cj) = len(cj),

which correspond to the curves’ lengths, i.e., the Euclidean distance between the end

po**in**ts **of** the curve. The azimuth angle θ(cij) **of** the aggregated curve is given as

follows.

θ(cij) = θ(ci) +

len(cj)

len(ci) + len(cj) · (θ(cj) − θ(ci)) (4.5)

The weight w(cij) = len(ci) + len(cj) **of** an already aggregated curve cij is the sum

**of** lengths **of** all curves aggregated **in**to cij.

4. The curve group cij is added and the two **in**dividual curves ci and cj are removed

from the set **of** curves **in** the model **of** the hyperedge structure.

5. Steps 1 through 4 are repeated until there is no further pair **of** dist**in**ct curves with

an **in**cluded angle less than the maximum angle. The maximum angle ∆θmax is

the term**in**ation criterion **of** this algorithm and specifies which curves are eligible for

aggregation.

v6

v5

v4

v0

crux

v1

v2

v3

(a) Hyperedge without curve aggregation

v6

v5

v4

4.6 Cluster-Based Curve Aggregation

v0

crux

v1

v2

v3

(b) Aggregated curves for ∆θmax = 90 ◦

Figure 4.10: **Visualization** **of** an exemplified hyperedge

c0 c1 c2 c3 c4 c5

c6 30 60 120 150 120 60

c5 90 120 180 150 60

c4 150 180 120 90

c3 120 90 30

c2 90 60

c1

30

Table 4.2: Initial **in**cluded angles between curves **of** the hyperedge **in** Figure 4.10a

c0,6 c1 c2 c3 c4

c5 70 120 180 150 60

c4 130 180 120 90

c3 140 90 30

c2 110 60

c1

50

Table 4.3: Updated **in**cluded angles after the aggregation **of** the curves c0 and c6

c0,6,1 c2,3

c4,5 130 120

c2,3

110

Table 4.4: Result **of** the group**in**g algorithm

71

4 Reduction **of** Visual Complexity

The example **in** Figure 4.10a depicts a centralized hyperedge composed **of** seven curves.

The **in**cluded angles between curves are shown **in** Tables 4.2 through 4.4. The bold values

**in** these tables label the m**in**imal **in**cluded angle between two curves that are aggregated **in**

the respective group**in**g step. Two group**in**g steps between Table 4.3 the result **in** Table 4.4

were omitted.

If the maximum aggregation angle is set to ∆θmax = 90 ◦ , then, **in** compliance with

Table 4.4, the presented group**in**g algorithm will compute three groups **of** curves: c0,6,1,

c2,3 and c4,5 as depicted **in** Figure 4.10b.

4.6.2 Branch Out **of** Aggregated Curves

In the follow**in**g, an aggregated curve that emerged from the aggregation **of** two curves ci

and cj is denoted by cij. After curves are aggregated and the direction **of** the aggregated

curves is computed, it is necessary to branch out the aggregated curves aga**in**. By this,

the respective hyperedge nodes are connected to the aggregated curve. The branch po**in**t

should be placed **in** a way that reduces the visual complexity and allows smooth furcation

**of** **in**dividual curves and the aggregated curve.

A branch po**in**t p b is placed at the latter half (that is not adjacent to the crux) **of** an

aggregated curve cij. A large **in**cluded angle between two curves ci and cj corresponds

to a short common aggregated path **of** ci and cj. The maximum common path length **of**

curves that are aggregated **in** cij is limited by the m**in**imal length m**in**{len(ci), len(cj)}

**of** **in**dividual curves. The m**in**imal common aggregated path is here set to the half **of** the

m**in**imal **in**dividual curve length.

pb = 1

2 m**in**{len(ci), len(cj)} + 1

2 m**in**{len(ci),

len(cj)} · 1 − ∆θ(ci,

cj)

∆θmax

= m**in**{len(ci), len(cj)} · 1 − ∆θ(ci,

(4.6)

cj)

2 · ∆θmax

An aggregated curve, i.e., group **of** curves, that consists **of** more than two **in**dividual

curves is branched out **in** almost the same manner. The two longest curves ci and cj **of**

an aggregated curve are bifurcated first as **in** Equation 4.6. Then the third longest curve

ck is branched out from the aggregated curve cij. In order to do this, the length **of** the

aggregated curve cij needs to be known. This length is def**in**ed as the distance between the

crux and the latest branch po**in**t p b **of** cij. This procedure is applied until all hyperedge

nodes **of** this group are connected.

F**in**ally, the branch po**in**ts are connected to the correspond**in**g hyperedge nodes and to

the other branch po**in**ts on the path to the crux. The result **of** curve aggregation applied

to the example above is shown **in** Figure 4.10b.

4.6.3 Movable Curves

Branch po**in**ts, as well as the crux, do not consider the given fixed graph layout. Thus,

they may be placed **in**appropriately as they can occlude nodes, or they could be placed

72

4.7 Energy-Based Curve Widen**in**g

**in**side a dense group **of** nodes. The end po**in**ts **of** curves, which are **in**troduced by the

branch**in**g method, must also meet the requirements **of** hypergraph visualizations from

Section 2.2.2.

Hyperedge nodes that are end po**in**ts **of** curves are not displaced. The other curve end

po**in**ts are movable **in** the layout area. The optimal position **of** curve end po**in**ts can

be computed dur**in**g energy-based rout**in**g. The energy model for rout**in**g presented **in**

Section 3.4 can also **in**tegrate movable curve ends as dummy nodes. Consequently, an

optimal position **of** an end po**in**t respects the fixed graph layout (cf. node repulsion **in**

Section 3.4.2) and prevents very long curve lengths (cf. stra**in** **of** curves **in** Section 3.4.3).

4.6.4 Conclusion

A prototype algorithm for the aggregation **of** close curves was **in**troduced. This algorithm

serves as a placeholder for more sophisticated and potentially more adapted group**in**g algorithms

for visually pleas**in**g and mean**in**gful results. For **in**stance, a hierarchical cluster**in**g

algorithm can produce a hyperbolic tree, i.e., a nested aggregation **of** curves. Nevertheless,

the algorithm allows to demonstrate that curve aggregation is able to reduce the visual

complexity **of** hypergraph draw**in**gs as the evaluation **in** Section 5.4.1 will prove. Conclud**in**g

the example from above, the aggregation **of** curves reduced the total curve length by

more than 20 percent and **in**creased the number **of** curves by 2.

The aggregation and branch**in**g **of** curves **in**troduces new curves. The **in**creas**in**g number

**of** curves and dummy nodes generally **in**creases the computation time needed to generate

hypergraph layouts. At the same time, it **in**creases the quality **of** approximation **of** the

curve representation.

Fully-Connected Hyperedge Structures The aggregation **of** curves is not limited to centralized

hyperedge structures. Other structures may not have a crux, but a subset **of**

curves are **in**cident as they share a common hyperedge node. The curve aggregation can

be applied to all hyperedge nodes where **in**cident curves can be aggregated.

Certa**in**ly, this approach will not aggregate close curves that are not **in**cident at all. So

the centralized structure is preferable for curve aggregation.

4.7 Energy-Based Curve Widen**in**g

The visual complexity **of** hypergraphs can be reduced by bundl**in**g curves visually together

or aggregat**in**g them **in** the hypergraph model. The widen**in**g **of** curves is a technique to

bundle curves visually together as wide curves are more likely to overlap each other. This

approach also aims at the reduction **of** the visual complexity by reduc**in**g the extraneous

cognitive load. The **in**tr**in**sic load is not modified.

As the width **of** curves **in** hypergraph draw**in**gs **in**creases, the curves will be rather

considered as planes than l**in**e segments. A hyperedge is represented by a plane construct

**in** the unused graph layout area. It is visually clearly separated from the rema**in****in**g graph,

**in** particular from ord**in**ary straight-l**in**e edges. These different visualization paradigms

73

4 Reduction **of** Visual Complexity

**of** hyperedges and boxes and l**in**es foster the readability **of** hypergraph visualizations and

still allows to visualize b**in**ary edges **of** the underly**in**g graph.

Based on the f**in**d**in**gs **of** the exam**in**ed curve rout**in**g approaches **in** Chapter 3, an energybased

approach is favored over a geometrical approach. Geometrical concepts tend to rely

on the specification **of** fixed values and spatial computations, which can become fairly

complex as a plethora **of** occurr**in**g **in**stances has to be considered.

This section **in**troduces a transformation **of** curves to a plane structure by utiliz**in**g energy

fields. First, an **in**troduction to the visualization **of** curves **in** hypergraph draw**in**gs

is given to further motivate a widen**in**g **of** curves. Section 4.7.3 **in**troduces a model **of** **of**

widened curves that allow their representation **in** energy fields. Afterwards, Section 4.7.4

outl**in**es the basic pr**in**ciple **of** this approach that is formalized **in** Section 4.7.5. A conclusion

**of** this energy-based widen**in**g technique f**in**ishes this section.

4.7.1 **Visualization** **of** Curves

The visual representation **of** a curve **in** a graph draw**in**g is a l**in**e segment. The position**in**g

**of** curves and the specification **of** the l**in**e width **in**fluence their visualization. A brief

digression on the position**in**g **of** curves is given first. Then, the l**in**e width **of** visualized

curves is discussed.

Position**in**g The position**in**g **of** curves is based on the positions **of** the dummy nodes.

Curves may be drawn as a sequence **of** straight l**in**e segments connect**in**g neighbor**in**g

dummy nodes. Alternatively, the next two paragraphs show methods for curve fitt**in**g,

that can smoothly draw the curves.

Spl**in**es may be used to form smooth curves pass**in**g the dummy nodes. The de Casteljau

algorithm [28] allows to **in**terpolate a curve between any number **of** control po**in**ts. Control

po**in**ts are given by the dummy nodes and the end po**in**ts **of** the curve.

However, spl**in**es do not necessarily **in**tersect the control po**in**ts. They are only required

to go through the end po**in**ts **of** a curve. In contrast, so called **in**terpolat**in**g Lagrange

curves **in**tersect all given po**in**ts. The Aitken algorithm [28] allows to compute **in**terpolat**in**g

Lagrange curves, i.e., l**in**es **in**tersect**in**g all dummy nodes and the end po**in**ts **of** the curves.

Spl**in**es and Lagrange curves both tend to deviate more than straight l**in**es connect**in**g

neighbor**in**g dummy nodes. S**in**ce energy-based rout**in**g only guarantees the prevention **of**

node occlusion at the positions **of** the dummy nodes, spl**in**es are more susceptible to node

occlusion.

L**in**e Width Besides the position**in**g **of** curves, their l**in**e width is **of** particular concern **in**

the rema**in****in**g section. Each l**in**e draw**in**g has a certa**in** width. Curves can be visualized

with marg**in**al l**in**e width or by plane constructs with much larger curve width. A wide

curve must still avoid node occlusion to meet the requirements **of** hyperedge visualizations.

Therefore, if the l**in**e width is not m**in**imal, a plane curve has to spare the nodes.

Another benefit **of** a plane curve representation is a higher ability to recognize characteristic

forms **of** a hyperedge. An on-l**in**e animation **of** the evolution **of** a hypergraph may

74

4.7 Energy-Based Curve Widen**in**g

slightly change the positions or the set **of** nodes **in** each step. Characteristic forms, like

a bulge **of** a wide curve, supports the viewer’s navigation and orientation **in** the hypergraph

draw**in**g. In contrast, l**in**es are not able to form characteristic shapes that are stable

aga**in**st a slight change **of** a graph layout **in** an on-l**in**e animation.

Further, the curve width may reflect **in**formation about the hypergraph. For **in**stance, a

broad curve is placed between two clusters. In the follow**in**g step **of** an on-l**in**e animation

**of** this hypergraph, the two clusters grow by add**in**g further nodes. As a result **of** this

change, the curve width reduces s**in**ce the **in**creased repulsive strength **of** the clusters

stronger h**in**ders the widen**in**g. A viewer **of** this animation thus can **in**terpret the curve

width as an **in**dicator **of** density **of** surround**in**g nodes.

4.7.2 Prerequisites

Widened curves must also fulfill the requirements **of** hypergraph visualization. The expressiveness

**of** the given graph layout is still preserved if the widened curves do not occlude

nodes and, **of** course, the graph layout must not change.

The width **of** a visualized curve has to be limited. If there are no graph nodes limit**in**g

the widen**in**g **of** curves, a curve must not be widened **in**f**in**itely.

4.7.3 Model **of** a Widened Curve

The def**in**ition **of** repulsion and attraction energies enables the calculation **of** the layout

**of** widened curves. Similar to the model **of** curves, which allows to handle curves **in**

energy fields, a model **of** planes represent**in**g widened curves **in** energy fields is required

first. Afterwards, the energy-based pr**in**ciple and the composition **of** the energies **in**to one

consistent energy model are exam**in**ed.

Hull A routed curve is modeled as a cha**in** **of** dummy nodes between the end po**in**ts.

These dummy nodes are also the **in**itial po**in**t **of** the model **of** a widened curve. A widened

curve is basically a l**in**e segment visualized with a non-zero l**in**e width. It is sufficient to

model the hull, i.e., the boundaries **of** the plane, to describe a widened curve. The hull

circumferentially encloses the entire curve and is the set **of** the outermost po**in**ts **of** the

widened curve. Figure 4.11a shows a dashed-l**in**e hull **of** a routed curve.

Hull Po**in**ts Still, the hull **of** a curve is an **in**f**in**ite set **of** po**in**ts. Similar to curves, a

proper approximation is needed to compute the **in**fluence **of** energy fields on the hull and

thus on the body **of** a widened curve. The concept **of** dummy nodes is reused to construct

an approximation **of** the hull. This way, the flexibility **of** the **in**cremental raise **of** the

accuracy **of** curves also holds for the accuracy **of** correspond**in**g hulls.

Perpendicular vectors on both sides **of** a curve, orig**in**ated at the positions **of** the dummy

nodes, reflect the potential positions **of** the hull po**in**ts. Figure 4.11a depicts hull po**in**ts by

the gray circles placed on the hull. The direction cδ **of** a curve c at the position **of** a dummy

node δ is derived from the positions **of** both neighbor**in**g dummy nodes δ1, δ2 ∈ neighbor(δ).

75

4 Reduction **of** Visual Complexity

(a) Hull **of** a routed curve modeled by hull po**in**ts

(b) Perpendiculars **of** a routed curve

(c) The result **of** the transformation from a routed curve to a plane

Figure 4.11: Energy-based widen**in**g **of** routed curves

This complies with the l**in**ear Bézier spl**in**e between the two control po**in**ts p δ1 and p δ2 ,

obta**in**ed by l**in**ear **in**terpolation.

cδ = p δ2 − p δ1

Then, **in** two-dimensional space the perpendiculars o1 and o2 at the position **of** δ are

orthogonal to the local direction cδ **of** the curve.

−c

o1 =

y

δ

cx δ

c

o2 =

y

δ

−cx δ

(4.7)

As the hull po**in**ts **of** the hull are part **of** the hyperedge layout, their positions **in** the graph

layout are denoted by p h.

76

FNR

o1

FCR

FCA

v

FHA FHA

FCA

4.7 Energy-Based Curve Widen**in**g

FCR

o2

FNR

Figure 4.12: Act**in**g forces **in** the process **of** energy-based widen**in**g

Figure 4.11b depicts a routed curve and its perpendicular vectors. As the end po**in**ts **of**

the curve only have one neighbor**in**g dummy node, the direction **of** the curve at the end is

derived from the end po**in**t and its neighbor**in**g dummy node. Further, another hull po**in**t

and perpendicular is **in**troduced for each end po**in**t **of** a curve. Both are placed on the

extension **of** the curve’s direction at the end po**in**t.

4.7.4 Rationale

The start**in**g po**in**t **of** the widen**in**g algorithm is a routed curve, whose l**in**e width can be

assumed to be zero. The **in**itial hull is therefore placed on the curve. The hull is “**in**flated”

by a repulsion. The repulsion pushes the hull equally **in** all directions away from the curve.

The repulsion **of** the hull po**in**ts by the curve is limited to a maximum curve width.

It is crucial that the hull must not occlude nodes **of** the graph layout. Therefore, the

repulsion **of** the curve is limited by an oppos**in**g node repulsion. Figure 4.11c drafts the

result **of** this approach. A gray plane represents the area enclosed by the hull **of** the curve

that spares nodes. Before all energies are formalized, their purposes and requirement are

briefly summarized **in** the subsequent paragraphs.

Node Repulsion The repulsion **of** hull po**in**ts by the curve, which is addressed below, is

the driv**in**g force **of** this energy-based widen**in**g technique. However, the node repulsion is

more important for the def**in**ition **of** an energy model, because widened curves must not

occlude graph nodes. Figure 4.12 illustrate node repulsion that is labeled with FNR and

the curve repulsion that is labeled with FCR act**in**g on the hull po**in**ts. The rema**in****in**g two

forces **in** Figure 4.12 are be **in**troduced below.

The specification **of** the node repulsion is crucial as it permanently has to be predom**in**ant

over the rema**in****in**g energies to guarantee the prevention **of** node occlusion. The hull

po**in**ts are only repulsed by the nodes **of** the same side **of** the curve. Otherwise the curve

width would be **in**creased by repuls**in**g graph nodes **of** the opposite side **of** the curve.

77

4 Reduction **of** Visual Complexity

Consequently, the layout **of** one side **of** a curve does not **in**fluence the hull on the other

side **of** the curve.

Aga**in**, the accuracy **of** the hull model determ**in**es the quality **of** the result. A large

distance between neighbor**in**g dummy nodes, which also entails a large distance between

neighbor**in**g hull po**in**ts, is more susceptible to node occlusion than a smaller distance.

FCR

FNR

Figure 4.13: Two hull po**in**ts are

ma**in**ly repulsed by the close curve.

The node repulsion force FNR is m**in**or

and thus does not prevent node

occlusion.

Curve Repulsion Hull po**in**ts are repulsed from the

curve to widen the curve. Each hull po**in**t is repulsed

**in** the direction **of** the correspond**in**g perpendicular.

In contrast to previous repulsions used **in** this work,

the curve repulsion must not tend to **in**f**in**ity for

very small distances between dummy node and hull

po**in**t. Otherwise, as Figure 4.13 depicts, the curve

repulsion might be stronger than a node repulsion, if

the distance between curve and hull po**in**t is significantly

smaller than the distance between node and

hull po**in**t. This would consequently permit node

occlusion.

Curve Attraction In order prevent an **in**f**in**ite curve width, an attraction between the

curve and the hull is added. The curve attraction acts **in** the opposite direction **of** the

curve repulsion. While the curve repulsion decreases and the curve attraction **in**creases

with **in**creas**in**g distance from the curve. The curve attraction starts to dom**in**ate the curve

repulsion at a certa**in** distance Wc. The specification **of** curve repulsion and attraction thus

determ**in**es the maximum curve width Wc **of** a curve.

Hull Attraction Node repulsion, curve repulsion, and curve attraction are still not sufficient

to produce proper hull layouts. As the node repulsion can be much stronger than

curve attraction and curve repulsion to avoid node occlusion, hull po**in**ts can rigorously deviate

from the perpendiculars. An attraction between neighbor**in**g hull po**in**ts **of** a curve’s

hull keeps them closer to the position **of** the perpendiculars and h**in**ders a distortion **of** the

hull. This hull attraction is similar to the attraction between neighbor**in**g dummy nodes

to h**in**der the stra**in** **of** curves.

4.7.5 Formalization

Four energies are required to compute the hulls that represent widened curves. After

the clarification **of** their purposes **in** the previous section, this section formalizes energies,

forces as functions **of** the distance, and the iterative displacements **of** hull po**in**ts. The

creation **of** an equilibrium is already considered as the equations are **in**troduced.

4.7.5.1 Node Repulsion

The hull po**in**ts are repulsed by the nodes on the same side **of** the curve. There is a strong

repulsion by a node v ∈ V **of** a hypergraph H = (V, E) **in** a small distance from the hull

78

4.7 Energy-Based Curve Widen**in**g

po**in**t h. Further, the repulsion dim**in**ishes with **in**creas**in**g distance p v − p h. Obviously,

the node repulsion energy will only affect the hull po**in**t’s position. The graph layout

rema**in**s unchanged.

Equilibrium To create a balance **of** all forces **in** the widen**in**g process, the magnitudes

**of** the forces are balanced regard**in**g optimal layouts. A similar procedure to balance the

forces **of** energy-based rout**in**g was thoroughly **in**troduced **in** Section 3.4.4. Hence, the

determ**in**ation **of** the coefficients **of** the follow**in**g equations are only briefly justified and

used **in** the follow**in**g equations.

Nodes are obstacles for the widen**in**g **of** curves. As the width **of** a curve **in**creases, i.e.,

the hull po**in**ts are moved on the perpendiculars away from the respective dummy nodes,

the node repulsion must be stronger than the repulsion by the curve (**in** the close proximity

**of** a node). This requirement must apply for every possible position **of** hull po**in**ts **in** such

a close proximity **of** nodes to prevent node occlusion. Otherwise, as Figure 4.13 already

depicted, curves may occlude nodes.

The radius **of** the area around a node, which must

assert a stronger node repulsion than curve repulsion,

depends on the distance dh between neighbor**in**g hull

po**in**ts **of** a curve, as illustrated **in** Figure 4.14. By this,

if a node is located between two neighbor**in**g perpen-

diculars, then the distance between the node and the

closest hull po**in**t is smaller than dh

2 for a certa**in** curve

width. Therefore, the node repulsion at a distance dh

2

from the node must be always stronger than the curve

repulsion. Because there are no other requirements **of**

an equilibrium so far, the magnitude **of** the node repulsion

force can be def**in**ed as 1 at the distance dh

2 .

dh

Figure 4.14: Distance dh between

two neighbor**in**g hull po**in**ts

Node Repulsion Energy The repulsion energy ENR act**in**g on a hull po**in**t h, caused by

a node v ∈ V , adds up to the follow**in**g.

⎧

⎪⎨ −

ENR =

⎪⎩

1

r ·

r 2

· pv − ph dh

r

if r < 0,

− dh

2 · log pv − ph if r = 0

The repulsion exponent r ≤ 0 allows to adjust the strength **of** the repulsion by nodes.

(4.8)

Node Repulsion Force The force FNR is calculated as the negative gradient **of** the energy

ENR.

FNR = −

2

dh

r−1

· p v − p h r−1 · p v −p h

p v −p h

(4.9)

79

4 Reduction **of** Visual Complexity

Hull Po**in**t Displacement In each iteration **of** an energy m**in**imization algorithm, a hull

po**in**t h is moved by ∆x and ∆y along the x- and y-axis, respectively.

4.7.5.2 Curve Repulsion

∆x = −

∆y = −

2

r−1

dh

r−1 2

dh

· (p x v − p x h) · p v − p h r−2

· p y v − p y

h · pv − ph r−2

(4.10)

The curve repulsion must take the strength **of** the node repulsion **in**to account. This way,

scenarios **of** occluded nodes as **in** Figure 4.13 from page 78 can be prevented. The curve

repulsion is maximal for distance **of** zero and decreases with **in**creas**in**g distance. The

magnitude **of** the curve repulsion force act**in**g on a hull po**in**t therefore is limited by a

certa**in** value F. The value **of** F, which is the largest magnitude FCR(0) **of** the curve

repulsion force, is equal to the smallest magnitude FNR( dh

2 ) **of** the node repulsion force

that may occur at possible positions **of** hull po**in**ts. The value **of** F can be different for

each curve, as it depends on the distance dh between neighbor**in**g hull po**in**ts.

F = FCR(0) = FNR( dh

2 ) (4.11)

Equilibrium The balance **of** forces is def**in**ed for an optimal layout. That is, regard**in**g

curve repulsion, a curve with maximum curve width Wc. The maximum curve width

is determ**in**ed below **in** conjunction with the curve attraction. A hull po**in**t h is **in** its

optimal position p o,CR

h regard**in**g curve repulsion if it has a distance pδ − ph = Wc from

the respective dummy node δ on the curve. The magnitude **of** the curve repulsion force is

set to FCR(Wc) = 1 at the optimal position **of** h.

The curve repulsion force function FCR(d) ∼ d r aga**in**st the distance d = p δ − p h

between dummy node δ and hull po**in**t h has two characteristic po**in**ts (0, F) and (Wc, 1).

Let the curve repulsion energy exponent be r = 0, the two parameters a and b adjust the

force accord**in**gly.

FCR(d) = b

d + a

(4.12)

Both parameters are derived from the maximum curve width Wc **of** a curve c and the

maximum curve repulsion magnitude F by substitution the two characteristic po**in**ts **in**to

Equation 4.12.

80

a = Wc

F − 1

b = Wc · F

= a · F

F − 1

(4.13)

4.7 Energy-Based Curve Widen**in**g

Curve Repulsion Energy Derived from the parameters a and b, the curve repulsion energy

ECR between a dummy node δ and a hull po**in**t h is as follows:

ECR = −b · log (p δ − p h + a) (4.14)

In analogy to repulsion energies **in**troduced earlier **in** this work, ECR **in** Equation 4.14

corresponds to repulsion energies with a repulsion exponent r = 0. Different curve repulsion

exponents require a recalculation **of** the parameters a and b, because both parameters

were computed under the assumption (Equation 4.12) **of** a curve repulsion force that is

similar to the function 1

d .

Curve Repulsion Force The repulsion force FCR act**in**g on a hull po**in**t h **in** the direction

**of** the respective perpendicular o is:

Hull Po**in**t Displacement

4.7.5.3 Curve Attraction

FCR = −b · (p δ − p h + a) −1 · o

o

∆x = −b · (p x δ − p x h) · (pδ − ph + a) −2

∆y = −b · p y

δ − py

h · (pδ − ph + a) −2

(4.15)

(4.16)

Equilibrium The magnitude **of** the curve repulsion force FCR(Wc) = 1 was set to 1 for a

distance Wc between hull po**in**t and respective dummy node. Hence, the curve attraction

is characterized by the follow**in**g two requirements.

• The curve attraction is weaker than the curve repulsion, if the distance is smaller

than the maximum curve width Wc

• The curve attraction is stronger than the curve repulsion, if the distance is larger

than the maximum curve width Wc

Curve Attraction Energy The curve attraction is described as a function similar to

ECA ∼ d a aga**in**st the distance d = p δ − p h between a hull po**in**t h and the respective

dummy node δ. The attraction exponent a > 0 must be greater than zero.

Curve Attraction Force

ECA =

1

a · W a−1

c

· p δ − p h a

FCA = 1

W a−1 · pδ − ph c

a−1 · o

o

(4.17)

(4.18)

81

4 Reduction **of** Visual Complexity

Hull Po**in**t Displacement

4.7.5.4 Hull Attraction

∆x = 1

W a−1 · (p

c

x δ − p x h) · pδ − ph a−2

∆y = 1

W a−1 ·

c

p y

δ

− py

h · pδ − ph a−2

(4.19)

Equilibrium Regard**in**g the distortion **of** a hull, the hull po**in**ts **of** a curve’s hull are ideally

homogeneously distributed. However, the node repulsion impedes an even distribution and

distorts the hull.

δ1

o1

g

dopt(g,h)

δ2

h

o2

Figure 4.15: Optimal distance dopt

between two hull po**in**ts is the average

distance between their perpendiculars

Optimal Distance **of** Neighbor**in**g Hull Po**in**ts

The optimal distance dopt between two neighbor**in**g

hull po**in**ts **of** a hull is the average distance between

the respective perpendiculars. The perpendiculars

**of** bent curves are not parallel and so the distance

between hull po**in**ts on the perpendiculars changes

with the actual curve width. Thus, the optimal distance

between two neighbor**in**g hull po**in**ts is the average

distance between the two perpendiculars as

depicted **in** Figure 4.15.

Let δ1 and δ2 be two neighbor**in**g dummy nodes

on a curve c with maximum curve width Wc. The

correspond**in**g normalized perpendicular vectors are

o1 and o2. Then the optimal distance between the two correspond**in**g hull po**in**ts g and h

is, as depicted **in** Figure 4.15, determ**in**ed by:

dopt(h, g) =

p Wc

δ2 +

2 · o2

Wc

− pδ1 + · o1

(4.20)

2

The magnitude FHA(dopt(g, h)) = 1 **of** the hull attraction force between neighbor**in**g

hull po**in**ts is def**in**ed as 1 for an optimal distance between two neighbor**in**g hull po**in**ts g

and h.

Hull Attraction Energy The attraction energy EHA on a hull po**in**t h is formalized below.

Aga**in**, it is adaptable by an attraction exponent a > 0.

Hull Attraction Force

82

FHA =

EHA =

1

a · dopt(g, h) a−1 · p g − p h a

1

dopt(g, h) a−1 · p g − p h a−1 · p g −p h

p g −p h

(4.21)

(4.22)

Hull Po**in**t Displacement

4.7.6 Conclusion

∆x =

∆y =

1

dopt(g, h) a−1 · (px g − p x h) · p g − p h a−2

1

dopt(g, h) a−1 · (py g − p y

h ) · p g − p h a−2

4.7 Energy-Based Curve Widen**in**g

(4.23)

The model **of** a hull that is approximated by the hull po**in**ts is necessary to compute

the widened curves utiliz**in**g energies. An energy model for the widen**in**g **of** curves was

**in**troduced. The four energies create an equilibrium **of** the hull po**in**ts that is able to

**in**crease the curve width, to avoid node occlusion, and to prevent an **in**f**in**ite curve width.

As the curve widths can be **in**creased, close curves may overlap each other. This promises

a reduction **of** the visual complexity. Several hypergraph visualizations with widened curve

are shown **in** the evaluation **in** Section 5.4.2.

Smooth**in**g **of** the Hull Besides the prevention **of** the hull distortion, the hull attraction

additionally evens the surface **of** the hull. A radical change **of** the curve width between

neighbor**in**g hull po**in**ts is smoothed, because the attraction **of** hull po**in**ts to its neighbors

grows with **in**creas**in**g distance **in** between. After several iterations **of** an energy

m**in**imization algorithm, the hull po**in**ts sw**in**g **in**to a more smooth hull surface.

The energy-based widen**in**g technique operates on **in**dividual curves. Adjacent curves

are not recognized and so this technique is not capable to smooth or merge adjacent

curves. Nevertheless, such a visual enhancement is not crucial for a reduction **of** the visual

complexity **of** hypergraphs and for the cognition **of** structural **in**formation **of** hyperedges.

Side Effect **of** the Hull Attraction For the sake **of** simplicity, the optimal distance

dopt between neighbor**in**g hull po**in**ts was approximated to the average distance between

perpendiculars (cf. Equation 4.20). A routed and thus bent curve consists **of** convex and

concave curve segments. With **in**creas**in**g curve width the actual hull po**in**t distance on

the convex side becomes larger than the calculated optimal distance from Equation 4.20.

Thus, the hull attraction produces a smaller curve width on the convex side **of** a bent

curve.

As a consequence, an energy m**in**imization algorithm does not **in**crease the curve width

**of** convex curve segments to the maximum width Wc. This m**in**or effect is not crucial, s**in**ce

curves are usually not radically bent by an energy-based rout**in**g technique. Furthermore,

as usually nodes repulse an expand**in**g hull, a downsiz**in**g **of** the maximal curve width will

not effect the draw**in**g either.

Performance The widen**in**g **of** curves can become fairly expensive. In particular the

computation **of** the node repulsion **in**volves |V | angle computations for each hull po**in**t

**of** each curve **of** the hypergraph. The angle between the vectors o **of** the respective

83

4 Reduction **of** Visual Complexity

perpendicular and the vector p v − p δ determ**in**es the relative position **of** a node v ∈ V to

the curve.

A remarkable simplification is the restriction to nodes which are close enough to impact

the widen**in**g **of** a curve. This is possible s**in**ce the potential maximum curve width Wc

has to be def**in**ed anyway. A smaller number **of** considered nodes reduces the number

**of** angle computations significantly. Furthermore, the set **of** nodes that is considered for

each hull po**in**t can be stored for reuse **in** all subsequent iterations **of** an energy m**in**imization

algorithm used to widen the curves. These modifications significantly reduce the

computational effort **of** an energy-based curve widen**in**g.

If only nodes with**in** a certa**in** range repulse hull po**in**ts, a radical change **of** the curve

width among neighbor**in**g hull po**in**ts may occur. This happens if one hull po**in**t is with**in**

and another hull po**in**t is outside **of** such a range. Nevertheless, a radical change **of** the

curve width between neighbor**in**g hull po**in**ts is elim**in**ated by the hull attraction force.

4.8 Discussion

A visualization **of** hyperedges **in** graph layouts generates additional cognitive load **in** a

graph draw**in**g. The reduction **of** the visual complexity **of** hyperedge visualizations aims

at reta**in****in**g the readability **of** the displayed **in**formation with**in** the narrow screen space.

This chapter therefore **in**troduced four techniques to reduce the complexity **of** hypergraph

visualizations. S**in**ce hyperedges are based on curves, these techniques operate on the

curves, too.

Energy-based bundl**in**g **of** curves can be comb**in**ed with the energy-based rout**in**g technique.

The curves are visually bundled together. The result**in**g hyperedge layouts fulfill

the requirements **of** hypergraph visualizations and have a reduced visual complexity.

The energy-threshold-based aggregation **of** curves transfers a visual bundl**in**g **of** curves

**in**to an aggregation **of** curves **in** the hyperedge model. The comb**in**ation **of** energy-based

bundl**in**g and threshold-based aggregation is a two-tiered technique to reduce the complexity

**of** the hyperedge structures by aggregat**in**g curves.

Another technique that reduces the complexity **of** hyperedge models is the clusterbased

aggregation **of** curves. The latter technique seems less laborious than the two-tired

one and promises comparable results. Therefore, the cluster-based aggregation **of** curves

is preferred for a later prototype implementation and is evaluated **in** Section 5.4.1 on

page 5.4.1 to demonstrate the reduction **of** the visual complexity **of** hyperedges.

The widen**in**g **of** curves also bundled close curves visually by alter**in**g the visualization

**of** curves rather than rearrang**in**g their paths. The evaluation **in** Section 5.4.2 will exam**in**e

the avoidance **of** node cover**in**gs **of** the energy-based curve widen**in**g technique. The accompanied

produced hypergraph draw**in**gs can prove the reduction **of** their visual complexity

due to this technique.

84

5 Evaluation

This chapter evaluates the proposed layout techniques, namely energy-based rout**in**g,

energy-based curve widen**in**g, and the cluster-based aggregation **of** curves, and **in**vestigates

their abilities to meet the requirements **of** hypergraph visualizations. For this purpose,

the evaluation utilizes the criteria for hypergraph visualizations that were given **in**

Section 2.2.3.

Each **of** the follow**in**g experiments exam**in**es a s**in**gle hypothesis that was assumed **in** this

work. Furthermore, and not less important, these experiments prove the suitability **of** the

proposed layout techniques. This work is accompanied by a prototype implementation that

is briefly **in**troduced **in** the subsequent Section 5.1. Next, the example hypergraphs used

to evaluate the hypergraph layout techniques are presented **in** Section 5.2. The rema**in****in**g

structure **of** this chapter is derived from the requirements **of** hypergraph visualizations as

follows.

The proposed layout techniques were designed to meet the given requirements **of** hypergraph

visualization, **in** particular

1. the establishment **of** a uniform visual connectivity between all nodes **of** a hyperedge,

2. the preservation **of** the expressiveness **of** the given graph layout, and

3. the reduction **of** the visual complexity **of** hypergraph draw**in**gs.

The first requirement is fulfilled by the choice **of** the hyperedge structure that was briefly

discussed **in** Section 2.3. The second requirement comprises a fixed graph layout and the

avoidance **of** node occlusion and cluster **in**tersection. The graph layout was not altered

by any layout technique. The energy-based curve rout**in**g technique that h**in**ders node

occlusion and cluster **in**tersection is evaluated by the experiments **in** Section 5.3. The

reduction **of** the visual complexity **of** hypergraph layouts is evaluated **in** Section 5.4.

Accord**in**g to the def**in**ition **in** Section 2.2.1, the term hypergraph layout denotes the

positions **of** the nodes and the dummy nodes **of** a hypergraph. The graph layout is given

or computed separately, and this chapter therefore dist**in**guishes the graph layout, i.e.,

the positions **of** the nodes, from the positions **of** hyperedges. Thus, the term hypergraph

layout as used **in** this chapter excludes the position**in**g **of** the nodes.

5.1 Implementation

The prototype implementation is based on the L**in**LogLayout tool [3]. In its own words,

“L**in**LogLayout is a simple, easy-to-use open source program (written **in** Java) for comput**in**g

graph draw**in**gs, us**in**g the L**in**Log energy models and standard energy models

like Fruchterman-Re**in**gold, and graph cluster**in**gs, us**in**g the Modularity measure **of** Mark

Newman. It **in**cludes a reusable energy m**in**imizer (spr**in**g embedder) class based on the

85

5 Evaluation

efficient Barnes-Hut algorithm, and a reusable class for Modularity cluster**in**g based on a

multi-scale algorithm.”

**Graph** Layout Computation The basic concepts **of** energy-based layout algorithms were

already **in**troduce **in** Section 2.1.2. The L**in**Log energy model, i.e., the (a, r) = (1, 0)energy

model computes graph layouts that reflect the structure **of** the graph. Such layouts

place densely connected vertices close and sparsely connected nodes more distant.

The computation **of** graph layouts starts with random positions **of** nodes. The energy

m**in**imization algorithm **in**itiates the layout computation with an PolyPoly energy model

that is less susceptible to local energy m**in**ima than the L**in**Log energy models [35]. In

the progress **of** the iterative energy m**in**imization, the energy model is slightly changed

to the f**in**al (1, 0)-energy model. The number **of** iterations **of** the energy m**in**imization

algorithm is chosen very conservatively, i.e., rather high, to produce stable graph layouts

with (locally) m**in**imal energy.

Hypergraph Layout Computation Hypergraph layouts are computed to compare different

parameters, techniques, and sett**in**gs with each other. The generated layouts are

ma**in**ly evaluated based on their energy values. For this, it is **in**evitable to facilitate the

same **in**itial graph layout for the computation **of** the hypergraph layout, as otherwise the

layout energies are **in**commensurable.

The layout **of** hyperedges is determ**in**ed by the positions **of** dummy nodes and optionally

by the positions **of** hull po**in**ts. The layout computation utilizes the same iterative energym**in**imization

approach as **in** L**in**LogLayout to move dummy nodes or hull po**in**ts. The

energy models **of** the hyperedge layout techniques do not change dur**in**g execution.

5.2 Example Hypergraphs

In the experiments presented **in** Sections 5.3 and 5.4, various hypergraphs are subjected

to the layout techniques to exam**in**e their capabilities. The employed graphs depict a s**of**tware

system, a social network, the world trade, and a pseudo-random cluster**in**g structure.

In each **of** the visualizations **of** these hypergraphs, nodes **of** different clusters are assigned

different colors, such that the affiliation **of** each node to a specific cluster is obvious. This

color cod**in**g is especially important when we evaluate the avoidance **of** cluster **in**tersections.

The size **of** the nodes **in** each graph visualization corresponds to their degree **in** the

underly**in**g graph.

The follow**in**g paragraphs outl**in**e the background and purposes **of** the example hypergraphs.

Each **of** the follow**in**g hypergraphs consists **of** the orig**in**al graph and one additional

hyperedge.

ArgoUML S**of**tware System

The visualization **of** s**of**tware systems is the ma**in** motivation **of** this work. Consequently,

the proposed hypergraph visualization techniques are evaluated us**in**g graphs that depict

s**of**tware systems. ArgoUML [1] is an open source UML model**in**g tool.

86

5.2 Example Hypergraphs

**Graph** The graph was derived from the source code **of** ArgoUML. The code was obta**in**ed

from the project’s Subversion repository on October 17, 2008. The nodes represent the

s**of**tware artifacts at the level **of** Java classes. Method calls, attribute access, and **in**heritance

are modeled by weighted b**in**ary edges between nodes. The graph layout reflects

the hierarchical structure **of** the s**of**tware system. This means that the layout allows to

spatialize each class to a package. In addition, packages can be spatialized to further

packages **of** higher levels **of** the s**of**tware hierarchy. Such a cluster**in**g **of** nodes is typical

for visualizations **of** s**of**tware systems, but not a requirement.

Not all nodes **of** the graph are connected to the major part **of** the s**of**tware system. 67

nodes were not connected at all, and overall 86 nodes were removed to create a connected

graph **of** 1, 434 nodes and 4, 372 edges.

Hypergraphs A hyperedge **of** the ArgoUML graph represents a set **of** Java classes that

were commonly changed, i.e., classes that were changed between two subsequent revisions

**of** the Subversion version control system. The scenario that such hyperedges reveal was

already **in**troduced as co-change **of** s**of**tware artifacts **in** Section 1.2.

76 hyperedges, which connect more than two nodes, were derived from the most recent

commits to the repository. Large hyperedges are rare, as commits usually do not

affect many s**of**tware classes. The frequency **of** commits decreases aga**in**st the commit size

(number **of** changed classes). For **in**stance, there are 26 hyperedges **of** size 4, 10 hyperedge

**of** size 6, and one hyperedge **of** size 13. The largest **in**vestigated hyperedge connects 32

nodes.

The hypergraphs, each **of** them conta**in****in**g one hyperedge, are denoted accord**in**g to the

pattern “argouml-he76-hn32” to **in**dicate hyperedge number 76 that connects 32 hyperedge

nodes.

Social Network

Social networks were already mentioned as possible applications **of** hyperedge visualizations,

cf. Section 1.2. This graph models high-tech managers that are employed by the

company “Krackhardt”, and was orig**in**ally used **in** [60] as a multi-relational network.

**Graph** The graph data [2] models the friendship relations between the employees. 147

edges represent the mutual friendship relations between 33 employees that are modeled

by nodes.

Hypergraph A hyperedge visualizes the common friends **of** two employees, Rick and Tom.

Both are strongly related to the same group **of** friends, i.e., they have a higher number

**of** common friends than two employees **of** different groups would have. The hyperedge

consists **of** six nodes. The hypergraph is denoted by “Hitech” **in** the follow**in**g.

87

5 Evaluation

World Trade

Trade graphs represent the economic relations between countries. The relations may

represent the trade volume **of** the annual import and export among countries and thus these

graphs reveal dependencies between economic systems. The statistical data **of** world trade

is made available by the World Bank at http://www.worldbank.org/trade (accessed

August 15, 2008).

**Graph** The graph used **in** this work represents the import **of** wares **of** 66 countries that

are represented by nodes. A weighted and undirected edge between two countries represents

the import trade volumes between both countries **in** 1999. Thus, the graph models

imported as well as exported trade volumes. The particular graph data [5] was also used

**in** previous work to discuss the L**in**Log draw**in**gs and it was shown that the graph layout

can represent an arrangement **of** nodes that is similar to the geographical arrangement **of**

countries.

Hypergraphs The hyperedges for this graph were derived from an obvious question a

viewer might ask: Which countries are economically strongly connected to a certa**in** country?

6 hyperedges **of** this type are created. The top 10 and top 20 countries that are

highly connected to the USA, Germany, and Ch**in**a. Each **of** the three countries is highly

connected to geographic neighbors, but is also connected to countries **of** other cont**in**ents.

For **in**stance, the top 10 countries related to Germany except for European countries are

located **in** America, Australia, and Asia. The choice **of** hyperedges therefore promises

a mixed distribution **of** connected hyperedge nodes, on the one hand between different

clusters and on the other hand strongly **in**tegrated **in** the cluster **of** the exam**in**ed country.

In the follow**in**g, the mentioned hypergraphs will be denoted by, e.g., “WorldImport1999-

GER10”. The name **in**dicates the country **of** **in**terest by a three character country code

and the number **of** considered trade partners at the end.

Pseudo-Random **Graph**

The pseudo random graph **of**fers a clear spatial cluster**in**g **of** nodes **in** the graph layout.

This specific graph property allows to exam**in**e the rout**in**g technique that avoids cluster

**in**tersections.

**Graph** The graph **of** 400 nodes and 30278 edges clearly conveys 8 clusters hav**in**g 50

nodes each. The specified probabilities **of** edges connect**in**g the nodes **of** the graph as well

as the graph itself are available from the author [4].

Hypergraphs As the graph does not reflect a real world problem, the hyperedges are userdef**in**ed.

The hyperedges connect nodes **of** different clusters **in** order to allow a reasonable

evaluation **of** the reduction **of** cluster **in**tersections. 3 hyperedges **of** different sizes connect

6, 10, and 16 nodes and are identified by the graph names “8Clusters-6”, “8Clusters-10”,

and “8Clusters-16”, respectively.

88

5.3 Preservation **of** **Graph** Layout Expressiveness

5.3 Preservation **of** **Graph** Layout Expressiveness

This section presents two experiments to evaluate to energy-based curve rout**in**g technique

that was proposed **in** Section 3.4. This rout**in**g technique is evaluated with regard to its

ability to preserve the expressiveness **of** the given graph layouts. This requirement is

represented by the criteria to avoid node occlusion and cluster **in**tersections. Both criteria

are exam**in**ed separately **in** the follow**in**g Sections 5.3.1 and 5.3.2.

5.3.1 Experiment 1 – Node Occlusion

The first experiment evaluates the energy-based rout**in**g technique **of** curves. The hypothesis

is that energy-based rout**in**g improves the hypergraph layout quality, which is

measured by the criteria for hypergraph visualizations from Section 2.2.3. Namely, node

occlusion, cluster **in**tersections, and the reduction **of** visual complexity can be evaluated.

Rout**in**g **of** curves does not aim at the reduction **of** visual complexity. The avoidance

**of** cluster **in**tersections by the rout**in**g technique is exam**in**ed **in** the second experiment **in**

Section 5.3.2.

The first experiment evaluates the rema**in****in**g criterion **of** avoid**in**g node occlusion by

exam**in****in**g the energy **of** hypergraph layouts. It was already stated **in** Chapter 3 that

the fidelity **of** the curve model **in**fluences the quality **of** the produced hypergraph layouts.

Thus, the **in**fluence **of** the curve model fidelity is **in**vestigated **in** the second part **of** this

experiment.

5.3.1.1 Energy-Based Rout**in**g Reduces Hyperedge Layout Energy

Hypothesis: Energy-based rout**in**g reduces the energy **of** hypergraph layouts

and thus avoids node occlusion.

The energy-based rout**in**g technique was designed to preserve the expressiveness **of** the

given graph layout and ma**in**ly focuses on the avoidance **of** node occlusion by hyperedges.

Hence, the energy model was established such that node occlusions are penalized with

high energy values. The energy model that is used **in** the energy-based rout**in**g algorithm

determ**in**es the total energy **of** a hypergraph layout. This experiment proves that the

energy **of** hypergraph layouts is reduced, which complies with the reduced likelihood **of**

node occlusion. The energy-based rout**in**g technique is evaluated **in**dependent **of** other

layout techniques proposed **in** this work.

A reduction **of** the energy is only atta**in**able by a reduction **of** the repulsion energy.

The attraction energy is already m**in**imal for not routed straight curves, s**in**ce a straight

l**in**e is the shortest connection between two po**in**ts. Thus, the attraction energy is always

**in**creased by rearrang**in**g the curves. Consequently, the total hyperedge layout energy is

reduced by a reduction **of** the repulsion energy that implies an enlargement **of** the distances

between repuls**in**g nodes and dummy nodes.

The occlusion **of** nodes can not be directly measured, because po**in**ts (the positions

**of** nodes) are very improbable to **in**tersect l**in**es (the curve segments). Furthermore, the

occlusion **of** nodes **in** a visualization depends on the size **of** drawn nodes and the l**in**e

89

5 Evaluation

width **of** drawn curve segments. Therefore, a change **of** the hyperedge layout energy is a

more abstract **in**dicator, which is **in**dependent **of** a particular visualization **of** hypergraph

layouts.

Def**in**itions

The energy model **of** the rout**in**g technique (which was summarized **in** Table 3.1) employs

a repulsion **of** dummy nodes by nodes and an attraction **of** neighbor**in**g dummy nodes **of**

the same curve. The energy E(p) **of** a hypergraph layout p thus is calculated as the sum

**of** the total energies act**in**g on all dummy nodes δ ∈ ∆(ε) **of** the hyperedge ε. The total

energy E(δ) **of** a dummy node was given **in** Equation 3.19 at the end end **of** Section 3.4.5

about energy-based rout**in**g.

E(p) =

δ∈∆(ε)

E(δ) (5.1)

As the value **of** the total hyperedge layout energy widely varies for arbitrary scal**in**gs **of**

layouts, the change ϱ **of** the hyperedge layout energy is computed. The change **of** energy

**of** a hyperedge layout is determ**in**ed based on the **in**itial energy E0 **of** a not routed layout

and the f**in**al energy E r **of** a routed layout **of** a hyperedge ε.

Setup

ϱ = E r − E0

E0

(5.2)

The energy **of** a hyperedge layout is computed before and after rout**in**g. The hyperedge

layout energy E0 before rout**in**g is computed with the **in**itial layout where hyperedge nodes

are connected by straight-l**in**e curves. Then the hypergraph is routed and the energy E r

**of** the result**in**g layout is measured. The repulsion and attraction exponents r and a **of**

the energy model used for rout**in**g, as summarized **in** Table 3.1, were set to r = −1 and

a = 2.

The fidelity f **of** the curve model is **in**crementally **in**creased dur**in**g rout**in**g. The f**in**al

fidelity **in** this experiment is set to f = 6, which corresponds to 63 dummy nodes model**in**g

each curve, and is sufficient to produce decent hypergraph layouts. The energy was

m**in**imized **in** a total **of** 180 iterations. 30 iterations for each level **of** fidelity turned out to

be adequate to compute stable hypergraph layouts **of** these example graphs.

In summary, the experimental results presented next were obta**in**ed us**in**g the follow**in**g

setup:

90

• Centralized and fully connected hyperedge structure

• Fidelity f = 6 **of** the curve model

• Each level **of** fidelity is routed **in** 30 iterations

• No reduction **of** visual complexity: no curve aggregation and no curve widen**in**g

5.3 Preservation **of** **Graph** Layout Expressiveness

**Graph** E0 E r ϱ

8Clusters-6 3, 898, 986 183, 510 −95%

8Clusters-10 6, 488, 107 281, 943 −96%

8Clusters-16 10, 260, 682 471, 213 −95%

Hitech 221, 661 9463 −96%

WorldImport1999-CHN10 67, 616, 953 2, 759, 081 −96%

WorldImport1999-CHN20 100, 954, 597 4, 403, 640 −96%

WorldImport1999-GER10 65, 463, 982 2, 641, 378 −96%

WorldImport1999-GER20 141, 974, 221 5, 732, 665 −96%

WorldImport1999-USA10 66, 129, 628 2, 863, 279 −96%

WorldImport1999-USA20 131, 226, 440 5, 607, 047 −96%

Table 5.1: Comparison **of** total energies **of** **in**itial and routed hypergraph layouts

Experimental Results

The energies E0 and E r **of** **in**itial and routed hyperedge layouts were measured to compare

the quality **of** the layouts. The energies **of** two layouts **of** the same hyperedge are only

comparable if the hyperedge has the same curve model fidelity each time the energy

is determ**in**ed. Thus, before comput**in**g the energy **of** a hypergraph, the fidelity **of** the

computed layouts is raised to a common level without chang**in**g the layouts. This means

that each curve is modeled by the same number **of** dummy nodes, and as a result the

energies **of** different layouts **of** a hyperedge are comparable. This procedure is expla**in**ed

**in** more detail **in** the follow**in**g Section 5.3.1.2.

The results **of** the centralized hyperedge structure are shown **in** the follow**in**g. As expected,

the values **of** hyperedge energies **of** the fully connected structure are much higher

than those **of** the centralized structure, s**in**ce the fully connected structure **in**volves a higher

number **of** curves. Beside that, the reduction **of** the hyperedge energies **of** all example hypergraphs

was similar for both visualization structures.

Table 5.1 shows the hyperedge energies **of** the **in**itial and the computed hypergraph

layout. The ratio ϱ **in**dicates the relative decrease **of** the total hyperedge layout energy.

The results **of** the 76 hypergraphs derived from the ArgoUML s**of**tware system were

omitted as they would not reveal any further **in**formation. On average, the ratio ϱ **of**

these 76 total energies was decreased by 94%±0.7%. The standard deviation **of** only 0.7%

**in**dicates the stability **of** the energy reduction **of** all ArgoUML hypergraphs.

Tables 5.2 and 5.3 show the attraction energy EA and repulsion energy ER **of** the example

layouts separately. The ratios ϱA and ϱR are calculated analogously to ϱ **in** Equation

5.2. The attraction energy is **in**creased by rout**in**g, because the curves are stretched

dur**in**g rout**in**g.

Conclusion

The measurements **of** the total hyperedge energies **of** the example hypergraphs **in** Table 5.1

show that **in** every case the hypergraph layout energy was clearly reduced by the energybased

rout**in**g technique. The reduction **of** the repulsion energy is achieved by **in**creas**in**g

91

5 Evaluation

**Graph** EA,0 EA, r ϱA

8Clusters-6 1, 754 2, 037 +16%

8Clusters-10 3, 021 3, 653 +21%

8Clusters-16 5, 241 6, 110 +17%

Hitech 95 172 +81%

WorldImport1999-CHN10 32, 754 41, 179 +26%

WorldImport1999-CHN20 69, 141 79, 280 +15%

WorldImport1999-GER10 37, 955 45, 717 +20%

WorldImport1999-GER20 76, 755 94, 153 +23%

WorldImport1999-USA10 33, 813 39, 804 +18%

WorldImport1999-USA20 71, 036 87, 447 +23%

Table 5.2: Comparison **of** attraction energies **of** **in**itial and routed hypergraph layouts

**Graph** ER,0 ER, r ϱR

8Clusters-6 3, 897, 233 181, 473 −95%

8Clusters-10 6, 485, 086 278, 290 −96%

8Clusters-16 10, 255, 441 465, 103 −95%

Hitech 221, 566 9, 291 −96%

WorldImport1999-CHN10 67, 584, 200 2, 717, 902 −96%

WorldImport1999-CHN20 100, 885, 457 4, 324, 360 −96%

WorldImport1999-GER10 65, 426, 026 2, 595, 661 −96%

WorldImport1999-GER20 141, 897, 466 5, 638, 512 −96%

WorldImport1999-USA10 66, 095, 816 2, 823, 475 −96%

WorldImport1999-USA20 131, 155, 404 5, 519, 600 −96%

Table 5.3: Comparison **of** repulsion energies **of** **in**itial and routed hypergraph layouts

the distance between repuls**in**g nodes and the dummy nodes **of** the curves **of** the hyperedges.

Assum**in**g that the curve model fidelity is sufficiently high, the reduction **of** the

hyperedge layout energy entails the avoidance **of** node occlusion.

The separation **of** repulsion and attraction **in** Tables 5.2 and 5.3 reveals that the total

energy is reduced exclusively by the repulsion. The attraction is consequently **in**creased as

the curves were displaced. In total, the reduction **of** the repulsion is much higher than the

**in**crease **of** the attraction. The Hitech hypergraph layout, and also a few layouts depict**in**g

the ArgoUML s**of**tware system, highly **in**creased the attraction energy by stretch**in**g the

curves. Nevertheless, **in** all **in**stances the total energy was significantly and steadily reduced

by approximately 95%, **in**dependent **of** the hyperedge size.

Furthermore, the energy-based rout**in**g technique is exam**in**ed by an assessment **of** produced

hypergraph layouts. Figures 5.1 through 5.6 show sequences **of** visualizations **of**

routed hypergraph layouts **of** four example hypergraphs **in** the centralized and the fully

connected structure. Each sequence **of** six hypergraph draw**in**gs show the **in**itial sett**in**g

and the **in**termediate results **of** the energy-based curve rout**in**g. For illustrative purposes

the dummy nodes are depicted.

The graph draw**in**gs help to illustrate the measured energy values **of** the experimental

results above, as the mean**in**g **of** a certa**in** energy value **of** a hypergraph layout and the

impact **of** a certa**in** degree **of** energy reduction is unclear **in** the number-based representation.

92

5.3 Preservation **of** **Graph** Layout Expressiveness

The first draw**in**g **of** each sequence **in** Figures 5.1 through 5.6 shows the **in**itial, not

routed, curves **of** a hyperedge. Then the curve model fidelity is **in**creased every 30 iterations

and consequently the number **of** dummy nodes **in**creases. The last draw**in**g **of** each sequence

shows a layout with curve fidelity f = 5. Those figures show that curves are successfully

repulsed by the nodes. Node occlusions by the curves are prevented. **Visualization**s **of**

layouts with curve fidelity f = 6 are omitted, because more dummy nodes per curve could

not be dist**in**guished visually and the layout **of** curves do not change significantly when

**in**creas**in**g the curve fidelity to f = 6. The latter fact will be demonstrated **in** the next

experiment **in** Section 5.3.1.2 and is confirmed by the values **in** Table 5.4 on page 98.

Another comparison **of** not routed and routed hypergraph layouts is shown **in** Figure 5.9

on page 103 by the graph draw**in**gs that also reveal the avoidance **of** cluster **in**tersections.

These layouts show the pseudo-random hypergraph **in** the centralized structure. The crux

**of** the structure was moved away from the central (red) cluster as the position **of** the

barycenter **of** the hyperedge was not optimal. A closer look at the parts **of** the layouts

where curves **in**tersect clusters **in** order to connect hyperedge nodes also reveal that curves

do not occlude the nodes.

Due to the small scale **of** the shown layouts, the hyperedge nodes were marked with a

black rim to allow a dist**in**ction. Routed curves were displaced with respect to the nodes.

It is also observable that close curves were moved to a common optimal path and thus

visually bundled, especially **in** Figure 5.9f.

The next experiment shows the **in**fluence **of** the curve model fidelity. It also reveals that

the chosen fidelity f = 6 **of** the experiment is sufficient and proves the stability **of** these

measurements.

5.3.1.2 Fidelity **of** Curve Model

Hypothesis: A higher curve model fidelity produces better hypergraph layouts

as it further reduces the hypergraph layout energy.

This experiment shows the impact **of** the curve fidelity on the hypergraph layout quality

**in** the context **of** energy-based rout**in**g. The assumption, which was already used **in** this

thesis, is that a higher curve model fidelity produces better hypergraph layouts than a

curve model with lower fidelity. The hypergraph layout quality is aga**in** evaluated with

respect to the requirements **of** hypergraph visualizations, which is measured by the total

energy **of** hyperedges. Later experiments on the aggregation and widen**in**g **of** curves will

also consider the impact **of** the curve model fidelity **in** their own context.

Setup

The hypergraph layouts are computed with equal sett**in**gs except for different fidelities **of**

the curve model.

A different fidelity **of** the curve model means a different number **of** dummy nodes model**in**g

the curves. The energies **of** hypergraph layouts are only comparable if the layouts are

raised to a common level **of** fidelity. This pr**in**ciple is illustrated **in** Figure 5.7. First, the

hypergraph layouts are computed with a certa**in** fidelity **of** the curve model. Figure 5.7a

93

5 Evaluation

94

(a) f = 0, no rout**in**g (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.1: Energy-based rout**in**g **of** argouml-he21-hn5 **in** the centralized structure

(a) f = 0, no rout**in**g (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.2: Energy-based rout**in**g **of** argouml-he21-hn5 **in** the fully connected structure

5.3 Preservation **of** **Graph** Layout Expressiveness

(a) f = 0, no rout**in**g (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.3: Energy-based rout**in**g **of** Hitech hypergraph **in** the centralized structure

(a) f = 0, no rout**in**g (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.4: Energy-based rout**in**g **of** Hitech hypergraph **in** the fully connected structure

95

5 Evaluation

(a) f = 0, no rout**in**g (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.5: Energy-based rout**in**g **of** WorldImport1999-GER10 hypergraph **in** the centralized

structure

(a) f = 0, no rout**in**g (b) f = 1, after 30 iterations (c) f = 2, after 60 iterations

(d) f = 3, after 90 iterations (e) f = 4, after 120 iterations (f) f = 5, after 150 iterations

Figure 5.6: Energy-based rout**in**g **of** WorldImport1999-GER20 hypergraph **in** the centralized

structure

96

(a) A routed curve modeled with low

fidelity

5.3 Preservation **of** **Graph** Layout Expressiveness

(c) The route from (a) is applied to

the high fidelity curve model from (b)

(b) The same curve as **in** (a), but

routed with a higher fidelity model

Figure 5.7: Comparability **of** hypergraph energies **of** layouts that are computed with different

curve model fidelities

depicts the route that was computed with a lower fidelity than **in** Figure 5.7b. To compare

the energies **of** both layouts, the lower fidelity is raised to a higher level, at least to

the level **of** the compared layout. As shown **in** Figure 5.7c, the layout **of** curves rema**in**s

unchanged with an **in**crease **of** the curve fidelity.

The energy-based rout**in**g technique is applied as above. The energy was m**in**imized **in**

180 iterations as this corresponds to at least 30 iterations for each level **of** the curve model

fidelity. The setup **in** this experiment is as follows:

• Centralized and fully connected hyperedge structure

• Curve model fidelity 1 ≤ f ≤ 6

• Energy-based rout**in**g **in** 180 iterations

• No reduction **of** visual complexity: no curve aggregation and no curve widen**in**g

Experimental Results

The energy was computed for various example hypergraphs identified by their names **in**

the leftmost column **of** Table 5.4. Similar to the previous experimental results, Table 5.4

shows the reduction **of** the hyperedge energies **of** the centralized structure. To reduce the

number **of** shown results, this table only shows the ratio **of** change **of** the layout energy

to the accord**in**g not routed layout energy. The table’s last row aggregates the measured

results **of** all ArgoUML hypergraphs. The average reduction **of** the layout energy and the

standard deviation **of** this value throughout all ArgoUML hypergraphs summarize these

results.

Aga**in**, the measurements **of** the fully connected structure are omitted **in** Table 5.4,

because they are very similar to the centralized structure. The plots **in** Figure 5.8 illustrate

some additional **in**formation about the reduction **of** the hyperedge energies **in** selected

example hypergraphs. Each plot depicts the energy E(p) **of** a computed layout p **of**

the centralized and the fully connected structure aga**in**st the different fidelities f. As

97

5 Evaluation

**Graph** f = 1 f = 2 f = 3 f = 4 f = 5 f = 6

8Clusters-6 −75% −88% −94% −95% −95% −95%

8Clusters-10 −75% −88% −94% −95% −96% −96%

8Clusters-16 −75% −88% −94% −95% −95% −95%

Hitech −76% −88% −94% −96% −96% −96%

WorldImport1999-CHN10 −78% −89% −95% −96% −96% −96%

WorldImport1999-CHN20 −76% −88% −94% −95% −96% −96%

WorldImport1999-GER10 −75% −89% −94% −96% −96% −96%

WorldImport1999-GER20 −76% −88% −94% −96% −96% −96%

WorldImport1999-USA10 −76% −89% −94% −96% −96% −96%

WorldImport1999-USA20 −76% −88% −94% −96% −96% −96%

argouml-he*

−75% −87% −93% −94% −94% −94%

±1.4% ±0.9% ±0.7% ±0.7% ±0.7% ±0.7%

Table 5.4: Hypergraph layout energy reduction ϱ aga**in**st **in**creas**in**g curve fidelity f

centralized structures have less curves, the energies are significantly smaller than the

energies **of** the fully connected structures. So each **of** the plots **in** Figure 5.8 separates the

two structures clearly from each other.

Conclusion

The second part **of** this experiment clearly proves that a higher fidelity allows to compute

better hypergraph layouts with regard to the rout**in**g **of** curves. The total energy **of** the

routed hyperedges based on curves modeled with a higher fidelity is always smaller, or

at most equally high as, the energy **of** the same hyperedge that was routed with a lower

fidelity.

The table and the plots reveal a significant reduction **of** the energy with **in**creas**in**g

fidelity **of** the curve model **in** the beg**in**n**in**g. The layout computations **of** all example

hypergraphs not shown here reveal the same result. The energy was never **in**creased. Due

to space constra**in**ts, not all results can be shown **in** this thesis to prove the results.

The energy plots **in** Figure 5.8 also prove that the layouts converge to an optimal

layout. It can not be expected that the employed energy m**in**imization algorithm can

further reduce the energy E(p) **of** the layouts significantly by **in**creas**in**g the fidelity beyond

f = 6. This observation also **in**dicates that the used curve model fidelity **in** the first part

**of** the experiment was sufficient for these examples as the layout quality converges to an

optimum.

5.3.2 Experiment 2 – Cluster Intersection

Hypothesis: Energy-based rout**in**g avoids cluster **in**tersections by curves.

The previous experiment proved the capability **of** the energy-based rout**in**g technique to

reduce the hypergraph layout energy. The avoidance **of** node occlusion is a consequence **of**

the energy reduction. The second aspect **of** the preservation **of** the expressiveness **of** graph

layouts is the avoidance **of** cluster **in**tersections. The energy model **of** the energy-based

rout**in**g technique was also designed to avoid the **in**tersection **of** clusters by curves.

98

E(p)

E(p)

6e+5

9e+3

3e+6

9e+3

centralized

fully connected

0 2 4 6

Fidelity f

(a) Hitech

centralized

fully connected

0 2 4 6

Fidelity f

(c) argouml-he76-hn32

5.3 Preservation **of** **Graph** Layout Expressiveness

E(p)

E(p)

2e+9

6e+6

4e+7

3e+5

centralized

fully connected

0 2 4 6

Fidelity f

(b) WorldImport1999-GER20

centralized

fully connected

0 2 4 6

Fidelity f

(d) 8Clusters-10

Figure 5.8: Reduction **of** the layout energy **of** example hypergraphs aga**in**st **in**creas**in**g

curve fidelity

This experiment **in**vestigates the **in**fluence **of** rout**in**g on the number **of** cluster **in**tersections

by curves. The number **of** cluster **in**tersections, however, strongly depends on a

proper def**in**ition **of** a cluster**in**g, i.e., a partition**in**g **of** the nodes. A spatial cluster**in**g **of**

the graph layout is preferred over a graph cluster**in**g. A graph cluster**in**g is **in**dependent

from a layout and solely considers the relations between nodes. Nevertheless, the L**in**-

Log energy model, which is used to compute the example graph layouts except for the

ArgoUML hypergraphs, produces graph layouts that reflect the graph cluster**in**g [44, 46].

Thus, the graph cluster**in**g that is calculated by the L**in**LogLayout tool can be used to

identify spacial clusters, as the layout will “represent the cluster structure **of** graphs by

group**in**g densely connected nodes and separat**in**g sparsely connected nodes” [46].

The graph layout **of** the ArgoUML graphs were precomputed and represent the hierarchical

structure **of** the s**of**tware system. The nodes are placed accord**in**g to their affiliation

to packages. Different packages are spatially and visually separated. Thus, the hierarchy

**of** the s**of**tware system is used to cluster the ArgoUML graph layout.

Both types **of** cluster**in**g, a graph and a spatial cluster**in**g, require proper parameters

to configure a threshold that allows to determ**in**e groups **of** nodes. That is, a threshold

specifies whether a node belongs to a certa**in** cluster or not. If the threshold is too high, the

clusters can become very large and might conta**in** large gaps between nodes, which could

be wide enough to accommodate curves. Conversely, if the threshold is too low, there will

99

5 Evaluation

be many small clusters and thus cluster **in**tersections are not very likely, as curves can

easily curl around the many small clusters. Furthermore, it is impossible to compute a

cluster**in**g that complies with human sense throughout. Therefore, this experiment also

exam**in**es draw**in**gs **of** the example hypergraphs.

Def**in**itions

A cluster**in**g is a partition**in**g P **of** the set **of** graph nodes V **of** a hypergraph H = (V, E).

A curve **in**tersects a cluster **of** the hypergraph if there are nodes **of** this cluster located

on both sides **of** the curve. The clusters that conta**in** the hyperedge nodes that are end

po**in**ts **of** the curve are not considered, as these **in**tersections are **in**evitable to connect the

hyperedge nodes with each other.

The number **of** cluster **in**tersections φ **of** a hyperedge is the sum **of** cluster **in**tersections

**of** all curves. If the hyperedge is not routed, φ0 is easily determ**in**ed, because curves are

straight-l**in**e segments. A routed curve is not straight and complicates the measurement

**of** the number **of** cluster **in**tersections φr. The relative position **of** a node to the curve

is determ**in**ed by the least distant curve segment, i.e., the straight-l**in**e segment between

neighbor**in**g dummy nodes.

Setup

The number **of** cluster **in**tersections per curve are measured before and after rout**in**g. To

produce stable hypergraph layouts, the curve model fidelity is set to f = 6 and the energy

is m**in**imized **in** 180 iterations. The suitability **of** these sett**in**gs was already demonstrated

**in** the first experiment. Furthermore, the hyperedges are not widened or aggregated.

In summary, the experimental results presented next were obta**in**ed us**in**g the follow**in**g

setup:

• Centralized and fully connected hyperedge structure

• Fidelity f = 6 **of** the curve model

• Each level **of** fidelity is routed **in** 30 iterations

• No reduction **of** visual complexity: no curve aggregation and no curve widen**in**g

Experimental Results

Table 5.5 contrasts the number φ0 **of** cluster **in**tersections **of** the **in**itial layout with the

number φr **of** the routed layout for the three pseudo-random hypergraphs. The results

are shown separately for different hyperedge structures. The Hitech and the World Trade

hypergraphs are too small as they both conta**in** only three clusters each and thus no cluster

**in**tersections were measured.

For the ArgoUML example, two cluster**in**gs **of** the underly**in**g graph **of** ArgoUML were

used. A f**in**e-gra**in**ed cluster**in**g **of** 40 clusters groups classes with respect to their package

affiliation on the lowest package level. A coarse-gra**in**ed cluster**in**g **of** 18 clusters is derived

if the second highest package level is chosen. Table 5.6 shows the measurements **of**

hypergraphs with hyperedges that connect nodes **of** dist**in**ct clusters.

100

Conclusion

5.3 Preservation **of** **Graph** Layout Expressiveness

**Graph** Structure φ0 φr

8Clusters-6

8Clusters-10

8Clusters-16

centralized 4 0

fully connected 4 0

centralized 7 1

fully connected 8 1

centralized 5 0

fully connected 29 5

Table 5.5: Cluster **in**tersections before and after energy-based rout**in**g

Tables 5.5 and 5.6 verify the reduction **of** cluster **in**tersections by energy-based rout**in**g.

The higher number **of** curves **of** a fully connected structure causes more cluster **in**tersections

compared to a centralized structure.

In compliance with s**of**tware modularity, a change **in** the s**of**tware system should not

affect artifacts **of** multiple subsystems. Consequently, the hypergraphs represent**in**g the

co-change **of** the ArgoUML s**of**tware system should not be proper test candidates for this

experiment. However, the co-change **of** classes **of** the ArgoUML tool turned out to be

scattered among several packages.

The f**in**e-gra**in**ed cluster**in**g features a higher probability **of** cluster **in**tersections s**in**ce

there are more clusters that can be **in**tersected between the end po**in**ts **of** curves. Both cases

show that the number **of** **in**itial cluster **in**tersections φ0 was generally reduced by energybased

rout**in**g. However, it is also possible that cluster **in**tersections are **in**troduced if a

path that **in**tersects a cluster has a lower energy than the **in**itial path. In our experiments,

this was the case for the hypergraphs “argouml-he16-hn19”, “argouml-he54-hn23”, and

“argouml-he55-hn11” us**in**g the coarse-gra**in**ed cluster**in**g. This observation confirms the

statement from above that a coarse cluster**in**g impedes the evaluation **of** the avoidance **of**

cluster **in**tersections.

Figure 5.9 depicts computed hypergraph draw**in**gs **of** the pseudo-random hypergraphs

and allows the comparison **of** the not routed and the routed layout **in** the centralized

structure. These layouts confirm the avoidance **of** cluster **in**tersections. The cluster**in**g is

visualized by the colors **of** nodes.

The layout **in** Figure 5.9d could not resolve one cluster **in**tersection. This observation

is compliant to Table 5.5. This case may occur if dummy nodes **of** the respective curve

are trapped **in** local energy m**in**ima s**in**ce the nodes **of** the red cluster repulse the curve

equally from both sides.

The previous experiment **in** Section 5.3.1.1 proved the avoidance **of** node occlusion by the

energy-based rout**in**g technique. Comb**in**ed with the capability to avoid the **in**tersection

**of** clusters by curves, it is proven that the rout**in**g technique preserves the expressiveness

(cf. Section 2.2.2.2) **of** fixed graph layouts. This requirement is fulfilled if hyperedges are

routed us**in**g the proposed energy-based technique.

101

5 Evaluation

**Graph** Structure

argouml-he16-hn19

argouml-he20-hn4

argouml-he21-hn5

argouml-he22-hn12

argouml-he38-hn8

argouml-he43-hn6

argouml-he50-hn5

argouml-he54-hn23

argouml-he55-hn11

argouml-he61-hn8

argouml-he63-hn6

coarse-gra**in**ed

cluster**in**g

f**in**e-gra**in**ed

cluster**in**g

φ0 φr φ0 φr

centralized 0 1 9 7

fully connected 7 3 108 44

centralized 2 0 4 0

fully connected 0 0 4 0

centralized 2 0 4 0

fully connected 2 0 6 2

centralized 0 0 0 0

fully connected 3 0 5 0

centralized 7 2 7 1

fully connected 9 2 9 1

centralized 4 0 5 0

fully connected 0 0 2 0

centralized 0 0 0 0

fully connected 3 1 3 1

centralized 0 1 2 1

fully connected 42 11 90 28

centralized 1 2 1 0

fully connected 17 10 20 14

centralized 0 0 3 1

fully connected 1 1 6 1

centralized 1 1 6 0

fully connected 0 0 5 3

Table 5.6: Cluster **in**tersections before and after energy-based rout**in**g **of** a coarse-gra**in**ed

and a f**in**e-gra**in**ed cluster**in**g **of** the ArgoUML s**of**tware system

5.4 Reduction **of** Visual Complexity

The rema**in****in**g requirement **of** hypergraph visualizations is the reduction **of** visual complexity

**of** hypergraphs. Visual complexity is crucial but also difficult to formalize and

to measure. Therefore, the generated hypergraph layouts are **of** particular importance to

exam**in**e whether the proposed techniques succeed to fulfill this requirement. A modelbased

curve aggregation technique and a visual curve bundl**in**g technique are evaluated **in**

Sections 5.4.1 and 5.4.2, respectively.

5.4.1 Experiment 3 – Model-Based Aggregation

Hypothesis: The cluster-based aggregation **of** curves reduces the visual complexity

**of** hypergraph layouts.

The visual complexity **of** hypergraph draw**in**gs is reduced by a reduction **of** the cognitive

load, i.e., the amount **of** **in**formation that a viewer has to cognize **in** order to comprehend

a hypergraph draw**in**g. As already mentioned **in** Section 4.6.4 about visual complexity, a

reduction **of** the total curve length **of** a hyperedge signifies a reduction **of** the cognitive load.

This experiment therefore measures the reduction **of** the total curve length **of** hyperedges.

102

5.4 Reduction **of** Visual Complexity

(a) 8Clusters-6 without rout**in**g (b) 8Clusters-6 after energy-based rout**in**g

(c) 8Clusters-10 without rout**in**g (d) 8Clusters-10 after energy-based rout**in**g

(e) 8Clusters-16 without rout**in**g (f) 8Clusters-16 after energy-based rout**in**g

Figure 5.9: Energy-based rout**in**g avoids node occlusion and cluster **in**tersection

103

5 Evaluation

The cluster-based curve aggregation technique is exam**in**ed **in** this experiment. This

model-based aggregation is **in**dependent **of** rout**in**g, so the hypergraphs are not assumed

to be routed before the technique is applied. The hypergraph draw**in**gs at the end **of** this

experiment will show routed hypergraph layouts.

Def**in**itions

The total curve length λ(ε) **of** a hyperedge ε is def**in**ed **in** Equation 5.3 below. As curves

are not routed the length len(c) **of** each curve c ∈ C(ε) is determ**in**ed by the Euclidean

distance between both **of** its end po**in**ts.

λ(ε) =

c∈C(ε)

len(c) (5.3)

After an aggregation **of** curves, the total curve length λa(ε) is calculated analogously.

The aggregation replaced aggregated parts **of** curves by a s**in**gle curve **in** the hyperedge

model. Thus, each aggregated path is counted once. Notice the dist**in**ction **of** the length

**of** an aggregated curve from the weight **of** an aggregated curve, which denotes the sum **of**

the **in**dividual curve lengths (cf. Section 4.6.1) for the curve aggregation algorithm.

Setup

Both hyperedge structures were tested regard**in**g the curve aggregation. As the curves

are not routed for the measurement **of** the total curve lengths **of** hyperedges, no further

sett**in**gs **of** rout**in**g and widen**in**g have to be specified. The maximum azimuth angle ∆θmax

between curves that limits the aggregation **of** curves is varied **in** this experiment.

The experimental results presented next were obta**in**ed us**in**g the follow**in**g setup:

• Centralized and fully connected hyperedge structure

• Maximum azimuth angle ∆θmax ∈ {50 ◦ , 70 ◦ , 90 ◦ }

Experimental Results

The results presented **in** the follow**in**g are shown for the centralized and the fully connected

structure separately. Table 5.7 shows the ratio ϱ = λa−λ0 **of** the change **of** total curve

λ0

length, i.e., λa − λ0, to the **in**itial (not aggregated) total curve length λ0 for three different

values **of** the maximum azimuth angle ∆θmax.

The example hypergraphs shown **in** this table are selected **in** order to show a variety **of**

different hyperedge sizes. The omitted hypergraphs show similar results. The total curve

length **of** hyperedges can be decreased by the aggregation **of** curves. An **in**crease **of** the

maximum azimuth angle does not always entail a reduction **of** the total curve length. A

more detailed **in**vestigation **of** the relative reduction if the total curve length **of** centralized

hyperedges aga**in**st differently specified maximum azimuth angles is shown **in** the plots **of**

Figure 5.10.

104

−ϱ **in** percent

5.4 Reduction **of** Visual Complexity

**Graph** Structure ∆θmax = 50 ◦ ∆θmax = 70 ◦ ∆θmax = 90 ◦

8Clusters-10

8Clusters-16

Hitech

WorldImport1999-GER10

WorldImport1999-GER20

argouml-he1-hn4

argouml-he37-hn10

argouml-he46-hn12

argouml-he54-hn23

argouml-he75-hn18

75

50

25

0

Conclusion

centralized −21% −24% −13%

fully connected −48% −53% −56%

centralized −43% −35% −36%

fully connected −62% −62% −59%

centralized −5% −11% −11%

fully connected −50% −57% −60%

centralized −44% −51% −53%

fully connected −65% −67% −68%

centralized −40% −49% −55%

fully connected −66% −68% −60%

centralized −13% −13% −15%

fully connected −29% −32% −39%

centralized −35% −36% −37%

fully connected −55% −57% −57%

centralized −43% −47% −49%

fully connected −56% −60% −59%

centralized −48% −35% −39%

fully connected −61% −58% −52%

centralized −34% −40% −1%

fully connected −58% −54% −60%

Table 5.7: Change **of** total curve lengths by cluster-based curve aggregation

8Clusters-10

Hitech

WorldImport1999-GER10

WorldImport1999-GER20

20 40 60 80 100

∆θmax **in** degree

(a)

−ϱ **in** percent

75

50

25

0

argouml-he1-hn4

argouml-he54-hn23

argouml-he75-hn18

20 40 60 80 100

∆θmax **in** degree

Figure 5.10: Reduction **of** total curve length aga**in**st the threshold angle θmax

The aggregation **of** curves reduces the total curve length. This reduction causes a decl**in**e

**of** the amount **of** visualized **in**formation, the cognitive load. The relative reduction **of** the

total curve length **of** fully connected structures is constantly higher than **of** its centralized

pendant. This is founded by the fact that curves **of** the fully connected structure can be

aggregated at both ends. Still, this higher reduction **of** the visual complexity by curve

aggregation does not compensate the tremendously higher number **of** curves compared to

the centralized structure.

The plots **in** Figure 5.10 reveal that the maximum azimuth angle does not correlate with

(b)

105

5 Evaluation

an reduction **of** the total curve length **of** hyperedges. A value **of** the maximum azimuth

angle that is optimal for any hypergraph is **in**determ**in**able. The relation between the angle

and the curve length reduction is specific to each hyperedge. An **in**creas**in**g azimuth angle

also implies the possibility to **in**crease the length **of** the connections between hyperedge

nodes as the divergence **of** shared path and the **in**dividual curves **in**creases.

This experiment proves that the choice **of** a value **of** the maximum azimuth angle ∆θmax

is not crucial for the reduction **of** the total curve length **of** hyperedges. Consequently, it

is left to the viewer’s preference to specify this angle.

Figures 5.11 through 5.15 show the produced graph draw**in**gs for various values **of** ∆θmax.

A comparison **of** routed hypergraph layouts, which were not aggregated, to the layouts

that were aggregated and routed is depicted **in** the first three Figures 5.11 through 5.13.

The rema**in****in**g figures depict further visual results **of** the curve aggregation technique

separated for both hyperedge structures.

The graph draw**in**gs **of** hyperedges **in** the fully connected structure reveal that the

cluster-based curve aggregation technique can also be applied to this structure, as **in**

Figures 5.15 and 5.12d. But as the curves are aggregated for each hyperedge node **in**dividually,

the **in**cident curves **of** close hyperedge nodes are similarly placed **in** the layout, i.e.,

the different aggregated paths are close to each other, but do not overlap. A mesh **of** **in**tersect**in**g

curves is created **in** the central area **of** an aggregated fully connected hyperedge.

S**in**ce this curve aggregation technique only aggregates adjacent edges, it is recommended

to further aggregate not adjacent curves.

The visual complexity **of** the depicted hypergraph draw**in**gs was reduces. The visualizations

**of** small (e.g., **in** Figures 5.14a and 5.14d) and large hyperedges (e.g., **in** Figures 5.13b

and 5.14b) clearly benefit from the aggregation **of** curves.

5.4.2 Experiment 4 – Visual Bundl**in**g

Hypothesis: The energy-based curve widen**in**g technique reduces the visual

complexity **of** hypergraphs layouts and still avoids the occlusion **of** nodes.

The energy-based curve widen**in**g technique **in**creases the l**in**e width **of** the visual representation

**of** curves. Like ord**in**ary curves, widened curves must not occlude nodes. This

experiment measures node occlusion caused by a widen**in**g **of** curves. Therefore, it is assumed

that the rout**in**g **of** (not widened) curves prevents node occlusion, as the previous

experiment **in** Section 5.3.1.1 has proven.

Widened curves span planes that are described as polygons **in** the graph layout area.

The position and shape **of** a polygon are determ**in**ed by the position**in**g **of** the hull po**in**ts **of**

the respective curve. The number **of** nodes that are occluded by a polygon is a measure for

the quality **of** layouts produced by the widen**in**g technique with respect to the preservation

**of** the layout’s expressiveness.

The accuracy **of** the curve model determ**in**es the accuracy **of** the hull, because the

number **of** hull po**in**ts depends on the number **of** dummy nodes that are used to model a

curve. As the hull model is an approximation **of** the widened curve, node occlusion can not

always be prevented. In this respect, the approximation **of** the hull by hull po**in**ts implies

similar challenges as the approximation **of** curves by dummy nodes. The quality **of** layouts

106

(a) No curve aggregation (b) Curve aggregation with

∆θmax = 20 ◦

5.4 Reduction **of** Visual Complexity

(c) Curve aggregation with

∆θmax = 50 ◦

Figure 5.11: Curve aggregation **of** 8Clusters-16 **in** the centralized structure

(a) No curve aggregation (b) Curve aggregation with ∆θmax = 70 ◦

(c) No curve aggregation (d) Curve aggregation with ∆θmax = 70 ◦

Figure 5.12: Curve aggregation **of** the Hitech hypergraph **in** the centralized structure **in**

(a), (b) and the fully connected structure **in** (c), (d)

107

5 Evaluation

(a) No curve aggregation (b) Curve aggregation with ∆θmax = 70 ◦

Figure 5.13: Curve aggregation **of** the argouml-he16-hn19 hypergraph **in** the centralized

structure

(a) argouml-he12, ∆θmax = 60 ◦

(d) 8Clusters-6, ∆θmax = 90 ◦

(b) argouml-he54, ∆θmax = 50 ◦

(e) WorldImport1999-GER10,

∆θmax = 70 ◦

(c) argouml-he55, ∆θmax = 70 ◦

(f) WorldImport1999-USA20,

∆θmax = 30 ◦

Figure 5.14: Curve aggregation **of** various example hypergraphs **in** the centralized structure

108

(a) argouml-he21-hn5, ∆θmax = 90 ◦

5.4 Reduction **of** Visual Complexity

(b) argouml-he47-hn6, ∆θmax = 90 ◦

Figure 5.15: Curve aggregation **of** ArgoUML hypergraphs **in** the fully connected structure

produced by the widen**in**g technique basically depends on the curve model fidelity and the

quality **of** the layout **of** routed curves. The curve widen**in**g technique with a low curve

model fidelity, i.e., large distances between neighbor**in**g hull po**in**ts, is more susceptible to

node occlusion.

Def**in**itions

The number Ω **of** nodes that are occluded by a hyperedge is the sum **of** the number **of**

nodes that are occluded by the polygons represent**in**g the widened curves. The same node

might be occluded by several widened curves. The occlusion **of** such a node is counted

multiple times, because the widen**in**g technique works on each curve **in**dividually and the

multiple occlusion **of** a node is a flaw **in** each curve widen**in**g.

The ratio **of** actually occluded nodes by a polygon to the number **of** potentially occluded

nodes by a polygon is denoted by ϱ. In comparison to the absolute number **of** node

occlusions, this ratio takes the number **of** nodes **in** the close proximity **of** a curve **in**to

account. This close proximity **of** a curve is determ**in**ed by the maximum curve width Wc

that was specified to adjust the energies. A potentially occluded node is located **in** the

polygon **of** a curve with maximum width.

The ratio ϱ **in** this experiment is used as an **in**dicator for the error rate **of** the curve

widen**in**g technique. A low value **of** ϱ means that only a small percentage **of** the nodes

that lie **in** the potential area **of** the widened curve were actually occluded by the produced

polygon.

Setup

The hypergraph layouts are already routed as **in** the first experiment. The energy-based

widen**in**g algorithm m**in**imized the energy **of** the hull **in** 50 iterations, which are more than

109

5 Evaluation

adequate to compute stable layouts for these example graphs. The experimental results

presented next were obta**in**ed us**in**g the follow**in**g setup:

• Centralized and fully connected hyperedge structure

• Fidelities f ∈ {4, 6, 8} **of** the curve model

• Energy-based rout**in**g **in** 180 iterations, i.e., each level **of** fidelity is routed **in** at least

30 iterations

• Curve widen**in**g **in** 50 iterations

• No curve aggregation

Experimental Results

Table 5.8 shows the number Ω **of** occluded nodes and the ratio ϱ for several example

hypergraphs. These examples were selected with respect to the size **of** the hyperedge.

Hypergraphs without node occlusions for any curve model fidelity f are omitted. The

results are separately listed for both hyperedge structures and the curve model fidelities.

Conclusion

This experiment proves that the number **of** node occlusions is tremendously reduced with

**in**creas**in**g curve model fidelity. As the results **in** Table 5.8 show, node occlusions are

almost entirely prevented with the fidelity f = 8 **of** the curve model. The fidelity **of** the

curve model **in**fluences, as expected, the widen**in**g results significantly. A higher curve

fidelity that corresponds to a smaller distance between neighbor**in**g hull po**in**ts avoids

node occlusion. This experiment reveals that the energy-based curve widen**in**g technique

requires a higher curve model fidelity to produce hypergraph layouts with very high quality.

In comparison, the energy-based curve rout**in**g technique already produced hypergraph

layouts with very high quality with a fidelity f = 6.

The fully connected structure **of** hyperedges tends to occlude more nodes than the

centralized structure. The absolute number Ω **of** node occlusions **of** the fully connected

structure is higher as this structure features a larger number **of** curves. The ratio ϱ is

also higher, because the layouts **of** the ArgoUML s**of**tware system and the pseudo-random

graph are clearly spatially clustered. Hyperedge nodes are affiliated to clusters and **in**cident

curves are probably routed through these clusters. As the occlusions **of** nodes by each curve

are counted, the higher number **of** curves **of** the fully connected structure also **in**creases

the ratio.

The widen**in**g **of** curves reduces the visual complexity **of** hyperedges by visually bundl**in**g

close curves together. The computed hypergraph draw**in**gs **in** Figure 5.16 demonstrate

the reduction **of** the cognitive load **of** hyperedges. The planes cover the tracks **of** the

close curves and thereby bundle them visually and simplify the displayed structure **of**

hyperedges. The hypergraph draw**in**gs clearly spare the nodes **of** the graphs, e.g., if curves

**in**tersect clusters to connect hyperedge nodes. **Hyperedges** **in** the fully connected structure

show the same effects as the ones **in** the centralized structure. For **in**stance, Figure 5.16e

shows the fully connected hyperedge **of** five hyperedge nodes. The curves **in** this draw**in**g

110

5.5 Comprehensive Hypergraph **Layouts**

**Graph** Structure

Ω

f = 4

ϱ Ω

f = 6

ϱ Ω

f = 8

ϱ

8Clusters-6

centralized

fully connected

7

21

6.4%

7.9%

1

6

1.0%

2.3%

0

0

0.0%

0.0%

8Clusters-10

centralized

fully connected

15

66

8.6%

17.0%

3

20

1.9%

5.2%

0

0

0.0%

0.0%

8Clusters-16

centralized

fully connected

19

111

8.8%

28.0%

3

34

1.4%

8.7%

0

1

0.0%

0.3%

argouml-he13-hn7

centralized

fully connected

5

23

13.5%

25.8%

0

6

0.0%

6.6%

0

0

0.0%

0.0%

argouml-he17-hn5

centralized

fully connected

2

1

40.0%

20.0%

0

0

0.0%

0.0%

0

0

0.0%

0.0%

argouml-he24-hn6

centralized

fully connected

17

82

13.2%

24.9%

4

36

3.1%

10.5%

0

4

0.0%

1.2%

argouml-he38-hn8

centralized

fully connected

10

13

18.2%

12.9%

2

4

3.7%

3.8%

0

2

0.0%

1.9%

argouml-he42-hn4

centralized

fully connected

30

55

17.1%

18.0%

5

33

2.8%

10.0%

0

8

0.0%

2.5%

argouml-he44-hn4

centralized

fully connected

1

5

6.7%

25.0%

0

3

0.0%

14.3%

0

0

0.0%

0.0%

argouml-he45-hn28

centralized

fully connected

17

138

13.4%

67.6%

4

52

3.2%

25.4%

0

9

0.0%

4.4%

argouml-he51-hn10

centralized

fully connected

13

106

13.7%

48.8%

4

33

4.3%

14.7%

0

2

0.0%

0.9%

argouml-he56-hn8

centralized

fully connected

0

2

0.0%

14.3%

0

0

0.0%

0.0%

0

0

0.0%

0.0%

argouml-he6-hn9

centralized

fully connected

0

4

0.0%

20.0%

0

0

0.0%

0.0%

0

0

0.0%

0.0%

argouml-he7-hn14

centralized

fully connected

35

252

14.0%

60.9%

6

129

2.4%

30.6%

0

10

0.0%

2.4%

argouml-he72-hn7

centralized

fully connected

20

108

10.6%

30.4%

4

32

2.0%

8.6%

0

2

0.0%

0.5%

argouml-he73-hn5

centralized

fully connected

9

30

12.9%

19.7%

3

9

4.2%

6.2%

0

0

0.0%

0.0%

argouml-he74-hn13

centralized

fully connected

31

197

10.4%

44.1%

4

81

1.3%

17.4%

0

20

0.0%

4.3%

Table 5.8: Node occlusion for different curve model fidelities f

are visually bundled as the widened curves overlap and thus do not reveal **in**dividual curves

anymore.

5.5 Comprehensive Hypergraph **Layouts**

After a large amount **of** measurements and graph draw**in**gs were discussed to evaluate

the layouts aga**in**st the **in**dividual criteria, this section demonstrates graph draw**in**gs that

result from the comb**in**ation **of** the proposed techniques. As we are **in**terested **in** s**of**tware

visualizations **in** particular, the follow**in**g Figure 5.17 shows six hypergraph layouts **of** the

111

5 Evaluation

112

(a) 8Clusters-10 (b) 8Clusters-16

(c) argouml-he16-hn19 (d) argouml-he22-hn12 (e) argouml-he30-hn5

(f) argouml-he47-hn6 (g) argouml-he54-hn23

Figure 5.16: Energy-based curve widen**in**g applied to routed hypergraphs

5.5 Comprehensive Hypergraph **Layouts**

ArgoUML s**of**tware system. Each hyperedge represents the co-change **of** Java classes and

connects those classes that were changed **in** one commit.

The visual complexity **of** the shown hyperedge layouts was first reduced by the clusterbased

curve aggregation technique. Then, the aggregated curves were routed by the

energy-based technique. Additionally, the curves were widened at the end **of** the layout

computation.

The first four draw**in**gs **in** Figure 5.17a through 5.17d show hyperedges that a widely

distributed **in** the layout area. Usually, co-change visualizations **of** modularized s**of**tware

systems are limited to a small part **of** the layout area and thus look similar to those **in**

Figures 5.17e and 5.17f.

113

5 Evaluation

(a) argouml-he12-hn6, centralized (b) argouml-he12-hn6, fully connected

(c) argouml-he21-hn5, centralized (d) argouml-he47-hn6, centralized

(e) argouml-he68-hn9, centralized (f) argouml-he11-hn4, fully connected

Figure 5.17: Hypergraph visualizations **of** co-change **of** the ArgoUML s**of**tware system.

**Hyperedges** are aggregated, routed, and widened.

114

6 Summary

In this chapter we first summarize the goals and achievements **of** this work **in** Section 6.1.

Then, Section 6.2 lists areas where future work can build upon the f**in**d**in**gs **in**troduced **in**

this thesis.

6.1 Conclusions

The visualization **of** hypergraphs comprises a wide field **of** applications. With the focus on

the visualization **of** s**of**tware systems, hypergraphs allow to model n-ary relations between

s**of**tware artifacts. A visualization **of** hypergraphs enables a viewer to get an overview **of**

the relationships between the depicted artifacts.

S**in**ce the positions **of** nodes usually represent a certa**in** property, e.g., the graph cluster**in**g,

the graph layout is assumed to be immutable when hyperedges are added. This is

the major contrast to other available hypergraph visualization techniques. Furthermore,

the visualization **of** hyperedges must not cover nodes to ma**in**ta**in** the expressiveness **of** the

entire graph layout. Cluster **in**tersections must also be avoided to ma**in**ta**in** the expressiveness.

The preservation **of** the expressiveness **of** the given graph layouts was not used

before as a requirement to visualize hypergraphs. The three aspects immutable graph

layout, avoidance **of** node occlusion and cluster **in**tersections constitute the preservation

**of** the expressiveness **of** graph layouts.

This thesis also focused on the readability **of** produced hypergraph draw**in**gs. Additional

hyperedges **in** graph draw**in**gs **in**crease the cognitive load and may overload the

viewer’s perception. Therefore, this work aimed at hypergraph visualization techniques

that produce layouts with low visual complexity.

Solutions Several techniques to compute hypergraph visualizations, which fulfill our requirements,

were **in**troduced. All techniques do not alter the given graph layout. First,

the choice **of** the hyperedge structure **in**fluences the visual complexity and readability **of**

hypergraphs draw**in**gs, as discussed **in** Section 2.3. Second, the energy-based rout**in**g technique

**in**troduced **in** Section 3.4 is applied to hyperedges to avoid the occlusion **of** nodes.

A rout**in**g technique based on cluster-bounds was found to be not appropriate to meet the

requirements.

The visual complexity **of** hypergraph draw**in**gs that is caused by hyperedges is reduced

by the techniques described **in** Chapter 4. They all aim at a reduction **of** the amount **of**

visualized objects, i.e., the cognitive load, either by simplify**in**g the hyperedge model or by

visually bundl**in**g curves. The first option, the simplification **of** the model, is **in**dependent

**of** a later visualization. Thus, it does not need to consider the graph layout and is applied

115

6 Summary

before rout**in**g. The second option, the visual bundl**in**g **of** curves, changes the curve

layout and thus must consider the graph layout to fulfill our requirements **of** hypergraph

visualizations.

Results The experimental evaluation **in** Chapter 5 proved the avoidance **of** node occlusion

by the energy-based curve rout**in**g technique. An occlusion **of** nodes can not directly be

measured. The reduction **of** hyperedge layout energy is an appropriate measure **of** the

avoidance **of** node occlusion and thus **in**dicates the hypergraph layout quality.

The rout**in**g **of** curves also reduces **in**tersections **of** clusters. Cluster **in**tersections were

not always prevented as the dummy nodes **of** the correspond**in**g curves can get trapped **in**

local energy m**in**ima. This is not a drawback **of** the def**in**ed energy model for rout**in**g, but

a shortcom**in**g **of** the energy m**in**imization algorithm that was not focused **in** this thesis.

As an important outcome **of** this work, the proposed energy-based rout**in**g technique can

also be applied to lay out any b**in**ary connection between nodes. The visualization **of** routed

b**in**ary edges **of** ord**in**ary graphs equally benefits from the avoidance **of** node occlusions.

Edges visualized by straight-l**in**e segments **in**troduce visual clutter that obscures nodes as

illustrated **in** Figure 2.2b on page 14.

Four techniques to reduce the visual complexity **of** hypergraph draw**in**gs were discussed.

The cluster-based aggregation **of** curves is preferred over the laborious comb**in**ation **of**

the energy-based bundl**in**g and the energy-threshold-based aggregation technique. Consequently,

the cluster-based aggregation was implemented and evaluated. Besides measur**in**g

the cognitive load by the total curve length **of** hyperedges, the variety **of** example graph

draw**in**gs **in** Figures 5.11 through 5.15 demonstrate the achievements **of** this approach.

The readability **of** the draw**in**gs was significantly **in**creased.

The energy-based widen**in**g technique also reduced the visual complexity. The curves

**in** the hypergraph draw**in**gs **in** Figure 5.16 were not bundled as much as the cluster-based

aggregation did. Nevertheless, plane representations **of** curves may **in**crease the readability

**of** hypergraph draw**in**gs due to the orthogonality **of** planes to the boxes and l**in**es **of** the

rema**in****in**g graph draw**in**g.

F**in**ally, all experiments shown **in** Sections 5.3 and 5.4 enable us to assess the stability **of**

the evaluated layout techniques. The measurements consistently lead to the same conclusions,

such as the significant reduction **of** the hypergraph layout energy for all subjected

example hypergraphs, and the reduction **of** visual complexity. The hypergraph draw**in**gs

confirmed these observations, as the avoidance **of** node occlusion and cluster **in**tersection

is visually noticeable. The centralized structure **of** hyperedges is the preferred choice to

visualize hypergraphs as it utilizes less space **in** the graph layout area and thus significantly

lowers the cognitive load **in** comparison to the fully connected structure.

In summary, this thesis successfully proposed and evaluated techniques to produce hypergraph

layouts that fully meet the requirements **of** hypergraph visualization.

116

6.2 Future Work

6.2 Future Work

This thesis is a first step towards hypergraph visualizations **in** fixed graph layouts that,

among other th**in**gs, do not occlude nodes. Several ideas that promise improvement were

not exam**in**ed **in** the scope **of** this thesis. This section documents open approaches that

may serve as a foundation **of** further research on this topic.

The curve model fidelity is **in**dependent **of** the surround**in**g graph layout **of** curves. To

**in**crease the quality **of** routed layouts, the curve model fidelity can be **in**creased locally

at those parts **of** the curves that are closely surrounded by nodes. Thus, with m**in**imal

**in**crease **of** the additional computational effort, the quality **of** the layouts can be **in**creased.

The rout**in**g process is assumed to start with a curve model fidelity **of** f = 1. A higher

value **of** f might be reasonable. Properties **of** the given graph layout, e.g., the m**in**imal

distance between graph nodes, can be used to determ**in**e a proper **in**itial value **of** f.

The assessment **of** readability and visual complexity **of** hypergraph draw**in**gs was chiefly

done by the author **of** the present work. However, such highly subjective assessments

depend on the viewer’s pr**of**essional background and her or his knowledge **of** box-andl**in**e

graph draw**in**gs. Therefore, user studies are necessary to substantially support the

statements **in** this work.

The computational performance **of** the accompany**in**g prototype implementation **of** the

energy-based curve rout**in**g and the energy-based curve widen**in**g technique can be improved.

Currently, each node **of** the graph repulses each dummy node and hull po**in**t

**in**dividually. The Barnes-Hut algorithm [10] spatially divides the graph layout area **in**to a

tree to approximate a set **of** distant nodes by one comb**in**ed node. The repulsion **of** nodes

close to dummy nodes or hull po**in**ts is computed **in**dividually. Thus the computation

**of** the node repulsion forces **of** the energy-based rout**in**g and widen**in**g techniques can be

considerably simplified.

Hypergraph draw**in**gs as the one **in** Figure 5.9f sometimes show curves that rigorously

**in**tersect clusters conta**in****in**g an end po**in**t **of** the respective curve. Such a curve that directly

connects the hyperedge node **of** the (light green) cluster **in** the bottom right corner is shown

**in** Figure 5.9f. Visually, it can be desirable to let curves only m**in**imally **in**tersect clusters

to connect a hyperedge node. This effect can be seen on a different curve **in** the same

Figure 5.9f, the one that connects the hyperedge node **in** the (dark green) cluster on the

right. However, the energy m**in**imization algorithm is not always able to route curves on a

shortest path out **of** the cluster, as parts **of** a curve can be trapped **in** local energy m**in**ima.

To remedy this problem, the rout**in**g **of** curves can start us**in**g an energy model with fewer

local energy m**in**ima as we proposed **in** [35]. Another option is to **in**itially route curves

with no or a weak stra**in** attraction force, because rout**in**g curves on a shortest path out

**of** the cluster **of** one **of** its end po**in**ts requires a large elongation **of** a small segment **of** the

curve.

The stability **of** hypergraph layouts **in** terms **of** predictability as mentioned **in** Section

2.2.3 is not evaluated yet. As on-l**in**e visualizations **of** the development **of** s**of**tware

systems can be **of** high value for s**of**tware developers, the layout techniques **in**troduced **in**

this work have to be **in**vestigated towards this criterion.

Regard**in**g the hyperedge structures, large hyperedges might benefit from a tree or bus

117

6 Summary

structured hyperedge visualization **in** conjunction with both the centralized and the fully

connected structures. This comb**in**ed approach **of** different hyperedge structures avoids

many long distant curves and thus **in**creases readability. This approach is similar to the

usage **of** Ste**in**er tree structures **in** the subset standard by Bertault and Eades that was

discussed **in** Section 2.2.4.1. First, the visualization **of** the hyperedges **in** the centralized

and fully connected structures have to be **in**vestigated. This work **in**vestigated the centralized

and fully connected structure. In future work, the techniques proposed **in** this

thesis can be extended to more sophisticated hyperedge structures to produce even better

hypergraph layouts.

118

Bibliography

[1] ArgoUML, Open Source UML Model**in**g Tool. http://argouml.tigris.org/ (accessed

August 15, 2008).

[2] Hi-Tech **Graph** Data. http://vlado.fmf.uni-lj.si/pub/networks/data/ESNA/

hiTech.htm (accessed August 15, 2008).

[3] L**in**LogLayout Tool. http://www.**in**formatik.tu-cottbus.de/~an/GD/ (accessed

August 15, 2008).

[4] Pseudo Random **Graph** Data. http://www.**in**formatik.tu-cottbus.de/~an/GD/

ERL**in**Log/Random.html (accessed August 15, 2008).

[5] World Trade 1999 **Graph** Data. http://www.**in**formatik.tu-cottbus.de/~an/GD/

ERL**in**Log/WorldTrade.html (accessed August 15, 2008).

[6] Sanjeev Arora. Approximation Schemes for Geometric NP-Hard Problems: A Survey.

In Ramesh Hariharan, Madhavan Mukund, and V. V**in**ay, editors, FSTTCS, volume

2245 **of** Lecture Notes **in** Computer Science, pages 16–17. Spr**in**ger, 2001.

[7] M. Balzer and O. Deussen. Level-**of**-Detail **Visualization** **of** Clustered **Graph** **Layouts**.

Asia-Pacific Symposium on **Visualization** (APVIS), 0:133–140, 2007.

[8] Michael Balzer, Andreas Noack, Oliver Deussen, and Claus Lewerentz. S**of**tware

Landscapes: Visualiz**in**g the Structure **of** Large S**of**tware Systems. In Oliver Deussen,

Charles D. Hansen, Daniel A. Keim, and Dietmar Saupe, editors, VisSym, pages

261–266. Eurographics Association, 2004.

[9] C. Bradford Barber, David P. Dobk**in**, and Hannu Huhdanpaa. The Quickhull Algorithm

for Convex Hulls. ACM Transactions on Mathematical S**of**tware, 22(4):469–483,

1996.

[10] Josh E. Barnes and Piet Hut. A Hierarchical O(N log N) Force-Calculation Algorithm.

Nature, 324(6270):446–449, 1986.

[11] Giuseppe Di Battista, editor. **Graph** Draw**in**g, 5th International Symposium, GD ’97,

Rome, Italy, September 18-20, 1997, Proceed**in**gs, volume 1353 **of** Lecture Notes **in**

Computer Science. Spr**in**ger-Verlag, 1997.

[12] Giuseppe Di Battista, Peter Eades, Roberto Tamassia, and Ioannis G. Tollis. Algorithms

for Draw**in**g **Graph**s: An Annotated Bibliography. Technical Report 5,

Amsterdam, The Netherlands, The Netherlands, 1994.

119

BIBLIOGRAPHY

[13] Roman Petrovych Bazylevych. Pr**in**ciples **of** Algorithmic Methods **of** Flexible Connection

Rout**in**g. IFAC-Workshop on Computer Control **in** Discrete Manufactur**in**g,

Prague, 1977.

[14] François Bertault and Peter Eades. Draw**in**g Hypergraphs **in** the Subset Standard

(Short Demo Paper). In Joe Marks, editor, **Graph** Draw**in**g, volume 1984 **of** Lecture

Notes **in** Computer Science, pages 164–169. Spr**in**ger, 2000.

[15] Dirk Beyer. Co-Change **Visualization**. In ICSM (Industrial and Tool Volume), pages

89–92, 2005.

[16] Franz-Josef Brandenburg, editor. **Graph** Draw**in**g, Symposium on **Graph** Draw**in**g,

GD ’95, volume 1027 **of** Lecture Notes **in** Computer Science. Spr**in**ger, 1996.

[17] Alex Bykat. Convex Hull **of** a F**in**ite Set **of** Po**in**ts **in** Two Dimensions. Information

Process**in**g Letters, 7(6):296–298, 1978.

[18] Kenneth L. Calvert, Matthew B. Doar, and Ellen W. Zegura. Model**in**g Internet

Topology. IEEE Communications Magaz**in**e, 35(6):160–163, June 1997.

[19] Bas Cornelissen, Arie van Deursen, Leon Moonen, and Andy Zaidman. Visualiz**in**g

Testsuites to Aid **in** S**of**tware Understand**in**g. In CSMR ’07: Proceed**in**gs **of** the 11th

European Conference on S**of**tware Ma**in**tenance and Reeng**in**eer**in**g, pages 213–222,

Wash**in**gton, DC, USA, 2007. IEEE Computer Society.

[20] H. O. Pollak D. S. Johnson. Hypergraph Planarity and the Complexity **of** Draw**in**g

Venn Diagrams. In Journal **of** **Graph** Theory, volume 11, pages 309–325. AT&T Bell

Laboratories Murray Hill, New Jersey; Bell Communications Research Morristown,

New Jersey, 1987.

[21] Ron Davidson and David Harel. Draw**in**g **Graph**s Nicely Us**in**g Simulated Anneal**in**g.

ACM Trans. **Graph**., 15(4):301–331, 1996.

[22] Re**in**hard Diestel. **Graph** Theory (Graduate Texts **in** Mathematics, Third Edition).

Spr**in**ger, August 2005.

[23] Peter Eades. A Heuristic for **Graph** Draw**in**g. In Congressus Numerantium, volume 42,

pages 149–160, 1984.

[24] Peter Eades, Robert F. Cohen, and Mao L**in** Huang. Onl**in**e Animated **Graph** Draw**in**g

for Web Navigation. In Battista [11], pages 330–335.

[25] Jason Eisner, Michael Kornbluh, Gordon Woodhull, Raymond Buse, Samuel Huang,

Constant**in**os Michael, and George Shafer. Visual Navigation Through Large Directed

**Graph**s and Hypergraphs. In Proceed**in**gs **of** the IEEE Symposium on Information

**Visualization** (InfoVis’06), Poster/Demo Session, pages 116–117, Baltimore, October

2006.

[26] Thomas Eschbach, Wolfgang Günther, and Bernd Becker. Orthogonal Hypergraph

Draw**in**g for Improved Visibility. Journal **of** **Graph** Algorithms and Applications,

10(2):141–157, 2006.

120

BIBLIOGRAPHY

[27] Leonhard Euler. Solutio Problematis ad Geometriam Situs Pert**in**entis (The Solution

**of** a Problem Relat**in**g to the Geometry **of** Position). Commentarii academiae

scientiarum Petropolitanae, 8:128–140, 1741.

[28] Gerald Far**in**. Curves and Surfaces for CAGD: a Practical Guide. Morgan Kaufmann

Publishers Inc., San Francisco, CA, USA, 2002.

[29] Thomas M. J. Fruchterman and Edward M. Re**in**gold. **Graph** Draw**in**g by Force-

Directed Placement. S**of**tware – Practice & Experience, 21(11):1129–1164, 1991.

[30] Mohammad Ghoniem, Jean-Daniel Fekete, and Philippe Castagliola. A Comparison

**of** the Readability **of** **Graph**s Us**in**g Node-L**in**k and Matrix-Based Representations. In

INFOVIS, pages 17–24. IEEE Computer Society, 2004.

[31] Ivan Herman, Guy Melancon, and M. Scott Marshall. **Graph** **Visualization** and Navigation

**in** Information **Visualization**: A Survey. IEEE Transactions on **Visualization**

and Computer **Graph**ics, 06(1):24–43, 2000.

[32] Danny Holten. Hierarchical Edge Bundles: **Visualization** **of** Adjacency Relations

**in** Hierarchical Data. IEEE Transactions on **Visualization** and Computer **Graph**ics,

12(5):741–748, 2006.

[33] Danny Holten, Bas Cornelissen, and Jarke J. van Wĳk. Trace **Visualization** Us**in**g

Hierarchical Edge Bundles and Massive Sequence Views. In Proceed**in**gs **of** the

4th International Workshop on Visualiz**in**g S**of**tware for Understand**in**g and Analysis

(VISSOFT), pages 47–54. IEEE, 2007.

[34] Michael Jünger and Petra Mutzel. Automatisches Layout von Diagrammen. OR

News, 12:5–12, 2001.

[35] Mart**in** Junghans. Avoidance **of** Local Energy M**in**ima **in** Energy-Based **Graph** Draw**in**g.

Brandenburg University **of** Technology, Cottbus, Juli 2006. Published **in** German

only as “Vermeidung lokaler Energiem**in**ima beim energie-basierten **Graph**enzeichnen.”

Contact the author at mart**in**.junghans@ieee.org for a digital copy **of** the

German version.

[36] Paulis Kikusts and Peteris Rucevskis. Layout Algorithms **of** **Graph**-Like Diagrams

for GRADE W**in**dows **Graph**ic Editors. In Brandenburg [16], pages 361–364.

[37] Paul A. Kirschner, John Sweller, and Richard E. Clark. Why M**in**imal Guidance

Dur**in**g Instruction Does Not Work: An Analysis **of** the Failure **of** Constructivist,

Discovery, Problem-Based, Experiential, and Inquiry-Based Teach**in**g. Educational

Psychologist, 41(2):75–86, 2006.

[38] Harri Klemetti, Ismo Lap**in**leimu, Erkki Mäk**in**en, and Mika Sieranta. A Programm**in**g

Project: Trimm**in**g the Spr**in**g Algorithm for Draw**in**g Hypergraphs. ACM SIGCSE

Bullet**in**, 27(3):34–38, 1995.

[39] Corey Kosak, Joe Marks, and Stuart M. Shieber. Automat**in**g the Layout **of** Network

Diagrams with Specified Visual Organization. IEEE Transactions on Systems, Man,

and Cybernetics, 24(24):440–454, 1994.

121

BIBLIOGRAPHY

[40] Erkki Mäk**in**en. How to Draw a Hypergraph. In Taylor and Francis, editors, International

Journal **of** Computer Mathematics, volume 34, pages 177–185, 1990.

[41] George A. Miller. The Magical Number Seven, Plus or M**in**us Two: Some Limits on

Our Capacity for Process**in**g Information. The Psychological Review, 63:81–97, 1956.

[42] Paul Mutton, Peter Rodgers, and Jean Flower. Draw**in**g **Graph**s **in** Euler Diagrams. In

Alan F. Blackwell, Kim Marriott, and Atsushi Shimojima, editors, Diagrams, volume

2980 **of** Lecture Notes **in** Computer Science, pages 66–81. Spr**in**ger, 2004.

[43] M. E. J. Newman. Analysis **of** Weighted Networks. Physical Review E, 70:056131,

2004.

[44] Andreas Noack. An Energy Model for Visual **Graph** Cluster**in**g. In Giuseppe Liotta,

editor, **Graph** Draw**in**g, volume 2912 **of** Lecture Notes **in** Computer Science, pages

425–436. Spr**in**ger-Verlag, 2003.

[45] Andreas Noack. Energy Models for Draw**in**g Clustered Small-World **Graph**s. Technical

Report 07/03, 2003.

[46] Andreas Noack. Energy Models for **Graph** Cluster**in**g. Journal **of** **Graph** Algorithms

and Applications, 11(2):pp. 453–480, 2007. Communicated by Peter Eades and Patrick

Healy.

[47] Stephen C. North. Incremental Layout **in** DynaDAG. In Brandenburg [16], pages

409–418.

[48] Doantam Phan, L**in**g Xiao, Ron Yeh, Pat Hanrahan, and Terry W**in**ograd. Flow Map

Layout. In INFOVIS ’05: Proceed**in**gs **of** the Proceed**in**gs **of** the 2005 IEEE Symposium

on Information **Visualization**, page 29, Wash**in**gton, DC, USA, 2005. IEEE Computer

Society.

[49] Franco P. Preparata and Michael I. Shamos. Computational Geometry: An Introduction

(Monographs **in** Computer Science). Spr**in**ger, August 1985.

[50] Helen C. Purchase. Which Aesthetic has the Greatest Effect on Human Understand**in**g?

In Battista [11], pages 248–261.

[51] Helen C. Purchase, Robert F. Cohen, and Murray James. Validat**in**g **Graph** Draw**in**g

Aesthetics. In GD ’95: Proceed**in**gs **of** the Symposium on **Graph** Draw**in**g, pages

435–446, London, UK, 1996. Spr**in**ger-Verlag.

[52] Aaron Quigley and Peter Eades. FADE: **Graph** Draw**in**g, Cluster**in**g, and Visual

Abstraction. In GD ’00: Proceed**in**gs **of** the 8th International Symposium on **Graph**

Draw**in**g, pages 197–210, London, UK, 2001. Spr**in**ger-Verlag.

[53] Manojit Sarkar and Marc H. Brown. **Graph**ical Fisheye Views **of** **Graph**s. In CHI

’92: Proceed**in**gs **of** the SIGCHI conference on Human factors **in** comput**in**g systems,

pages 83–91, New York, NY, USA, 1992. ACM.

122

BIBLIOGRAPHY

[54] Wayne P. Stevens, Glenford J. Myers, and Larry L. Constant**in**e. Structured Design.

IBM Systems Journal, 13(2):115–139, 1974.

[55] Kozo Sugiyama, Shojiro Tagawa, and Mitsuhiko Toda. Methods for Visual Understand**in**g

**of** Hierarchical System Structures. IEEE Trans. Systems, Man and Cybernetics,

11(2):109–125, February 1981.

[56] J. Sweller. Some Cognitive Processes and Their Consequences for the Organisation

and Presentation **of** Information. Australian Journal **of** Psychology, 45(1):1–8, 1993.

[57] John Sweller. Cognitive Load Dur**in**g Problem Solv**in**g: Effects on Learn**in**g. Cognitive

Science, 12(2):257–285, 1988.

[58] John Sweller, Jeroen J. G. Van Merrienboer, and Fred G. W. C. Paas. Cognitive

Architecture and Instructional Design. Educational Psychology Review, 10:251–296,

1998.

[59] Roberto Tamassia, Giuseppe Di Battista, and Carlo Bat**in**i. Automatic **Graph** Draw**in**g

and Readability **of** Diagrams. IEEE Transactions on Systems, Man and Cybernetics,

18(1):61–79, 1988.

[60] Stanley Wasserman and Kather**in**e Faust. Social Network Analysis: Methods and

Applications. Cambridge University Press, November 1994.

[61] Douglas B. West. A Question on Notation **in** **Graph** Theory. http://www.math.

uiuc.edu/~west/igt/notat.html, 2000. [Onl**in**e; accessed June 29, 2008].

[62] Douglas B. West. Introduction to **Graph** Theory (Second Edition). Prentice Hall,

August 2000.

[63] Rui Xu and II Wunsch, Donald. Survey **of** Cluster**in**g Algorithms. Neural Networks,

IEEE Transactions on, 16(3):645–678, May 2005.

[64] S. H. Yook, H. Jeong, and A. L. Barabasi. Model**in**g the Internet’s Large-Scale

Topology. Proc Natl Acad Sci U S A, 99(21):13382–13386, October 2002.

[65] Ellen W. Zegura, Kenneth L. Calvert, and Michael J. Donahoo. A Quantitative

Comparison **of** **Graph**-Based Models for Internet Topology. IEEE/ACM Trans. Netw.,

5(6):770–783, 1997.

123