13.07.2013 Views

Compendium (PDF, 28 MB) - IEEE Computer Society

Compendium (PDF, 28 MB) - IEEE Computer Society

Compendium (PDF, 28 MB) - IEEE Computer Society

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

POSTER COMPENDIUM<br />

<strong>IEEE</strong> Conference on Visualization<br />

<strong>IEEE</strong> Symposium on<br />

Information Visualization<br />

Visualization Poster Chairs:<br />

David Laidlaw, Kwan-Liu Ma, Han-Wei Shen<br />

Information Visualization Poster Chairs:<br />

Alan Keahey, Matt Ward<br />

Information Visualization Contest Chairs:<br />

Jean-Daniel Fekete, Catherine Plaisant<br />

Seattle, Washington<br />

October 19-24, 2003


VIS 2003 Posters<br />

Table of Contents<br />

Message from chairs..………….……………………………………………………………………………………...…..1<br />

David Laidlaw, Kwan-Liu Ma, Han-Wei Shen<br />

The Canopy Database Project: Component-Driven Database Design and Visualization for Ecologists………...……….2<br />

Judy Bayard Cushing, Nalini Nadkarni, Mike Ficker and Youngmi Kim<br />

Iterative Watersheds and Fuzzy Tumor Visualization ……………………………………………………………………4<br />

Matei Mancas and Bernard Gosselin<br />

gViz – Visualization Middleware for e-Science………………………………………………………………………..…6<br />

Jason Wood and Ken Brodlie<br />

Rapid 3D Insect Model Reconstruction from Minimal 2D Image Set…………………………………………………....8<br />

Gregory Buron and Geoffrey Matthews<br />

Visualizing the Elementary Cellular Automata Rule Space……………………………………………………….….…10<br />

Rodrigo A. Obando<br />

GLOD: A Geometric Level of Detail System at the OpenGL API Level…………………………………………….…12<br />

Jonathan Cohen, David Luebke, Nathaniel Duca and Brenden Schubert<br />

Subjective Usefulness of CAVE and Fish Tank VR Display Systems for a Scientific Visualization Application……..14<br />

Cagatay Demiralp, David H. Laidlaw, Cullen Jackson, Daniel Keefe and Song Zhang<br />

Visual Exploration of Measured Data in Automotive Engineering……………………………………………………..16<br />

Andreas Disch, Michael Munchhofen, Dirk Zeckzer and Ralf Klein<br />

Free Form Deformation for Biomedical Applications…………………………………………………………………..18<br />

Shane Blackett, David Bullivant and Peter Hunter<br />

Geo Pixel Bar Charts…………………………………………………………………………………………………….20<br />

Ming C. Hao, Daniel A. Keim, Umeshwar Dayal, Joern Schneidewind, Peter Wright<br />

Line Rendering Primitive………………………………………………………………………………………………...22<br />

Keen-Hon Wong, Xin Ouyang and Tiow-Seng Tan<br />

Visual Exploration of Association Rules……………………...…………………………………………………………24<br />

Li Yang<br />

Multi Level Control of Cognitive Characters in Virtual Environments…………………………………………………26<br />

Peter Dannenmann, Henning Barthel and Hans Hagen<br />

HistoScale: An Efficient Approach for Computing Pseudo-Cartograms………………………………………………..<strong>28</strong><br />

Daniel A. Keim, Christian Panse, Matthias Schafer, Mike Sips and Stephen C. North<br />

A Volume Rendering Extension for the OpenSG Scene Graph API……………………………………………………30<br />

Thomas Klein, Manfred Weiler and Thomas Ertl<br />

KMVQL: a Graphical User Interface for Boolean Query Specification and Query Result Visualization………………32<br />

Jiwen Huo and William B. Cowan<br />

Visualization of 2-manifold eversions………………………...…………………………………………………………34<br />

M. Langer<br />

Prefetching in Visual Simulation…………………………………………………………………………………….…..36<br />

Chu-Ming Ng, Cam-Thach Nguyen, Dinh-Nguyen Tran, Shin-We Yeow and Tiow-Seng Tan<br />

i


Collaborative Volume Visualization Using VTK………………………………………………………………………..38<br />

Anastasia Valerievna Mironova<br />

The Challenge of Missing and Uncertain Data…………………………………………………………………………..40<br />

Cyntrica Eaton Catherine Plaisant, and Terence Drizd<br />

The Open Volume Library for Processing Volumetric Data…………………………………………………………….42<br />

Sarang Lakare and Arie Kaufman<br />

A Parallel Coordinates Interface for Exploratory Volume Visualization………………………………………………..44<br />

Simeon Potts, Melanie Tory and Torsten Möller<br />

How ReV4D Helps Biologists to Study the Effects of Anticancerous Drugs on Living Cells…………………...……..46<br />

Eric Bittar, Aassif Benassarou, Laurent Lucas, Emmanuel Elias, Pavel Tchelidze,<br />

Dominique Ploton and Marie-Françoise O’Donohue<br />

Visualizing the Interaction Between Two Proteins…………………………………………………….………………..48<br />

Nicolas Ray, Xavier Cavin and Bernard Maigret<br />

Photorealistic Image Based Objects from Uncalibrated Images…………………………………………………………50<br />

Miguel Sainz, Renato Pajarola and Antonio Susin<br />

Segmentation of Vector Field Using Green Function and Normalized Cut……………………………………………..52<br />

Honyu Li<br />

DStrips: Dynamic Triangle Strips for Real-Time Mesh Simplification and Rendering………..………………………..54<br />

Michael Shafae and Renato Pajarola<br />

Interactive Visualization of Time-Resolved Contrast-Enhanced Magnetic Resonance Angiography (CE-MRA)…..….56<br />

Ethan Brodsky and Walter Block<br />

Using CavePainting to Create Scientific Visualizations…………………………………………………………………58<br />

David B. Karelitz, Daniel F. Keefe and David H. Laidlaw<br />

3D Visualization of Ecological Networks on the WWW………………………………………………………………..60<br />

Ilmi Yoon, Rich Williams, Eli Levine, Sanghyuk Yoon, Jennifer Dunne and Neo Martinez<br />

Streaming Media Within the Collaborative Scientific Visualization Environment Framework………………………...62<br />

Brian James Mullen<br />

Visualization of Geo-Physical Mass Flow Simulations………………………………………………………………….64<br />

Navneeth Subramanian, T Kesavadas and Adani Patra<br />

INFOVIS 2003 Posters<br />

Message from chairs..………….…………………………………………………………………………………...……67<br />

Alan Keahey, Matt Ward<br />

Axes-Based Visualizations for Time Series Data………………………………………………………………………..68<br />

Christian Tominski, James Abello, Heidrun Schumann<br />

Visualising Large Hierarchically Structured Document Repositories with InfoSky…………………………………….70<br />

Keith Andrews, Wolfgang Kienreich, Vedran Sabol, Michael Granitzer<br />

An XML Toolkit for an Information Visualization Software Repository……………………………………………….72<br />

Jason Baumgartner, Katy Börner, Nathan J. Deckard, Nihar Sheth<br />

Trend Analysis in Large Timeseries of High-Throughput Screening Data Using<br />

a Distortion-Oriented Lens with Semantic Zooming…………………………………………………………………….74<br />

Dominique Brodbeck, Luc Girardin<br />

ii


Interacting with Transit-Stub Network Visualizations…………………………………………………………………..76<br />

James R. Eagan, John Stasko, and Ellen Zegura<br />

MVisualizer: A Visual Tool for Analysis of Medical Data………………………………………………...……………78<br />

Nils Erichson, Göran Zachrisson<br />

The InfoVis Toolkit…..……………………………………………………………………………………………….…80<br />

Jean-Daniel Fekete<br />

Overlaying Graph Links on Treemaps……………………………………………………………………………….…..82<br />

Jean-Daniel Fekete, David Wang, Niem Dang, Aleks Aris, Catherine Plaisant<br />

Semantic Navigation in Complex Graphs………………………………………………………………………………..84<br />

Amy Karlson, Christine Piatko, John Gersh<br />

Business Impact Visualization…………………………………………………………………………………………...86<br />

Ming C. Hao, Daniel A. Keim, Umeshwar Dayal, Fabio Casati, Joern Schneidewind<br />

Visualization for Periodic Population Movement between Distinct Localities………………………………………….88<br />

Alexander Haubold<br />

PolyPlane: An Implementation of a New Layout Algorithm for Trees in Three Dimensions…………………………..90<br />

Seok-Hee Hong, Tom Murtagh<br />

Displaying English Grammatical Structures……………………………………………………………………………..92<br />

Pourang Irani, Yong Shi<br />

VistaClara: An Interactive Visualization for Microarray Data Exploration……………………………………………..94<br />

Robert Kincaid<br />

Linking Scientific and Information Visualization with Interactive 3D Scatterplots…………………………………….96<br />

Robert Kosara, Gerald N. Sahling, Helwig Hauser<br />

Enlightenment: An Integrated Visualization and Analysis Tool for Drug Discovery…………………………………...98<br />

Christopher E. Mueller<br />

3D ThemeRiver…………………………………………………………………………………………………………100<br />

Peter Imrich, Klaus Mueller, Dan Imre, Alla Zelenyuk, Wei Zhu<br />

A Hardware-Accelerated Rubbersheet Focus + Context Technique for Radial Dendrograms………………………...102<br />

Peter Imrich, Klaus Mueller, Dan Imre, Alla Zelenyuk, Wei Zhu<br />

Visualizations in the ReMail Prototype………………………………………………………………………………...104<br />

Steven L. Rohall<br />

Interactive Symbolic Visualization of Semi-automatic Theorem Proving……………………………………………..106<br />

Chandrajit Bajaj, Shashank Khandelwal, J. Moore, Vinay Siddavanahalli<br />

FROTH: A Force-directed Representation of Tree Hierarchies…………………………………………………….….108<br />

Lisong Sun, Steve Smith, Thomas Preston Caudell<br />

PaintingClass: Interactive Construction, Visualization and Exploration of Decision Trees………………………..….110<br />

Soon Tee Teoh, Kwan-Liu Ma<br />

Evaluation of Spike Train Analysis using Visualization……………………………………………………………….112<br />

Martin A. Walter, Liz J. Stuart, Roman Borisyuk<br />

Tree3D: A System for Temporal and Comparative Analysis of Phylogenetic Trees…………………………………..114<br />

Eric A. Wernert, Donald K. Berry, John N. Huffman, Craig A. Stewart<br />

iii


INFOVIS 2003 Contest<br />

Message from chairs..………….…………………………………………………………………………………...…..117<br />

Jean-Daniel Fekete, Catherine Plaisant<br />

TreeJuxtaposer InfoVis Contest Entry………………………………………………………………………………….118<br />

James Slack, Tamara Munzner, François Guimbretière<br />

Zoomology: ComparingTwo Large Hierarchical Trees………………………………………………..………………120<br />

Jin Young Hong, Jonathan D’Andries, Mark Richman, Maryann Westfall<br />

Visualization of Trees as Highly Compressed Tables with InfoZoom…………………………………………………122<br />

Michael Spenke, and Christian Beilken<br />

EVAT : Environment for Vizualisation and Analysis of Trees………………………………………………………...124<br />

David Auber, Maylis Delest, J.Philippe Domenger, Pascal Ferraro, Robert Strandh<br />

Comparison of multiple taxonomic hierarchies using TaxoNote………………………………………………………126<br />

David R. Morse, Nozomi Ytow, David McL. Roberts, Akira Sato<br />

Treemap, Radial Tree, and 3D Tree Visualizations…………………………………………………………………….1<strong>28</strong><br />

Nihar Sheth, Katy Börner, Jason Baumgartner, Ketan Mane, Eric Wernert<br />

iv


VIS 2003 Posters<br />

To foster greater interaction among attendees and to provide a forum for discussing exciting ongoing visualization<br />

research, a poster program was added to the Visualization 2002 Conference's technical program. Such a poster program<br />

offers a unique opportunity to showcase work-in-progress, student projects, or non-traditional visualization research.<br />

Following last year's success, we have put together a program consists of 32 posters, which focus on work that has<br />

produced new or exciting ideas or findings; some are case studies in all areas of science, engineering, and medicine.<br />

This year, we have made several changes. First, the posters are listed in the hardcopy Final Program of the Conference.<br />

Second, all posters are exhibited during all three days of the main Conference. Third, in addition to an interactive<br />

session, a preview session has been added. Because of the large number of posters we have this year, the preview<br />

session will help the attendees identify the posters of interest. The casual setting of the interactive session will allow the<br />

presenters to have one-on-one dialogue with attendees and also to better control the pace and level of the presentations.<br />

We would like to express our sincere thanks to those people who have submitted abstracts, and those who have assisted<br />

in our selection process.<br />

Co-Chairs:<br />

David Laidlaw, Brown University, USA<br />

Kwan-Liu Ma, University of California at Davis, USA<br />

Han-Wei Shen, Ohio State University, USA<br />

1


The Canopy Database Project<br />

Component-Driven Database Design and Visualization for Ecologists 1<br />

Judy Bayard Cushing, Nalini Nadkarni, Mike Ficker, Youngmi Kim<br />

The Evergreen State College, Olympia WA 98502 USA<br />

Introduction. Solving ecology problems such as global warming, decreasing biodiversity, and<br />

depletion of natural resources will require increased data sharing and data mining 2 . This in turn<br />

will require better data infrastructure, informatics and analysis tools than are now available.<br />

Investments are being made in needed data warehouses for ecology 3 , though problems are far from<br />

solved, in particular attaining adequate data documentation. Integrating database technology early<br />

in the research process would make this metadata provision easier, but barriers to database use by<br />

ecologists are numerous. The Canopy Database Project is experimenting with database<br />

components for commonly used spatial (structural) data descriptions in one ecology discipline<br />

(forest canopy research). While using domain specific components for generating databases will<br />

make using databases easier, other productivity gains would have to be evident before researchers<br />

use such tools. We have identified easier data visualization as a possibly effective reward, and our<br />

visualization program CanopyView, developed with VTK 4 , takes as input databases designed from<br />

those components and produces visualization specific to structural aspects of the ecology study.<br />

Ecology Research Workflow, Database Generation and Data Visualization. We have<br />

articulated an ecology research workflow, and postulate that data visualization will be particularly<br />

useful at the data verification, analysis, publication and data mining phases. While databases<br />

would be most helpful if generated at the study design phase, using components to generate field<br />

databases at any stage could increase researcher productivity if other tools work from those<br />

components. The following figure conceptualizes how researchers might use conceptual<br />

components to design field databases. Given three real-world canopy entities (stem, branch and<br />

branch foliage), and given several spatial or structural conceptualizations that correspond to<br />

commonly measured variables for each, a researcher selects those that best match his or her<br />

research objectives. DataBank uses the selected components to generate SQL, from which a<br />

database (currently MS Access) is generated and to which additional observations can be added.<br />

Upright<br />

linear<br />

Upright<br />

cylinder,<br />

DBH<br />

Stem Model<br />

Upright<br />

cone,<br />

DBH<br />

CanopyView is a visualization application that generates interactive scenes of ecological entities at<br />

the tree-level and plot-level using the same predefined data structures (aka database components<br />

or templates) used by DataBank to generate field databases. CanopyView uses an ecological field<br />

1 This work is funded by the National Science Foundation, BIR 93-07771, 96-3O316, 99-75510.<br />

2 See reports of two NSF, USGS and NASA Workshops to establish computer science research agenda for<br />

biodiversity and ecosystem informatics http://www.evergreen.edu/bdei . The ecoinformatics web site also<br />

provides good references http://ecoinformatics.org/ .<br />

3 See the National Science Foundation Long Term Ecological Research repositories http://lternet.edu/ .<br />

4 W. Schroeder, K. Martin, B. Lorensen, The Visualization Toolkit, Prentice Hall, 1998. See also<br />

http://www.kitware.com .<br />

Upright cylinder<br />

stepped<br />

measures<br />

Branch Length Measurement<br />

Branch length<br />

perpendicular<br />

to stem<br />

Branch length<br />

along branch<br />

2<br />

Branch Foliage Model<br />

Foliage<br />

Start, stop<br />

Foliage<br />

inner, mid,<br />

outer<br />

Foliage<br />

length and<br />

width


database (generated by DataBank and usually in MS Access) as its primary data source. The<br />

following figure shows scenes generated by CanopyView for several of our sample field data sets.<br />

Canopy airspace (blue)<br />

overlaid with stems at<br />

Martha Creek in<br />

southern Washingt on.<br />

Surface area density<br />

map of the canopy of<br />

an eastern deciduous<br />

forest at SERC.<br />

Dwarf mistletoe<br />

infection ratings in an<br />

old-growth Douglasfir<br />

forest,<br />

Washington.<br />

To the best of our knowledge, CanopyView is unique in that it produces visualizations directly of<br />

field data. Other visualization aids we have seen are either map-based or are essentially visual<br />

representations of statistical analyses 5 . While those are essential, sometimes the scale of an<br />

ecological study such as for within-tree structure does not lend itself to a map-based first-cut<br />

visualization. Furthermore, our researchers have found that visualization of raw data contributes<br />

to their understanding of the data for data validation and discovery. CanopyView is implemented<br />

using the Visualization Toolkit (VTK) and Java. The following figure shows the underlying<br />

software architecture for DataBank and CanopyView.<br />

Internet<br />

Browser<br />

IE 5+<br />

Netscape 6+<br />

Access<br />

Field<br />

DB<br />

Viz Tookkit<br />

VTK<br />

Web Server<br />

(Apache)<br />

Enhydra<br />

(Middleware)<br />

Polyline<br />

representations of<br />

Castanea Crenata,<br />

Japan.<br />

Databank Backend<br />

(Java)<br />

DB<br />

SQL<br />

Server<br />

Findings. We conclude that using components for field database design is feasible. Furthermore,<br />

databases thus developed can be used with a companion visualization application to generate<br />

scenes easily by end users. However, conceptualization of the components requires time and<br />

collaboration between ecologists and computer scientists; we are considering cost-benefit<br />

tradeoffs. VTK was a significant productivity aid in developing the visualization application.<br />

Acknowledgements. We thank ecologists B. Bond, R. Dial, G. Parker, D. Shaw, S. Sillett, and A.<br />

Sumida, Long Term Ecological Research information managers J. Brunt, D. Henshaw, N. Kaplan,<br />

E. Menendez, K. Ramsey, S. Stafford, K Vanderbilt, J. Walsh, and computer scientists D. Maier<br />

and L. Delcambre for valuable field data and ideas. Former project staff Erik Ordway, Steve<br />

Rentmeester and Bram Svoboda, and Starling Consulting, made significant contributions.<br />

A demonstration of CanopyView will take place at the Visualization 2003 Conference.<br />

Full stem<br />

reconstructions at the<br />

Trout Creek site in<br />

southern Washington<br />

state.<br />

5 J.J. Helly, Visualization of Ecological and Environmental Data, in W.K. Michener, J.H. Porter and S.G.<br />

Stafford, eds., Data and Information Management in the Ecological Sciences, LTER Network Office,<br />

University of New Mexico, Albuquerque, New Mexico, 1998, pp. 89-94.<br />

3


! ! " #<br />

$ " % & ' ( %)*** # "<br />

%" + " ," - ,.(" , , ( + /01 23 0) 4) 40<br />

5 + (+66 ,.(" , , 67"<br />

5 " % " " #" , . #<br />

8 9( . : #" " " (( , ! 9 :<br />

9 : 5 . ;; 5 " ( 5 * &,<br />

5 ( ( . " 8 < 5 #" =1> + " # #<br />

( 5 " 5 # # ' 5 5<br />

, "" . ( 5 5 . # . " 5 " 8 ( 5 .<br />

; " + 5 ( # 5 , ' 5<br />

# # ' . 5 ? @ A . # , ! @ =&> ..<br />

. # 5 # ( ( # ' " 5 # "(<br />

# ' + ,<br />

! "#<br />

! " " " # . 8 ? ! B !A, '<br />

# #" 5 8 # 8 . " + ( ( '<br />

" # " 8 # " 8 " , . 5<br />

"( . " 5 " 8 ? A " 8 .<br />

5 ? A 5 # ' # ? . ;; A +<br />

$ %& ' $ #<br />

! # " " ? ' 1A # ( ' " 8<br />

5 # . " ' &, C . ' # . " 9 : 9 : 5<br />

# " 8 +<br />

4


' 1 5 5 # ' "<br />

( " 8 ( , 5 D "( . ' 1<br />

#" !% , ! ( # ?/6% & ""A . # (<br />

? A 5 " 5 ' # " ? A ?5 % 5A, C<br />

' 1 #" ' " ,<br />

, & $<br />

-(<br />

5 . "<br />

' 5 "( "<br />

"( "<br />

#" "<br />

" ' ; ,<br />

'<br />

' # " (<br />

( # "<br />

#" + 5<br />

( . " E 1*<br />

8 ( . 0*<br />

3* 8 , ' '<br />

5 8<br />

( 5 ( ( " ( 8,<br />

C ' 5 ( . ' .. " "( #<br />

( # . .. 5 D # ? / . ;; A 5 # ..<br />

" 8 , ! 5 . ' # # ( ,<br />

Tumor 1 Tumor 2 Tumor 3 Tumor 4 Total Average<br />

Average difference at level 1 6.7 % 2.4 % 3.3 % 7.0 % 4.85 %<br />

Variance at level 1 2.6 0.4 1.1 1.3 1.35<br />

Average difference at level 2 7.4 % 4.0 % 4.0 % 6.8 % 5.55 %<br />

Variance at level 2 2.0 1.8 2.0 1.0 1.7<br />

( ) !<br />

/ & $<br />

% 0 0 '<br />

. - !<br />

. " 5 # #<br />

5 " # 5 " . ;; 5 5 +<br />

, ! . " " 8 ( ( "<br />

' + # 5 5 " . 5 .<br />

' ,<br />

! ' . " ( " (<br />

( # " " F " " # #, !<br />

( 5 " 5 ' ' "<br />

' ; , ! (( " . ;; ((<br />

5 #" ' , ' " %"<br />

D ! B ! " # " ( " D #<br />

. ;; # . 6 ( ,<br />

" D( " ( " # ' , 5 8 5<br />

. 0G D . " # ( , " '<br />

( 5 # (( . ( ( #,<br />

! ( F B ! B . H B # C # "<br />

C 5 ! B ! . I ?I ' ' A # "<br />

)<br />

* + , J K, , : 8 ( @ 5:<br />

)?0A ((, 03L%02L &LLE,<br />

*$+ K, B 8 , F 9! C ! . " + G . # "<br />

; # : 4& (( &E)%11E 1**&<br />

5


gViz – Visualization Middleware for<br />

e-Science<br />

Jason Wood and Ken Brodlie<br />

School of Computing<br />

University of Leeds<br />

Jeremy Walton<br />

NAG Ltd<br />

E-science is about global collaboration in science and the next generation of infrastructure<br />

that will enable it – John Taylor, UK Research Councils<br />

Visualization is a key component of e-Science, allowing insight to be gained into the large<br />

datasets generated either by simulation – such as in computational fluid dynamics – or by<br />

measurement – such as in medical imaging. The gViz project is a major part of the UK e-<br />

Science research programme, aiming to provide today’s e-Scientist with visualization<br />

software that works within modern Grid environments.<br />

Grid-enabling Current<br />

Visualization Systems<br />

A major part of gViz is the Gridenabling<br />

of existing visualization<br />

systems, so that scientists can migrate<br />

their work seamlessly to Grid computing<br />

environments – without changing their<br />

mode of working. In particular, we have<br />

extended a widely used visualization<br />

system, IRIS Explorer from NAG Ltd.<br />

This is a Modular Visualization<br />

Environment, in which a user builds an<br />

application by connecting modules in a<br />

dataflow network. Our extension allows<br />

this network to span a set of Grid<br />

resources, so that user interface modules<br />

execute on the scientist’s desktop, but<br />

computationally intensive modules are<br />

launched securely on remote servers<br />

using Globus middleware. Moreover a<br />

number of scientists at different locations<br />

can join in a collaborative visualization<br />

session. An independent server process<br />

(the COVISA server) manages the<br />

collaborative session.<br />

6<br />

Grid-enabled IRIS Explorer<br />

Modules in the dataflow pipeline execute on<br />

different Grid resources<br />

Collaborative IRIS Explorer<br />

Geographically separated research teams<br />

collaborate across the network


Grid-enabled Computational<br />

Steering<br />

A special focus of the gViz project is<br />

Computational Steering. This proves an<br />

extremely useful way of working for the<br />

very large simulations that are now<br />

possible in Grid-based applications.<br />

Visualization runs in tandem with<br />

simulation, and the scientist can amend<br />

the controlling parameters of the<br />

simulation as it executes. The gViz<br />

Computational Steering Library allows<br />

scientists to link their simulation code<br />

with a visualization system of choice.<br />

The Library can operate in a Web<br />

Services context, with the opportunity to<br />

register simulation details with the Web<br />

Service and at any later time user<br />

interface components retrieve this<br />

information, so that the visualization<br />

system can connect to the simulation and<br />

steer its progress.<br />

We are demonstrating this through an<br />

environmental application, where we<br />

simulate the dispersion of a toxic<br />

chemical under the action of the wind.<br />

The simulation runs on a remote Grid<br />

compute resource, but the scientist<br />

connects to the simulation at any time to<br />

monitor progress, or perform ‘what-if’<br />

scenarios, such as change of wind<br />

direction.<br />

Computational Steering<br />

Here IRIS Explorer is used as a front-end<br />

visualization system, connected to simulation<br />

through the gViz computational steering<br />

library. The wind direction is steered by the<br />

e-scientist and the resulting effect on the<br />

pollution plume is immediately observed.<br />

Other aspects of gViz include study of the use of XML languages for visualization; the<br />

Grid-enabling of pV3; and the development of novel geometry compression – important<br />

for any distributed application.<br />

Partners in the project are: Universities of Leeds, Oxford and Oxford Brookes; CLRC<br />

Rutherford Appleton Laboratory; NAG Ltd; IBM UK; and Streamline Computing.<br />

Further information at: http://www.visualization.leeds.ac.uk/gViz<br />

7


Rapid 3D Insect Model Reconstruction from Minimal 2D Image Set<br />

Abstract<br />

Gregory Buron and Geoffrey Matthews<br />

Western Washington University<br />

We present a method of easily creating a threedimensional<br />

model of an insect from a small set of<br />

two-dimensional digital images, taken in a<br />

laboratory from known angles. A reconstructed<br />

model of this type can be used for purposes of<br />

identification or education. Using these images,<br />

along with some simplifying assumptions about the<br />

standard construction of an insect, it is<br />

straightforward to rapidly create a simple but<br />

accurate virtual insect. Insect taxonomy is a dying<br />

art, and it is hoped that the creation of a virtual<br />

collection will help the development of this skill.<br />

Insects are collected from a local watershed and<br />

preserved for identification and research purposes<br />

by the Institute of Watershed Studies. Digital<br />

photographs are taken of the insects from various<br />

angles. These digital photographs are then preprocessed<br />

for use in the modeling program. The<br />

pre-processing step creates a mask of the original<br />

photograph in which the pixels are separated into<br />

two distinct areas, “insect body” and “not insect<br />

body”. The insect body portions are colored green<br />

(or other color that the algorithm recognizes as<br />

insect body), and the non-body portions are<br />

colored black. This is done for each image in the<br />

image set that is to be used in the image<br />

registration.<br />

The images and models shown in this paper and<br />

used to demonstrate the insect reconstruction<br />

process are of Calineuria californica, a species of<br />

stonefly found at a local watershed.<br />

Figure 1a: Side view of Calineuria californica.<br />

8<br />

Figure 1b: Top view of Calineuria californica.<br />

The photographs are also pre-processed for<br />

identification of leg segments. Each endpoint for<br />

the leg segments is assigned a unique color that<br />

the algorithm will recognize as that particular<br />

segment endpoint. The algorithm then coalesces<br />

these endpoints from each of the images to<br />

generate a location in three-dimensional space for<br />

that segment endpoint.<br />

Figure 2a: Side view of the pre-processed image<br />

mask for Calineuria californica.<br />

Figure 2b: Top view of the pre-processed image<br />

mask for Calineuria californica.


The pre-processed images are then used as input<br />

to a program, which was developed to create the<br />

three-dimensional model based on such images.<br />

The program uses the Visualization Toolkit© with<br />

Java© bindings to display the insect model.<br />

ImageJ© was used for the image processing in the<br />

model reconstruction process. Pre-processing of<br />

images was done with Adobe Photoshop©.<br />

The model shown in Figure 3 is the surface and leg<br />

structures created using the masks shown in<br />

Figures 2a and 2b. In addition to the geometry<br />

created for the insect parts, the model can also be<br />

textured using the original images (not preprocessed)<br />

to give the model a more realistic<br />

appearance. The texturing process is<br />

straightforward for a pre-made texture. Texture<br />

coordinates for the body are generated and used<br />

to map the texture to the body. Figure 4 shows a<br />

portion of the model with a texture map applied to<br />

the body of the insect.<br />

Figure 3: Solid facet representation of an insect<br />

model Calineuria californica created with the<br />

InsectModeler program with body and leg points<br />

registered. This model has 219 columns and 64<br />

radial points. The leg segments are scaled spheres<br />

that are aligned to the lines created by the leg<br />

segment end points.<br />

9<br />

Figure 4: A texture map applied to the insect body<br />

from the original image of the insect.<br />

The goal for this project is to create a simple and<br />

effective means for biologists and environmental<br />

scientists to create models of insects for<br />

identification purposes, as well as an educational<br />

tool for biology students. One goal is to create a<br />

user-friendly interface, which will allow users to<br />

interactively define insect data bounds instead of<br />

relying on image pre-processing. Also, more<br />

options on leg geometry besides simple deformed<br />

spheres would go a long way to creating a more<br />

realistic model.<br />

Other improvements intended for this project are<br />

the creation of other body features found on<br />

insects, such as the ability to add antennae, tails,<br />

and perhaps even wing structures to the model.<br />

All of these features would be included or excluded<br />

at the user’s request.<br />

Acknowledgements<br />

Many thanks to Robin Matthews and Joan<br />

Vandersypen of the Institute for Watershed Studies<br />

at Western Washington University for their help<br />

and input.


Interactive Poster: Visualizing the Elementary Cellular Automata Rule Space<br />

Keywords: Cellular Automata Rule Space, 3D Visualization.<br />

1 Introduction<br />

Cellular automata are simple systems that can produce complex<br />

behavior and are ideal for the study of a great variety of topics<br />

such as thermodynamics[Hunter and Corsten 1991], biological systems[A<br />

Brass and Else 1994], landscape change[Itami 1988], etc.<br />

These automata may be defined for one, two, or more dimensions<br />

as well as for k cell values and for a neighborhood of size r. The<br />

elementary cellular automata are the simplest of the spaces that produce<br />

interesting behavior. This rule space is one-dimensional with<br />

k = 2 and r = 1. Given these parameters there can be up to 256<br />

elementary rules. Each rule may be encoded in a byte where each<br />

individual bit indicates the action of the rule for a given combination<br />

of the input. In the case of elementary rules, the input is the<br />

cell to be updated (central cell) and the two neighboring cells, one<br />

to the left and one to the right of the central cell.<br />

A one-dimensional cellular automaton is used to update a row<br />

of cells. Each cell is updated individually and at the same time.<br />

Each step in the evolution of the automata updates all the cells in<br />

the row. The original row of cells is the input to the automaton.<br />

Each one of these rules produce a different behavior given either a<br />

simple input or a disordered input. A classification of the various<br />

behaviors given a disordered input was given by Wolfram [Wolfram<br />

1994] as classes 1, 2, 3, and 4. Class 1 relates to an automaton that<br />

evolves to a homogeneous state. Class 2 evolves simple separated<br />

periodic structures. Class 3 evolves into chaotic aperiodic patterns.<br />

Finally, class 4 generates complex patterns of localized structures.<br />

Even though these rules are simple and deterministic, there has<br />

been no way to know the class of behavior from the rule itself until<br />

it is evolved. This is not a real problem in the elementary rule space<br />

since there is a relatively small number of rules in it. An extensive<br />

survey of their behavior has been done already. The problem arises<br />

when the neighborhood is expanded such as for r = 2 where the<br />

number of rules becomes 2 32 and an exhausting analysis becomes<br />

extremely time consuming. This problem is aggravated even more<br />

for k = 3 and r = 1 where the number of rules is 3 27 .<br />

A better understanding on the distribution of these classes is required<br />

for the exploration of these big spaces. This was also set<br />

forth by Wolfram [Wolfram 1994] in his open question: How is<br />

different behavior distributed in the space of cellular automaton<br />

rules? This requires the definition of rule properties that would<br />

∗ e-mail:RObando@mail.fairfield.edu.<br />

Rodrigo A. Obando<br />

Fairfield University ∗<br />

10<br />

Figure 1: 3D Rendering of Rule 30.<br />

allow a partition of the rule space in a way that highlights the distribution<br />

of these behaviors.<br />

2 Visualization of a cellular automaton rule<br />

The encoding of the cellular automaton rules in digital form does<br />

not lend itself to an appreciation of its dynamics. The following is<br />

a proposed new visualization of the elementary cellular automaton<br />

rules.<br />

Each bit in the rule is represented in 3D space by a triangle with<br />

vertices v0,v1,v2:<br />

v0 = a−1a0a1 → {a1 · 1.0,a0 · 1.0,a−1 · 1.0}<br />

v1 = ba−1a0a1 → {0.5,ba−1a0a1 · 0.5 + 0.25,0.5}<br />

v2 = a0 → {0.5,a0 · 1.0,0.5}<br />

where a0 is the cell to be updated, a−1 is the cell to the left, and<br />

a1 is the cell to the right. The binary code for Rule 30 is 00011110<br />

and its 3D rendering is shown in Figure 1. This 3D representation<br />

allows the definition of the following properties.<br />

3 Properties of the Cellular Automaton<br />

Rules<br />

The following properties are defined for k = 2 and r = 1 but can<br />

also be similarly defined for other rule spaces. Let us assume that<br />

we encode a rule in binary:<br />

Rule = b7b6b5b4b3b2b2b1b0<br />

p0 = b5b4b1b0 (Primitive 0)<br />

p1 = b7b6b3b2 (Primitive 1)<br />

c0 = Count(bx = 0) for bx ∈ p0 (Crossings from 0)<br />

c1 = Count(bx = 1) for bx ∈ p1 (Crossings from 1)


T<br />

T<br />

C T0<br />

T<br />

T<br />

C T1<br />

T<br />

T<br />

C T2<br />

Figure 2: Cycles induced by the Twist operator.<br />

Figure 3: Example of the Twist Operator.<br />

p0 represents the four bits rendered on the bottom of the 3D visualization.<br />

p1 represents the four bits on the top. c0 represents the<br />

triangles that cross from the bottom to the top, and c1 represents the<br />

triangles that cross from the top to the bottom.<br />

The 3D representation can be subjected to geometric transformations<br />

to obtain new rules based on an initial rule. A useful transformation<br />

is presented next.<br />

4 Twist Operator<br />

The twist operator acts on the primitives p0 and p1 but preserves c0<br />

and c1. It rotates the 3D rule representation about the Y-axis. The<br />

twisting of a rule generates another rule; this may be a different<br />

rule or it may be the same rule. There are three types of cycles that<br />

are generated by multiple applications of this operator; they have<br />

degrees 1, 2, and 4 as shown in Figure 2. An example showing the<br />

twisting of Rule 110 is shown in Figure 3. The application of this<br />

operator generates a partition of the rule space.<br />

5 Rule Space Partition<br />

We partition the rule space in 3D space using the primitives p0 and<br />

p1, along with c0 and c1 and the cycles generated by the twist operator.<br />

All possible primitives p0 are aligned along the x-axis ordered<br />

by the values of their corresponding c0. Primitives p1 are aligned<br />

along the y-axis ordered by c1. The degree of the cycles where the<br />

given rule resides determines the height or the position along the<br />

z-axis. A planar view of this visualization is shown in Figure 4.<br />

T<br />

11<br />

Figure 4: Rule Space Partition.<br />

When the rule space is partition in this way, it is immediately<br />

evident that clusters are formed that contain rules of the same class.<br />

This is the first time that such a display has been reported. The<br />

structure of the rule space that is revealed allows for a deeper study<br />

of how behavior is distributed and perhaps find the properties that<br />

cause it.<br />

References<br />

A BRASS, R. K. G., AND ELSE, K. J. 1994. A cellular automata model for<br />

helper t cell subset polarization in chronic and acute infection. Journal<br />

of theoretical biology 166, 2, 189.<br />

HUNTER, AND CORSTEN, M. J. 1991. Determinism and thermodynamics:<br />

Ising cellular automata. Physical Review.A. 43, 6, 3190.<br />

ITAMI, R. 1988. Cellular worlds-models for dynamic conceptions of landscape.<br />

Landscape Architecture (July), 52–57.<br />

WOLFRAM, S. 1994. Cellular Automata and Complexity: Collected Papers.<br />

Addison-Wesley.


GLOD: A Geometric Level of Detail System at the OpenGL API Level<br />

Jonathan Cohen * David Luebke + Nathaniel Duca * Brenden Schubert +<br />

* Johns Hopkins University + University of Virginia<br />

1 INTRODUCTION<br />

Level of detail (LOD) techniques are widely used today among<br />

interactive 3D graphics applications, such as CAD design,<br />

scientific visualization, virtual environments, and gaming,<br />

allowing applications to trade off visual fidelity for interactive<br />

performance. Many excellent algorithms exist for LOD generation<br />

as well as for LOD management [Luebke 2003]. However, no<br />

widely accepted programming model has emerged as a standard<br />

for incorporating LOD into programs.<br />

Existing tools generally fall into two categories: mesh simplifiers<br />

and scene graph toolkits. Mesh simplifiers address the LOD<br />

generation problem, taking a complex object and producing<br />

simpler LODs, but they do not attempt to address LOD management<br />

at all. Scene graphs such as OpenGL Performer [Rohlf<br />

1994] perform LOD management, but go to the opposite extreme;<br />

they provide heavyweight “all or nothing” solutions that lump<br />

LOD in with myriad other aspects of an interactive computer<br />

graphics system, constraining the form of the overall application.<br />

In this poster we present GLOD, a tool for geometric level of<br />

detail that provides a full LOD pipeline in a lightweight and<br />

flexible application programmer’s interface (API). This API is a<br />

powerful, extendible, yet easy-to-use LOD system, supporting<br />

discrete, continuous, and view-dependent LOD, multiple simplification<br />

algorithms, and multiple adaptation modes. GLOD is not a<br />

scene graph system; instead, it is an API integrated with OpenGL,<br />

an existing and popular low-level rendering API. With this<br />

formulation, we start to think of geometric level of detail as a<br />

fundamental component of the graphics pipeline, much like mipmapping<br />

is a fundamental component for controlling detail of<br />

texture images. The system itself should be an excellent tool for<br />

interactive visualization applications written using OpenGL.<br />

2 GLOD API<br />

Our design goals for the GLOD API (see Figure 3) focus on<br />

providing a lightweight model for the creation, management, and<br />

rendering of geometry. To maximize its appeal to multiple<br />

audiences, GLOD should be fast, extensible to different LOD<br />

algorithms, and easy to integrate into existing applications.<br />

Furthermore, it should allow incremental adoption rather than<br />

locking developers into all pieces of the GLOD framework. To<br />

accomplish these goals, GLOD API is tightly integrated with the<br />

industry standard OpenGL API, so our design decisions are<br />

guided as if GLOD were a component of OpenGL.<br />

The data handled by GLOD is organized into three principal<br />

units: patches, objects, and groups. A patch is the principal unit<br />

http://www.cs.jhu.edu/~graphics/GLOD<br />

Figure 1: The GLOD object and dataflow model.<br />

12<br />

of rendering. A patch is specified to GLOD using the OpenGL<br />

vertex array interface. Drawing a patch is much like drawing a<br />

vertex array, the chief difference being that what you get is an<br />

LOD of the original arrays. The application may change rendering<br />

state, such as bound textures, on a per-patch basis at the time of<br />

rendering; GLOD does not interfere with rendering state.<br />

An object is the principal unit of LOD generation. The application<br />

designates one or more patches as an object before initiating<br />

the LOD generation process. Thus multiple patches may be<br />

simplified together into crack-free levels of detail. GLOD also<br />

supports memory-efficient instancing of objects to provide<br />

efficient LOD management for applications which render objects<br />

in multiple locations.<br />

A group is the principal unit of LOD management. An application<br />

places one or more objects into a group. At each frame,<br />

GLOD adapts the LOD of all patches of all objects in each group<br />

according to the specified adaptation mode and current OpenGL<br />

viewing matrices.<br />

The GLOD pipeline is designed to allow flexible motion of<br />

data into and out of it as desired by the application, as illustrated<br />

in Figure 1. The original geometry is specified as patches using<br />

the vertex array mechanism. The application can then set a<br />

number of per-patch and per-object LOD generation parameters to<br />

determine how the LOD hierarchy is constructed. For example,<br />

parameters may be used to select a simplification operator, error<br />

metric, hierarchy type (e.g. discrete, continuous, view-dependent),<br />

importance values, etc. A special hierarchy type allows the<br />

programmer to manually build discrete hierarchies from a set of<br />

existing LODs. An entire hierarchy may be read back by the<br />

application to save it to disk, allowing it to be re-used in a later<br />

execution without regenerating it. Group parameters specify<br />

management modes such as the error mode (object-space or<br />

screen-space), adaptation mode (error threshold or triangle<br />

budget), morphing parameters, etc. After adapting a group, the<br />

individual adapted patches may be read back, again through the<br />

vertex array mechanism. The application can store these vertex<br />

arrays, pass them to OpenGL for rendering, etc. This complete set<br />

of data paths allows applications to incrementally adopt GLOD.<br />

3 DISCUSSION<br />

We have currently limited the scope of GLOD to filtering geometric<br />

detail without interfering with rendering state. This has several<br />

benefits. The application may safely employ complex rendering<br />

algorithms, including multi-pass algorithms, as well as custom<br />

vertex and fragment programs. For example, applications can use<br />

normal mapped LODs without difficulty in GLOD. Many user-


Figure 2: Bunny rendered in GLOD using a multipass<br />

rendering algorithm, demonstrating GLOD’s policy of<br />

non-interference with the underlying graphics system.<br />

defined vertex program parameters can pass through GLOD<br />

filtering. However, this is not applicable for all vertex programs.<br />

Also, our non-interference policy makes some forms of LODs,<br />

such as textured impostors, difficult to support because they<br />

require us to change rendering state.<br />

At the time of this writing, a pre-release version of the GLOD<br />

system is available from our web site:<br />

http://www.cs.jhu.edu/~graphics/GLOD<br />

The current implementation supports both discrete and viewdependent<br />

hierarchy formats, several simplification operators,<br />

error threshold and triangle budget adaptation modes, etc. We<br />

hope that this open source system will provide a viable and<br />

convenient pathway for level of detail research to migrate from<br />

the research lab to full deployment. With a wide array of simplification<br />

algorithms, hierarchical data representations, and management<br />

policies in their hands, all available through the setting of a<br />

few parameters, application developers will have tremendous<br />

power to select the implementations that meet their needs.<br />

REFERENCES<br />

Luebke, D., M. Reddy, J. Cohen, A. Varshney, B. Watson, and R.<br />

Huebner. Level of Detail for 3D Graphics. Morgan Kaufman.<br />

2003.<br />

Rohlf, J. and J. Helman. IRIS Performer: A High Performance<br />

Multiprocessing Toolkit for Real-Time 3D Graphics. Proceedings<br />

of SIGGRAPH 94. July 24-29. pp. 381-395.<br />

13<br />

glodNewGroup(grpname);<br />

glodDeleteGroup(grpname);<br />

Create a group to contain and manage objects. Deleting<br />

a group deletes all its objects.<br />

glodNewObject(objname, grpname, format);<br />

Create an object for a particular hierarchy format and<br />

place in the named group.<br />

glodInsertArrays(objname, patchname, mode,<br />

first, count, level, error);<br />

glodInsertElements(objname, patchname, mode,<br />

count, type, indices,<br />

level, error);<br />

Put a patch into an object using vertex arrays. Level<br />

and error can be used to load an LOD generated<br />

elsewhere into a discrete hierarchy, but are typically<br />

set to 0.<br />

glodBuildObject(objname);<br />

Complete an object and convert to hierarchy in the<br />

selected output format.<br />

glodInstanceObject(objname, instname, grpname);<br />

Instantiate an existing object by sharing its geometry<br />

hierarchy data, and place into a group.<br />

glodDeleteObject(objname);<br />

Delete an object (which removes it from its group).<br />

glodBindAdaptXform(objname);<br />

Capture an object’s viewing parameters for adapting<br />

(not drawing – GLOD does not change the OpenGL<br />

transformation state).<br />

glodAdaptGroup(grpname);<br />

Adapt LOD for all the objects in a group according to<br />

the group’s ADAPT_MODE.<br />

glodDrawPatch(objname, patchname);<br />

Draw one patch of an object.<br />

glodFillArrays(objname, patchname, first);<br />

glodFillElements(objname, patchname, type,<br />

elements);<br />

Read back current adapted object into vertex arrays<br />

glodGetObject(objname, data);<br />

glodLoadObject(objname, data);<br />

Read back an object’s hierarchy so it may be saved<br />

and later reloaded to GLOD.<br />

Figure 3: The GLOD API


Subjective Usefulness of CAVE and Fish Tank VR Display Systems for a<br />

Scientific Visualization Application<br />

1 Introduction<br />

Ça˘gatay Demiralp David H. Laidlaw Cullen Jackson Daniel Keefe Song Zhang<br />

cad, dhl, cj, dfk, sz@cs.brown.edu<br />

<strong>Computer</strong> Science Department, Brown University, Providence - RI<br />

The scientific visualization community increasingly uses VR display<br />

systems, but useful interaction paradigms for these systems<br />

are still an active research subject. It can be helpful to know the<br />

relative merits of different VR systems for different applications<br />

and tasks. In this paper, we report on the subjective usefulness<br />

of two virtual reality (VR) display systems, a CAVE and a Fish<br />

Tank VR display, for a scientific visualization application (see Figure<br />

1). We conducted an anecdotal study to learn five domainexpert<br />

users’ impressions about the relative usefulness of the two<br />

VR systems for their purposes of using the application. Most of<br />

the users preferred the Fish Tank display because of perceived display<br />

resolution, crispness, brightness and more comfortable use.<br />

Whereas, they found the larger scale of objects, expanded field of<br />

view, and suitability for gestural expressions and natural interaction<br />

in the CAVE more useful.<br />

The term “Fish Tank VR” is used to describe desktop systems<br />

that display stereo image of a 3D scene, which is viewed on a monitor<br />

using perspective projection coupled to the head position of the<br />

observer [Ware et al. 1993]. A CAVE is a room-size, immersive<br />

VR display environment where the stereoscopic view of the virtual<br />

world is generated according to the user’s head position and orientation<br />

[Cruz-Neira et al. 1993].<br />

Some related work compares Fish Tank VR displays with Head<br />

Mounted Stereo Displays (HMD) and conventional desktop displays.<br />

In [Ware et al. 1993; Arthur et al. 1993], the authors compare<br />

Fish Tank VR with an HMD and conventional desktop systems.<br />

[Pausch et al. 1997] showed that HMDs can improve performance,<br />

compared to conventional desktop systems, in a generic search task<br />

when the target is not present. However, a later study showed that<br />

these findings do not apply to desktop VR; Fish Tank VR and desktop<br />

VR have a significant advantage over HMD VR in performing<br />

a generic search task [Robertson et al. 1997]. [Bowman et al. 2001]<br />

compared HMD with Tabletop (workbench) and CAVE systems for<br />

search and rotation tasks respectively They found that HMD users<br />

performed significantly better than CAVE users for a natural rotation<br />

task. For a difficult search task, they also showed that subjects<br />

perform differently depending on which display they encountered<br />

first.<br />

Bowman and his colleagues’ work shares similar motivations to<br />

ours. We go beyond their work with a direct comparison of CAVE<br />

and Fish Tank VR platforms. Also, most of previous studies have<br />

evaluated VR systems by looking at user performance for a few<br />

generic tasks such as rotation and visual search on experiment specific,<br />

simple applications. For most of the real visualization applications<br />

it may be difficult to reduce the interactions into a set of<br />

simple, generic tasks. Consequently, it is not clear how well the results<br />

of these studies apply to real visualization applications. This<br />

point is elucidated in a recent study that presented the importance of<br />

application specific user studies using tasks that reflect end user’s<br />

needs [Swan II et al. 2003]. In this study, the authors compare<br />

user performance for an application specific task across desktop,<br />

CAVE, workbench and display wall platforms. They found that<br />

the users performed tasks fastest using the desktop and slowest us-<br />

14<br />

Figure 1: The visualization application running in the CAVE (left<br />

image) and on the Fish Tank VR display (right image).<br />

ing the workbench. They have a good discussion of the tradeoff<br />

between application specific and generic user studies, stressing on<br />

the value of application-context based user studies using high-level<br />

tasks.<br />

We chose to perform an anecdotal study for two specific reasons:<br />

First, we believe application-oriented user studies using the<br />

domain-expert user’s scientific hypothesis-testing process as a task<br />

to be evaluated can be complementary to user studies that utilize<br />

generic tasks and experiment specific applications. Second, we<br />

wanted to gain insights for designing future quantitative studies to<br />

compare user performance in CAVEs and on Fish Tank VRs.<br />

2 Methods<br />

Diffusion tensor magnetic resonance imaging (DT-MRI) is a new<br />

imaging modality with the potential to measure fiber-tract trajectories<br />

in fibrous soft tissues such as nerves and muscles. Our application<br />

visualizes DT-MRI brain data as 3D streamtube and streamsurface<br />

geometries in conjunction with 2D T2-weighted MRI sections.<br />

It is based on the work by et al. [Zhang et al. 2001]. We have the application<br />

running both in a CAVE and on a Fish Tank display. Five<br />

domain-expert users were asked to use it both in the CAVE and on<br />

the Fish Tank display. Our expert user pool was made of one neuroradiologist,<br />

one neurosurgeon, one computer science graduate student<br />

with an undergraduate degree in neuroscience, one biologist<br />

and one doctor, who is also a medical school instructor, with an<br />

undergraduate degree in computer science. Four of the users were<br />

male and one was female. Two of the users started with the Fish<br />

Tank version of the application and the rest with the CAVE version.<br />

Each user had their own task (or scientific hypothesis to be<br />

tested), which they described to us. They were asked to compare<br />

the platforms with respect to their purposes. They did so by talking<br />

to us while using the application. Most often we offered counterarguments,<br />

which helped to expose the reasoning behind the users’<br />

observations. The users were then asked to give an overall preference<br />

for one of the two VR systems.<br />

3 Results<br />

Overall, one user preferred CAVE and four preferred Fish Tank VR<br />

display. We summarize the users’ comments as to relative advan-


tages of CAVE and Fish Tank VR systems below.<br />

Comments on advantages of CAVE:<br />

¯ Has bigger models, one can see more<br />

¯ Has larger field of view<br />

¯ More suitable for gestural expression and natural interaction<br />

¯ Possible to walk around<br />

On Fish Tank VR display:<br />

¯ Has sharper and crisper images<br />

¯ Constitutes more information, relationships between the<br />

structures are easier to see<br />

¯ Feels more comfortable, non-claustrophobic and sitting is better<br />

than standing<br />

¯ Works better for collaboration, especially with two people<br />

¯ Pointing to objects on the screen is easier<br />

¯ More time efficient to use; doctors prefer to work-and-go<br />

¯ Would work better for telemedicine-like collaboration<br />

¯ More intuitive for surgery planning because doctors are used<br />

to working with real or smaller brain sizes<br />

Our first user was a neurosurgeon; he had used the application<br />

before. He uses DT-MRI data to study obsessive-compulsive disorder<br />

(OCD) patients and was particularly interested in studying<br />

changes that occur after radiation surgery, which ablates an important<br />

white matter region. He wanted to see the relation between the<br />

neuro-fiber connectivity and linear diffusion (streamtubes) in the<br />

brain. He strongly preferred using Fish Tank VR and did not find<br />

any relative advantages of the CAVE.<br />

Our second user was a biologist who was also trying to see correlations<br />

between white matter structure and linear diffusion in the<br />

brain. His interests were not confined to a specific anatomical region.<br />

He was the only user who preferred the CAVE over Fish Tank<br />

display.<br />

Our third user was a doctor and a medical school instructor with<br />

an undergraduate degree in computer science. She evaluated the<br />

application from teaching and learning perspectives.<br />

Our fourth user was a computer science graduate student with<br />

an undergraduate degree in neuroscience. He looked at the application<br />

to see correlations between white matter structures and linear<br />

diffusion in the brain, similar to our second user. He said that he<br />

preferred Fish Tank VR because 2D sections have higher resolution<br />

and the models look crisper on the screen, which helped him see<br />

the correlations easily.<br />

Our last user was a neuroradiologist working on MS (multiple<br />

sclerosis) disease. He wanted to see the 3D course of neurofibers<br />

along corpus callosum. He was able to see what he was looking for<br />

in both the platforms.<br />

All users also found 2D sections to be very helpful in both platforms.<br />

They said they were familiar with looking at 2D sections,<br />

which help them to correlate and orient the 3D geometries representing<br />

diffusion with the brain anatomy.<br />

4 Discussion<br />

The higher perceived display resolution, crispness, brightness, and<br />

more comfortable use were considered useful on the Fish Tank VR.<br />

On the other hand, users found the larger scale of objects, expanded<br />

field of view, and potential use of gestural and natural interaction<br />

useful in the CAVE. We believe that each of these factors is worth<br />

investigating in order to quantify their effects on user performance.<br />

Some of these factors have already been studied quantitatively: for<br />

example, recently Kasik et al. showed the positive effect of a crisp<br />

display on user performance [Kasik et al. 2002].<br />

We still believe that application-oriented user studies using<br />

the domain-expert user’s hypothesis-testing process as a task to<br />

15<br />

be evaluated can be complementary to user studies that evaluate<br />

generic task performance on experiment specific, simple applications.<br />

However, this approach is difficult to implement: First, one<br />

needs many application-oriented studies to find meaningful patterns<br />

and generalize them; second, finding enough expert users with similar<br />

hypotheses can be very difficult.<br />

In light of the experience we gained through this study, we hypothesize<br />

that Fish Tank VR displays are preferable over CAVEs for<br />

exocentric tasks, as they physically separate user’s reference frame<br />

from the application’s. As an initial attempt to test this hypothesis<br />

we will conduct a formal quantitative user study in which we<br />

will compare the user performance between CAVE and Fish Tank<br />

VR for an exocentric search task on a simple, experiment specific<br />

application. However, we will also give a greater emphasis on the<br />

task’s relevance in real visualization applications.<br />

5 Summary<br />

We presented results from an anecdotal user study with five<br />

domain-expert users. They used a scientific visualization application<br />

both in a CAVE and on a Fish Tank VR platform. While the<br />

higher perceived display resolution, crispness, brightness and more<br />

comfortable use were considered useful on the Fish Tank VR, users<br />

found the larger scale of objects, expanded field of view, and potential<br />

use of gestural and natural interaction useful in the CAVE.<br />

Overall, one user preferred CAVE and four users preferred Fish<br />

Tank VR.<br />

References<br />

ARTHUR, K.W.,BOOTH, K.S.,AND WARE, C. 1993. Evaluating 3d<br />

task-performance for fish tank virtual worlds. ACM Trans. Inf. Syst. 11,<br />

239–265.<br />

BOWMAN, D. A., DATEY, A., FAROOQ, U., RYU, Y. S., AND VASNAIK,<br />

O. 2001. Empirical comparisons of virtual environment displays. Tech.<br />

rep., Virginia Tech Dept. of <strong>Computer</strong> Science, TR-01-19.<br />

CRUZ-NEIRA, C., SANDIN, D. J., AND DEFANTI, T. A. 1993. Surroundscreen<br />

projection-based virtual reality: the design and implementation<br />

of the cave. In Proceedings of the 20th annual conference on <strong>Computer</strong><br />

graphics and interactive techniques, ACM Press, 135–142.<br />

SWAN II, J. E., GABBARD, J. L., HIX, D., SCHULMAN, R. S., AND<br />

KIM, K. P. 2003. A comparative study of user performance in a mapbased<br />

virtual environment. In Proceedings of <strong>IEEE</strong> Virtual Reality 2003,<br />

259–266.<br />

KASIK, D.J.,TROY, J. J., AMOROSI, S.R.,MURRAY, M. O., AND<br />

SWAMY, S. N. 2002. Evaluating graphics displays for complex 3d models.<br />

<strong>IEEE</strong> Comput. Graph. Appl. 22, 56–64.<br />

PAUSCH, R., PROFFITT, D., AND WILLIAMS, G. 1997. Quantifying immersion<br />

in virtual reality. In Proceedings of the 24th annual conference<br />

on <strong>Computer</strong> graphics and interactive techniques, ACM Press/Addison-<br />

Wesley Publishing Co., 13–18.<br />

ROBERTSON, G., CZERWINSKI, M., AND VAN DANTZICH, M. 1997.<br />

Immersion in desktop virtual reality. In Proceedings of the 10th annual<br />

ACM symposium on User interface software and technology, ACM Press,<br />

11–19.<br />

WARE, C., ARTHUR, K., AND BOOTH, K. S. 1993. Fish tank virtual<br />

reality. In Proceedings of the conference on Human factors in computing<br />

systems, Addison-Wesley Longman Publishing Co., Inc., 37–42.<br />

ZHANG, S., DEM˙IRALP,Ç.,KEEFE, D., DASILVA,M., LAIDLAW, D. H.,<br />

GREENBERG, B. D., BASSER,P.,PIERPAOLI, C., CHIOCCA, E., AND<br />

DEISBOECK, T. 2001. An immersive virtual environment for dt-mri<br />

volume visualization applications: a case study. In Proceedings of the<br />

conference on Visualization 2001, <strong>IEEE</strong> <strong>Computer</strong> <strong>Society</strong> Press, 437–<br />

440.


Abstract<br />

Visual Exploration of Measured Data in Automotive Engineering<br />

Andreas Disch, Michael Münchhofen, Dirk Zeckzer<br />

ProCAEss GmbH<br />

Landau, Germany<br />

{A.Disch,M.Muenchhofen,D.Zeckzer}@procaess.com<br />

The automotive industry demands visual support for the verification<br />

of the quality of their products from the design phase to the<br />

manufacturing phase. This implies the need of tools for measurement<br />

planning, programming measuring devices, managing measurement<br />

data, and the visual exploration of the measurement results.<br />

To simplify and accelerate the quality control in the process<br />

chain an integration of such tools in a platform independent framework<br />

is crucial.<br />

We present eMMA (enhanced Measure Management Application),<br />

a client/server system that integrates measurement planning,<br />

data management, and simple as well as sophisticated visual exploration<br />

tools in a single framework.<br />

1 Introduction<br />

To ensure the quality of the fabrication process and the products<br />

manufactured workpieces are measured using a coordinate measuring<br />

machine. Measurement plans are based on the CAD models<br />

usually stored in Product Data Management (PDM) or Product<br />

Lifecycle Management (PLM) systems. Both systems are based on<br />

a database and store also documents related with the CAD data.<br />

The process chain of quality ensurance is made up of different,<br />

partly complex steps, which are characterized by loosely coupled<br />

software and nonuniform modi operandi. We have developed<br />

eMMA to integrate those different procedures and the necessary<br />

software into a single tool. Thus, we have the ability to integrate<br />

new visualization types for the generation of evaluation reports.<br />

We have designed a modular system that can be easily extended<br />

to a wider spectrum of analysis algorithms, report styles, etc. It is<br />

already in practical use in the automotive industry, but is, of course,<br />

not restricted to car production. It can be used in any mechanical<br />

engineering or production business.<br />

2 System Overview<br />

Main areas of our system eMMA include the Measurement Plans<br />

and Report Templates, the Online Evaluation, and the creation and<br />

printing of Measuring Reports. We describe these areas in the subsequent<br />

sections.<br />

2.1 Measurement Plans and Report Templates<br />

The whole system is centred around the MDM (Measure Data Management)<br />

database which stores assembly hierarchies along with<br />

measurement plans, measuring data, report definitions, evaluation<br />

definitions, references to the PDM system, etc.<br />

Figure 1 shows the MDM tree on the left side and an information<br />

panel on the right side which displays information about the<br />

currently selected node. After selecting the menu item for editing<br />

a report template and choosing an existing or starting the definition<br />

of a new report template the main window looks like in figure 2.<br />

16<br />

Ralf Klein<br />

IVS, DFKI GmbH<br />

Kaiserslautern, Germany<br />

Ralf.Klein@dfki.de<br />

Figure 1: The eMMA main window displaying a tree of product<br />

types, component parts, and measurement plans stored in the MDM<br />

database<br />

The structure of the current template is displayed in the left panel<br />

where the user can add, edit, or remove report views, or move features<br />

from one view to another. The right panel shows the main<br />

image of the currently selected view. A viewing editor allows the<br />

user to pan, rotate, and zoom the view on the geometry and to take<br />

snapshots which are stored with the current view.<br />

Figure 2: The definition of a report template organized in several<br />

report views (pages) with report features attached to them<br />

In the online evaluation module we have implemented several<br />

different views on the measured data of quality features on a selected<br />

assembly to meet different needs of an evaluator.<br />

From the main window (see figure 1) the user gets to the evalua-


tion module by first selecting a measurement plan in the MDM data<br />

tree and then choosing the Evaluation action. This switches to the<br />

evaluation module where the user can either run an online evaluation<br />

with the default report template or he can first open a settings<br />

dialog to make more specific selections.<br />

When the online evaluation is complete we list all evaluated<br />

quality features with their nominal data and the computed error values<br />

for each measuring in a table. Errors that are out of tolerance<br />

bounds we colour red. We also display the main image of the current<br />

active report view.<br />

Figure 3: Online evaluation of a component showing the error values<br />

for each quality feature and each measuring as well as a graphical<br />

representation of the workpiece<br />

To aid the user in finding the measured quality features in the<br />

picture on the right side we compute and render labels pointing to<br />

the features’ locations (like in figure 2).<br />

One of the other possible types of online evaluation that can be<br />

started by right-clicking on the table is the Cpk online evaluation<br />

that is shown in figure 4. By moving the vertical edges of the blue<br />

area horizontally the user can deselect an interval of measurings<br />

from being used for the computations of the values listed on the<br />

left. The user may also right-click the points to select/deselect a<br />

single measuring.<br />

Figure 4: Cpk online evaluation of a round hole: hashed out measuring<br />

results are discarded for Cpk computation<br />

Very similar to the Cpk online evaluation is the analysis tool that<br />

also opens a frame showing a trend chart for each evaluated dimension<br />

and a table with some statistical data (see the lower trend<br />

17<br />

chart window in figure 5). We offer this tool for convenience reasons<br />

for users who don’t need the actual Cpk computation function.<br />

When the mouse hovers over points representing measuring results<br />

we show tool tips that reveal an identifier of the measuring process,<br />

the measured value, and the error value which is colour-coded.<br />

Figure 5: A collection of several online evaluation functions<br />

2.2 Measurement Reports<br />

Beside the different types of online evaluation within eMMA we<br />

also allow the user to generate <strong>PDF</strong> files with customizable layout<br />

schemes. In a first step we implemented an export of evaluation<br />

and report data in an XML file which then was transformed by XSL<br />

stylesheets into a <strong>PDF</strong> file. These XSL stylesheets actually define<br />

the report style and can be easily integrated to allow any kind of<br />

report.<br />

Currently, we are working on a way to directly create <strong>PDF</strong> files<br />

from our internal data structures. This will improve the performance<br />

on generating standard reports while the XML interface still<br />

enables an easy way for integrating user-specific plug-ins.<br />

3 Conclusions<br />

We have presented an integrated system providing visual support<br />

to meet the needs of the manufacturing industry for quality control<br />

through the whole product lifecycle. We have combined tools for<br />

managing measurement plans, the results from measurings, and for<br />

the visual exploration of the measuring results.<br />

Compared to the conventional method of using a loose collection<br />

of tools our integrated solution eMMA means a decisive improvement<br />

in today’s quality control work flow. We provide the means<br />

for a robust process chain without the risk of data inconsistencies.<br />

Beside the advantage of only one interface to be learnt we also<br />

offer the incorporation of any report style As further advantages,<br />

the users don’t need to learn different user interfaces and they don’t<br />

need to change between different applications. This leads to an<br />

accelerated quality control process and efficiently aids in improving<br />

the product quality.


Free Form Deformation for Biomedical Applications<br />

Shane Blackett, David Bullivant, Peter Hunter<br />

Bioengineering Institute, The University of Auckland, New Zealand<br />

http://www.bioeng.auckland.ac.nz<br />

Free form deformation is a useful technique for customisation and specification of anatomical finite<br />

element models.<br />

Introduction<br />

The IUPS Physiome Project is a worldwide effort to provide a computational framework for<br />

understanding human physiology. Working towards this goal finite element models have been<br />

created for many parts of human anatomy and the use of free form deformation is integral to the<br />

model creation, customisation and visualisation.<br />

Free form deformation has been described in computer graphics applications for a number of years.<br />

(Sederberg and Parry 1986) and direct-free form deformation introduced the concept of using a<br />

least squares minimisation (Hsu et al. 1992).<br />

The whole organ models that have been developed generally incorporate cubic Hermite finite<br />

elements providing a C1 continuous description of geometry with a relatively small number of<br />

elements. They are used to calculate mechanics, electrical excitation and embedded vessel fluid<br />

flow. Software developed at the Bioengineering Institute (CMISS http://www.cmiss.org) is used<br />

for computation and visualisation.<br />

Most of the applications of free form deformation in the Bioengineering Institute employ a similar<br />

process. Identifiable common points are selected on an existing model and on the target dataset,<br />

either manually or with some image processing. The objects are aligned as solid bodies, then the<br />

existing model is embedded in a host mesh, which is usually a small number of tricubic Hermite<br />

elements. A least squares fit is performed to find the nodal positions and derivatives in the host<br />

mesh which minimise the distances between the model and target points. The target points can be<br />

weighted differently and Sobelov smoothing can be applied to each of the degrees of freedom of the<br />

host mesh.<br />

Host Mesh Model Mesh<br />

Model Point Target Point<br />

a b<br />

c<br />

d<br />

Figure 1 (a) The initial model geometry and a host mesh which contains it. (b) Model point and<br />

target point pairs are specified. (c) A close up illustration from b. (d) The fitted geometry showing<br />

the deformed host mesh, the deformed model and the residual vectors where the target points were<br />

not matched exactly.<br />

Heart Fibres<br />

In cardiac tissue there is a definite fibre direction, and these fibres are coupled into sheets giving the<br />

myocardium a very anisotropic material behaviour. The fibre alignment varies throughout the heart<br />

18


wall. These important fibre and sheet directions were carefully measured by hand for a single heart<br />

(Nielsen). To enable mechanics solutions to be generated on other heart models it is important to<br />

have a representation of this fibre field but the effort required to acquire another set of fibre<br />

alignments has been prohibitive. By using free form deformation the existing fibre field can be<br />

transferred from the existing hand measured models to other ventricular models.<br />

Model Specification<br />

A detailed model of each bone, muscle and ligament around the knee joint has been developed from<br />

the Visible Human data set (Fernandez). To facilitate patient specific analysis of the stresses in the<br />

knee, a model that is customised to that patients geometry is required. To create this a single scan is<br />

obtained and then free form deformation used to align the existing detailed model with the scan.<br />

Similarly the Bioengineering Institute's lung models are customised to specific lung geometry<br />

segmented from scans by free form deformation.<br />

a b c d<br />

Figure 2 (a) Generic model of the femur. (b) Cloud of scanned data from a particular patient. (c)<br />

Fitted femur in deformed host mesh. (d) Close up of the fitted femur (red).<br />

Facial Animation Performance<br />

Facial animation requires models which represent the dynamics of the skin surface. By acquiring<br />

detailed motion capture of a particular performance these dynamics can be reproduced digitally. By<br />

using free form deformation to provide a mapping from two different neutral faces, the dynamic<br />

performance can be transferred through the same mapping, allowing dynamics captured from one<br />

animation to be transferred to any number of other models.<br />

a b<br />

Figure 3 (a) A generic standard model showing a smile. (b) Free form deformation was used to<br />

transfer dynamics, including the smile to a significantly different shaped face.<br />

Fernandez, J.W., Mithraratne, S., Thrupp, M. H., Tawhai, M. H. & Hunter P. J. 'Anatomically based geometric<br />

modelling of the musculo-skeletal system and other organs' To appear in Biomechanics and Modelling in<br />

Mechanobiology.<br />

Nielsen, P. M. F., LeGrice, I. J., Smaill, B. H. & Hunter, P, J. (1991) 'Mathematical model of geometry and fibrous<br />

structure of the heart', Am. J. Physiol. Heart Circ. Physiol. 260(29), H1365-H1378<br />

Sederberg, T.W. & Parry, S. R. (1986) 'Free-Form Deformation of Solid Geometric Models', ACM <strong>Computer</strong> Graphics<br />

(SigGraph 86 Conference Proceedings) 20(4), 151-160<br />

19


Ming C. Hao<br />

HP Research Labs<br />

Palo Alto, CA<br />

1 Introduction<br />

Daniel A. Keim<br />

University of Constance<br />

Germany<br />

The automation of activities in almost all areas, including business,<br />

engineering, science and government produces an ever increasing<br />

stream of data. Even simple transactions of every day life, like<br />

credit card payments or telephone calls are logged by computers.<br />

Most of these transactions have a spatial location attribute, like<br />

source and destination of a telephone call or location of a credit card<br />

payment [Keim and Herrmann 1998]. This data is collected because<br />

it is a potential source of valuable information. For business<br />

analysts, for example, it is important to know the sales amount for a<br />

certain product or the customer behaviour for geographical regions<br />

like the states of a country. In this poster we combine the ability of<br />

Pixel Bar Charts [Keim et al. 2002] and interactive maps for visualising<br />

multidimensional data belonging to certain geographical regions.<br />

The user can choose a geographical region on an interactive<br />

map and analyse the collected data assiciated with this region using<br />

the Pixel Bar Chart technique. The advantage of Geo Pixel Bar<br />

Chart is that the underlying data can be partitioned in geographical<br />

regions, and at the same time, all data items of each of the regions<br />

can be visualised without aggregation.<br />

2 Geo Pixel Bar Chart System<br />

The Geo Pixel Bar Charts system consists of an interactive map<br />

and Pixel Bar Charts. The user can choose geographical regions<br />

of a map by clicking the corresponding polygon on the interactive<br />

map. Once the user has select a region on the map, a Pixel Bar<br />

Chart for the underlying data of this region is computed. The user<br />

then selects different dimensions of the categorical data for the partitioning<br />

into bars. Then the user navigates over pixels within the<br />

bars to analyze detailed information of the data record. Figure 1 illustrates<br />

the interaction capabilities between the map and Pixel Bar<br />

Charts.<br />

Figure 1: Screenshot of the Geo Pixel Bar Chart System<br />

2.1 Interactive map<br />

An interactive map is used in the Geo Pixel Bar Chart System. The<br />

user interacts with the map by either clicking on single geographical<br />

Geo Pixel Bar Charts<br />

Umeshwar Dayal<br />

HP Research Labs<br />

Palo Alto, CA<br />

20<br />

Joern Schneidewind<br />

HP Research Labs<br />

Palo Alto, CA<br />

Peter Wright<br />

HP Finance<br />

Atlanta, GE<br />

regions on the map to start Pixel Bar Charts for the underlying data<br />

of this region, or starts Pixel Bar Charts for the global map. A feature<br />

of the map is the visualization of additional statistical or business<br />

attributes, like population density, income, or sales amount,<br />

expressed by the color of the map regions. The map is connected to<br />

Pixel Bar Charts. If the user selects an item in the Pixel Bar Charts,<br />

the region in the map corresponding to the spatial attribute of the<br />

data item is highlighted.<br />

2.2 Pixel Bar Charts<br />

Pixel bar charts are derived from regular bar charts. The basic idea<br />

of a pixel bar chart is to present the data values directly instead of<br />

aggregating them into a few data values. The approach is to represent<br />

each data item (e.g. an invoice) by a single pixel in the bar<br />

chart. The detailed information of one attribute of each data item is<br />

encoded into the pixel color and can be accessed and displayed as<br />

needed. To arrange the pixels within the bars one or two attributes<br />

are used to separate the data into bars and then use two additional<br />

attributes to impose an ordering within the bars. Pixel Bar Charts<br />

realizes a visualization in which one pixel corresponds to one data<br />

item and can therefore be used to present large amounts of detailed<br />

information.<br />

Figure 2: Basic Idea of Space Filling Pixel Bar Charts<br />

In the Geo Pixel Bar Chart system, Space-Filling Pixel Bar<br />

Charts are used in order to increase the number of displayable data<br />

values on the available screen space. The basic idea is to is to use<br />

equal-height instead of equal-width bar charts, shown in Figure 2.<br />

3 Applications<br />

The Geo Pixel Bar Chart technique has been applied to sales<br />

analysis and Internet usage analysis at Hewlett Packard Laboratories.<br />

The applications show the wide applicability and usefulness<br />

of Geo Pixel Bar Charts.<br />

3.1 Sales Analysis<br />

The rapid growth of business on the Internet has led to the<br />

availability of large volumes of data. Business research efforts have<br />

been focused on how to turn raw data into actionable knowledge.<br />

In order to find and retain customers, business analysts need to<br />

improve their sales quality based on prior information. For sales<br />

analysis, sales specialists would like to discover new patterns and


elationships in the invoice data. Common questions are ‘What is<br />

the sales growth rate in recent months?’, ‘Which product has the<br />

most sales?’, and ‘Where do the sales come from?’. With Geo<br />

Pixel Bar Charts it is easy to explore all sales for a geographical<br />

region and obtain additional information from the Pixel Bar Chart<br />

visualization.<br />

Figure 3: Geo Pixel Bar Charts for California: partition attribute is<br />

day, ordering attribute is dollar amount<br />

Figure 3 shows an interactive geographical map. The highlighted<br />

yellow region (California) corresponds to the region the user has<br />

clicked. The pixel bar chart represents the underlying data for this<br />

region. In this example, a data file with sales transaction for a certain<br />

company is used. The pixel bar chart shows the sales growth<br />

rate for the state California over 37 days. Each pixel in the pixel<br />

bar charts corresponds to a customer invoice. Within the bars, the<br />

pixels are ordered by dollar amount. The color of the map regions<br />

represents the population density. Blue regions have a high density<br />

and red regions have a low density. Thus an analyst might be<br />

more interested in high populated areas instead of areas with low<br />

population density.<br />

3.2 Internet Usage Analysis<br />

Geo Pixel Bar Charts have also been used to analyze Internet<br />

duration time at Hewlett Packard Laboratories. A system analyst<br />

can use the visualization to rapidly discover event patterns to<br />

manage the Internet configuration. With an interactive map, the<br />

analyst might be able to find server locations which might be the<br />

source of Internet traffic problems or can locate geographical regions<br />

with high internet traffic duration time. This application also<br />

demonstrates that Geo Pixel Bar Charts is interactive in two ways.<br />

The user can click on a region in the interactive map to start a Pixel<br />

Bar Chart for this region. The user can also start with a Pixel Bar<br />

Chart to explore the geographical location of each of the data items<br />

in the chart by clicking on the data item. If the data item has a<br />

21<br />

spatial location attribute, this location is highlighted on the interactive<br />

map. To map the logged IP addresses to geographical locations<br />

a geo-locator database is employed.<br />

Figure 4: Location of IP addresses: If the user clicks on a data item,<br />

the geographical location of this item is highlighted as a circle in the<br />

map. The color of this circle corresponds to the color of the item.<br />

Most web traffic occurs between hours 9 and 17.<br />

Figure 4 presents an Internet access log file visualized by Geo<br />

Pixel Bar Charts. The data items contained in the Pixel Bar Charts<br />

correspond to web transactions. The partitioning attribute is hour of<br />

the day and the y-ordering attribute is duration time. If the web request<br />

exceeds a threshold duration time (100ms) the corresponding<br />

data item is colored red. If the analyst clicks on a data item in the<br />

Pixel Bar Chart, the corresponding geographical location is highlighted<br />

on the map. This makes it easy to find regions with high<br />

duration time or locations with high volume of web requests.<br />

4 Conclusion<br />

This poster presents a new interactive visualization technique called<br />

Geo Pixel Bar Charts, which combines the advantages of interactive<br />

maps and Pixel Bar Charts. Further research goals will focus on<br />

the improvement of the new technique and the use of distortion<br />

techniques like cartograms instead of normal maps.<br />

References<br />

KEIM, D. A., AND HERRMANN, A. 1998. The gridfit algorithm: An<br />

efficient and effective approach for visualizing large amounts of spatial<br />

data. In Proc. Visualization ’98, Research Triangle Park, NC, 181–188,<br />

531.<br />

KEIM, D. A., HAO, M. C., DAYAL, U., AND HSU, M. 2002. Pixel bar<br />

charts: A visualization technique for very large multi-attribute data sets.<br />

Visualization, San Diego 2001, extended version in: <strong>IEEE</strong> Transactions<br />

on Visualization and <strong>Computer</strong> Graphics 7, 2002.


1. MOTIVATION<br />

3D models are becoming ever more detailed as the number of<br />

triangles representing these models increases. As a surface gets a<br />

more detailed representation, more triangles of smaller sizes are<br />

used. When these small triangles are rendered on screen, each<br />

may cover only a few pixels. As such, point was proposed as an<br />

alternative primitive [2].<br />

We observe that there is a gap between the two known primitives<br />

of point and triangle in representing and rendering 3D models or<br />

surfaces; that is, the line, in particular anti-alias line has yet to be<br />

studied. Our work is also motivated by the observation that while<br />

point is suitable for surfaces with high complexity and irregularity<br />

and triangle for regular surfaces, line is suitable for surfaces with<br />

regularity along one dimension (such as a cylindrical surface).<br />

Figure 2 shows an arm bone where lines or hybrid of lines and<br />

points can represent it concisely as most parts of its surface have<br />

regularity along one dimension. Without the line primitive, a<br />

surface with regularity along one dimension may need to be<br />

represented unfavourably by thin or fat triangles, many smaller<br />

triangles, or many points. Another view of this motivation is the<br />

compression that lines can provide. Given an arbitrary set of<br />

points, it may be possible to come up with heuristics to construct a<br />

set of lines which represents a larger set of points while<br />

maintaining a certain error measure.<br />

In another view, having the line primitive, one can construct a<br />

model and its different level of details (lods) with a continuous<br />

spectrum of primitives. From the intra-primitive perspective, for a<br />

polyline, the error measure of its lod as another polyline can be<br />

formulated in some straightforward way. From the inter-primitive<br />

perspective, a primitive can be replaced with another primitive<br />

depending on the position of the viewpoint: if a triangle is far<br />

away, it projects as a small triangle which can be represented by a<br />

line and subsequently as a point. We note that the line primitive<br />

can be adaptive in that the same line is used to represent near or<br />

far surfaces, but if a set of points is used in place of a line, to<br />

maintain surface continuity, a sufficient number of points we need<br />

to use when the surface is near is more than that is needed when<br />

the surface is far away.<br />

In this work, we formulate the line primitive as another<br />

representation and rendering alternative. It extends the antialiasing<br />

theory in texture mapping [1] to render anti-aliased 3D<br />

line models. Our work on rendering line is also using the elliptical<br />

weighted average (EWA) resampling filter [3, 4, 5].<br />

2. DESCRIPTION OF THE IDEA<br />

Our work started with finding a solution for the resampling filter<br />

of lines. We attempted to find a closed form solution since there<br />

exists closed form approximation for the integration of Gaussian<br />

points along a line, as a Gaussian line, using the error<br />

3 Science Drive 2, Singapore 117543, Republic of Singapore.<br />

Emails: keenhon@hotmail.com, ouyangxi@comp.nus.edu.sg,<br />

tants@comp.nus.edu.sg<br />

Line Rendering Primitive<br />

Keen-Hon Wong Xin Ouyang Tiow-Seng Tan<br />

School of Computing, National University of Singapore<br />

22<br />

1<br />

Figure 1: Opaque, transparent and textured line models<br />

rendered by our approximation method.<br />

function erf (x)<br />

. We, however, arrived at an expression which in<br />

general, cannot be integrated in closed form. Instead, we have<br />

designed a good approximation to render anti-aliased opaque,<br />

transparent and textured 3D line models; see Figures 1 and 3.<br />

A simple view of our approximation idea is to linearly interpolate<br />

between two EWA resampling filters computed at the line<br />

segment endpoints. More specifically, our approximation is based<br />

on the analysis of the texture mapping theory with consideration<br />

of the properties of perspective mapping and Gaussian<br />

convolution. Following is a brief description of the process to<br />

render a line; all rendered lines are to be blended as in [5].<br />

Ellipse equations for the two EWA resampling filters at both<br />

endpoints of a line are computed. These equations are then used to<br />

compute the tangent lines ( 1, 2 and 3,4 ) connecting the two<br />

ellipses and vertices 5 to 18 as shown in Figure 4. These 18<br />

vertices are then used in the mapping of the Gaussian influences<br />

while minimizing possible distortion. The Gaussian influences are<br />

pre-computed from a Gaussian line of length l > 2, r where r is


Figure 2: Arm bone represented by different primitives.<br />

Figure 4: Texturing lines’ contents with influence textures.<br />

the cutoff radius of a unit Gaussian kernel. The pre-computed<br />

influence is then sliced into a side and a middle texture which are<br />

used for the endpoints portion and middle portion respectively.<br />

Next, the line is coloured or mapped with texture images.<br />

3. RESULTS<br />

We implemented a software pipeline in C/C++ and conducted<br />

experiments for both lines and points primitives. The lines and<br />

points models we acquired are converted from triangle models. In<br />

the experiments we compare the quality and performance of<br />

rendering different models using points, lines, and a hybrid of<br />

points and lines. We also scrutinize the results of linearly<br />

interpolating texture mapping. A video of the rendering results is<br />

at http://www.comp.nus.edu.sg/~tants/line.html.<br />

From our experiments, the numerical results (image difference<br />

between a rendered line model and a point model) show that the<br />

rendered line and point models have the same quality. Indeed,<br />

there is no significant difference detected visually between each<br />

pair of images from a line and its corresponding point model.<br />

Based on our experiments, the estimated cost of rendering a line is<br />

equal to rendering 4.3 points.<br />

We also created hybrid models (combination of lines and points).<br />

In conjunction with the result in the last paragraph, we find that<br />

the optimum hybrid model will need to convert lines which are of<br />

length less than the maximum distance covered by 4 points. We<br />

intend to perform more experiments on different platforms to<br />

better understand the tradeoff between lines and points.<br />

As for texture mapping line models, shorter lines cause less<br />

amount of interpolation error compared to longer lines.<br />

Additionally, due to the perspective mapping and linearity of<br />

lines’ texture coordinate interpolation, it is indeed the case that the<br />

23<br />

Figure 3: Opaque (top) and transparent (bottom) anti-aliased<br />

checkerboard line models. Each line is painted with only one color<br />

and lines of different colors are separately laid in checker boxes.<br />

problem of texture mapping error is worsened for lines oriented in<br />

the viewing direction. Our preliminary investigation shows that<br />

the cost of rendering a textured line to be equivalent to rendering<br />

about 5 points. An example of a textured model is the face model<br />

as shown in Figure 1.<br />

4. CONCLUDING REMARKS<br />

We are currently investigating ways to convert raw data (for<br />

example, 3D scanned points) directly to lines and also tools to<br />

support hybrid modeling using any combination of points, lines<br />

and triangles. To increase the rendering performance of the<br />

primitive, we are also looking into implementing the<br />

approximation using existing graphic hardware acceleration.<br />

Other possible research includes new data structures and<br />

algorithms to support inter-primitive and intra-primitive level-ofdetail<br />

models.<br />

References<br />

[1] P. Heckbert. Fundamentals of Texture Mapping and Image<br />

Warping. Master’s Thesis, University of California,<br />

Berkeley, June 1989.<br />

[2] M. Levoy and T. Whitted. The Use of Points as a Display<br />

Primitive. Technical Report TR 85-022, University of North<br />

Carolina at Chapel Hill, 1985.<br />

[3] H. Pfister, M. Zwicker, J. van Baar, M. Gross. Surfels:<br />

Surface Elements as Rendering Primitives. In Proc. of<br />

SIGGRAPH 2000, pp. 335–342, July 2000.<br />

[4] L. Ren, H. Pfister, M. Zwicker. Object Space EWA Surface<br />

Splatting: A Hardware Accelerated Approach to High<br />

Quality Point Rendering. In Proc. of Eurographics 2002, pp.<br />

461–470, September 2002.<br />

[5] M. Zwicker, H. Pfister, J. van Baar, M. H. Gross. Surface<br />

Splatting. In Proc. of SIGGRAPH 2001, pp. 371–378, July<br />

2001.


Visual Exploration of Association Rules<br />

Li Yang ∗<br />

Department of <strong>Computer</strong> Science, Western Michigan University<br />

Frequent itemsets and association rules[Agrawal and<br />

Srikant 1994] are difficult to visualize. This is because that<br />

they are defined on elements of the power set of a set of<br />

items that reflect the many-to-many relationships among the<br />

items. With the absence of effective technique to visualize<br />

many-to-many relationships, association rules pose fundamental<br />

challenges to information visualization.<br />

We begin by defining a few terms: An itemset is a set of<br />

items. A transaction supports an itemset if the transaction<br />

contains all items in the itemset. Support of an itemset<br />

A, support(A), is defined as the percentage of transactions<br />

that support A. The support of a rule A → B is defined<br />

as support(A ∪ B). The confidence of the rule A → B is<br />

defined as support(A ∪ B)/support(A). An item group is a<br />

transitive closure of items in a set of frequent itemsets or<br />

association rules. Mining generalized association rules with<br />

item taxonomy was proposed in [Srikant and Agrawal 1995].<br />

An example item taxonomy tree that organizes the items<br />

{a, b, c, d} is shown in Figure 1. A transaction T supports<br />

an item a if a ∈ T or a is an ancestor of some item in T under<br />

the item taxonomy. A transaction T supports an itemset A<br />

if T supports every item in A. An ancestor itemset A of A<br />

is obtained by replacing one or more items in A with their<br />

ancestors. A is then called descendent itemset of A.<br />

Frequent itemsets are downward closed according to the<br />

subset relationship and the ancestor relationship. Let I<br />

be the set of all items and IT be an item taxonomy<br />

on I. Let P(I) denote the power set of I. Define the<br />

generalized power set GP(I, IT) as GP(I, IT) = P(I) ∪<br />

{ancestor itemsets of A|∀A ∈ P(I)}, that is, GP(I, IT) contains<br />

all possible itemsets and their ancestor itemsets. Define<br />

partial order as: (1) A B if A ⊆ B; (2) A A.<br />

Then < GP(I, IT), > is a lattice. It is easy to verify<br />

that support(A) ≥ support(B) if A B. Therefore,<br />

there is a border in < GP(I, IT), > which separates<br />

the frequent itemsets from the infrequent ones. Figure 2<br />

shows an example support border in the generalized lattice<br />

< GP(I, IT), > on the items I = {a, b, c, d} under the item<br />

taxonomy in Figure 1. We use straight lines to denote subset<br />

relationships and arcs to denote ancestor relationships.<br />

An item taxonomy tree can be partly displayed in a visualization,<br />

beginning from its root and stopping at any internal<br />

nodes. An itemset is called displayable if all items<br />

in the itemset are shown in the displayed taxonomy tree.<br />

The displayable property is downward closed in the generalized<br />

itemset lattice < GP, >. Therefore, we have now two<br />

borders in < GP, >: one border separates the frequent<br />

itemsets from infrequent ones; the other border separates<br />

the displayable itemsets from indisplayable ones. For example,<br />

assume that the item taxonomy tree in Figure 1 is<br />

partly displayed so that only the items c, d, e, f are visible<br />

and items a and b are invisible, this specifies a border of<br />

displayable itemsets which is also shown in Figure 2.<br />

We can design a visualization method so that only nonredundant<br />

displayable frequent itemsets are displayed. Here<br />

non-redundant means that the frequent itemset is not im-<br />

∗ e-mail: li.yang@wmich.edu<br />

24<br />

f<br />

e f<br />

a b c d<br />

Figure 1: A simple item taxonomy tree.<br />

{ }<br />

a b c d<br />

af bf ab ac ad bc bd cd ec ed ef<br />

abf<br />

abc abd acd bcd<br />

abcd<br />

e<br />

ecd<br />

Displayable<br />

itemset<br />

border<br />

Figure 2: The lattice < GP(I, IT), >.<br />

Support<br />

border<br />

plied by any other displayed frequent itemsets. In the lattice<br />

< GP, >, the non-redundant displayable frequent itemsets<br />

must reside on the border of the intersection of frequent<br />

itemsets and displayable itemsets. Taking Figure 2 for example,<br />

ec and ed are two such itemsets on this border and<br />

should be visualized. The other displayable frequent itemsets<br />

are implied by these two itemsets.<br />

Association rules generated from a frequent itemset have<br />

also closure property. Let A be a frequent itemset and<br />

B ⊆ A, then B → (A − B) is an association rule if the<br />

support support(B) does not exceed support(A)/minconf<br />

where minconf is the user-defined minimum confidence. This<br />

means that the association rules generated from a frequent<br />

itemset are upward closed according to their LHSs in the<br />

sub-lattice formed by the frequent itemset using subset relationship<br />

as the partial order. This means that, if a → bc is<br />

a valid rule, then ab → c and ac → b are valid rules that can<br />

pass the same support and the same confidence tests. Furthermore,<br />

a → b, a → c, a → ˆ bc, a → bĉ are also valid rules.<br />

We have developed algorithms for generating displayable frequent<br />

itemsets and for generating association rules that are<br />

not implied by any other rules.<br />

Parallel coordinates have often been used[Inselberg 1990]<br />

to visualize relational records. We propose to use it to visualize<br />

data with variable lengths such as frequent itemsets and<br />

association rules. Figure 3(a) illustrates the visualization of<br />

three frequent itemsets adbe, cdb and fg as polygonal lines.<br />

Items are arranged by item groups so that items belonging<br />

to the same group are displayed together. In this way, the


g<br />

f<br />

e<br />

b<br />

d<br />

c<br />

a<br />

g<br />

f<br />

e<br />

b<br />

d<br />

c<br />

a<br />

g<br />

f<br />

e<br />

b<br />

d<br />

c<br />

a<br />

g<br />

f<br />

e<br />

b<br />

d<br />

c<br />

a<br />

(a)<br />

(b)<br />

g<br />

f<br />

e<br />

b<br />

d<br />

c<br />

a<br />

g<br />

f<br />

e<br />

b<br />

d<br />

c<br />

a<br />

g<br />

f<br />

e<br />

b<br />

d<br />

c<br />

a<br />

g<br />

f<br />

e<br />

b<br />

d<br />

c<br />

a<br />

Figure 3: Visualizing (a) frequent itemsets and (b) association<br />

rule.<br />

g1<br />

f1<br />

e1<br />

b1<br />

d1<br />

c1<br />

a1<br />

g2<br />

f2<br />

e2<br />

b2<br />

d2<br />

c2<br />

a2<br />

p0<br />

p1<br />

p2<br />

g3<br />

f3<br />

e3<br />

b3<br />

d3<br />

c3<br />

a3<br />

p3<br />

g4<br />

f4<br />

e4<br />

b4<br />

d4<br />

c4<br />

a4<br />

Figure 4: Visualizing association rules using Bezier curves.<br />

polygonal lines are organized into “horizontal bands” and<br />

never intersect with each other. Figure 3(b) illustrates the<br />

visualization of an association rule ab → cd. An association<br />

rule is visualized as one polygonal line for its LHS, followed<br />

by an arrow connecting another polygonal line for its RHS.<br />

This method provides a way to support the closure properties:<br />

Subsets of displayed frequent itemsets are implied<br />

frequent. ab → cd implies that abc → d, abd → c, ab → c<br />

and ab → d are all valid rules. If two or more itemsets or<br />

rules have parts in common, for example, adbe and cdb in<br />

Figure 3(a), we can use cubic Bezier curves instead of polygonal<br />

lines to distinguish one from the other. Two example<br />

rules, ab → ce and db → ce, are visualized in Figure 4 by<br />

using Bezier curves.<br />

We demonstrate our method by using a supermarket<br />

transaction data in IBM DB2 Intelligent Miner as the test<br />

data. The data contain 80 items which are leaf nodes of a<br />

4-level taxonomy tree. 496 frequent itemsets are discovered<br />

when the minimum support is set to 5%. The visualization<br />

begins by displaying [Root] nodes. As the user clicks on a<br />

node and expands the item taxonomy tree, frequent itemsets<br />

will be displayed. Figure 5 visualizes frequent itemsets on a<br />

partly shown taxonomy tree. The color of the name of each<br />

item or item category represents its support. Displayable<br />

frequent itemsets are visualized as smooth connections of<br />

Bezier curves. The color of a curve represent the support<br />

value of the corresponding itemset. The user can select an<br />

itemset by clicking anywhere on its curve segments. The<br />

selected itemset and its implied itemsets will be printed out<br />

with their support values.<br />

Figure 6 shows a visualization of the discovered association<br />

rules when the minimum support is set to 5% and<br />

25<br />

Figure 5: Frequent itemsets drawn on the selected items.<br />

Figure 6: Association rules drawn on the selected items.<br />

the minimum confidence is set to 50%. Association rules<br />

are aligned according to where the RHSs separate from the<br />

LHSs. In this example, the left two coordinates represent<br />

the LHSs of the rules and the right two coordinates represent<br />

the RHSs of the rules. Support of a rule is represented<br />

by line width. Confidence of a rule is represented by color.<br />

All these visualizations also support panning and zooming.<br />

The fundamental problem in the visualization of frequent<br />

itemsets and association rules is that there is a long border of<br />

frequent itemsets in the generalized itemset lattice and there<br />

is no visual technique directly applicable to displaying manyto-many<br />

relationships. We have overcome this problem by<br />

using an expandable item taxonomy tree to organize items.<br />

Basically this introduces another border which separates the<br />

displayable itemsets from the non-displayable ones. Only<br />

those frequent itemsets that are on this border are displayed.<br />

By changing this border through expanding or shrinking the<br />

display of the item taxonomy tree, we selectively visualize<br />

frequent itemsets and association rules that we are interested<br />

in.<br />

References<br />

Agrawal, R., and Srikant, R. 1994. Fast algorithms for mining<br />

association rules. In Proc. 20th Int. Conf. Very Large Data<br />

Bases (VLDB’94), 207–216.<br />

Inselberg, A. 1990. Parallel coordinates: A tool for visualizing<br />

multi-dimensional geometry. In Proc. 1st <strong>IEEE</strong> Conf. Visualization,<br />

361–375.<br />

Srikant, R., and Agrawal, R. 1995. Mining generalized association<br />

rules. In Proc. 21st Int. Conf. Very Large Data Bases<br />

(VLDB’95), 407–419.


×ØÖ Ø<br />

ÅÙÐØ ÄÚÐ ÓÒØÖÓÐ Ó ÓÒØÚ Ö ØÖ× Ò ÎÖØÙÐ ÒÚÖÓÒÑÒØ×<br />

Peter Dannenmann Henning Barthel Hans Hagen<br />

We present our approach for a general component-based animation<br />

framework for autonomous cognitive characters. In this ongoing<br />

project we develop a working platform for autonomous characters<br />

in dynamic virtual environments, where users can define high<br />

level goals, and virtual characters will determine appropriate actions<br />

based on specific domain knowledge and AI techniques. The<br />

user will also be allowed to overrule the character’s decision and<br />

force it to execute different actions. Motion sequences implied by<br />

the character’s actions will be created by adapting reference motions<br />

provided by a motion database.<br />

CR Categories: I.2.0 [Artificial Intelligence]: General—<br />

Cognitive Characters; I.3.7 [<strong>Computer</strong> Graphics]: Threedimensional<br />

Graphics and Realism—Animation;<br />

Keywords: character animation, cognitive characters, animation<br />

framework<br />

ÁÒØÖÓÙ ØÓÒ<br />

In the last years, animation and simulation of human characters in<br />

virtual environments has become ever more important in numerous<br />

areas of applications. Besides movie, gaming and advertising companies,<br />

also manufacturing industry has discovered the benefit of<br />

integrating virtual humans into their development process.<br />

Although today’s commercially available animation packages<br />

provide advanced tools like key-frame editors, inverse kinematics,<br />

etc. the creation of high quality animations is still expensive, and<br />

especially dependent on skilled artists, animators or programmers.<br />

However, the design of computer animations also is an enormous<br />

creative process and quite a lot of modification will occur, until a<br />

”generally” accepted result has been achieved. That means, until<br />

a final version of an animation has been generated, a lot of refinements<br />

considering the building of characters and environments, as<br />

well as their number, type and distribution have to be handled. All<br />

this requires significant manual animator intervention.<br />

Therefore, providing a higher degree of automation in this process<br />

is vital. Incorporating Artificial Intelligence technology within<br />

animation generation tools and procedures can efficiently support<br />

this task. Especially, introducing flexibility and adaptability to the<br />

virtual character’s behavior with respect to changing roles or a dynamically<br />

changing environment as well as techniques for reusing<br />

already existing motion sequences are of major interest.<br />

£ email: [dannenbarthelhagen]@dfki.uni-kl.de<br />

German Research Center for Artificial Intelligence £<br />

26<br />

ÒÑØÒ ÓÒØÚ ÎÖØÙÐ Ö ØÖ×<br />

Over the decades, computer animation has evolved from a purely<br />

geometrical based manipulation technique to a more powerful simulation<br />

of models by including physical principles (see e.g. [Watt<br />

and Watt 1992], [Kokkevis et al. 1996], [Sun and Metaxas 2000]).<br />

The fundamental motivation behind that entire endeavor is the<br />

automation of a variety of difficult animation tasks which especially<br />

include the creation of realistically looking and moving virtual<br />

characters. Traditional approaches to meeting those requirements<br />

were to employ highly skilled human animators who were<br />

using the labor-intensive keyframing technique.<br />

After including physical principles into the animation generation<br />

process, the next substantial progress has been the introduction of<br />

behavioral animation techniques ([Brogan et al. 1998], [Cavazza<br />

et al. 1998], [Chen et al. 2001]).<br />

Adding a cognitive layer on top of the behavioral level (see e.g.<br />

[Funge 1999]) allowed the caracters to act quite autonomously and<br />

react quite flexibly to changes in their environment. However, the<br />

direct interaction of the characters with dynamic environments as<br />

well as permitting control of the characters on all three levels of<br />

control in parallel (i.e. on direct control level, on behavioral level<br />

and on cognitive level) are still challenging subjects of research.<br />

Ì ÇÆÌÌ ÖÑÛÓÖ<br />

In the course of the project CONTACT [Barthel et al. 2003] topics<br />

related to the generation and automatic adaptation of low-level<br />

(key-frame) motions of virtual characters as well as topics related<br />

to the multi-level directability of autonomous characters are investigated.<br />

The resulting system will enable the user to define animations<br />

on a high-level basis mainly by specifying a goal for a virtual<br />

character. Based on the given domain knowledge the character automatically<br />

will work out an appropriate action plan. Additionally,<br />

the user will be allowed to overrule the character’s decision and<br />

for example force the character to fall back on some predefined behavior.<br />

When an action implies the movement of a character, the<br />

corresponding motion sequence will be created by automatically<br />

adapting reference motions provided by a motion database.<br />

The generation of animations within the CONTACT framework<br />

is an iterative process. Starting from a given description of the cognitive<br />

characters’ environment and situation, they initially plan their<br />

actions. After the planning is completed the plans are going to be<br />

executed by animating the characters. Due to the dynamic nature<br />

of the environment, changes (probably caused by the virtual characters,<br />

moving environment objects or by user interaction) may occur<br />

that make the current plans obsolete. This causes the characters<br />

involved to perform a re-planning taking into consideration the<br />

changed environment.<br />

Based on the actual plan the animation is generated by combining<br />

motion sequences of atomic actions. These atomic actions are<br />

stored within a motion database and are adapted to the executing<br />

character’s anthropometry and then combined to the animation sequence<br />

(see figure 1).<br />

The generation of the cognitive characters’ action plans is based<br />

on Funge’s work ([Funge 1998]) on the Cognitive Modeling Language<br />

(CML) that in turn has it’s foundation in the Situation Calcu-


Cognitive Cognitive Character Character<br />

(Java) (Java)<br />

Animation-Plan<br />

Animation-Plan<br />

User User Interaction Interaction<br />

Event/Interrupt/Control<br />

Data/Relation<br />

Animation Animation Control Control<br />

(Java) (Java)<br />

Primitive/Atomic<br />

Primitive/Atomic<br />

Actions<br />

Actions<br />

Legend<br />

Data/Document<br />

Data/Document<br />

Environment Description<br />

Environment Description<br />

(XML)<br />

(XML)<br />

Environment<br />

Environment<br />

(Java) (Java)<br />

Path Computation<br />

Path Computation<br />

(C++)<br />

(C++)<br />

Component Component<br />

Software Software<br />

Utility Utility<br />

Figure 1: The animation generation cycle.<br />

lus. This permits us to describe an actor’s possible atomic actions<br />

with their corresponding preconditions to determine the actions’ applicability<br />

and their effect axioms.<br />

Action Description<br />

Action Description<br />

(CML)<br />

(CML)<br />

Atomic Action<br />

Atomic Action<br />

Preconditions<br />

Effect Axioms<br />

Atomic Action<br />

Atomic Action<br />

Preconditions<br />

Effect Axioms<br />

Motion Database<br />

Motion Database<br />

Atomic Motion Sequence<br />

Atomic Motion Sequence<br />

Atomic Motion Sequence<br />

Atomic Motion Sequence<br />

Figure 2: Atomic actions and related atomic motion sequences.<br />

The atomic actions can be mere sensing actions or they can imply<br />

some movement of the character. While the atomic actions are<br />

simply stored in the CML description of the character’s capabilities,<br />

the corresponding atomic motions are stored within a Motion<br />

Database (see figure 2). This enables us to develop the character’s<br />

behavioral description separately from its implementation as reference<br />

motion sequences while we still maintain the link between<br />

logical (CML) action description and its implementation.<br />

On the basis of the given CML description of the character’s<br />

possible atomic actions we make use of Funge’s tool to generate<br />

the actual plan on the basis of the dynamic environment properties.<br />

This plan finally is used for generating the animation sequence.<br />

For a smooth integration into our framework we have enhanced<br />

the perception capabilities of Funge’s concept by giving the planning<br />

component access to the environment representation via the<br />

animation control component (see figure 1). This approach decouples<br />

the planning component from the environment representation<br />

and permits the realization of dynamic environments that can be<br />

manipulated in various ways, e.g. by definition of dynamic properties<br />

of the environment objects or by user interaction. Additionally,<br />

as mentioned above, during plan execution the virtual characters<br />

themselves can also change their environment.<br />

The geometry and structure of the virtual characters in our<br />

27<br />

framework follows the humanoid animation specification H-Anim<br />

1.1 giving a standard way of representing humanoids in VRML97,<br />

independent from any operating system and computer hardware.<br />

In order to be independent of VRML we use an XML-based description<br />

of virtual humans which can easily be generated from any<br />

H-Anim compliant model. This XML-description is read in by<br />

using software components realizing the H-Anim hierarchy. The<br />

character can then be visualized and manipulated by converting the<br />

XML-description into a Java3D scene.<br />

In order to realize a dynamic environment we decoupled the<br />

planning component from the environment representation. In the<br />

CONTACT framework this is realized by using XML to describe<br />

the environment as a hierarchy of objects, each with appropriate attributes<br />

and properties like ”weight”, ”isMovable”, ”doorIsOpen”,<br />

etc. Utilizing appropriate software components, this environment<br />

description can be read, modified and queried at runtime from other<br />

components, enabling interaction between environment and actors.<br />

An XML-description of the environment is generated by using<br />

an XML-Editor to build a hierarchy of environment objects and to<br />

alter the attributes and properties of each object. For the geometric<br />

representation of each object standard modeling tools or already<br />

existing geometry can be used.<br />

ÓÒ ÐÙ×ÓÒ<br />

In this paper we presented the design and architecture of a general<br />

animation and simulation framework for cognitive, human-like, autonomous<br />

characters. This system we are currently working on enables<br />

the user to define animations on a high-level basis mainly by<br />

specifying a goal for the virtual character. Based on the given domain<br />

knowledge the character automatically works out a sequence<br />

of appropriate actions in dynamic environments. In addition, the<br />

user is allowed to overrule the character’s decision and for example<br />

force the character to fall back on some predefined behavior. The<br />

resulting animations are computed automatically by combining reference<br />

motions available from a motion database.<br />

ÊÖÒ ×<br />

BARTHEL, H., DANNENMANN, P.,AND HAGEN, H. 2003. Towards a<br />

general framework for animating cognitive characters. In Proceedings of<br />

IASTED Visualization, Imaging, and Image Processing (VIIP 03).<br />

BROGAN, D.C.,METOYER, R.A.,AND HODGINS, J. K. 1998. Dynamically<br />

simulated characters in virtual environments. <strong>IEEE</strong> <strong>Computer</strong><br />

Graphics and Application 15, 5 (September / October), 58–69.<br />

CAVAZZA, M., EARNSHAW, R., MAGNENAT-THALMANN, N., AND<br />

THALMANN, D. 1998. Motion control of virtual humans. <strong>IEEE</strong> <strong>Computer</strong><br />

Graphics and Application 15, 5 (September / October), 24–31.<br />

CHEN, L., BECHKOUM, K., AND CLAPWORTHY, G. 2001. A logical<br />

approach to high-level agent control. In Proceedings of the 5th International<br />

Conference on Autonomous Agents, 1–8.<br />

FUNGE, J. D. 1998. Making Them Behave - Cognitive Modeling for <strong>Computer</strong><br />

Animation. PhD thesis, Department of <strong>Computer</strong> Science, University<br />

of Toronto, Toronto, Canada.<br />

FUNGE, J. D. 1999. AI for Games and Animation - The Cognitive Modeling<br />

Approach. A. K. Peters Ltd.<br />

KOKKEVIS, E., METAXAS, D., AND BADLER, N. 1996. User controlled<br />

phisics-based animation for articulated figures. In Proceedings of <strong>Computer</strong><br />

Animation 1996.<br />

SUN, H. C., AND METAXAS, D. 2000. Animation of human locomotion<br />

using sagittal elevation angles. In Proceedings of the 8th Pacific Conference<br />

on <strong>Computer</strong> Graphics and Applications.<br />

WATT, A., AND WATT, M. 1992. Advanced Animation and Rendering<br />

Techniques. Addison Wesley.


1 Motivation<br />

HistoScale: An Efficient Approach for Computing<br />

Pseudo-Cartograms<br />

Daniel A. Keim, Christian Panse, Matthias Schäfer, Mike Sips<br />

University of Konstanz, Germany<br />

Nowadays, two types of maps, the so-called Thematic Map<br />

and Choropleth Map, are used in Cartography and GIS-<br />

Systems. Thematic Maps are used to emphasize the spatial<br />

distribution of one or more geographic attributes. Popular<br />

thematic maps are the Choropleth Maps (Greek: choro =<br />

area, pleth = value), in which enumeration or data collection<br />

units are shaded to represent different magnitudes of a<br />

variable. Besides, the statistical values are often encoded as<br />

colored regions on these maps. On both types of maps, high<br />

values are often concentrated in densely populated areas, and<br />

low statistical values are spread out over sparsely populated<br />

areas. These maps, therefore, tend to highlight patterns in<br />

large areas, which may, however, be of low importance. A<br />

cartogram can then be seen as a generalization of a familiar<br />

land-covering choropleth map. According to this interpretation,<br />

an arbitrary parameter vector gives the intended sizes<br />

of the cartogram’s regions, that is, a familiar land-covering<br />

choropleth map is simply a cartogram whose regions sizes<br />

proportional to the land area. In addition to the classical applications<br />

mentioned above, a key motivation for cartograms<br />

as a general information visualization technique is to have a<br />

method for trading off shape and area adjustments. Pseudo-<br />

Cartograms provide an efficient and convenient approximation<br />

of cartograms, since a complete computation of cartograms<br />

is expensive. In this poster, we propose an efficient<br />

method called HistoScale to compute Pseudo-Cartograms.<br />

2 HistoScale Approach<br />

The basic idea of the HistoScale method is to distort the map<br />

regions along the two euclidian dimensions x and y. The<br />

distortion depends on two parameters, the number of data<br />

items which are geographically located in this map area, and<br />

the area covered by this map region on the underlying familiar<br />

land-covering map. The distortion operations can be<br />

efficiently performed by computing a histogram with a given<br />

number of bins in two euclidian dimensions x and y to determine<br />

the distribution of the geo-spatial data items in these<br />

dimensions. The two histograms are independent from each<br />

{keim,panse,schaefer,sips}@informatik.uni-konstanz.de<br />

Stephen C. North<br />

AT&T Shannon Laboratory, Florham Park, NJ, USA<br />

north@research.att.com<br />

<strong>28</strong><br />

other, that means, the computation of the histograms can be<br />

random. The two consecutive operations in the two euclidian<br />

dimensions x and y realize a grid placed on a familiar landcovering<br />

map. The number of histogram bins can be given<br />

by the user. For a practicable visualization we suggest 256<br />

histogram bins for both histograms.<br />

Each of the histogram bins covers an area on the underlying<br />

familiar land-covering map. To determine this area,<br />

the HistoScale method computes the upper and the lower<br />

point of intersection with the underlying map. The minimal<br />

bounding box containing the points of intersection and<br />

the preceding bin approximate the covered area for each histogram<br />

bin. The next step is to rescale the minimal bounding<br />

box of each histogram bin and, at the same time, the associated<br />

map regions in such a way, that our HistoScale method<br />

fulfills the cartogram condition. That is, the covered map<br />

area is equal to the number of geographically located data<br />

items in the map region. The area covered by the minimal<br />

bounding box is determined by the width (equal for all histogram<br />

bins) and the height H<strong>MB</strong>B of the minimal bounding<br />

box (this one being different for each histogram bin). Therefore,<br />

we compute new widths for each of the minimal bounding<br />

boxes while, at the same time, the heights of the minimal<br />

bounding boxes remain unmodified. The new lengths of the<br />

minimal bounding boxes can be determined using the following<br />

formula:<br />

⇒ | −−−−−→<br />

hb ′ i−1hb ′ i| = ∑|DB| j=1 |p = (x j,yj) ∈ hbi|<br />

| −−−→<br />

upil pi|<br />

· ∑|hb|<br />

k<br />

k=1<br />

(A<strong>MB</strong>B)<br />

∑ |hb|<br />

k=1 (A′ <strong>MB</strong>B )k<br />

with ∀i ∈ {1,··· ,|hb|} : (A<strong>MB</strong>B) i = | −−−−−→<br />

hbi−1hbi| · | −−−→<br />

upil pi|<br />

∀i ∈ {1,··· ,|hb|} : (A ′ <strong>MB</strong>B) i |DB|<br />

=<br />

∑<br />

j=1<br />

|p = (x j,yj) ∈ hbi|<br />

where A ′ <strong>MB</strong>B is the area of a minimal bounding box <strong>MB</strong>B of<br />

each histogram bin, hb = {hb1,··· ,hbm} the end points of<br />

all histogram bins, and l p, up the lower and upper points of<br />

intersection. To compute the new boundaries in an efficient<br />

way, our HistoScale algorithm only needs to compute the<br />

new end points of each histogram bin. The original and new


VisualPoints<br />

Tobler Pseudo Cartogram<br />

Kocmoud and House<br />

CartoDraw Interactive<br />

CartoDraw Automatic<br />

CartoDraw *HistoScale<br />

*HistoScale<br />

10^1 10^2 10^3 10^4 10^5<br />

Time [Seconds]<br />

Figure 1: Time Comparison - we have assumed a 120MHz<br />

Intel CPU to compute the US-State Cartograms<br />

end points of each histogram bin are stored in an array in<br />

ascending order.<br />

After rescaling certain map regions our HistoScale algorithm<br />

computes the new coordinates of the map polygon<br />

mesh. The basic idea is to determine for each polygon node<br />

the original histogram bin in which this polygon node is geographically<br />

located. The search for this bin can be done in<br />

logarithmic time using binary search.<br />

3 Application and Evaluation<br />

The resulting output maps are referred to as pseudocartograms,<br />

since they are only approximations to the true<br />

cartogram solution. On the other hand our approach generates<br />

interesting maps and good solutions in least square<br />

sense. The computation of pseudo-histograms using our HistoScale<br />

algorithm can be done in real-time (see figure 1).<br />

Due to the run time behavior, HistoScale can be used as a<br />

preprocessing step for other cartogram algorithms. Figure<br />

1 shows, that the computation time of the CartoDraw algorithm<br />

can be reduced without losing any quality. Figure 2<br />

shows several interesting applications using our HistoScale<br />

algorithm. The world population pseudo-cartogram shows<br />

clearly, that China and India are the most populated world<br />

regions. This fact has for example an important influence on<br />

the evolution of epidemics such as SARS, as unknown epidemics<br />

in such areas can be dangerous for the whole world<br />

population. The USA Pseudo-Cartogram clearly shows the<br />

two most populated areas, which are, New York City and Los<br />

Angles County.<br />

References<br />

Christopher J. Kocmoud and Donald H. House. Continuous cartogram construction.<br />

In <strong>IEEE</strong> Visualization, Research Triangle Park, NC, pages<br />

197–204, 1998.<br />

Daniel A. Keim, Stephen C. North, Christian Panse, and Jörn Schneidewind.<br />

Visualizing geographic information: VisualPoints vs CartoDraw. Palgrave<br />

Macmillan – Information Visualization, 2(1):58–67, March 2003.<br />

W.R. Tobler. Pseudo-cartograms. The American Cartographer, 13(1):43–<br />

40, 1986.<br />

29<br />

(a) World Population Pseudo-Cartogram<br />

(b) World SARS Pseudo-Cartogram (gray indicates<br />

countries with SARS cases)<br />

●<br />

Gore Bush<br />

(c) The area of the states in the cartogram corresponds<br />

to population and the color corresponds<br />

to the percentage of the votes. A bipolar colormap<br />

is used to show which candidate won the<br />

state.<br />

(d) NY-State Pseudo-Cartogram with texture<br />

mapping<br />

Figure 2: Application examples


Abstract<br />

A Volume Rendering Extension for the OpenSG Scene Graph API<br />

Thomas Klein Manfred Weiler Thomas Ertl<br />

Institute of Visualization and Interactive Systems, University of Stuttgart<br />

Universitätsstr. 38, 70569 Stuttgart, Germany; E-mail: {klein, weiler, ertl}@vis.uni-stuttgart.de<br />

We will present the current state of our ongoing work on a simple<br />

to use, extensible and cross-platform volume rendering library.<br />

Primary target of our framework is interactive scientific visualization,<br />

but volumetric effects are also desirable in other fields of computer<br />

graphics, e.g. virtual reality applications. The framework we<br />

present is based on texture-based direct volume rendering. We apply<br />

the concept of volume shaders and demonstrate their usefulness<br />

in terms of flexibility, extensibility and adaption to new or different<br />

graphics hardware. Our framework is based on the OpenSG<br />

scene graph API, that is designed especially with multi-threading<br />

and cluster-rendering in mind, thus, it is very easy to integrate volumetric<br />

visualizations into powerful virtual reality systems.<br />

Keywords: texture-based direct volume rendering, scene graph<br />

API, virtual reality<br />

1 Introduction<br />

Visualization of volumetric data is of utmost interest not only in the<br />

field of scientific visualization but also in other areas of computer<br />

graphics, like virtual reality or computer animation. Although there<br />

are already some scene graph APIs available that support volumetric<br />

objects [1, 4] their solutions mostly lack the possibility for easy<br />

development of platform-independent and extensible applications.<br />

Our extension to the OpenSG scene graph library [2] is especially<br />

focused on these issues. The major design goals were: platform<br />

independence, extensibility, flexibility, usability, and seamless integration<br />

into the existing scene graph system.<br />

Our work is closely related to SGI’s OpenGL Volumizer [3].<br />

However our implementation is not limited to SGI hardware only,<br />

especially as with the OpenGL Volumizer 2.x releases the support<br />

for graphics systems other than InfiniteReality was canceled by<br />

SGI. Instead we support a wide range of platforms and graphics<br />

adapters from low-cost PC hardware like the NVIDIA GeForce to<br />

high-end visualization systems as the SGI Onyx family.<br />

The framework we present is built upon the OpenSG scene<br />

graph API, a real-time rendering system especially designed for<br />

use in multi-threaded, multi-pipe, and cluster-rendering environments.<br />

OpenSG is a freely available open source project written<br />

in C++ utilizing OpenGL as low-level graphics library. It<br />

provides an extensive set of scene graph nodes based on multithreading<br />

aware container classes, thus, allowing the easy development<br />

of multi-threaded virtual reality systems. Another advantage<br />

of OpenSG is its portability. It is known to work on many different<br />

platforms including Linux, Irix, Solaris, and Microsoft Windows.<br />

Our volume rendering extension is originating from the OpenSG<br />

PLUS project [2], funded by the German Ministry for Research<br />

and Education (B<strong>MB</strong>F), in which nine German research institutions<br />

(universities and independent research groups) are cooperating<br />

in the development of important basic technology for OpenSG.<br />

This includes support for very large scenes, higher level primitives<br />

like subdivision surfaces, and high-level shading on contemporary<br />

graphics hardware.<br />

30<br />

2 Implementation<br />

The volume rendering extension we have implemented provides a<br />

special volume rendering node for the scene graph consisting of<br />

several modules, that provide the basic infrastructure, and a couple<br />

of volume shaders encapsulating the actual mapping algorithm. We<br />

use a texture-based direct volume rendering algorithm [5] employing<br />

either 3D or 2D texture maps depending on the capabilities of<br />

the underlying graphics hardware. Fig. 1 shows the internal structure<br />

of the volume node and the interaction between the different<br />

modules.<br />

Te xture<br />

Manager<br />

activate brick<br />

register<br />

textures<br />

Renderer<br />

callback<br />

Shader<br />

clipped slice<br />

render slice<br />

per vertex<br />

data<br />

Clipper<br />

slice data<br />

Slicer<br />

Figure 1: The modular design of the volume node.<br />

The renderer module is the controlling instance that is responsible<br />

for steering the whole rendering process. It initiates the generation<br />

of slice polygons—either view port parallel or axis aligned,<br />

depending on the available texture targets—by the slicer and calls a<br />

shader module that renders the resulting slices with an appropriately<br />

OpenGL setup and the textures supplied by the texture manager.<br />

In 3D texture mode, the renderer and the texture management<br />

modules are also responsible for volume bricking, since texture<br />

memory is always a short resource with respect to the ever growing<br />

size of volume data sets. The volume is split into bricks or<br />

tiles which completely fit into the available texture memory. The<br />

renderer assures, that the bricks are rendered in back-to-front order<br />

with the texture manager providing the suitable texture maps. Special<br />

care is taken of textures that cannot be bricked but have to stay<br />

resident in the texture memory, e.g. textures used for dependent<br />

lookups or transfer function tables.<br />

The shader module is different from the other modules shown in<br />

Fig. 1, in such that it can be exchanged by means of a plug-in concept.<br />

This makes it possible to change the visualization algorithm<br />

by simply replacing the shader module. The shader is responsible<br />

for registering the volume data as a texture with the appropriate<br />

format. It is also possible for a shader to specify an arbitrary<br />

number of per-vertex attributes which will be linearly interpolated<br />

along the edges of the slice geometry. Because the volume node<br />

only implements the infrastructure needed for rendering and the actual<br />

OpenGL setup is done by the pluggable shader modules, new<br />

volume rendering algorithms or hardware specific implementations<br />

can easily be handled by providing customized shader objects. In<br />

this context the shader functionality also provides an abstraction


Figure 2: Different shading modes for the same volume data set are presented. On the left an appropriate transfer function is applied. The<br />

second image shows an iso-surface diffusely lit by 6 differently colored light sources. The third one shows the same iso-surface lit by 3 light<br />

sources with diffuse and specular contribution. The last image depicts how a volume can be modified by a geometrically defined clip object.<br />

layer, separating the desired rendering effect from the available<br />

hardware support.<br />

A major problem in volume visualization is occlusion and, therefore,<br />

it is often desirable to remove parts of the volume in order to<br />

reveal interior structures. A transfer function is often not sufficient<br />

to achieve that goal and at the same time to emphasize the structures<br />

one is interested in. In order to bypass this limitation volume<br />

clipping can be used, that removes parts of the volume given<br />

by one or more clip geometries. The clip geometries representing<br />

closed manifolds specified by geometry nodes in the scene graph<br />

can be interactively assigned to a volume. Clipping is implemented<br />

as slice clipping which means that we are not rendering the complete<br />

slice polygons but only the sections that—depending on the<br />

user-selected clipping mode—lie either within or without the clip<br />

objects. These sections are computed by intersecting the triangulated<br />

clip geometries with the volume slices using a fast incremental<br />

Sutherland-Hodgman-like algorithm that exploits the coherence<br />

between the contours on successive slices. Afterwards the clipped<br />

slice polygons are determined by tessellation based on those polylines<br />

and the clipping mode. Because clipping is done completely<br />

in software this does not interfere with the shader concept. Note<br />

that clipping based on tagged clip textures [6], could easily be implemented<br />

as a special shader module.<br />

3 Results<br />

In this section some example images, generated using a simple interactive<br />

volume viewer application built upon the previously described<br />

framework, will be presented. In order to demonstrate the<br />

applicability of the volume shader concept we show some example<br />

images using different volume shader modules.<br />

The first example, the semi-transparent rendering shown in the<br />

leftmost image of Fig. 2, was generated by rendering a 256×256×<br />

1<strong>28</strong> voxel data set using our color table shader. This shader implements<br />

the most common approach in direct volume rendering—the<br />

mapping of data values to color and opacity using a tabulated transfer<br />

function. The image shown was generated on a SGI Onyx4<br />

UltimateVision visualization system using a fragment program that<br />

realizes a post-shading transfer function by means of a dependent<br />

texture lookup. Second, we present an example of the extraction of<br />

iso-surfaces from volume data sets. We implemented a shader module<br />

that renders shaded iso-surfaces with an algorithm as introduced<br />

in [7] that was slightly enhanced regarding the lighting computation.<br />

Both shaders adapt to the capabilities of the available graphics<br />

hardware in selecting an optimal OpenGL setup with respect to image<br />

quality and performance. The second and third image in Fig. 2<br />

show two examples of illuminated iso-surfaces from the aforementioned<br />

data set rendered on a NVIDIA GeforceFX. They differ in<br />

the number and properties of the applied light sources. The first<br />

31<br />

one was illuminated by six different purely diffuse lights while in<br />

the second one three lights with both, specular and diffuse contributions<br />

were used. The last image in the row of Fig. 2 shows an<br />

example of a clipped volume. A cylindrical geometry is applied as<br />

clip object to unveil the interior of the skull that is rendered as a<br />

specularly lit iso-surface.<br />

4 Conclusion<br />

In this document we have briefly described an extension of the<br />

OpenSG scene graph with a framework for texture-based direct<br />

volume rendering. Volumetric objects can be included into any<br />

OpenSG scene. The framework fits seamlessly into the existing<br />

scene graph structure of OpenSG, thus, enabling the application<br />

programmer to use volumetric effects without any additional effort.<br />

This in particular includes parallel rendering applications in<br />

a cluster environment, as we will demonstrate with a simple setup<br />

using four PCs to drive a large stereographic rear projection display<br />

system. The framework hides the intricate tasks of texture<br />

management, slice generation or volume clipping from the developer<br />

reducing his work to the task of providing the data and selecting<br />

the right shader to achieve the desired effect. Additionally,<br />

employing the hardware abstraction layer provided by the volume<br />

shader concept, it is easy to realize new shaders to support different<br />

graphics adapters or adapt an existing shader to use new features of<br />

upcoming graphics chips generations. We have demonstrated the<br />

usefulness of this concept with examples of shaders encapsulating<br />

different volume rendering techniques, e.g. iso-surfaces.<br />

References<br />

[1] OpenRM, http://openrm.sourceforge.net/.<br />

[2] OpenSG, http://www.opensg.org/.<br />

[3] SGI OpenGL Volumizer,<br />

http://www.sgi.com/software/volumizer/.<br />

[4] TGS, Open Inventor VolumeViz extension,<br />

http://www.tgs.com/.<br />

[5] C. Rezk-Salama, K. Engel, M. Bauer, G. Greiner, and T. Ertl. Interactive<br />

Volume Rendering on Standard PC Graphics Hardware Using<br />

Multi-Textures and Multi-Stage-Rasterization. In Eurographics / SIG-<br />

GRAPH Workshop on Graphics Hardware ’00, pages 109–118,147,<br />

2000.<br />

[6] D. Weiskopf, K. Engel, and T. Ertl. Volume Clipping via Per-Fragment<br />

Operations in Texture-Based Volume Visualization. In Procceedings of<br />

<strong>IEEE</strong> Visualization ’02, pages 93–100, 2002.<br />

[7] R. Westermann and T. Ertl. Efficiently using graphics hardware in volume<br />

rendering applications. In <strong>Computer</strong> Graphics (SIGGRAPH 98<br />

Proceedings), pages 169–177, 1998.


Interactive Poster: KMVQL: a Graphical User Interface for Boolean<br />

Query Specification and Query Result Visualization<br />

1. Introduction<br />

Jiwen Huo, William B. Cowan<br />

School of <strong>Computer</strong> Science, University of Waterloo<br />

jhuo@cgl.uwaterloo.ca, wbcowan@cgl.uwaterloo.ca<br />

Information is being created and becoming<br />

available in ever growing quantities [1]. Users<br />

face an information overload problem and<br />

require tools to help them explore this vast<br />

universe of information in a structured way.<br />

In information exploration, users specify terms<br />

of interest joined by query language operators.<br />

Boolean logic is commonly exploited in query<br />

languages. But it has been shown that users have<br />

difficulty in formulating Boolean queries and<br />

analyzing the query results [1, 2].<br />

In this poster, we present a technique called<br />

KMVQL (Karnaugh Map-based Visual Query<br />

Language), which is a visualization method<br />

based on Karnaugh maps. It can be used as a<br />

visual query language and as a visualization tool<br />

which shows the relationship between query<br />

terms and data sets.<br />

2. Karnaugh Map<br />

A Karnaugh map [3] (K-Map), originally<br />

proposed by Maurice Karnaugh, is a twodimensional<br />

tabular layout of a truth table. It<br />

represents each of the queries from n input<br />

variables as one cell of a table making the<br />

simplification of Boolean expressions easy and<br />

intuitive.<br />

Using a K-Map, specifying a Boolean query<br />

accounts to selecting cells in the K-Map.<br />

Therefore K-Map is a useful component for<br />

designing visual query languages.<br />

But as the number of input variables increases,<br />

the size of a K-Map grows exponentially,<br />

making it difficult to understand and use. To<br />

alleviate this problem, KMVQL uses color<br />

coding principle to enhance the K-Map display.<br />

3. KMVQL<br />

Figure 2: The four components of KMVQL<br />

32<br />

Figure 1: K-Map with three variables.<br />

In this example, there are four selected cells<br />

surrounded by three circles. The expression<br />

reduces to: BC + AC + AB.<br />

KMVQL incorporates dynamic query [4] techniques<br />

in the form of K-Maps. There are four basic<br />

components of KMVQL: the data source, an<br />

attribute value control window, a K-Map control<br />

window, and the final visualization.<br />

The attribute value control window contains a set<br />

of selectors (sliders, radio buttons, check boxes,


etc.) used to specify limits for the query terms.<br />

Each of the selectors is assigned a unique color<br />

and has a check box related with it. If a check<br />

box is checked, the attribute related with it is<br />

used as a query term.<br />

The K-Map control window displays an<br />

enhanced K-Map which is used to specify the<br />

Boolean structure of the query and provide an<br />

intermediate visualization for the data items. The<br />

number of query terms equals the number of<br />

selected attributes in the value control window;<br />

the color of the tabs corresponds to the color of<br />

the selector check boxes. The data items that<br />

meet specific query terms are displayed in<br />

corresponding cells of the K-Map. This display<br />

shows the contribution of each query term to the<br />

query results.<br />

Of necessity, the attribute value control, K-Map<br />

control, and the final visualization are tightly<br />

coupled. The K-Map control acts as a<br />

middleware joining the other components. In<br />

traditional dynamic query systems, no such<br />

middleware exists, the resulting query is limited<br />

to the conjunction of predetermined selectors.<br />

But using K-Map control, arbitrary Boolean<br />

queries can be easily formulated.<br />

4. Query Formulation with KMVQL<br />

To specify a query, users need to find the cells<br />

related with their information need and select<br />

them in the K-Map. The resulting query is the<br />

disjunction of the Boolean queries associated<br />

with the selected cells.<br />

With KMVQL, users can construct multiple K-<br />

Maps and store them for further use. There are<br />

two approaches to construct hierarchical queries:<br />

1) The stored K-Maps can be added into the<br />

value control window and their output queries<br />

can be used as query terms in the new K-Map.<br />

2) The data items that matched the query<br />

represented by a K-Map can be used as the data<br />

source of a new K-Map.<br />

These two approaches can be combined. In this<br />

way, users are relieved of using raw logical<br />

operators and parentheses when specifying<br />

queries.<br />

33<br />

5. Conclusion<br />

KMVQL is a new method that can be used as a<br />

visualization tool and a visual query language.<br />

In KMVQL, dynamic query techniques are<br />

incorporated using K-Maps that allow users<br />

specify Boolean queries graphically by<br />

interacting with a direct manipulation visual<br />

interface. It also visualizes context information<br />

for query results and provides a partial ordering<br />

of the results.<br />

Future work will involve user studies to test the<br />

usability and effectiveness of KMVQL,<br />

extensions to deal with fuzzy logic and vector<br />

space queries, multiple visualization methods,<br />

and more tools for user-specified visualization.<br />

References<br />

[1] Spoerri,A., “InfoCrystal: A Visual Tool For<br />

Information Retrieval”, Ph.D. Research<br />

and Thesis at MIT, 1995<br />

[2] Young,D. and Shneiderman,B., “A<br />

graphical filter/flow model for Boolean<br />

queries: An implementation and<br />

experiment”, Journal of the American<br />

<strong>Society</strong> for Information Science, 44(6):327-<br />

339, July 1993.<br />

[3] Karnaugh,M., “The Map Method for<br />

Synthesis of Combinational Logic<br />

Circuits”, AIEE V72, 1953, 593-599<br />

[4] Shneiderman,B., “Dynamic Queries for<br />

Visual Information Seeking”, <strong>IEEE</strong><br />

Software, 11(6):70-77, 1994


Abstract<br />

Visualization of 2-manifold eversions<br />

The visualization shows step by step how a two-dimensional<br />

torus without a disk and a pretzel whithout a disk can be turned<br />

inside out in R 3 by a continuous topological operation.<br />

CR Categories: Categories and Subject Descriptors (according<br />

to ACM CCS): I.3.7 [<strong>Computer</strong> Graphics]: Three-Dimensional<br />

Graphics and Realism, I.3.5 [<strong>Computer</strong> Graphics]:<br />

Computational Geometry and Object Modeling<br />

Key words: eversion, visualization, 2-manifold, torus, pretzel<br />

1 Introduction<br />

Let’s T 2 ( r1 , r2 ) is a two-dimensional torus embedded in R 3<br />

with radius r1 of the minimal parallel and radius r2 of the<br />

meridian. Then the following statement take place:<br />

T 2 ( r1 , r2 ) is homeomorphic to the torus T 2 ( r2 , r1 )<br />

obtained from the initial torus by his eversion inside out. The<br />

proof can be deduced from the Smale theorem about the Sphere<br />

Eversion .<br />

Analogously, torus without disk D<br />

T 2 ( r1 , r2 ) \ D is homeomorphic to the inverted<br />

T 2 ( r2 , r1 ) \ D<br />

2 Eversion of a torus without a disk<br />

Let’s construct the torus without a disk as a two-sided frontback<br />

coloured 3d object within „Amorphium“ („Play Inc.“) and<br />

turn it inside out using the topologocal tools of the program.<br />

M. Langer<br />

McGill University<br />

34


3 Eversion of a pretzel without a disk<br />

35<br />

4 Conclusion<br />

The presented visualizations help to feel the structure of the<br />

torus and pretzel (spheres with 1 and 2 handles respectively)<br />

and lead to an understanding of the eversion idea for any<br />

manifold from the class of smooth orientable 2-manifolds in R 3<br />

(spheres with handles).<br />

Many interesting mathematical objects are difficult to visualize.<br />

Even simple visualizations should be sometimes helpfull to find<br />

new theorems as well as new proof ideas of known theorems.<br />

The software „Amorphium“ helps to do this.<br />

With help of this program we can analyse complicated surfaces<br />

with many self-intersections, select the areas with „correct<br />

normals“ and the areas with „inverted normals“ of the surfaces;<br />

verify their connectedness; make various operations which do<br />

not change the topology of the object, such as „Stretch, Bend,<br />

Shear, Smooth“ etc.<br />

Acknowledgements<br />

Many thanks for discussions to H.-Ch. Hege (Konrad–Zuse-<br />

Zentrum für Informationstechnik Berlin), to P. Pushkar<br />

(Moscow Independent University and Toronto University) and<br />

to Moscow Centre for continuous mathematical education.<br />

References<br />

SMALE, S. 1958. A Classification of Immersions of the<br />

Two-Sphere. In Trans. Amer. Math. Soc. 90, <strong>28</strong>1-290.<br />

FRANCIS, G. K. 1987. A Topological Picture book. Springer.


1 Motivation<br />

Prefetching in Visual Simulation<br />

Chu-Ming Ng + , Cam-Thach Nguyen + , Dinh-Nguyen Tran + , Shin-We Yeow*, Tiow-Seng Tan +<br />

This project examines the problem of visual simulation of virtual<br />

environment that is too large to fit into the main memory of a PC.<br />

We broadly classify the problem into three subproblems: render,<br />

query and prefetch, which correspond, respectively, to processing<br />

data to be displayed, identifying and organizing data to be<br />

retrieved, and retrieving (identified) data into main memory in<br />

anticipation of the need to render them in the near future. Unlike<br />

the first two subproblems, there is little existing work that reports<br />

the prefetch subproblem in detail. Some existing applications<br />

adopt advanced data indexing and layout in an application specific<br />

way but leave the operating system to do the actual fetching<br />

(paging) of data during runtime (see, for example, [LiPa02]).<br />

There are also approaches that use speculative prefetching based<br />

on the viewer’s current position and velocity to prefetch data<br />

needed for future frames (see, for example, [CGLL98]). These are<br />

sometimes coupled with sophisticated occlusion culling<br />

techniques that reduce the amount of geometry that needs to be<br />

fetched from disk (see, for example, [VaMa02]). On the whole, the<br />

focus of current approaches is in solving the render and query<br />

subproblems but it is not clear how these methods can provide<br />

specific quality-of-service guarantee with respect to page fault<br />

rates. The general lack of quantitative work on the prefetch<br />

subproblem underlies our motivation to study it in detail.<br />

2 Prefetching Issues<br />

The main objective of any prefetching mechanism is to ensure that<br />

any data that are needed for processing during any time of the<br />

visual simulation are already loaded into the memory. Failure of<br />

the prefetching mechanism to maintain the above objective results<br />

in the occurrence of page faults. The aim of all prefetching<br />

mechanism is thus to minimize the number of page faults to<br />

support a given operating environment with a given system<br />

configuration. At any time ti for the observer O, let Si be the<br />

amount of data in the main memory M, and F the needed set of<br />

data in its current viewing frustum. Then, prefetching wishes:<br />

F ⊆ Si. (1)<br />

Also, Si ⊆ M. (2)<br />

To maintain equation (1), one must perform some prefetching<br />

starting at some later time t to obtain S j by time t j. While<br />

prefetching is on-going, O continues with its movement. Then, S i<br />

must be large enough to fulfill equation (1) till Sj is available, i.e.<br />

F ⊆ S i for all time till time t j, (3)<br />

and those data to be fetched (i.e. those in Sj but not in S i) must not<br />

be larger than the amount of data that can be fetched from disk to<br />

the main memory from t till t j:<br />

Sj – S i ≤ H( t j – t ) (4)<br />

Emails: {ngchumin | nguyenca | trandinh}@comp.nus.edu.sg,<br />

shinwe@gelement.com, tants@comp.nus.edu.sg,<br />

+ National University of Singapore *G Element Pte Ltd<br />

36<br />

where H is the system data transfer rate (see Figure 1). When<br />

equation (4) is not achieved by a prefetching request, we call it a<br />

scheme failure. A scheme failure at time t j may not result in a page<br />

fault at time t j as those pages that are yet to be fetched may not be<br />

needed yet. Though scheme failures may be tolerable, they result<br />

in no guarantee in system performance and thus should be<br />

avoided. On the other hand, a page fault is a result of a scheme<br />

failure. As such, we re-state the aim of prefetching mechanism as<br />

minimizing the number of scheme failures.<br />

3 Prefetching Schemes<br />

Figure 1. At time t, the<br />

prefetching mechanism decides<br />

to start prefetching to obtain Sj.<br />

As part of Sj is the same as that<br />

of Si, the fetching only needed<br />

for the part in Sj – Si.<br />

For purposes of analysing prefetching schemes, we make the<br />

following assumptions. First, we assume it is a 2D map with<br />

uniform data density ρ. Though this is unlikely the case in<br />

practice, one choice is to set the density to be the highest density<br />

of the map. This is reasonable as in the worst case, O can spend all<br />

the time moving in the highest density part of the map. Second, we<br />

assume that a prefetching scheme maintains the same size of the Sj each time it calculates Sj. Third, at any time, there is at most one<br />

outstanding prefetching work. That is, no prefetching thread can<br />

be initiated until an ongoing prefetching has completed. If not, the<br />

analysis can be modified to as if there is only one pending<br />

prefetching work.<br />

Shapes of Prefetch Region. We consider two shapes of prefetch<br />

region. First, we have the fan shape as shown in Figure 2. Suppose<br />

the motion of O is governed by its maximum speed ν and view<br />

direction can change by a maximum angular speed ω. Then, the<br />

calculation of Sj at location rj (see also Figure 1) is such that the<br />

shortest amount of time τ for the frustum of O to touch Bj starting<br />

at time t is the same in all directions for the given ν and ω. Such τ<br />

is the amount of time that the system has enough data to run till<br />

time t + τ without fetching more data. Second, we have the circle<br />

shape as shown in Figure 3 where it extends the fan shape with<br />

extra data to make it a circle. The reason to consider the circle<br />

shape is that it eliminates the contribution of rotation to Sj – Si, resulting in smaller amount of data need to be fetched each time.<br />

It, however, requires much more memory to work.<br />

Figure 2. Fan shape where the center<br />

shaded triangle is the current viewing<br />

frustum of O.<br />

extended<br />

part<br />

Figure 3. Circle shape<br />

extended from the fan<br />

shape.


One way to categorize prefetching schemes is to examine their<br />

decisions on (a) whether to triggle prefetching at current time t and<br />

(b) if so, the amount of data to be fetched, while honoring<br />

equations (1) to (4). For a pre-determined shape of S i, both<br />

decisions depend on the only factor of the distance of the current<br />

frustum F to the boundary Bi of S i. The reasoning being as<br />

follows: the nearer the distance, the less amount of time available<br />

for prefetching before O moves out of B i to possibly result in a<br />

page fault, and the larger the amount of data in Sj – S i to fetch; see<br />

again Figure 1 for the illustration. We analyse the following two<br />

prefetching schemes.<br />

Spatial Prefetching Scheme. The spatial prefetching scheme<br />

employs a closed curve to be a threshold boundary b i as follows:<br />

The system does not perform data fetches until the current frustum<br />

touches the threshold boundary bi. When it does at time t, it<br />

defines a new reference point r j at the observer location to<br />

calculate B j so as to fetch S j – S i and to set the new threshold<br />

boundary bj. It can be argued that the best spatial prefetching<br />

scheme S to support the largest ρ is one with the shape S where its<br />

reference point to the threshold boundary, and the threshold<br />

boundary to its boundary are both of time τ away, for a total of 2τ<br />

of data contained in S.<br />

Temporal Prefetching Scheme. For the above spatial prefetching<br />

scheme S, the “busiest” situation is when it finishes a prefetch, the<br />

frustum again touches the threshold boundary and thus<br />

immediately initiates a new prefetch, and so on. In this case, S<br />

prefetches in every τ interval. Same in spirit to the “busiest”<br />

situation of S, a temporal prefetching scheme T does its<br />

prefetching at some regular interval of τ. To initiate a prefetch at<br />

time t, it also sets the new reference point r j at the current location<br />

of the observer to calculate B j so as to fetch S j – S i where S i has<br />

sufficient data to enable computation till t + τ, and S j will contain<br />

enough data to enable computation from t + τ till t + 2τ. The<br />

scheme does not set any threshold boundary, but is to be<br />

implemented with system interrupt at regular intervals of τ to<br />

triggle prefetching.<br />

4 Relationship between τ and ρ<br />

This section presents a relationship between data density ρ and the<br />

amount of prefetching time τ available for the temporal<br />

prefetching scheme T. The result also applies to the spatial<br />

prefetching scheme S as we discussed in the last section that S<br />

converges to T in the worst case. To obtain the mentioned<br />

relationship, we first study the maximum complement Sj – Si as in<br />

Figure 1 using simple geometry. Let l denote the distance of the<br />

far plane from the observer. We have:<br />

(a) For fan shape:<br />

2<br />

S j Si 2 2 l<br />

(4 2 l) ( l )<br />

2<br />

ω<br />

−<br />

= ν + ων τ + ν + τ<br />

ρ<br />

(b) For circle shape:<br />

S j − Si − vτ = ( π − 2cos ( ))(2 vτ + l) ρ 2(2 vτ + l)<br />

+ vτ (2 vτ + l)<br />

vτ<br />

−(<br />

)<br />

2<br />

1 2 2 2<br />

In the ideal case where the disk performance can be approximated<br />

as a linear function H(τ) = K(τ – ε), where K and ε are constants,<br />

we can substitute the above into equation (4) to plot Figure 4 as<br />

shown. Because (S j – Si)/ρ is a quaratic function of τ, while H(τ) is<br />

a linear function, a large τ may result in bad performance (small ρ<br />

supported). On other hand, if τ is too small, harddisk overhead<br />

contributes a big percentage in transfer time and result in bad<br />

performance. In other words, there is a range of suitable τ to be<br />

used to obtain good performance. This is conformed to the<br />

experiments discussed in the next section.<br />

37<br />

Figure 4. Density<br />

ρ can be supported<br />

as a function of τ.<br />

5 Experiment on Terrain Walkthrough<br />

In our experiments, terrain data are used as it is easy to create<br />

terrain datasets of different densities in a 2D map. Data are stored<br />

in a grid of cells. We use temporal prefetching scheme that is<br />

implemented as a thread separated from other threads such as the<br />

rendering one. To realise the worst case situation, we force the<br />

observer to run on a “tricky path” where the amount of data to be<br />

prefetched is maximum each time a prefetching is performed. We<br />

have run experiments on four densities, ranging from 75 to 112.5<br />

Kbytes per cell with the chosen parameters indicated in the graph.<br />

Our preliminary experiment results conform to the theoretical<br />

prediction outlined in the previous section; that is, there is a good<br />

range of τ with small number of scheme failures.<br />

Number of scheme failures.<br />

120<br />

100<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0.3<br />

0.6<br />

0.9<br />

1.2<br />

1.5<br />

1.8<br />

Figure 5. Experimental results on the number of scheme failures against τ.<br />

6 Concluding Remarks<br />

2.1<br />

2.4<br />

2.7<br />

τ[second]<br />

3<br />

3.3<br />

3.6<br />

3.9<br />

75 KB/cell 87.5 KB/cell<br />

100 KB/cell 112.5 KB/cell<br />

v = 400 m/sec<br />

ω = 6 o /sec<br />

l = 1 Km<br />

α = 30 o<br />

Our work aims to supplement the meager pool of knowledge in<br />

understanding prefetching quantitatively. With this, one can<br />

incorporate other practical consideration on building prefetching<br />

systems to meet other challenges in real applications. We intend to<br />

do further experimentation in different platforms. Also, there are a<br />

lot of further works. One possible direction is to incorporate the<br />

understanding into building a practical prefetching system as<br />

mentioned in the above. Such practical system may support certain<br />

path predictions, “urgent” queue prefetching for those page faults,<br />

selective memory release when prefetching new data, special data<br />

organization that incoporate LOD and occlusion [BaPa03] etc.<br />

References<br />

[BaPa03] X. Bao and R. Pajarola. “LOD-based Clustering<br />

Techniques for Optimizing Large-scale Terrain Storage and<br />

Visualization”, Proc. SPIE Conference on Visualization and<br />

Data Analysis, 2003.<br />

[CGLL98] H. Chim, M. Green, W. Lau, H. Leong and A. Si. “On<br />

Caching and Prefetching of Virtual Objects in Distributed<br />

Virtual Environments”, Proc. ACM Multimedia, pp. 171—180,<br />

1998.<br />

[LiPa02] P. Lindstrom and V. Pascucci. “Terrain Simplification<br />

Simplified: A General Framework for View Dependent Out-of-<br />

Core Visualization”, <strong>IEEE</strong> Transactions on Visualization and<br />

<strong>Computer</strong> Graphics, 8(3), July-September 2002, pp. 239—254.<br />

[VaMa02] G. Varadhan and D. Manocha. “Out of Core Rendering<br />

of Massive Geometric Environments”, Proc. <strong>IEEE</strong><br />

Visualization, pp. 69—76, 2002.


Interactive Poster: Collaborative Volume Visualization Using VTK<br />

1. INTRODUCTION<br />

The purpose of this interactive poster is to present the results of<br />

the “Collaborative Volume Visualization Using VTK” project<br />

funded by the Alaska Experimental Program to Stimulate<br />

Competitive Research (EPSCoR) of the University of Alaska<br />

system.<br />

The objective of the “Collaborative Volume Visualization Using<br />

VTK” project was the integration of additional three-dimensional<br />

visualization techniques, and expansion of allowable file formats<br />

for the Collaborative Volume Visualization Environment (CSVE).<br />

2. COLLABORATIVE VOLUME VISULALIZATION<br />

ENVIRONMENT (CSVE)<br />

CSVE is a basic collaborative scientific visualization<br />

environment, developed under a National Science Foundation<br />

(NSF) MRI grant and a NSF REU Supplement to the grant,<br />

0215583, during FY2002, primary investigator: Dr. Patrick<br />

O’Leary.<br />

CSVE allows any group of scientists on a network sharing the<br />

same interface and visualizations to explore simulations of<br />

different scientific/natural processes, to interactively roam and<br />

zoom an array of time-dependent data, and to interact in other<br />

ways, e.g. using a chat utility, whiteboard, streaming audio,<br />

streaming media, or the graphics screen just as if sitting together<br />

in front of the same workstation.<br />

Figure 1: The CSVE illustrating a three-dimensional volume<br />

dataset of a human head inside the visualization frame with the<br />

primary application bar on top and two other graphical user<br />

interface frames showing the participants currently in the session<br />

(top) and visualization components present in the scene (bottom).<br />

Anastasia Valerievna Mironova<br />

Department of Mathematical Sciences<br />

University of Alaska Anchorage, Anchorage, Alaska<br />

anastasia_mironova@hotmail.com<br />

38<br />

CSVE was demonstrated in a prototype collaborative scientific<br />

visualization of time-dependent data sets. The example presented<br />

in Figure 1 is a visualization of a three-dimensional volume<br />

dataset of a human head within the environment.<br />

CSVE is a client/ server network application developed using the<br />

Java programming language. The server allows scientists to<br />

administrate a scientific database that stores scientific data, user<br />

information, and session creation.<br />

Figure 2: The internal static architecture of the CSVE server.<br />

The client provides a desktop with several internal frames that can<br />

be viewed as a workbench for collaborative scientific<br />

visualization. The internal frames make available collaborative<br />

visualization and communication utilities.<br />

3. COLLABORATIVE VISUALIZATION<br />

Building upon the collaborative visualization environment<br />

described above, the overall objective of the “Collaborative<br />

Volume Visualization Using VTK” project was to enhance<br />

visualization graphics capabilities of the described system by<br />

researching implemented additional three-dimensional scientific<br />

visualization techniques using the powerful Visualization Toolkit<br />

(VTK) graphics system for volume scalar and vector data sets, to<br />

expand acceptable data formats.<br />

As of this moment the following volume visualization techniques<br />

have been implemented for three-dimensional file viewing:<br />

• Creating isosurface objects with custom parameters;<br />

• Creating isosurface objects with preset properties and a<br />

custom contour value;<br />

• Creating custom cross section objects of the volume<br />

data.<br />

The process of creating isosurface objects has been supplied with<br />

a graphical user interface that enables the user to set and change<br />

parameters for objects of this nature. Specifically, these


parameters are the following: contour value, RGB color, specular<br />

lighting, specular power, transparency, and ambient parameter.<br />

The interface for creating isosurfaces with preset parameters<br />

requires the user to only select a contour value and type of desired<br />

isosurface. Each type is associated with a specific set of material<br />

properties of an isosurface object.<br />

Figure 3: Creating isosurface objects on a three-dimensional<br />

dataset of a human head and the GUI components for creating<br />

preset (left) and custom (right) isosurface objects.<br />

Custom cross sections represent a color map on a rectangular<br />

object. Creation of such objects, as mentioned above, is another<br />

implemented tool of the CSVE. The cross section objects can be<br />

translated along the X, Y, and Z axes, respectively. The user can<br />

customize cross section extent, scalar range values, color range,<br />

hue range, and saturation range.<br />

Figure 4: Custom cross section objects on a three-dimensional<br />

dataset of a human head and the GUI components for creating this<br />

type of objects.<br />

39<br />

Any of the above components along with the white box outline,<br />

once created, can be conveniently modified or completely<br />

removed from the scene via the “Manage Components” frame that<br />

has also been added as a tool for working with three-dimensional<br />

data sets.<br />

The timeframe for the “Collaborative Volume Visualization<br />

Using VTK” is from June through August 2003. Consequently, it<br />

will still be in progress for one month after this paper has been<br />

submitted and, therefore, additional techniques and enhancements<br />

are expected in the final version.<br />

4. INTERACTIVE DEMOSTRATION<br />

Besides the prepared poster, the CSVE itself will also be available<br />

on several networked laptop computers for testing by the<br />

conference attendees during the poster session and presenters will<br />

provide all necessary equipment.<br />

5. CONCLUSION<br />

The “Collaborative Volume Visualization Using VTK” project is<br />

still in progress, however, considerable enhancements have<br />

already been added to the Collaborative Scientific Visualization<br />

Environment. The CSVE is now capable of serving as a<br />

collaborative tool not only for viewing .3DS type files but also for<br />

exploring three-dimensional volume data. Creating isosurface<br />

objects and cross section objects are the two main tools that have<br />

been implemented for this type of data sessions. The collaborative<br />

nature of these visualizations allows for any of the created<br />

components to be easily modified or deleted from the scene by<br />

any of the participants of a single session.<br />

These enhancements make the CSVE a more powerful tool for<br />

collaborative visualizations.<br />

6. REFERENCES<br />

[1] C.Upson, T.Faulhaber, D.Kamins, D.Laidlaw, D.Schlegal,<br />

J.Vroom, "The Application Visualization System: A<br />

Computational Environment for Scientific Visualization." <strong>IEEE</strong><br />

<strong>Computer</strong> Graphics and Applications, 9(4):30-42, 1989.<br />

[2] Anupam, Vinod, Chandrajit Bajaj, Daniel Schikore, and<br />

Matthew Schikore. "Distributed and collaborative visualization",<br />

<strong>Computer</strong>, 1994, 27(7), pp. 37-43.<br />

[3] Pang, Alex and Craig Wittenbrink. "Collaborative 3D<br />

visualization with CSpray", <strong>IEEE</strong> <strong>Computer</strong> Graphics and<br />

Applications, 1997, 17(2), pp. 32-41.<br />

[4] J. Wood; H. Wright; K. Brodlie. CSCV - <strong>Computer</strong> Support<br />

for Collaborative Visualization. In: EARNSHAW, Rae; VINCE,<br />

John; JONES, Huw (Eds.). Visualization & Modeling. London,<br />

UK: Academic Press, 1997. p. 13-25.<br />

[5] Michael R. Macedonia and Donald P. Brutzman, <strong>MB</strong>one<br />

Provides Audio and Video Across the Internet, <strong>Computer</strong>, <strong>IEEE</strong><br />

<strong>Computer</strong> <strong>Society</strong>, Vol. 27, No. 4, (April 1994), 30 - 36.


The Challenge of Missing and Uncertain Data (Poster)<br />

Cyntrica Eaton and Catherine Plaisant<br />

Human <strong>Computer</strong> Interaction Lab<br />

University of Maryland, College Park<br />

College Park, MD 20742<br />

ceaton@cs.umd.edu, plaisant @cs.umd.edu<br />

1. Abstract<br />

Although clear recognition of missing and uncertain data is<br />

essential for accurate data analysis, most visualization techniques<br />

do not adequately support these significant data set attributes.<br />

After reviewing the sources of missing and uncertain data we<br />

propose three categories of visualization techniques based on the<br />

impact that missing data has on the display. Finally, we propose a<br />

set of general techniques that can be used to handle missing and<br />

uncertain data.<br />

2. Introduction<br />

Information visualization presents an interesting paradox. While<br />

visual perception can be highly effective in the recognition of<br />

trends, patterns and outliers, the conclusions drawn as the result<br />

of such observations are only as accurate as the visualizations<br />

allow them to be. Therefore, to preserve the integrity of the data<br />

exploration process it is important to design visualization<br />

techniques that render data as accurately as possible and do not<br />

introduce misleading patterns. While this is an issue on a broader<br />

level, poor handling of missing values and data confidences is one<br />

specific aspect of data visualization that can have a negative<br />

influence on the quality of the data interpretation. Most tools<br />

available (especially research tools) cannot handle missing data<br />

and simply crash. The literature on visualization applications<br />

often reports on how the raw data has been preprocessed to “fillin<br />

the blanks” or extrapolate data but users cannot see that the<br />

data was altered. Only rarely do tools attempt to make users<br />

aware of the presence of missing or uncertain information, e.g.<br />

[1,2,3]<br />

We reviewed the sources of missing and low confidence data and<br />

propose a classification of visualization techniques based on the<br />

impact missing data has on the display and how likely users are to<br />

notice the existence of missing and uncertain data. Finally we<br />

propose a list of techniques that can be used to handle missing<br />

and uncertain data.<br />

3. Sources of Missing and Uncertain Data<br />

Because so many visualization tools work with data that can be<br />

represented in tabular form, we define a missing data point as an<br />

empty table cell. Generally, missing data is a result of the tools<br />

and procedures utilized during experimentation and constraints<br />

placed on the publication of data results, e.g. uncollected data,<br />

redefined data categories, data source confidentiality protection,<br />

and non-applicable multivariate combinations. Given the intrinsic<br />

collection and presentation influenced reasons behind missing<br />

data, avoiding missing values is nearly impossible, and the<br />

amount of missing data is likely to increase proportionally with<br />

the size of the set.<br />

In most current visualization applications, however, missing data<br />

is either omitted from the display space, or presented in such a<br />

way that that it is indistinguishable from valid data. Consider the<br />

graph shown in Figure 1, as an example. Although the first three<br />

40<br />

Terence Drizd<br />

National Center for Health Statistics<br />

3311 Toledo Road<br />

Hyattsville, Maryland 20782<br />

tad2@cdc.gov<br />

data points were actually missing, the preprocessing of the data<br />

filled the empty cells with zeros. Users are likely to interpret the<br />

diagram as showing the values to be low and stable then<br />

increasing sharply. This bias is likely to occur even if users are<br />

aware of what preprocessing took place.<br />

540<br />

430<br />

320<br />

210<br />

10<br />

1987 1988 1989 1990 1991<br />

Figure 1: Missing data encoded as zero values can be misinterpreted<br />

Confidence values are largely dependent upon the parameters of<br />

the experimentation process. Statistical sampling, sample size<br />

issues, flawed experimentation, and data estimation can all<br />

contribute to low confidence. While the missing data problem is<br />

more obvious in that a cell in a data set is actually empty, the<br />

confidence problem may even be more difficult to detect. The<br />

confidence interval may not be included at all in the data (it<br />

doubles the size of the dataset), or it may be difficult to present<br />

visually, and finally it may be difficult to comprehend for some<br />

users.<br />

4. Classification of Visualization Techniques<br />

We found three types of techniques in respect to the impact<br />

missing data has on the display. All visualizations use graphic<br />

objects to represent data items, and the position of those graphic<br />

objects on the display can be: 1) dedicated to the data item<br />

independently of the attribute values, 2) entirely a function of<br />

attribute values, or 3) a function of the item attribute values and<br />

the values of neighboring items.<br />

An example of the first category (“dedicated”) is a line graph in<br />

which the graphic object representing a data value is a dot with a<br />

dedicated X location. Other values in the data set have minimal<br />

influence on the graphic object. At most, the minimum and<br />

maximum values impact axis calibration. Chloropleth maps and<br />

techniques relying on ordering can fall in this category. For this<br />

type of visualization, if the data is missing and no object is<br />

displayed at the corresponding X position, the absence of data<br />

should be easily detected since users will be expecting to see a<br />

data point for each of the ordered values in the set (Figure 2).<br />

540<br />

430<br />

320<br />

210<br />

10<br />

1987 1988 1989 1990 1991<br />

Figure 2: Voids can accurately signal missing data<br />

An example of the second category (“attribute dependent”) is a<br />

scatter plot. In a scatter plot the position, color, and size of an<br />

object is based on the data item attribute values. If a data item is


missing, there is nothing in the display that clearly indicates a<br />

missing data value.<br />

540<br />

430<br />

320<br />

210<br />

10<br />

1 2 3 4 5 6<br />

Figure 3: Voids can go undetected or bias the display<br />

Agree<br />

Disagree Indifferent<br />

M issing Data<br />

Examples of the third category (“neighbor dependent”) are pie<br />

charts or Treemap. Here, the size and placement of the wedge or<br />

box representing the data item is a function of both the data item<br />

attribute values and neighboring items. If a data item is missing,<br />

simply omitting it from the display space will not only go<br />

unnoticed but it will also bias the appearance of other items. This<br />

is characteristic of all the space-filling techniques. In contrast the<br />

first two categories can be called neighbor-independent<br />

techniques.<br />

Hybrid cases can be found. For example with parallel<br />

coordinates, a missing data item will go unnoticed (the position of<br />

the line is entirely dependent of attribute values) but a missing<br />

attribute value might be noticed as the location for that attribute is<br />

dedicated and the line can be drawn broken or connected to a<br />

separate location for missing values.<br />

5. Possible Solutions and Directions<br />

For both the neighbor-dependent and independent models, there<br />

are primarily three data visualization enhancements that could be<br />

used to provide effective indication of missing data and<br />

confidence intervals. They include 1) dedicated visual attributes,<br />

2) annotation, and 3) animation. Dedicating visual attributes<br />

involves associating color, texture, shape, or any combination of<br />

these with data point appearance in order to indicate missing<br />

points or confidence ranges. Annotation, on the other hand,<br />

would allow users to gain further insight into missing and<br />

unreliable data through text or graphic information presented<br />

outside of the scope of data point appearance. Lastly, animation<br />

can provide a series of data display transitions that allow users to<br />

view several different perspectives in a short period of time.<br />

Animation can be helpful in adding and eliminating missing data<br />

clues based on the preference and/or intentions of the user. For<br />

example, users may be initially interested in observing the<br />

missing data points yet eventually hiding missing point indicators<br />

as set exploration goals change. Overall, the most effective way<br />

of using any of these enhancements is largely dependent upon the<br />

nature of the visualization paradigm.<br />

For the visualizations in the first “dedicated” category, solutions<br />

abound as even a void can be noticed. Designers can use<br />

dedicated graphic attributes such as a special color or style to<br />

display an extrapolated value (e.g. a gray dot or a dotted line).<br />

They can also use annotation with a textual or graphic icon since<br />

there is dedicated space for it on the display, or they can use<br />

animation to first show only the data available then show the<br />

addition of the estimated data, possibly with a warning to users<br />

about the reason for the missing data. Similar techniques can be<br />

used to represent the uncertainty of the data. The color can<br />

become less intense with uncertainty, boxes or range bars can<br />

annotate the display, or animation can illustrate the possible<br />

variations of the display for min and max values. While both<br />

hatching and color ranges are both reasonably sound dedicated<br />

visual attributes that could be used to indicate associated<br />

41<br />

confidence values, they could also be used to indicate the reasons<br />

why a given data point is unreliable. In either case, a particular<br />

hatching scheme or intensity would be mapped to a confidence<br />

value or a confidence influence and then incorporated into the<br />

display space to alert users accordingly. As stated before, these<br />

attributes should be carefully incorporated to ensure that the<br />

visualization does not become distorted, confounded, or<br />

ambiguous.<br />

For the second category of visualizations (“attribute dependent”),<br />

designers have to rely on annotations to represent missing values.<br />

For example the number of missing items can be indicated on the<br />

side of the display, with possibly a list of names or partial<br />

representations when available. Hybrid cases exist where the data<br />

item may only be missing for some of the attribute values but not<br />

all of them. For example the X value can be known but the Y<br />

value missing, therefore specific annotations areas can be<br />

dedicated on the side that still represent the partial data.<br />

Ironically this category of visualization suffers from the opposite<br />

problem: data may sometime appear to be missing while in fact<br />

the graphic object is being hidden by another one. For<br />

uncertainty, dedicated graphic attributes, annotation or animation<br />

can be used. Data elements that vibrate in such a way that more<br />

stable data points indicate more confident measures might also<br />

provide users with the insights necessary to determine data point<br />

value dependability. Finally, methods like direct manipulation<br />

could provide the ability to filter data points upon demand based<br />

on user-defined confidence thresholds.<br />

Neighbor-dependent visualizations are much more difficult to<br />

deal with as missing data is more likely to bias the interpretation<br />

of the rest of the data. Even choosing a default value for missing<br />

data has a significant impact on the display. Annotation is likely<br />

to be useful. Animation is intriguing. In a Treemap, as an<br />

example, the data can be shown without size coding with a<br />

dedicated color the extent of the missing data then animated to a<br />

size coded Treemap where the missing data is only indicated by a<br />

small fixed size of same color. The classic use of annotation for<br />

marking uncertainty (error bars) is a challenge for the neighbor<br />

dependent techniques. Animation of uncertainty is a challenge as<br />

elements interact with each others. Through direct manipulation,<br />

however, data analysts could be given the ability to filter the<br />

display space based on a user-provided range of confidence.<br />

6. Conclusion<br />

Dealing with missing data and uncertain data is a challenge for<br />

information visualization. We hope that our general classification<br />

of visualizations and techniques will help us to build effective<br />

prototypes that can be further tested to develop guidelines for<br />

designers.<br />

7. Acknowledgement<br />

This research was supported in part by the National Center for<br />

Health Statistics and NSF EIA 0129978<br />

8. References<br />

1. MacEachren, A. M., Brewer , C. A., and Pickle, L. 1998. Visualizing<br />

Georeferenced data: Representing reliability of health statistics.<br />

Environment and Planning: A 30, 1547-1561.<br />

2. Twiddy, R., Cavallo, J., and Shiri, S. 1994. Restorer: A visualization<br />

technique for handling missing data. In <strong>IEEE</strong> Visualization 94, 212-216<br />

3. Olston, C., and Mackinlay, J. 2002. Visualizing Data with Bounded<br />

Uncertainty. In Proceedings of the <strong>IEEE</strong> Symposium on Information<br />

Visualization, 37-40


The Open Volume Library for Processing Volumetric Data<br />

We present the Open Volume Library (OpenVL) for processing<br />

volumetric data and as a framework for collaboration, and shared<br />

software development of volumetric data applications. OpenVL<br />

provides a comprehensive low-level volume access functionality<br />

that is suitable for implementing algorithms and applications that<br />

deal with volumetric data, such as accessing a voxel neighborhood,<br />

performing certain operations at a given voxel, interpolating data<br />

values at non-grid locations, etc. We present OpenVL as a standard<br />

platform for collaboration in the community. This is achieved<br />

by an extensive plugin framework built into OpenVL. Scientists and<br />

researchers can implement their work as OpenVL plugins which<br />

others in the community can easily use.<br />

1 Introduction<br />

Many scientific disciplines ranging from biomedical sciences to<br />

seismic sciences deal with volumetric data. However, even with<br />

the wide spread use of such data, there is no standard and open<br />

source library for handling them. Most of the currently available<br />

systems such as VTK [Schroeder et al. 1996; Schroeder<br />

et al. 1998], VolVis [Avila et al. 1994], AVS [Upson et al. 1989],<br />

OpenDX(formerly Data Explorer) [IBM 1991], Khoros [Konstantinides<br />

and Rasure 1994] etc. mainly provide high-level functionality<br />

for the purpose of visualizing the data. Most of them do not<br />

provide low-level volume access functionality and framework for<br />

handling volumetric data. Some libraries such as ITK [Ins 2002]<br />

and ImLib3D [Bosc and Vik 2002], which were developed at the<br />

same time as OpenVL, do provide some low level access functionality<br />

but lack the support for multiple data layouts and a dynamic<br />

plugin framework which we feel is critical for flexibility, extensibility,<br />

and ease of use.<br />

The main motivation for our work is the lack of a standard framework<br />

for working with volumetric datasets. Any researcher or developer<br />

intending to work with volumetric data has to build tools<br />

that provide the basic functionality needed for accessing the data.<br />

OpenVL [Lakare and Kaufman 2003] is a framework that allows<br />

the users to concentrate on algorithm development and implementation<br />

and not bother with the low-level volume access issues. It<br />

also makes the code more manageable, less prone to errors, and<br />

more readable.<br />

We present OpenVL as a standard platform for collaboration in<br />

the community. We want to encourage sharing of algorithm implementations<br />

to maximize code reuse and minimize duplication of<br />

efforts. OpenVL framework provides a comprehensive support for<br />

plugins, which are dynamic modules capable of performing certain<br />

task. This allows researchers and developers to provide their algorithm<br />

implementations as OpenVL plugins which others can easily<br />

incorporate into their own code. For example, a plugin may provide<br />

a volume subdivision, a region-grow capability or an implementation<br />

of a newly published work. As these plugins are used by other<br />

† {lsarang,ari}@cs.sunysb.edu<br />

Sarang Lakare and Arie Kaufman †<br />

Center for Visual Computing (CVC)<br />

and Department of <strong>Computer</strong> Science<br />

Stony Brook University<br />

Stony Brook, NY 11790<br />

42<br />

Figure 1: Overview of OpenVL.<br />

users, it is likely that they will be optimized and improved. As a result,<br />

all the users of OpenVL will have access to the most optimzed<br />

implementation of various algorithms.<br />

The goals of this paper are to highlight the work done on<br />

OpenVL since introducing it in a previous paper [Lakare and Kaufman<br />

2003], to present the current state of OpenVL, and to enocourage<br />

discussions and collaborations to define its future.<br />

2 Highlights<br />

The OpenVL design has the following important properties:<br />

Modular: OpenVL is modular. Almost everything in OpenVL<br />

is implemented as a plugin which makes it very easy to add and<br />

remove functionality.<br />

Extensible: OpenVL is designed to be extensible. All the functionality<br />

provided by OpenVL can be extended by implementing<br />

additional plugins. These plugins can be provided by third parties<br />

and need not be part of OpenVL. Their functionality will be available<br />

immediately to all OpenVL enabled applications.<br />

High performance: Every part of OpenVL is implemented to<br />

provide maximum performance. The OpenVL design allows users<br />

to tradeoff between flexibility and performance where flexibility<br />

can lead to reduced performance.<br />

Ease of use: The various APIs used in OpenVL are designed to<br />

be as simple as possible. All APIs are documented and reference<br />

documentation is always available on the OpenVL website. The<br />

use of plugins allows users to employ algorithms implemented by<br />

others without knowing the intrinsics of the implementaton.<br />

Open source: We strongly believe in the fundamentals of open<br />

source. The entire source code for OpenVL is freely available<br />

on the Internet from the OpenVL website. The development of<br />

OpenVL is open and contributions are encouraged. The source code<br />

is controlled by CVS which allows parallel development.


3 OpenVL Overview<br />

Figure 1 shows an overview of OpenVL. The user application is at<br />

the highest level and makes use of various OpenVL components.<br />

The main components of OpenVL are:<br />

• Volume: Stores the volumetric data in various layouts and<br />

provides access to the data.<br />

• Volume File Input/Output: Loads volumetric data stored in<br />

user files into the Volume component and writes the data in<br />

the Volume component to user files.<br />

• Volume Processing: A framework for implementing various<br />

volume processing algorithms. By volume processing we<br />

mean any task that can be performed on a volume. This can<br />

include an image processing task or even volume rendering.<br />

• Plugin Manager: Responsible for managing the OpenVL<br />

plugins. Also provides a trader interface for applications to<br />

query and request plugins. The plugins are loaded on demand<br />

which reduces the memory usage considerably when a large<br />

number of plugins are installed.<br />

• Utility Classes: These are a collection of classes commonly<br />

needed when working with volumetric datasets.<br />

• GUI Widgets: Provide a user interface component to various<br />

functionality provided by OpenVL.<br />

4 Implementation<br />

We have implemented the OpenVL library using standard C++. Our<br />

current development is on the Linux operating system and uses the<br />

GNU C++ compiler. The implementation of OpenVL accomplishes<br />

three goals:<br />

• Fast: Since the major part of OpenVL is at a very low-level<br />

(voxel access level), speed is a very important concern.<br />

• Ease of use: Our implementation focuses on the ease of use<br />

of our library.<br />

• Hiding templates: One important goal of our library is to<br />

hide the C++ templates from the user as much as possible<br />

while making extensive use of them internally. This allows<br />

effecient and flexible implementation of application code.<br />

The implementation of OpenVL uses modern C++ techniques<br />

such as templates, partial template specialization, code inlining etc.<br />

This results in a high performance and flexible implementation of<br />

the library. The library is implemented as a shared library which<br />

applications can link to dynamically.<br />

Almost everything in the library is built as plugins which are binary<br />

shared object files that can be loaded and used at run time.<br />

This allows dyamically extending the functionality provided by<br />

OpenVL, making the library extensible. Since all the plugins are<br />

simple files, they can be added or removed to control the functionality<br />

provided by OpenVL. This gives OpenVL a modular structure.<br />

All the APIs in OpenVL are clean, simple, and well documented.<br />

A reference documentation is always available on the OpenVL<br />

website. This makes it easy to learn and use the library. To provide<br />

a clean and simple API, we hide the internal use of C++ templates<br />

from the user. This also has the advantage of controlling the size<br />

of the library. With extensive use of templates, it is possible for the<br />

run time size of the library to grow exponentialy.<br />

OpenVL supports multiple file formats for volumetric data storage<br />

in files. This is achieved through the use of plugins. Each file<br />

43<br />

format has a plugin which provides the input/output functionality<br />

for that format. Since these plugins are dynamic, existing applications<br />

using OpenVL can make use of new plugins at run time and<br />

without needing a recompile.<br />

5 Future Work<br />

Our next goal with OpenVL is to provide as much functionality<br />

as possible. This will include implementing plugins for various<br />

volume processing tasks, file formats, and data layouts. We also<br />

aim to add more utility classes to the library.<br />

In the future, we would also like to extend OpenVL to include a<br />

volume rendering framework just like the volume processing framework<br />

we have now. For this, we would like to add a volume rendering<br />

API and a volume modeling API with plugin support for<br />

different rendering engines and modelling methods, respectively.<br />

6 Acknowledgements<br />

This work has been partially supported by ONR grant<br />

N000140110034, NIH grant CA82402, NSF grant CCR-<br />

0306438 and a NYSTAR grant. The authors wish to thank<br />

Manjushree Lakare, Klaus Mueller, Suzanne Yoakum-Stover,<br />

and Susan Frank for their help, encouragement and discussions.<br />

We also thank Sourceforge.net for providing CVS code<br />

repository, mailing lists and initial WWW hosting for our<br />

project. The OpenVL website can be currently accessed at<br />

http://openvl.sourceforge.net. More information about OpenVL<br />

can be found at http://www.cs.sunysb.edu/∼vislab/projects/openvl.<br />

References<br />

AVILA, R., HE, T., HONG, L., KAUFMAN, A., PFISTER, H.,<br />

SILVA, C., SOBIERAJSKI, L., AND WANG, S. 1994. VolVis:<br />

A Diversified Volume Visualization System. In Proc. of <strong>IEEE</strong><br />

Visualization ’94, 31–38.<br />

BOSC, M., AND VIK, T., 2002. The ImLib3D website -<br />

http://imlib3d.sourceforge.net .<br />

IBM CORP. 1991. Data Explorer Reference Manual. Armonk,<br />

NY, USA.<br />

INSIGHT CONSORTIUM. 2002. The Insight Segmentation and Registration<br />

Toolkit (ITK) Website. http://www.itk.org.<br />

KONSTANTINIDES, K., AND RASURE, J. 1994. The Khoros Software<br />

Development Environment for Image and Signal Processing.<br />

<strong>IEEE</strong> Transactions on Image Processing 3 (May), 243–252.<br />

LAKARE, S., AND KAUFMAN, A. 2003. OpenVL - The Open Volume<br />

Library. In Proceedings of the Eurographics/<strong>IEEE</strong> TVCG<br />

Workshop on Volume Graphics, 69–78.<br />

SCHROEDER, W. J., MARTIN, K. M., AND LORENSEN, W. E.<br />

1996. The Design and Implementation of an Object-Oriented<br />

Toolkit for 3D Graphics and Visualization. In Proc. of <strong>IEEE</strong><br />

Visualization ’96, 93–100.<br />

SCHROEDER, W., MARTIN, K., AND LORENSEN, B. 1998. The<br />

Visualization Toolkit, 2 ed. Prentice Hall.<br />

UPSON, C., FAULHABER, T., KAMINS, D., SCHLEGEL, D.,<br />

LAIDLAW, D., VROOM, F., GURWITZ, R., AND VAN DAM,<br />

A. 1989. The Application Visualization System: A Computational<br />

Environment for Scientific Visualization. <strong>IEEE</strong> <strong>Computer</strong><br />

Graphics and Applications 9, 4 (July), 30–42.


A Parallel Coordinates Interface for Exploratory Volume Visualization<br />

1. Introduction<br />

Volume data exploration and analysis are important tasks in many<br />

visualization domains, including medical imaging and<br />

computational fluid flow simulations. However, these tasks can be<br />

quite challenging, because effective volume rendering interfaces<br />

have not been established. With traditional volume rendering<br />

interfaces, understanding the space of available parameters,<br />

keeping track of what you have done, and undoing operations to go<br />

back to previous states are particularly difficult operations.<br />

Parallel coordinates [Inselberg, 1990] is a graphing technique for<br />

multi-dimensional data points that is used for finding correlations<br />

and other interesting features in a set of observations. A parallel<br />

coordinates graph consists of one vertical axis per variable, with<br />

data points plotted as a series of line segments connecting the<br />

values of the individual components together. We apply the<br />

parallel coordinates layout to the parameter space used for<br />

volumetric rendering, where the variables include camera<br />

orientation, transfer functions for colour and opacity, zoom and<br />

translation of the view, a volumetric data file, and a rendering<br />

technique. Many other parameters are possible, and are only<br />

limited by what the chosen set of rendering techniques supports.<br />

Cutting plane position and orientation, light placement, and<br />

shading coefficients are a few additional examples. By organizing<br />

visualization parameters in a parallel coordinates layout, all<br />

parameters are explicitly represented to clearly illustrate the space<br />

of available options for volume rendering.<br />

Figure 1: Diagram of our parallel coordinates interface,<br />

illustrating how different rendering techniques might be<br />

compared.<br />

2. The Parallel Coordinates Interface<br />

We have developed an application that uses parallel coordinates as<br />

an interface for volume rendering. Our interface is illustrated in<br />

Simeon Potts, Melanie Tory, Torsten Möller<br />

Graphics, Usability, and Visualization (GRUVI) lab<br />

School of Computing Science, Simon Fraser University<br />

sgpotts@sfu.ca, {mktory, torsten}@cs.sfu.ca<br />

44<br />

Fig. 1. Instances of the various parameters are placed as<br />

consecutive nodes on the axis designated for them. Nodes can be<br />

connected together to form a set of rendering parameters by<br />

dragging the mouse across the axes, or optionally by clicking<br />

individually on the nodes. The resultant image is placed in the<br />

history view (scrollable blue pane in Fig. 1) and visually connected<br />

to the parameters with a line (coloured polylines in Fig. 1).<br />

Particularly interesting images can be copied to a favourites view<br />

(scrollable green pane in Fig. 1). In the interface, additional<br />

windows are used for the parameter editors, for a trash browser, for<br />

larger high-resolution renderings, and for spreadsheet-like tables<br />

of images similar to [Jankun-Kelly and Ma, 2001]. These<br />

additional windows are illustrated in Figures 2 and 3.<br />

1<br />

2<br />

3 4<br />

Figure 2: Some auxiliary windows used in our interface. 1.<br />

Colour and opacity transfer function editors. 2. Zoom and<br />

translation editors. The editor on the right has a checkbox<br />

selected to make it interactive, updating the image as the<br />

user drags the mouse to zoom and translate the view. 3. A<br />

trash browser window. 4. A larger rendering window that<br />

allows the user to save a PPM image.<br />

We included features that we believe make parallel coordinates a<br />

powerful tool for visualization. The axes can be re-arranged into<br />

any order, and nodes on the axes can be moved by the user into any<br />

desired position. Axes are equipped with scroll bars to handle large<br />

numbers of nodes and a trash container to manage discarded<br />

nodes. The history view allows a user to go back to any previous<br />

set of parameters, which they can continue exploration from. In the<br />

parallel coordinates view, a user can drag the polyline from one<br />

node to another on the same axis to create a parameter set identical<br />

to the original one except for one parameter value, enabling them<br />

to see and compare the effect of the change on the rendered image.<br />

In the same way, a particular set of parameters can easily be<br />

applied to an entire set of data files, or a set of parameters can be<br />

rendered by several different renderers. Additionally, we have


included a version of the spreadsheet interface described by<br />

Jankun-Kelly and Ma in [Jankun-Kelly, 2001], implementing the<br />

same basic features (however, we did not include a scripting<br />

language or session management). This tabular layout tool is an<br />

extension to our interface and is illustrated in Fig. 3. Finally, the<br />

set-up of nodes and the history can be saved and re-created in a<br />

later session. In this way, the interface can be used as a teaching<br />

tool, where an expert user can construct an ideal set of parameters<br />

for a particular application and / or record an efficient exploration<br />

of a data set for others to examine and learn from.<br />

1<br />

3. Results and Implications to Visualization<br />

2<br />

Figure 3: Diagram of a table. 1. Checkboxes determine an<br />

axis to be used for the rows and for the columns of the<br />

table, and a new table is created with a button press. 2. The<br />

table after clicking a render button, displaying all the<br />

possible combinations of colour and opacity transfer<br />

functions from the row and column axes. 3. The user added<br />

a thumbnail of the table and the image from the first row,<br />

second column of the table to the history.<br />

The parallel coordinates layout facilitates operations that could<br />

otherwise be quite difficult. Understanding the space of possible<br />

parameters requires only a simple glance at the parallel axes.<br />

Similarly, users can glance at a polyline to easily understand the<br />

set of parameters that produced a particular image. Keeping track<br />

of what combinations have been tried and going back to previous<br />

states is possible by scrolling through the history bar. Finally,<br />

effects of parameters (e.g., different transfer functions, rendering<br />

techniques, or data sets) can be compared side-by-side in the<br />

history bar, favorites bar, or within a table.<br />

The parallel coordinates interface we have described here can be<br />

applied to scientific visualization tasks across many domains. We<br />

envision a broad impact of this tool on research that relies on the<br />

visualization of large or complex data sets. A user evaluation<br />

comparing the parallel coordinates interface to a spreadsheet-style<br />

interface and to a more traditional interface for volume rendering<br />

was carried out. (For a detailed discussion of the parallel<br />

coordinates interface and user study results, see [Potts et al.,<br />

2003]). Results of the evaluation suggest that the parallel<br />

coordinates interface offered the best understanding of the<br />

parameter space available to volume rendering of the three<br />

interfaces, and was generally the easiest interface to use for<br />

3<br />

45<br />

changing parameters. The tabular layout feature was considered a<br />

useful addition for image comparisons.<br />

We would like to extend our interface to include a broader<br />

parameter space made possible through rendering tools such as<br />

VTK [Schroeder, 1998], or through renderers that can render timevarying<br />

or multi-modal data (data that includes multiple time steps<br />

or overlapping measurements).<br />

4. Acknowledgements<br />

Funding for this project was provided by NSERC and the British<br />

Columbia Advanced Systems Institute (BC ASI). Renderings were<br />

produced by vuVolume, a rendering suite developed in the<br />

Graphics, Usability and Visualization (GrUVi) lab at Simon Fraser<br />

University. We would like to thank T.J. Jankun-Kelly and Kwan-<br />

Liu Ma for providing us with source code for their Spreadsheet<br />

interface, which provided the basis of our original discussions.<br />

References<br />

INSELBERG, A., DIMSDALE, B. 1990. Parallel coordinates: A tool for<br />

visualizing multidimensional geometry. Proc. <strong>IEEE</strong> Visualization, 361-<br />

378<br />

JANKUN-KELLY, T.J., MA, K. 2001. Visualization exploration and<br />

encapsulation via a spreadsheet-like Interface. <strong>IEEE</strong> Transactions on<br />

Visualization and <strong>Computer</strong> Graphics, 7, 3, 275-<strong>28</strong>7.<br />

POTTS, S., TORY, M., MÖLLER, T. 2003. A Parallel Coordinates Interface for<br />

Exploratory Volume Visualization. Technical Report, School of<br />

Computing Science, Simon Fraser University, (SFU-CMPT-08/03-<br />

TR2003-05), August 2003.<br />

SCHROEDER W., MARTIN K., LORENSEN W. 1998. The Visualization Toolkit,<br />

2nd ed. Prentice Hall PTR: New Jersey.


How ReV4D Helps Biologists Study the Effects of Anti-cancerous Drugs on<br />

Living Cells<br />

Eric BITTAR 1 , Aassif BENASSAROU 1 , Laurent LUCAS 1 ,<br />

Emmanuel ELIAS 2 , Pavel TCHELIDZE 2 , Dominique PLOTON 2 and Marie-Françoise O’DONOHUE 2<br />

1. INTRODUCTION<br />

We present a collaborative work between cellular biologists of<br />

the MéDIAN lab. and computer science researchers of the LERI<br />

lab. Cell biologists of the MéDIAN group have applied recent<br />

developments of genetic engineering to obtain cell-lines<br />

expressing fusion proteins composed of a protein of interest, UBF<br />

(Upstream Binding Factor), combined with an auto-fluorescent<br />

protein: GFP (Green Fluorescent Protein). Thus, these cells may<br />

be observed with a confocal microscope, leading to 4D images.<br />

The data in this study record the evolution of UBF proteins under<br />

the action of an anti-cancerous drug: actinomycin D. We show<br />

that, whereas the images are difficult to study with conventional<br />

visualization tools, the “Reconstruction and Visualization 4D”<br />

tool (ReV4D) developed in the LERI lab. is helpful to model,<br />

analyze and visualize the evolving phenomena occurring in the<br />

living cells. ReV4D combines a time-space deformable model<br />

with volume rendering methods.<br />

2. MATERIAL AND METHODS<br />

Human cancerous cells were grown on glass coverslips and<br />

transfected with the corresponding vectors. Twenty-four hours<br />

after transfection, a coverslip was mounted in a perfusion chamber<br />

equipped with a heat controller. Images were performed with a<br />

Biorad 1024 confocal microscope equipped with a x 63, 1.4 NA<br />

plan apochromat objective. Acquisition conditions were optimized<br />

to perform one z-serie (containing 40 optical sections) every 5<br />

minutes during long periods of time, each lasting 8 to 10 hours. In<br />

the present work, we studied the effects of a drug inhibiting rRNA<br />

transcription, Actinomycin-D, on the reorganization of nucleolar<br />

sites containing GFP tagged UBF protein (UBF-GFP). After a 30minute<br />

period without drug, the cell culture was perfused with a<br />

solution containing 50 ng/ml of drug for 2 hours. Then, the<br />

medium without drug was perfused for the next 5 hours and 30<br />

minutes. As a result, 100 z-series were collected for each cell<br />

during one experiment.<br />

3. 3D APPROACH<br />

The classical approach of such a study is to reduce the dimension<br />

of the data by slicing (for example considering a 3D volume for a<br />

fixed t value) or projection, in order to obtain 3D volumes. We<br />

have used this method: for each of the 100 z-stacks, the 40 optical<br />

sections { x, y, z }-data are combined with the Maximum<br />

Intensity Projection method to obtain one projection image along<br />

the z-axis. Then, these 100 images are put together to create a new<br />

3D volume. By applying a surface rendering method to this data,<br />

with Amira 3.0 Software [5] (see Figure 1), the z-projections of<br />

1 : LERI Laboratory, EA2618 *<br />

2 : MéDIAN Unit, UMR CNRS 6142 *<br />

University of Reims Champagne-Ardenne, France.<br />

46<br />

the different structures appear as cylinders which may show the<br />

changes of given structures over time (for example fusion of two<br />

structures). The contours of the projections (x and y axis) are<br />

shown over time (z-axis). The transparent yellow cylinder<br />

corresponds to the nucleus of the cell. Within the latter, green and<br />

red cylinders show the evolution of UBF spots over time. It<br />

appears that the fusion of the 2 red spots occurs 2 hours after the<br />

beginning of the experiment. One limitation of this mode of<br />

visualization is that only structures not localized on the same<br />

z-axis can be observed. It appears thus necessary to investigate the<br />

true three-dimensional trajectories over time.<br />

Figure 1: Surface visualization of a stack of 2D+t projections<br />

4. ReV4D<br />

4.1 4D Deformable Surface Model<br />

A temporal deformable model is well suited in order to extract the<br />

evolution of the objects, while using the speed and shape<br />

coherence in the reconstruction process. Our model, introduced in<br />

3D by Lachaud and Montanvert [3], and transposed to 4D by<br />

Benassarou et al [1] is called δ-snake. It owes its name to the δ<br />

parameter that governs its structure. It is a four-dimensional<br />

deformable surface, which is able to change its topology over<br />

time. It is based on an oriented constrained triangular mesh, which<br />

is governed by distance rules, ensuring the regularity of the mesh.<br />

If those rules are violated, specific operations are applied until the<br />

regularity is recovered. The δ-snake evolves according to a usual<br />

energy-minimizing scheme. The energy combines an external<br />

term and one internal. We calculate the internal energy as the


a. t = 1h b. t = 1h 40min c. t = 2h 5min<br />

d. t = 2h 10min e. t = 2h 25min<br />

Figure 2: Space-Time Surface Model and trajectories at 5 different times during Actinomycin-D treatment.<br />

composition of a surface tension and an object-shape term. We<br />

thus take into account deformations and rigid movements of each<br />

object. We define the external energy as a local attractor to the<br />

desired level in the volume. In this scheme, the 4D deformable<br />

model reconstructs the 4D data from one 3D volume to another<br />

while maintaining its space-time coherence. It mimics the<br />

evolution of the biological objects.<br />

4.2 DVR and Symbolic Representations<br />

As shown in Figure 3, to reinforce the spatial understanding, the<br />

visualization is completed by the addition of other information:<br />

the Direct Volume Rendering of the data sliced in time [2] [4], the<br />

numerical value of the volume of each nucleolus - represented in a<br />

small tag connected to the bounding box of the object - and the<br />

trajectories. Indeed, thanks to our Space-Time model, we compute<br />

the graph of the objects evolution, which enables us to compute<br />

the trajectory of each reconstructed object and to maintain it<br />

trough the topological events.<br />

The trajectories are enhanced by modifying the radius of<br />

cylinders according to the volume of the objects. As also shown in<br />

Figure 3, time may be integrated to a spatial dimension, leading<br />

for example to a {x, y, z+t} 3D representation. This mode has the<br />

advantage of better presenting the data variations according to<br />

time. It can be compared with the projection presented in Section<br />

Figure 3: Trajectories of the spots between 1h 20min and 2h<br />

20min with time mapped to the z-axis. The radii represent the<br />

volume of the spots. The surface of the spots at t=2h20 is<br />

represented, as well as the volume rendering of the nucleolus.<br />

47<br />

3, but it has the advantage not to alter the data during processing.<br />

It produces a representation of the evolution of the object’s center<br />

of mass.<br />

4.3 Results<br />

The extraction of the 4D surface takes about 5 minutes for the 100<br />

volumes on a Pentium IV 1.2 Ghz PC. Once this computation is<br />

finished, the user visualizes the information at an interactive rate,<br />

with the help of an nVIDIA GeForce 4 Ti4200 Graphics<br />

Accelerator. The spots colored in red on Figure 1 correspond to<br />

the spots at the bottom of Figure 2 and Figure 3. We can see that<br />

the fusion of the spots is visible both with the surface and the<br />

trajectory. It occurs between 2h 5min and 2h 10min from the start<br />

of the experiment (Figure 2-b and -c).<br />

5. CONCLUSIONS AND PERSPECTIVES<br />

ReV4D brings a space-time approach that allows describing,<br />

understanding and showing the complexity of the phenomena that<br />

take place within living cells during a drug treatment. Significant<br />

events like spots fusion are identified and localized, both in time<br />

and space. ReV4D computes a graph of the evolution of the<br />

connected components, which is represented by the trajectories.<br />

The visualization is granted at an interactive rate, and combines<br />

surface and volume rendering, as well as quantitative information.<br />

We are currently generalizing the method to take into account the<br />

global displacements of the nucleoli.<br />

References<br />

[1] A. Benassarou, J. De Freitas-Caires, E. Bittar and L. Lucas.<br />

An Integrated Framework to Analyze and Visualize the<br />

Evolution of Multiple Topology Changing Objects in 4D<br />

Image Datasets. Proc. Vision Modeling and Visualization<br />

2002, Erlangen Germany, pp 147-154, 2002<br />

[2] J. Kniss, G. Kindlmann, and C. Hansen. Interactive volume<br />

rendering using multi-dimensional transfer functions and<br />

direct manipulation widgets. In <strong>IEEE</strong> Visualization<br />

Proceedings 2001, pages 255–262, 2001.<br />

[3] J-O. Lachaud and A. Montanvert. Deformable Meshes with<br />

Automated Topology Changes for Coarse-to-fine 3D<br />

SurfaceExtraction. Medical Image Analysis, 3(2):187-207,<br />

1999<br />

[4] C. Rezk-Salama, K. Engel, M. Bauer, G. Greiner and T. Ertl.<br />

Interactive volume rendering on standard PC graphics<br />

hardware using multi-textures and multi-stage rasterization.<br />

Siggraph & Eurographics Workshop on Graphics Hardware<br />

2000, 2000.<br />

[5] http://www.tgs.com


Interactive poster: visualizing the interaction<br />

between two proteins<br />

Nicolas Ray, Xavier Cavin<br />

Inria Lorraine / Isa, France<br />

Introduction<br />

Protein docking is a fundamental biological process<br />

that links two proteins in order to change their<br />

properties. The link is defined by a set of forces<br />

between two large areas of the protein boundaries.<br />

These forces can be classified in two categories:<br />

• The Van der Waals (VdW) forces, corresponding<br />

to the geometrical matching of the molecular<br />

surfaces [1].<br />

• Other forces, including hydrogen bounds,<br />

induction, hydrophobic effects, dielectric effects,<br />

etc.<br />

Two docked proteins are very close to each other due<br />

to the VdW forces. This makes the understanding of<br />

the phenomenon difficult using classical molecular<br />

visualization. We present a way to focus on the most<br />

interesting area: the interface between the proteins.<br />

Visualizing the interface is useful both to understand<br />

the process thanks to co-crystallized proteins and to<br />

estimate the quality of docking simulation result. The<br />

interface may be defined by a surface that separates the<br />

two proteins. The geometry of the surface is induced<br />

by the VdW forces, while other forces can be<br />

represented by attributes mapped onto the surface. We<br />

present a very fast algorithm that extracts the interface<br />

surface.<br />

Moreover, the result of a rigid docking simulation can<br />

be improved using the flexibility of the residues. We<br />

show how the interface surface geometry and attributes<br />

can be updated in real-time when the user interactively<br />

moves the residues. This way, we allow expert<br />

knowledge to be intuitively introduced in the process to<br />

enhance the quality of the docking.<br />

Interface extraction<br />

The interface can be defined as the iso-0 of a `distance<br />

to molecule' function defined as follows :<br />

dist(X) = dist_to_protein_A(X) - dist_to_protein_B(X)<br />

While classical approaches extract this iso-surface<br />

using a greedy algorithm [2], we propose to speed-up<br />

the process using a Delaunay tetrahedrization.<br />

The Delaunay tetrahedrization is computed by CGAL<br />

[3] using all atoms as vertices (see figure 1). The<br />

48<br />

Bernard Maigret<br />

CNRS / LCTN, France<br />

interface is then extracted using a marching tetrahedra<br />

algorithm (see figure 2).<br />

As illustrated in figure 3, slicing the surface along the<br />

tetrahedrization edges enables to interactively move the<br />

interface between the molecular surfaces of the<br />

proteins.<br />

Mapping attributes<br />

It is possible to map on the interface several attributes<br />

characterizing the potential interactions both<br />

qualitatively and quantitatively.<br />

Remember that each vertex of the interface is on an<br />

edge of the tetrahedrization, joining a pair of atoms<br />

belonging to each protein. The Delaunay<br />

tetrahedrization ensures that these atoms are the closest<br />

ones to the interface vertex. This property makes it<br />

very easy to extract local information about docking<br />

possibilities around each vertex of the interface.<br />

In our experiments (see figure 4), a quantitative<br />

attribute and a qualitative attribute have been tested :<br />

• The distance to the proteins.<br />

• The kind of potential residues interaction:<br />

hydrogen link, hydrophobia link, Pi...X, Pi...Pi,<br />

same charge and opposite charge are represented<br />

by symbolic colors.<br />

As in the MolSurfer application, electrostatic potential<br />

and hydrophobia can also be used as attributes.<br />

Interactive modifications<br />

The interface extraction presented above is very fast<br />

(about 1 second), but not enough to enable interactive<br />

surface extraction. The most time consuming step is<br />

the tetrahedrization algorithm, whose complexity is<br />

O(n.log(n)). Fortunately, it is possible to dynamically<br />

remove and insert vertices from such a tetrahedrization.<br />

The interface can be updated in real-time when a small<br />

part of the protein (like a residue) is moved. At each<br />

frame, each vertex of the residue is removed from the<br />

tetrahedrization and inserted back with its new<br />

position; the new interface is then extracted. The<br />

whole process takes less than 0.1 second.


Acknowledgments<br />

This work was supported by the ARC Docking of Inria.<br />

Thanks to CGAL for the Delaunay tetrahedrization<br />

code.<br />

References<br />

[1] B. Lee and F. Richards. The interpretation of<br />

protein structures: Estimation of static accessibility. J.<br />

of Molecular Biology, 55:379-400, 1971.<br />

[2] R.R.Gabdoulline and R.C.Wade. Analytically<br />

Defined Surfaces to analyze molecular interaction<br />

properties. J. Mol. Graph. 14:341-353, 1996.<br />

[3] A. Fabri, G.-J. Giezeman, L. Kettner, S. Schirra, S.<br />

Schönherr. On the design of CGAL, the computational<br />

geometry algorithms library. Software - Practice and<br />

Experience, Vol. 30, 1167-1202, 2000.<br />

[4] http://www.embl-heidelberg.de/~gabdoull/ads/imap/<br />

Figure 1: Delaunay tetrahedrization. Figure 2: Interface extraction.<br />

Figure 3: Slicing the interface. Left : the interface is snapped to the first protein. Middle : the interface is equidistant to both<br />

proteins surfaces. Right : the interface is snapped to the first protein.<br />

Figure 4: Mapping attributes. Left: interface. Middle: distance map. Right: kind of residue interaction.<br />

49


Photorealistic Image Based Objects from Uncalibrated Images<br />

Miguel Sainz ∗<br />

<strong>Computer</strong> Graphics Lab<br />

Information and <strong>Computer</strong> Science<br />

University of California, Irvine<br />

Abstract<br />

In this paper we present a complete pipeline for image based modeling<br />

of objects using a camcorder or digital camera. Our system<br />

takes an uncalibrated sequence of images recorded around a<br />

scene, it automatically selects a subset of keyframes and then it<br />

recovers the underlying 3D structure and camera path. The following<br />

step is a volumetric scene reconstruction performed using a<br />

hardware accelerated voxel carving approach. From the recovered<br />

voxelized volume we obtain the depth images for each of the reference<br />

views and then we triangulate them following a Restricted<br />

Quadtree meshing scheme. During rendering, we use a highly optimized<br />

approach to combine the available information from multiple<br />

overlapped reference images generating a generate a full 3D<br />

photo-realistic reconstruction. The final reconstructed models can<br />

be rendered in real time very efficiently, making them very suitable<br />

to be used to enrich the content of large virtual environments.<br />

CR Categories: I.3.5 [<strong>Computer</strong> Graphics]: Computational Geometry<br />

and Object Modeling—; I.4.8 [Image Processing and <strong>Computer</strong><br />

Vision]: Scene Analysis—;<br />

Keywords: Volumetric reconstruction, Voxel carving, Hardware<br />

acceleration, Overlapping textures, image-based rendering, multiresolution<br />

modeling, level-of-detail, hardware accelerated blending.<br />

1 Introduction<br />

In this paper we present a complete pipeline for extracting a 3D volumetric<br />

representation of an object from a set of calibrated images<br />

taken with a digital camera or handheld camcorder. This reconstruction<br />

is then processed using a projective texture-mapped depth<br />

mesh model description and we provide an efficient rendering algorithm<br />

obtaining high quality images in realtime.<br />

Since the very beginning of computer technology, the possibility<br />

of reproducing the real world for simulation purposes has been a<br />

primary goal of researchers. The growth of computer graphics technology<br />

has generated an important demand for more complex and<br />

realistic 3D content. However, even though the supporting tools<br />

for complex 3D model creation are more powerful (but also more<br />

expensive and difficult to use), obtaining realistic models is still<br />

difficult and time consuming.<br />

In recent years Image Based Modeling and Rendering techniques<br />

have demonstrated the advantage of using real image data to greatly<br />

improve rendering quality. New rendering algorithms have been<br />

presented that reach photo-realistic quality at interactive speeds<br />

when rendering 3D models based on digital images of physical objects<br />

and some geometric information (i.e. a geometric proxy).<br />

While these methods have emphasized the rendering speed and<br />

quality, they generally require an extensive preprocessing in order<br />

to obtain well calibrated images and geometric approximations of<br />

∗ e-mail: msainz@ics.uci.edu<br />

† e-mail: pajarola@acm.org<br />

‡ e-mail: toni.susin@upc.es<br />

Renato Pajarola †<br />

<strong>Computer</strong> Graphics Lab<br />

Information and <strong>Computer</strong> Science<br />

University of California, Irvine<br />

50<br />

Antonio Susin ‡<br />

Dynamic Simulation Lab<br />

Applied Mathematics Dept.<br />

Polytechnical University of Catalonia<br />

the target objects. Moreover, most of these algorithms heavily rely<br />

on user interaction for the camera calibration and image registration<br />

part or require expensive equipment such as calibrated gantries and<br />

3D scanners.<br />

2 Pipeline Description<br />

Our goal is to extract 3D geometry and appearance information of<br />

the target objects in the scene, based on given camera locations and<br />

their respective images. Different approaches such as photogrammetry,<br />

stereo vision, contour and/or shadow analysis techniques<br />

work with similar assumptions. Figure 1 illustrates the block diagram<br />

of the proposed pipeline for the image based 3D model reconstruction<br />

from images problem<br />

Figure 1: Image Based Modeling pipeline.<br />

The complete pipeline starts with an initial calibration process<br />

of the images themselves, in order to recover the camera internal<br />

and external parameters. The next step in the pipeline is a scene reconstruction<br />

to obtain a complete model representation that can be<br />

used to render novel views of the object. Depending on the chosen<br />

representation for the model, solutions ranging from point based<br />

approaches to complete 3D textured meshes are available in the literature<br />

([Pollefeys 1999], [Sainz 2003]). We propose a novel model<br />

representation that consists of set of textured depth meshes obtained<br />

from a voxelized reconstruction and that uses the images as overlapping<br />

texture maps. During rendering, our approach efficiently<br />

combines all the images as view-dependent projected texture maps<br />

obtaining photorealistic renders at interactive speeds.<br />

2.1 Camera Calibration<br />

The first step of the pipeline consists of recovering the 3D geometry<br />

of a scene from the 2D projections of measurements obtained<br />

from the digital images of multiple reference views, taking into account<br />

the motion of the camera. The proposed calibration approach<br />

[Sainz et al. 2003] is based on a divide and conquer strategy that automatically<br />

fragments the original sequence into subsequences and,<br />

in each of them, a set of key-frames is selected and calibrated up<br />

to a scale factor, recovering both camera parameters and structure


of the scene. When the different subsequences have been successfully<br />

calibrated a merging process groups them into a single set<br />

of cameras and reconstructed features of the scene. A final nonlinear<br />

optimization is performed in order to reduce the overall 2D<br />

re-projection error.<br />

2.2 Volumetric Scene Reconstruction<br />

In order to reconstruct the volume occupied by the object in the<br />

scene we have improved the approach presented in [Sainz et al.<br />

2002], that is based on carving a bounding volume using a color<br />

similarity criterion. The algorithm is designed to use hardware accelerated<br />

features from the videocard. Moreover, the data structures<br />

have been highly optimized in order to minimize run-time memory<br />

usage. Additional techniques such as hardware projective texture<br />

mapping and shadow maps are used to avoid redundant calculations.<br />

2.3 Object Modeling<br />

The final representation of the reconstructed object is based on an<br />

efficient depth-image representation and warping technique called<br />

DMesh ([Pajarola et al. 2003]) which is based on a piece-wise<br />

linear approximation of the reference depth-images as a textured<br />

and simplified triangle meshes. During rendering the algorithm<br />

selects the closest reference views to the novel viewpoint, and it<br />

renders them and combines the result using a per-pixel weighted<br />

sum of the respective contribution, obtaining the final colored image.<br />

This weighted sum and the corresponding final normalization<br />

are achieved in real-time using the programmability of the actual<br />

GPU’s.<br />

2.4 Results<br />

We present the monster dataset(see Fig. 2) that consists of a set of<br />

16 still images of 1024x1024 pixels each taken from an object on a<br />

turntable. A manual tracking process of the fiducials on the surface<br />

of the object is performed to obtain the proper calibration of the<br />

images using our approach.<br />

The volumetric reconstruction starts with an initial volume of<br />

250 x 225 x 170 (9562500 voxels), and using five manually selected<br />

frames from the initial set, it produces in 43 iterations and<br />

3.5 min. a final reconstruction of 1349548 voxels (a 14% of the<br />

initial volume). The DMesh final model is presents an average of<br />

36000 faces per reference view and they are rendered and combined<br />

at 400 frames per second.<br />

References<br />

PAJAROLA, R.,SAINZ, M.,AND MENG, Y. 2003. Dmesh: Fast depthimage<br />

meshing and warping. International Journal of Image and Graphics<br />

(IJIG), (to appear).<br />

POLLEFEYS, M. 1999. Self-calibration and metric 3D reconstruction from<br />

uncalibrated image sequences. PhD thesis, K.U. Leuven.<br />

SAINZ, M.,BAGHERZADEH, N., AND SUSIN, A. 2002. Hardware accelerated<br />

voxel carving. In Proceedings of 1st Ibero-American Symposium<br />

in <strong>Computer</strong> Graphics, <strong>28</strong>9–297.<br />

SAINZ, M.,BAGHERZADEH, N., AND SUSIN, A. 2003. Camera calibration<br />

of long image sequences with the presence of occlusions. In<br />

Proceedings of <strong>IEEE</strong> International Conference on Image Processing.<br />

SAINZ, M. 2003. 3D Modeling from Images and Video Streams. PhD<br />

thesis, University of California Irvine.<br />

51<br />

Figure 2: From top to bottom: the original 16 frames of the monster<br />

dataset; the calibrated camera path; the reconstructed volume<br />

without coloring and a novel rendered view of the object using the<br />

DMesh approach.


DStrips: Dynamic Triangle Strips for Real-Time Mesh Simplification and<br />

Rendering<br />

1 Motivation<br />

Multiresolution modelling techniques are important to<br />

cope with the increasingly complex polygonal models available<br />

today such as high-resolution isosurfaces, large terrains,<br />

and complex digitized shapes [10]. Large triangle<br />

meshes are difficult to render at interactive frame rates due<br />

to the large number of vertices to be processed by the graphics<br />

hardware. Level-of-detail (LOD) based visualization<br />

techniques [7] allow rendering the same object using triangle<br />

meshes of variable complexity. Thus, the number of<br />

processed vertices is adjusted according to the object’s relative<br />

position and importance in the rendered scene. Many<br />

mesh simplification and multiresolution triangulation methods<br />

[5], [8], [4], [11], [12] have been developed to create<br />

different LODs, sequence of LOD-meshes, and hierarchical<br />

triangulations for LOD based rendering. Although reducing<br />

the amount of geometry sent to the graphics pipeline elicits<br />

a performance gain, a further optimization can be achieved<br />

by the use of optimized rendering primitives, such as triangle<br />

strips.<br />

Triangle strips have been used extensively for static mesh<br />

representations since their widespread availability through<br />

tools such as the classic tomesh.c program [1], Stripe [6]<br />

and the more recent NVidia NVTriStrip tools [3] [2]. However,<br />

using such triangle strip representations and generation<br />

techniques is not practical for a multiresolution triangle<br />

mesh. The problem of representing the stripped mesh<br />

and maintaining the coherency of the triangle strips is compounded<br />

when used with LOD-meshes. In view-dependent<br />

meshing methods the underlying mesh is in a constant state<br />

of flux between view positions. This poses a significant hurdle<br />

to surmount for current triangle strip generation techniques<br />

for two core reasons. First, triangle strip generation<br />

techniques tend to require too much CPU time and memory<br />

space to be practical for interactive view-dependent triangle<br />

mesh visualization. Secondly, most triangle strip generation<br />

techniques focus on producing optimized strips, but<br />

not managing the strips in light of continuous changes to<br />

the mesh. That is, for each new view position a new stripifi-<br />

Michael Shafae and Renato Pajarola<br />

<strong>Computer</strong> Graphics Lab<br />

School of Information & <strong>Computer</strong> Science<br />

University of California, Irvine<br />

mshafae@ics.uci.edu, pajarola@acm.org<br />

54<br />

Figure 1. Example of dynamically generated triangle strips of a<br />

view-dependently simplified LOD-mesh. Individual triangle strips<br />

are pseudo-colored for better distinction (15548 triangles represented<br />

by 3432 triangle strips).<br />

cation must be computed. Our approach, on the other hand,<br />

manages triangle strips in such a way that reconstructing the<br />

entire stripification never has to be done. Instead, it either<br />

grows triangle strips, shrinks triangle strips, or recomputes<br />

triangle strips only for small patches when necessary.<br />

In this short paper and poster, DStrips is presented.<br />

DStrips is a simple yet effective triangle strip generation<br />

algorithm and data structure for real-time triangle strip generation,<br />

representation, and rendering of LOD-meshes. The<br />

implementation presented in this paper is built on a LODmesh<br />

using progressive edge collapse and vertex split operations<br />

[9] based on a half-edge representation of the triangle<br />

mesh connectivity [13]. However, DStrips is not tightly<br />

coupled to any one particular LOD-mesh. DStrips is easily<br />

adapted to any LOD-mesh so long as the mesh provides a<br />

mapping from an edge to its associated faces and vice-versa.<br />

Also, the edges of a face must maintain a consistent order-


ing and orientation in the LOD-mesh. Figure 1 presents<br />

an example screenshot of pseudo-colored triangle strips of<br />

a view-dependently simplified LOD-mesh that were generated<br />

by DStrips.<br />

2 Innovative Aspects<br />

Unlike other LOD-mesh with some sort of triangle stripping<br />

support, DStrips does not merely shorten the initially<br />

computed triangle strips. Rather, DStrips dynamically<br />

shrinks, grows, merges and partially recomputes strips. Table<br />

1 briefly compares other approaches which couple triangle<br />

strips with an LOD-mesh.<br />

Name Algorithm Stripification Strip Management<br />

Dstrips Online Dynamic Shorten, Grow,<br />

Merge, Partial Re-Strip<br />

Tunneling Online Dynamic Repair & Merge<br />

(Tunneling Operation)<br />

Stripe Offline Static Not Applicable<br />

Skip Strips Pre-Process Static Resize<br />

(Stripe) Pre-Computed Strips<br />

Mulltiresolution Pre-Process Static Resize<br />

△ Strips Pre-Computed Strips<br />

Table 1. A comparison of triangle stripification techniques.<br />

Note that a clear distinction can be drawn between the techniques<br />

which dynamically manage the triangle strips and those which<br />

shorten pre-computed triangle strips.<br />

To illustrate the novelty of our approach, experiments<br />

were performed on a Sun Microsystems Ultra 60 workstation<br />

with dual 450MHz UltraSparc II CPUs and an Expert3D<br />

graphics card. Table 2 shows the sizes of the different<br />

models we used for testing DStrips.<br />

Table 2 shows the average number of faces, LODupdates<br />

and triangle strips encountered each frame. The<br />

time to perform the edge collapse and vertex split updates<br />

each frame is also recorded here since it is independent of<br />

the rendering mode. The average number of triangle strips<br />

per frame is given for the three stripping configurations:<br />

adjacency stripping, greedy stripping allowing swap operations<br />

and greedy stripping without swap operations (strictly<br />

left-right). One can see from Table 2 that adjacency stripping<br />

generates fewer strips than greedy stripping, in particular<br />

if strict left-right alternation is enforced.<br />

Model # # # △ # Update # Strips<br />

Faces Vertices Drawn Updates Time ADJ GS GNS<br />

happy 100,000 49,794 54,784 358 3ms 7,006 8,127 12,143<br />

horse 96,966 48,485 39,584 519 4ms 5,008 5,4<strong>28</strong> 7,808<br />

phone 165,963 83,044 60,291 498 5ms 7,272 7,904 11,382<br />

Table 2. The model’s name, total number of triangle faces, total<br />

number of vertices, per frame average numbers of rendered triangles,<br />

LOD-mesh updates, and time to perform mesh updates. The<br />

average number of triangle strips is divided into adjacency stripping<br />

(ADJ) as well as greedy stripping with swap (GS) and without<br />

swap operations (GNS).<br />

3 Conclusion<br />

DStrips, a simple and efficient method to dynamically<br />

generate triangle strips for real-time level-of-detail (LOD)<br />

55<br />

meshing and rendering. Built on top of a widely used LODmesh<br />

framework using a half-edge data structure based hierarchical<br />

multiresolution triangulation framework, DStrips<br />

has shown efficient data structures and algorithms to compute<br />

a mesh stripification and to manage it dynamically<br />

through strip grow and shrink operations, strip savvy mesh<br />

updates, and partial re-stripping of the LOD-mesh.<br />

References<br />

[1] K. Akeley, P. Haeberli, and D. Burns. The tomesh.c program.<br />

Technical Report SGI Developer’s Toolbox CD, Silicon<br />

Graphics, 1990.<br />

[2] Curtis Beeson and Joe Demer. Nvtristrip v1.1. Software<br />

available via Internet web site., November 2000.<br />

http://developer.nvidia.com/view.asp?IO=nvtristrip v1 1.<br />

[3] Curtis Beeson and Joe Demer. Nvtristrip, library version.<br />

Software available via Internet web site., January 2002.<br />

http://developer.nvidia.com/view.asp?IO=nvtristrip library.<br />

[4] Paolo Cignoni, Claudio Montani, and Roberto Scopigno. A<br />

comparison of mesh simplification algorithms. <strong>Computer</strong>s &<br />

Graphics, 22(1):37–54, 1998.<br />

[5] Leila De Floriani and Enrico Puppo. Hierarchical triangulation<br />

for multiresolution surface description. ACM Transactions<br />

on Graphics, 14(4):363–411, 1995.<br />

[6] F. Evans, S. Skiena, and A. Varshney. Optimizing triangle<br />

strips for fast rendering. In Proceedings <strong>IEEE</strong> Visualization<br />

96, pages 319–326. <strong>Computer</strong> <strong>Society</strong> Press, 1996.<br />

[7] T. Funkhouser and C. Sequin. Adaptive display algorithm<br />

for interactive frame rates during visualization of complex<br />

virtual environments. In Proceedings SIGGRAPH 93, pages<br />

247–254. ACM SIGGRAPH, 1993.<br />

[8] Paul S. Heckbert and Michael Garland. Survey of polygonal<br />

surface simplification algorithms. SIGGRAPH 97 Course<br />

Notes 25, 1997.<br />

[9] Hugues Hoppe. Progressive meshes. In Proceedings SIG-<br />

GRAPH 96, pages 99–108. ACM SIGGRAPH, 1996.<br />

[10] Marc Levoy, Kari Pulli, Brian Curless, Szymon<br />

Rusinkiewicz, David Koller, Lucas Pereira, Matt Ginzton,<br />

Sean Anderson, James Davis, Jeremy Ginsberg,<br />

Jonathan Shade, and Duane Fulk. The digital michelangelo<br />

project: 3d scanning of large statues. In Proceedings<br />

SIGGRAPH 2000, pages 131–144. ACM SIGGRAPH,<br />

2000.<br />

[11] Peter Lindstrom and Greg Turk. Evaluation of memoryless<br />

simplification. <strong>IEEE</strong> Transactions on Visualization and<br />

<strong>Computer</strong> Graphics, 5(2):98–115, April-June 1999.<br />

[12] David P. Luebke. A developer’s survey of polygonal simplification<br />

algorithms. <strong>IEEE</strong> <strong>Computer</strong> Graphics & Applications,<br />

21(3):24–35, May/June 2001.<br />

[13] Kevin Weiler. Edge-based data structures for solid modeling<br />

in curved-surface environments. <strong>IEEE</strong> <strong>Computer</strong> Graphics<br />

& Applications, 5(1):21–40, January 1985.


Abstract<br />

Interactive Visualization of Time-Resolved Contrast-Enhanced Magnetic<br />

Resonance Angiography (CE-MRA)<br />

Time-resolved Magnetic Resonance Angiography (MRA)<br />

provides time-varying 3D datasets, or 4D data, that demonstrate<br />

the vascular anatomy and general flow patterns within the body.<br />

A single exam can generate 15-30 time frames of 256x256x256<br />

images. Current commercial PACS workstations are expensive<br />

and do not provide the visualization tools necessary for<br />

interpreting this data. They are also ill suited for the types of<br />

image analysis often required for research applications. We<br />

introduce an interactive OpenGL-based tool for visualizing these<br />

datasets. It offers maximum intensity projections (MIPs) with<br />

arbitrary cut-planes and viewing angle. It allows rapid switching<br />

between time frames or datasets and rendering of multiple<br />

datasets simultaneously in different colors.<br />

CR Categories: I.3.3: [<strong>Computer</strong> Graphics] Picture/Image Generation—<br />

Viewing algorithms; I.3.4: Graphics Utilities—Application packages,<br />

graphics packages; I.3.6: Methodology and Techniques—Interaction<br />

techniques<br />

Keywords: 4D Visualization, Volume Rendering, Medical Imaging, MIP,<br />

Angiography, MRI, MRA<br />

1 Introduction<br />

Magnetic Resonance Angiography (MRA) has been limited by<br />

long scan times. While x-ray fluoroscopy and MRA both work by<br />

injecting a vascular contrast agent and imaging its passage over<br />

time, they have very different properties. X-ray fluoroscopy is<br />

capable of 2D imaging at very high frame rates, while the<br />

clinically accepted technique for MRA (3DFT acquisition)<br />

produces a single 3D image at a given time after injection. The<br />

high frame rate in x-ray fluoroscopy provides added diagnostic<br />

confidence as the radiologist can watch injected contrast flow<br />

from the arteries to the veins. Having only a single image limits<br />

the effectiveness of MRA for complex flow patterns (such as<br />

dissections or retrograde filling). It also makes it critical to time<br />

the scan to get good arterial signal without venous contamination,<br />

which is difficult for patients with delayed filling (as is the case<br />

with aortic aneurysm or stenosis).<br />

A 3D undersampled projection-reconstruction acquisition,<br />

VIPR, allows for large speedup factors and makes possible timeresolved<br />

3D MRA[1]. VIPR exams have high isotropic spatial<br />

resolution and good temporal resolution over a large field-of-view.<br />

A 30-60 second scan generates a dataset with spatial resolution of<br />

256x256x256 or greater and 15-30 time frames. At each point,<br />

the scan characterizes the concentration of the contrast agent in<br />

blood or tissue. Time-resolved exams eliminate scan timing<br />

concerns, ease diagnosis for complex flow patterns, and produces<br />

useful images at several stages of arterial filling.<br />

Unfortunately, the visualization tools available from MRI<br />

scanner manufacturers and PACS vendors are ill suited to dealing<br />

with these time-resolved datasets. The commercial visualization<br />

Ethan Brodsky<br />

Electrical and <strong>Computer</strong> Engineering<br />

Walter Block<br />

Biomedical Engineering and Medical Physics<br />

University of Wisconsin-Madison<br />

Madison, Wisconsin 53706<br />

e-mail: {ethan/block}@mr.radiology.wisc.edu<br />

56<br />

tools offer radiologists the capability of doing Multi-Planar<br />

Volume Reformats (MPVR) to analyze 3D volumes, but are<br />

designed to work with a single dataset from a single scan.<br />

Additionally, the commercial tools run on expensive, specialized<br />

workstations. These workstations are centrally located in<br />

radiology reading rooms and their availability is limited.<br />

There is a need for fast, simple visualization tools that are<br />

adept at working with multiple datasets and operate on<br />

inexpensive desktop workstations running Linux/X or Microsoft<br />

Windows. The tools must do maximum intensity projections<br />

(MIPs) through a volume, with arbitrary cut-planes and<br />

viewpoints. They must provide the ability to rapidly switch<br />

between time frames (maintaining the same viewpoint and cutplanes)<br />

for looking at time-varying properties and to switch<br />

between entire datasets for comparing various acquisition and<br />

reconstruction techniques under development. They must have<br />

the ability to render multiple datasets simultaneously in different<br />

colors, with the capability of doing simple per-voxel arithmetic<br />

operations. Finally, they must be able to generate still images and<br />

movies.<br />

We have developed an OpenGL-based application to satisfy<br />

these requirements. It is written in C and uses GLUT for all userinterface<br />

interactions, so it is easily portable between Linux/X and<br />

Microsoft Windows.<br />

2 Methods<br />

CE-MRA exams are usually interpreted using MIPs through a<br />

volume of interest. The MIP operation enhances the visibility of<br />

high-contrast vessels over the background tissue.<br />

The tool uses a four-pane user interface (figure 1), with three<br />

small windows showing orthogonal “ortho-navigation” slices<br />

(axial, coronal, and sagittal) to guide the user, and a single large<br />

window showing the 3D MIP.<br />

The MIP is constructed in video hardware using the GL_MAX<br />

blending operation.[2] The volume is represented to the video<br />

card as a collection of 2D textures of slices on three orthogonal<br />

planes. The entire 3D volume is rendered using the slice set<br />

nearest orthogonal to the viewpoint. It is also possible to use a<br />

single 3D texture to represent the entire volume and render using<br />

slices orthogonal to the viewpoint, but large 3D textures are not<br />

supported on all video cards.<br />

The volume to be rendered is bounded using a set of usercontrolled<br />

cut-planes, which are specified graphically on one of<br />

the ortho-navigation windows. The viewpoint can be locked to<br />

remain perpendicular to the cut-planes, or can be unlocked to<br />

allow viewing from arbitrary angles.<br />

The software is capable of displaying either a single dataset or<br />

displaying multiple datasets simultaneously in different colors.<br />

Multiple colors can be used to show arteries in red and venous<br />

vasculature in blue, or to show the false lumen of a dissected aorta


in a different color to ease assessing whether vessels come off the<br />

true or false lumen.<br />

3 Hardware and Performance<br />

The tool can run on a single x86-based workstation with a<br />

consumer-level video card. Current work has been on a dualprocessor<br />

P3-800 workstation with 1 GB of memory and an<br />

ASUS V7700 GeForce2 GTS video card with 64 <strong>MB</strong> of video<br />

memory.<br />

A single 256x256x256 volume requires 48<strong>MB</strong> of memory, as it<br />

must be stored three times, with slices along each axis. However,<br />

only one third of this data is used to render a frame, since only a<br />

single set of slices is used at any one time. Swapping new texture<br />

sets into video memory (necessary when rotating the viewpoint<br />

through certain angles) is a relatively fast operation that takes less<br />

than ¼ second. With adequate video memory to store two full<br />

texture sets, the delay could potentially be eliminated in most<br />

cases by anticipating texture switches and prefetching the<br />

necessary set.<br />

Performance depends on the extent of the volume rendered<br />

(determined by the cut-planes) and the size of the displayed image<br />

(determined by the viewing distance). Both of these are pixel fill<br />

rate limitations. Polygon transforms requirements are determined<br />

by the number of datasets rendered simultaneously and thus<br />

remain relatively constant.<br />

When rendering to fill the full viewport, the raw data is<br />

magnified by a factor of three. Rendering the full volume over<br />

the full viewport gives a frame rate of 3 fps. With thinner slabs<br />

rendered over the full viewport, the frame rate can be as high as<br />

60 fps. Rendering without magnification leads to significant<br />

improvements in frame rate, with 10 fps for full-volume MIPs and<br />

70 fps for thin slabs. Frame rates during typical use generally<br />

range from 6-20 fps. It is anticipated that higher performance<br />

video cards will lead to far higher frame rates.<br />

4 Future Work and Conclusions<br />

The tool is designed to be flexible and easily extensible. It has<br />

already been extended to support 256x256x1024 full-body<br />

angiograms generated by moving table acquisition techniques.<br />

Switching to a single 3D texture per volume, instead of<br />

collections of 2D textures, should reduce memory usage and<br />

allow additional datasets to be simultaneously loaded into<br />

memory.<br />

The tool is well suited to acceleration using parallel-execution<br />

methods. 4D cluster visualization methods being developed in<br />

conjunction with this project support distribution of a time<br />

sequence across nodes in a cluster. Interactive view<br />

manipulations are performed in parallel on all nodes and reading<br />

back the rendered results sequentially from each node produces a<br />

high frame-rate animation. This approach is especially useful as<br />

higher resolution datasets and more complex rendering algorithms<br />

increase memory requirements and reduce the frame rates on<br />

desktop workstations.<br />

The tool can also be easily extended to support rendering in<br />

stereo for viewing with 3D glasses, easing interpretation of<br />

complex 3D structures.<br />

Time-resolved imaging offers additional information that is<br />

very useful clinically. However, without a tool designed to take<br />

advantage of the time-resolved data, radiologists often examine<br />

only a single time frame to make diagnoses. We plan to assess<br />

the use of our tool in interpreting challenging cases such as aortic<br />

stent follow-up with endoleak characterization and the analysis of<br />

dissections and anomalous portal or pulmonary flow patterns<br />

57<br />

We have developed a tool that allows for interactive 3D<br />

visualization on inexpensive desktop workstations. It compares<br />

favorably with tools available on commercial PACs workstations,<br />

achieving similar or higher frames rates with similar image<br />

quality at a far lower cost. It has proved useful for our research<br />

applications, and its ability to work with time-resolved<br />

information has great clinical potential.<br />

Acknowledgements<br />

This work was funded by the Whitaker Foundation and NIH<br />

R01 EB002075.<br />

[1] Barger AV, et al. Time-resolved contrast-enhanced imaging with<br />

isotropic resolution and broad coverage using an undersampled 3D<br />

projection trajectory. Magn Reson Med 48(2):297-305 (2002).<br />

[2] Tom McReynolds. Advanced Graphics Programming Techniques<br />

Using OpenGL. ACM SIGGRAPH 98 Course. [Course notes online at<br />

http://www.sgi.com/software/opengl/advanced98/notes/notes.html].<br />

Figure 1: The user interface features three small “ortho-navigation”<br />

windows showing single slices to assist the user in selecting cut-planes,<br />

and a large window showing the rendered 3D MIP.<br />

Figure 2: Color can be a useful aid in interpreting complex structures<br />

and flow patterns. (a) Dissected vessels are split into two lumen: a “true<br />

lumen” with good blood flow, and a “false lumen” with poor flow. Here<br />

a late time frame is shown in green, to assist in identifying the false<br />

lumen and vessels branching from it. (b) The portal venous system can<br />

have complex anomalous flow patterns. Showing a late frame in blue<br />

eases distinguishing between arterial and venous flow. (c) and (d) These<br />

images show the entire vasculature of the brain and abdomen, with the<br />

arteries (early frame) in red and veins (late frame) in blue.<br />

a b<br />

c d


Using CavePainting to Create Scientific Visualizations<br />

David B. Karelitz *<br />

Brown University<br />

Figure 1: We extended CavePainting, a system for drawing in VR,<br />

to aid design tasks in the scientific visualization domain by<br />

allowing designers to easily preview designs in an immersive<br />

environment. This figure contains one prototype of a particle<br />

designed to show pressure as the width of the head, and velocity<br />

as the position of the tentacles with faster particles having more<br />

streamlined tentacles. The legend used to generate this image is<br />

shown in Figure 2.<br />

Abstract<br />

We present an application of a virtual reality (VR) tool to the<br />

problem of creating scientific visualizations in VR. Our tool<br />

allows a designer to prototype a visualization prior to<br />

implementation. The system evolved from CavePainting [Keefe et<br />

al. 2001], which allows artists to draw in VR. We introduce the<br />

concept of using an interactive legend to link a visualization<br />

design to the visualization data. As opposed to existing methods<br />

of visualization design, our method enables the researcher to<br />

quickly experiment with multiple visualization designs - without<br />

having to code each one. We applied this system to the<br />

visualization of coronary artery flow data.<br />

1 Introduction<br />

According to Senay and Ignatius, ``The primary objective in data<br />

visualization is to gain insight into an information space by<br />

mapping data onto graphical primitives'' [Senay and Ignatius<br />

1994]. The first step in this process is often a quick sketch of the<br />

elements of the visualization. When designing visualizations for<br />

VR, sketching on paper does not capture the immersive nature of<br />

VR. Furthermore, implementing each design often takes hours or<br />

even days as visualization styles are coded, examined, and<br />

evaluated.<br />

--------------------------------------------<br />

* e-mail: dbk@cs.brown.edu<br />

† e-mail: dfk@cs.brown.edu<br />

‡ e-mail: dhl@cs.brown.edu<br />

Daniel F. Keefe †<br />

Brown University<br />

58<br />

David H. Laidlaw ‡<br />

Brown University<br />

Figure 2: Legends are used to link drawn icons to data. Velocity<br />

Magnitude, the data type represented by this legend is shown just<br />

below the green line. User drawn icons are added above the line,<br />

and a preview of the final icons or particles is shown below it.<br />

The goal of our system is to reduce the iteration time for<br />

designing a visualization to a few minutes. We accomplish this by<br />

allowing an artist to sketch a visualization in VR, and then apply<br />

that to the actual visualization data. The end result is a hastened<br />

research cycle; each design can be implemented and evaluated in<br />

a matter of minutes.<br />

The CavePainting system allows users to draw 3D forms in virtual<br />

reality directly using a six degree-of-freedom tracker. The user<br />

manipulates a brush to generate a stroke of color and texture.<br />

These strokes can be edited and combined into compound strokes.<br />

2 Motivation<br />

The existing artery application visualizes pulsatile fluid flow<br />

using particles. The particles showed only the path the particle<br />

would take through the flow; however, simply looking at the path<br />

of a particle was not enough to give a comprehensive image of the<br />

flow. The flow is characterized by multiple values at each point<br />

within the flow. The main problem then became how to show<br />

multiple values in a single particle. [Sobel et al. 2002]<br />

The traditional method of designing particles is to sketch some<br />

designs on paper, implement them, and then evaluate them. The<br />

main problem with designing VR images on paper is that paper<br />

design does not fully characterize what the resulting visualization<br />

will look like in a VR environment. For example, choosing colors<br />

for VR is difficult to do on paper since the projected colors are<br />

often dim and unsaturated. Furthermore, any design on paper is<br />

still a 2D design, and 3D designs on a 2D medium may be<br />

problematic when viewed in immersive 3D. Our system operates<br />

between the paper design and the actual implementation, and thus<br />

provides a medium in which to easily test a design. Paper designs<br />

are still useful as starting points, but refining a base design can<br />

proceed much faster with our system than with the design and


Figure 3: This example visualization shows bird icons that change<br />

wingspan in response to velocity, and color in response to<br />

pressure. This snapshot was taken at a low pressure point.<br />

implementation cycle normally employed. Using our system, a<br />

researcher can take a paper design, sketch the design in 3D, and<br />

imediately view the final result.<br />

3 Our Approach<br />

Legends were used to combine CavePainting strokes with the data<br />

being visualized. CavePainting strokes added to the legend<br />

indicate how the final visualization element changes in response<br />

to a particular data type. The lower portion of the legend shows<br />

miniature versions of the final visualization element. There is one<br />

legend per data attribute visualized.<br />

The previous system used for visualizing artery flow data requires<br />

each different visual style to be explicitly coded, and as a result,<br />

adding new styles or more information to the visualization often<br />

takes days or weeks. With our system, the researcher is able to<br />

sketch the legend for a visualization and see the result almost<br />

instantly.<br />

4 Results<br />

We used the system to design particles for the visualization of the<br />

artery data. The CavePainting system excels at creating organic<br />

forms, so we chose some organic creatures -- fish and birds -- as a<br />

basis for the particle design. Both particles were created to<br />

simultaneously show two data types: velocity and pressure.<br />

Overall, it took about half an hour to generate each visualization.<br />

The first particles were squid with some trailing tentacles, as seen<br />

in Figure 1. Speed was mapped to the shape of the tentacles;<br />

contracted tentacles signify a slower velocity and streamlined<br />

tentacles signify a faster velocity; pressure was mapped to the size<br />

of the squid's head.<br />

The second set of particles created were modeled after birds, as<br />

seen in Figure 2. The birds show velocity as the shape of the<br />

wings, outstretched for fast, and folded in for slow; they show<br />

pressure as color, with green as high pressure and blue as low<br />

pressure. One legend used to create bird icons is shown in<br />

Figure 4.<br />

59<br />

Figure 4: This legend was used to map speed to wingspan. The<br />

user-drawn icons appear above the line; below it are some<br />

samples of the final particles.<br />

5 Conclusion<br />

Designing visualizations for multi-valued, time-varying data is a<br />

very hard problem requiring many iterations of the design,<br />

implementation, and critique cycle. Furthermore, designs are<br />

traditionally done on paper, and not in the target medium. This<br />

works for some types of visualizations, but is much less effective<br />

when the final medium is an immersive display. Paper simply<br />

cannot capture the nuances of an immersive display as well as a<br />

design done in the target medium.<br />

Our system provides the designer with a tool to quickly judge how<br />

well a particular design will work in the target environment. It is<br />

not designed to replace paper designs as it still takes longer to<br />

draw a design in our system than on paper, but it does allow the<br />

designer to preview a design before the costly step of coding it.<br />

As long as implementing a design is the costly step in completing<br />

a visualization, every effort should be made to reduce the number<br />

of times a design is implemented; our system is one step towards<br />

the goal of reducing the number of implementations to just one.<br />

References<br />

KEEFE, D., ACEVEDO, D.,MOSCOVICH, T., LAIDLAW, D., AND<br />

LAVIOLA, J. 2001. Cavepainting: A fully immersive 3d artistic<br />

medium and interactive experience. In Proceedings of the 2001<br />

Symposium on Interactive 3D Graphics, 85–93.<br />

SENAY, H., AND IGNATIUS, E. 1994. A knowledge-based system for<br />

visualization design. In <strong>IEEE</strong> <strong>Computer</strong> Graphics and Applications,<br />

36–47.<br />

SOBEL, J., FORSBERG, A., ZELEZNIK, R., LAIDLAW, D. H.,<br />

PIVKIN, I., KARNIADAKIS, G., AND RICHARDSON, P. 2002.<br />

Particle flurries for 3D pulsatile flow visualization. In <strong>Computer</strong><br />

Graphics and Applications. pending publication


3D VISUALIZATION OF ECOLOGICAL NETWORKS ON THE WWW<br />

Ilmi Yoon 1 , Rich Williams 2 , Eli Levine 1 , Sanghyuk Yoon 1 , Jennifer Dunne 3 , Neo Martinez 4<br />

We present web-based information technology being developed to<br />

improve the quality of ecological network study and to promote<br />

collaboration of the worldwide researchers through the 3D<br />

visualizations of ecological networks on the WWW. Important<br />

design issues are (1) developing flexible and efficient data format<br />

to handle diverse ecological data for storage and analysis, (2)<br />

developing intuitive 3D visualization of complex ecological<br />

network data, and (3) developing component-based architecture<br />

of analysis and 3D presentation tools on WWW. 3D network<br />

visualization algorithms include variable node and link sizes,<br />

placements according to node connectivity and tropic levels, and<br />

visualization of other node and link properties in ecological<br />

network (food web) data. The flexible architecture includes an<br />

XML application design, FoodWebML, and pipelining of<br />

computational components, and flexible 3D presentation format<br />

on WWW according to users preference (VRML, Java applet, or<br />

Plug-in).<br />

1. INTRODUCTION<br />

The need of interactive 3D visualization of food webs<br />

In ecology, the study of complex networks (food webs<br />

- who eats whom among species within a habitat) is central<br />

to researchers’ efforts to understand large systems of<br />

dynamically interacting components such as their stability,<br />

functioning, and dynamics [Strogatz01][Williams00].<br />

Especially for complex networks, visualization and<br />

simulation allow concepts to be more clearly and<br />

compellingly explored. 3D visualization is particularly<br />

valuable because 2D visualizations of such complex<br />

networks are usually overwhelmed with too much<br />

information, becoming cluttered and visually confused.<br />

Also, 3D visualization embraces users’ intuitive connection<br />

between physical quantities and visible volumes, which are<br />

spatially more compact than 2D areas. In addition, the<br />

interactive manipulation of foodweb properties and<br />

visualization characteristics helps researchers gain new<br />

insights into the structure and dynamics of complex food<br />

webs.<br />

The need of web portal<br />

An accessible repository of information on a species’<br />

biology and inter-species interactions is strongly desired to<br />

experts of particular subsystems to explore the broader<br />

ecological context surrounding their focal systems. In order<br />

to build such repository, it is important to promote the<br />

participation and collaboration of worldwide ecologists,<br />

hence, web-based interface (web portal) is a natural choice<br />

for such system since it provides most familiar and<br />

consistent/uniform accesses from anywhere in the world.<br />

2. ARCHITECTURE DESIGN<br />

The architecture is designed with two important issues<br />

in mind. Field scientists or ecologists usually use their own<br />

format to keep data since each scientist has slightly<br />

60<br />

different interest on each species. Users have different<br />

preferences in browsing and manipulating 3D contents on<br />

the web. In addition, reusability is always important. To<br />

address these issues, we designed an XML application,<br />

FoodWebML for a flexible database and 3D visualizations<br />

on the WWW, and a pipeline architecture that supports the<br />

flexibility of FoodWebML.<br />

FoodWebML (FWML)<br />

FoodWebML is an XML application that flexibly<br />

handles diverse food-web network data formats and<br />

visualization information efficiently, and can be easily<br />

translated into different formats like VRML, data for<br />

Shockwave or java applet. FoodWebML stores pure data<br />

and optional visualization information that can be<br />

calculated upon request and then stored back to the<br />

FoodWebML for future reuse, thus saving significant<br />

computation time. FoodWebML handles pure data and<br />

visualization data together efficiently, but with clean<br />

distinctions.<br />

The FoodWebML allows a wide range of food web<br />

data to be flexibly represented, including various<br />

parameters that are associated with nodes and links.<br />

Parameters that describe nodes include taxonomic and<br />

functional similarities, body sizes, and other bioenergetic<br />

parameters [Williams01]. In addition to describing<br />

individual species and the links between them,<br />

FoodWebML allows the representation of system-wide<br />

properties, such as environmental parameters.<br />

FoodWebML is also designed to handle hierarchical<br />

aggregation of nodes. Network data can deliver more<br />

meaningful information by embedding its hierarchical<br />

information. The number of nodes could be systematically<br />

reduced by taxonomic or functional similarity aggregation<br />

[Martinez96]. This could hide overwhelming degrees of<br />

complexity to give additional information in a very<br />

intuitive way [Kundu]. FoodWebML allows the definition<br />

of different types of aggregations using group and level<br />

elements. The process of aggregation can be visualized by<br />

using animations of collapsing nodes into a higher-level<br />

node. The animation is available at<br />

http://unicorn.sfsu.edu:5080/wow.<br />

Visualization Pipeline Design<br />

To support several visualization formats at WWW<br />

browsers according to users preference with minimal<br />

overhead, flexible architectures such as pipeline<br />

architecture become essential. Pipeline architecture uses<br />

component-based modular implementation and a simple<br />

interface for organizing such components to configure a<br />

pipeline as needed. This highly flexible pipeline<br />

architecture allows users to easily configure a new pipeline<br />

for specifically desired analyses and visualizations. Figure<br />

1 presents the current WWW architecture. The WWW user


interface allows users to choose one food-web data set from<br />

the database and then choose from a range of visualization<br />

options to configure the pipeline.<br />

Client<br />

Web Browser<br />

with VRML player<br />

uses component<br />

pipelining<br />

RQ: Request<br />

RS: Response<br />

RQF: Request Forwarding<br />

RQ<br />

RQ<br />

RS<br />

RQ<br />

RS<br />

VRML<br />

Fig. 1 – Architecture Design<br />

Server<br />

Tomcat Application<br />

Server<br />

RQF<br />

FoodWeb Selection<br />

Servlet<br />

RQF<br />

Vis Option<br />

Selection Servlet<br />

RQF<br />

Processing Servlets<br />

& JSP pages<br />

XSL VRML Translator<br />

Middleware<br />

Pipeline Components<br />

Food web<br />

data sets<br />

Format Converter<br />

Trophic Level Calculator<br />

Connectivity Calculator<br />

Visual Node Calculator<br />

FWML Database System<br />

Visualization Algorithms<br />

To provide intuitive 3D visualizations, we developed<br />

effective algorithms especially suitable for foodweb data<br />

and made them into components such as Visual Node<br />

Calculator, trophic level calculator, etc (Fig. 1). Node<br />

placement is one of the most critical aspects of 3D network<br />

visualization [Graham00] and we currently use many<br />

important parameters (the trophic level of the species,<br />

generality -the number of prey that the species consumes,<br />

vulnerability - number of predators that consume the<br />

species in question, or the connectivity - the total number<br />

Fig. 2 – FoodWeb3D Visualization of Little Rock Lake,<br />

Wisconsin. FoodWeb3D users visualize the structure and<br />

nonlinear dynamics of empirical and model food webs, rotate the<br />

image, highlight food chain paths, delete species, and adjust the<br />

color and size of the nodes and links.<br />

61<br />

of predators and prey of the species in question to place<br />

nodes, or groups of organisms, in the three-dimensional<br />

space (Fig. 2).<br />

3. CONCLUSION<br />

This work, funded by NSF, Biological Databases and<br />

Information program, aims to develop visualization tools<br />

and facilitation of easy access of food web data through the<br />

WWW. The current implementation is focused on<br />

prototyping and studying the flexibility and performance of<br />

the system design of pipelining, FoodWebML, VRML and<br />

its related database components. The current<br />

implementation allows users to choose one food-web data<br />

set from the database and then choose from a range of<br />

visualization options. Upon requests from the client,<br />

WOW servelets invoke jsp pages to retrieve a<br />

FoodWebML file from the XML database system (Xindice)<br />

and process selected options to create VRML/Shockwave<br />

data on the fly. Table 1 shows the size of FoodWebML at<br />

different stages, generation time, and the number of nodes<br />

and links when executed on a relatively slow PC (Pentium<br />

III, 833 MHz with 1<strong>28</strong><strong>MB</strong>yte main memory).<br />

Data Num<br />

of<br />

Nodes<br />

REFERENCES<br />

Num<br />

of<br />

Links<br />

FWML<br />

with<br />

visual<br />

node<br />

info<br />

(size)<br />

[KB]<br />

FWML<br />

visual<br />

node<br />

generation<br />

time<br />

[sec]<br />

VRML<br />

(size)<br />

[KB]<br />

VRML<br />

generation<br />

time<br />

[sec]<br />

Grass 75 113 107 3.0 198 15<br />

Broom 154 370 <strong>28</strong>5 7.9 517 19<br />

Elverde 156 1510 715 27.7 1,676 148<br />

Little<br />

Rock<br />

181 2375 1,101 42.5 2,575 250<br />

Table 1 – Data Size and related execution time. FWML<br />

stands for FoodWebML. Performance was measured at a PC<br />

with Pentinum II, 833 MHz with 1<strong>28</strong>Mbyte main memory.<br />

(Grass:Grasslands in England and Wales, Broom:Scotch<br />

Broom – Cytisus scoparius, Elverde:El Verde Rainforest,<br />

LittleRock:Little Rock Lake)<br />

[Graham00] Graham, M., et al. 2000. A comparison of set-based and<br />

graph-based visualisations of overlapping classification hierarchies.<br />

Proceedings of the Working Conference on Advanced Visual Interfaces,<br />

[Kundu] Kundu, K., et al. “ Three-Dimensional Visualization of<br />

Hierarchical Task Network Plans,”<br />

[Martinez96] Martinez, N. D. 1996. Defining and measuring functional<br />

aspects of biodiversity. Pages 114-148 in Biodiversity.<br />

[Strogatz01] Strogatz, S. H. 2001. Exploring complex networks. Nature<br />

[Williams00] Williams, R. J et al. 2000. Simple rules yield complex food<br />

webs. Nature 404:180-183.<br />

[Williams01] Williams, R. J., et al. 2001. Stabilization of chaotic and<br />

non-permanent food web dynamics. Santa Fe Institute Working Paper


Poster and Interactive Demonstration:<br />

Streaming Media Within the Collaborative Scientific Visualization<br />

Environment Framework<br />

1 Introduction<br />

CSVE, a basic collaborative scientific visualization environment,<br />

was developed under a National Science Foundation (NSF) MRI<br />

grant and a NSF REU Supplement to the grant, 0215583, during<br />

FY2002. CSVE was demonstrated in a prototype collaborative<br />

scientific visualization of a time-dependent two-dimensional oil<br />

reservoir simulation.<br />

CSVE allows any number of scientists to explore the simulation<br />

for oil reservoir sweep realizations, to interactively roam and<br />

zoom an array of time-dependent data sets, and to interact in other<br />

ways. Groups of scientists at remote workstations share the user<br />

interface and visualizations. The oil reservoir simulation is one of<br />

many examples of how the CSVE collaborative scientific<br />

visualization environment can be used.<br />

Figure 1. CO 2 is pumped into an oil reservoir through an injection<br />

well displacing the oil towards a production well.<br />

CSVE is a basic collaborative scientific visualization environment<br />

that allows any number of scientists to explore scientific data, and<br />

to interact in other ways. Groups of scientists at remote<br />

workstations share the user interface and visualizations. CSVE is<br />

a client/ server network application. The server allows scientist to<br />

administrate a scientific database that stores scientific data, user<br />

information, and session creation.<br />

The client provides a desktop with several internal frames that can<br />

be viewed as a workbench for collaborative scientific<br />

visualization. The internal frames make available collaborative<br />

visualization and communication utilities.<br />

Key to these utilities for collaboration is the user interface<br />

allowing for streaming media.<br />

Brian James Mullen<br />

University of Alaska, Anchorage<br />

Department of Mathematical Sciences, <strong>Computer</strong> Science<br />

asbvm@uaa.alaska.edu<br />

62<br />

Figure 2. The current collaborative offerings of the CSVE client.<br />

2 Streaming Media Within the CSVE<br />

An important aspect with the collaborative scientific visualization<br />

environment is communication, whether the source is audio or<br />

visual. Providing a video channel adds or improves the ability to<br />

show understanding, forecast responses, give non-verbal<br />

information, enhance verbal descriptions, manage pauses and<br />

express attitudes [Isaacs, Tang, 1993].<br />

The streaming media aspect of the CSVE was developed using the<br />

Java Media Framework (JMF) to help realize the benefits<br />

mentioned above.<br />

Video support is provided under the JMF for cameras utilizing<br />

Video For Windows as well as the ability to utilize V4L drivers<br />

for the Linux operating system. XJPEG and YUV capture formats<br />

are transmitted as JPEG and H263, respectively.<br />

Audio support is provided via Java using Direct Sound for<br />

Windows as well as JavaSound for both Windows and Linux.<br />

Regardless of the local format chosen audio is streamed using the<br />

DVI (direct voice input) format at 8000 Hz, mono. This ensures<br />

the highest quality at the lowest bandwidth.<br />

The ability to stream media files is also provided with the CSVE.<br />

JMF currently supports the following file types: AIFF, AU, AVI,<br />

GSM, MIDI, MPEG, QuickTime, RMF, and WAV. Other file<br />

types such MP3 are supported in a platform dependent manner.<br />

This application utilizes the JMF API for Real time Transport<br />

Protocol for multicasting and multi-unicasting media in the<br />

collaborative session. Via the server, ports for the streaming<br />

media are allocated, tracked and released dependent on the<br />

originator of the media who is providing the streamed source.


Figure 3. Local and Remote Streaming Audio frames which<br />

allow you to monitor your own audio transmission as well as<br />

receive other user’s audio within the collaborative session.<br />

Figure 4. Local and Remote Streaming Video frames allow you<br />

to use and receive non-verbal communication, a key to<br />

collaboration.<br />

Additionally desktop capture is provided to allow users to<br />

collaborate regardless of the application they are utilizing.<br />

Figure 5. Collaboration is more then just talking and watching.<br />

The local and remote desktop capture frames allow you to see<br />

what others are working on within the collaborative session.<br />

Figure 6. User Interface: Your panel allows you to set what you<br />

want to make available. Other panels provide you access to what<br />

other users are streaming.<br />

63<br />

The user interface provides a simple means to not only access<br />

your own media, but to easily see what others have made<br />

available for you to receive.<br />

3 Conclusion<br />

The initial offering of the CSVE contained only a simple chat<br />

program for communication and a simple user panel to see who is<br />

in the session.<br />

It is the inclusion of these media streaming tools within the<br />

Collaborative Scientific Visualization Environment that marks<br />

them as innovative and greatly extends the functionality of the<br />

CSVE markedly increasing the efficiency within a session. It is<br />

these media streaming tools that really put the collaborative in<br />

CSVE.<br />

Collaboration can be done with a simple chat program. But you<br />

never could accomplish as much as you could with the tools<br />

provided by this CSVE version. There are so many visual and<br />

audio cues humans require for communication. These tools bring<br />

on-line collaboration a bit closer to face to face and bring a bit<br />

more reality to an on-line environment.<br />

References<br />

ISAACS, E.A. AND TANG, J.C. 1993. What Video Can and Can't Do for<br />

Collaboration: A Case Study. In Proceedings ACM Multimedia<br />

Anaheim, CA: ACM, 199-206.<br />

MACEDONIA, M.R., AND BRUTZMAN, D.P., 1994. <strong>MB</strong>one Provides Audio<br />

and Video Across the Internet. In <strong>Computer</strong>, <strong>IEEE</strong> <strong>Computer</strong> <strong>Society</strong>,<br />

Vol. 27, No. 4, 30 - 36.<br />

PANG, A. AND WITTENBRINK, C. 1997. Collaborative 3D visualization<br />

with CSpray. In <strong>IEEE</strong> <strong>Computer</strong> Graphics and Applications, 17(2), pp.<br />

32-41.<br />

UPSON, C., FAULHABER, T., KAMINS, D., LAIDLAW, D., SCHLEGAL, D.,<br />

AND VROOM J.. 1989. The Application Visualization System: A<br />

Computational Environment for Scientific Visualization. In <strong>IEEE</strong><br />

<strong>Computer</strong> Graphics and Applications, 9(4):30-42.<br />

VINOD, A., BAJAJ, C., SCHIKORE, D., AND SCHIKORE, M. 1994.<br />

Distributed and collaborative visualization. In <strong>Computer</strong>, 27(7), pp. 37-<br />

43.<br />

WOOD, J., WRIGHT, H., BRODLIE, K., CSCV - <strong>Computer</strong> Support for<br />

Collaborative Visualization. In: EARNSHAW, R., VINCE, J., JONES, H.,<br />

(Eds.). 1997. Visualization & Modeling. London, UK: Academic Press.<br />

p. 13-25.


Visualization of Geo-Physical Mass Flow Simulations<br />

Navneeth Subramanian, T. Kesavadas and Abani Patra<br />

Virtual Reality Lab<br />

Dept of Mechanical and Aerospace Engineering<br />

State University of New York at Buffalo<br />

Introduction<br />

An interdisciplinary team from the departments of geology, geography, mathematics and mechanical<br />

engineering at the University at Buffalo has been pursuing this ongoing effort to model and simulate geophysical<br />

mass flows at volcanoes [1]. The system consists of a parallel adaptive finite volume [2] code for<br />

simulation of geo-physical mass flows that takes as its input Digital Elevation Models of the area of interest<br />

and a customized visualization module for displaying and communicating the results of the simulation. This<br />

visualization module is in turn integrated with terrain data and imagery (satellite/aerial photos) for<br />

appreciation of the possible hazards.<br />

Incremental Updated, Adaptive meshing and Level of Detail (LOD):<br />

The principal difficulties in the visualization are a) large size of the dataset (hundreds of <strong>MB</strong> per time step<br />

of the simulation and thousands of time steps), and, b) requirement that meaningful visualization be<br />

produced on both high end and limited hardware resources – e.g. SGI ONYX and simple desktop PCs. To<br />

effectively manage these huge datasets, we take advantage of two main features of the simulation data – 1)<br />

only small parts of the complete flow change from time step to time step – hence only a small subset of the<br />

full visualization needs to updated as each time step data is displayed, and, 2) the adaptive triangulation<br />

used by the simulation can be reused for visualization avoiding the cost of the re-tessellation. To handle the<br />

gigabyte size simulation output data we use a dynamic linked list based data structure.<br />

The initial spatial partitioning offered by the computational mesh generated from Digital Elevation<br />

Model (DEM) of the terrain lends itself to a coarse level of detail for rendering. The solution adaptive<br />

meshing used subsequently in the solution automatically refines the triangulation in areas where the flow<br />

has interesting features. Since we reuse the triangulation from the simulation this automatically provides a<br />

level of detail in the visualization based on flow features. This can dramatically improve the quality of the<br />

visualization without imposing impossible requirements on the visualization hardware.<br />

Implementation and Conclusions:<br />

Each vertex of the dataset has an associated pile height vector in n time steps (50-1000). These<br />

pile heights and the associated velocities at the time steps are visualized as contour maps (See figures 1, 2).<br />

Figure 1a: DEM of the Colima volcano (Mexico) Figure 1b: System Schematic<br />

Due to efficiency considerations, the simulation starts off with a coarse resolution DEM and using a<br />

solution adaptive technique, the DEM and computational grid is refined in the region of flow and unrefined<br />

elsewhere. An interesting problem we faced in this conjunction was the overlay of the flow data (pile<br />

height) over the initial DEM used for the simulation.<br />

64


Figure 2: Color ramping of pile height, showing course of a potential volcanic flow on Tahoma<br />

If the pile height data were approximated to the vertices of the coarse LOD, considerable loss of<br />

data would result (see fig 3). However if the flow data were to be directly overlay on the coarse DEM,<br />

discernible discontinuities in the topography would result. To overcome this problem, we delete all data<br />

under the region of flow in the coarse initial DEM, merge the flow data with this new DEM and retriangulate<br />

the boundary region of the flow and coarse DEM to achieve the result shown in fig 2.<br />

Contour map of flow<br />

Coarse Mesh of Topography<br />

Figure 3: Merging flow data with the coarse topography<br />

The satellite imagery of the site of the volcano is then overlaid with the simulation data to allow<br />

the user to appreciate the location of the flow with respect to geographical landmarks and to aid in disaster<br />

management (fig 4).<br />

Figure 4: (Left) Overlay of flow data shown in red with satellite imagery of the terrain. (Right) Overlay of<br />

satellite imagery to allow appreciation of geographical location of flow.<br />

The application was written in C++ using the Open Inventor graphics library. It has been tested on<br />

data for the volcano sites of Colima, Tahoma and Mammoth with the simulations running on 1-4 processors<br />

of a 64 processor SGI Origin 3800. The simulation code has been structured so that the output generated by<br />

each processor is in a separate file, hence allowing the visualization code to parse each section of the data<br />

on a separate thread and build the spatial partition of the terrain (octree based) independent of the other<br />

thread. Presently, the application is being used both on desktop Linux boxes (Pentium 4 class, NVIDIA<br />

GForce2 graphics cards) and on a 4 processor SGI ONYX2 in an immersive environment<br />

(ImmersaDesk). These tests have shown that the framework is generic and extensible.<br />

65


References:<br />

[1] Sheridan, M.F., Bloebaum, C.L., Kesavadas, T., Patra, A.K, and Winer,E.,2002, Visualization and<br />

Communication in Risk Management of Landslides. In C.A. Brebbia (editor), Risk Analysis III,<br />

WIT Press, Southampton, pp. 691-701.<br />

[2] Patra, A. K., Bauer, A.C., Nichita, C., Pitman, E. B., Sheridan, M.F., Webber, A., Rupp, B.,<br />

Stinton, A., Bursik, M., Parallel Adaptive Numerical Simulation of Dry Avalanches over Natural<br />

Terrain, J. Volcanology and Geophysical Research, to appear.<br />

66


INFOVIS 2003 Posters<br />

InfoVis 2003, the <strong>IEEE</strong> Symposium on Information Visualization 2003, is now in its ninth year. We continue to be held<br />

in conjunction with the <strong>IEEE</strong> Visualization 2003 conference, and this year we are distributing a combined InfoVis/Vis<br />

Posters <strong>Compendium</strong> at the conference.<br />

We are delighted that the Interactive Poster category, which was first introduced two years ago, has elicited a strong<br />

response from the research community. This year we received 32 poster submissions, of which 24 were accepted. All<br />

submissions were reviewed by both of the posters co-chairs.<br />

The Interactive Posters category includes both traditional posters and demonstrations of interactive systems, either live<br />

on a laptop or through video. We encouraged both submissions of original unpublished work and submissions<br />

showcasing systems of interest to the information visualization community that have been presented in other venues.<br />

This year there will be a rapid-fire Posters Preview, where each poster author will have two minutes to pique the<br />

interest of audience members, who can later see the full poster and discuss the work with the authors at length during<br />

the poster session. The poster session is co-located with Monday night's symposium reception. In addition the posters<br />

will be on display over the full course of the symposium.<br />

We gratefully acknowledge the support of Microsoft Research, whose Conference Management Toolkit was a<br />

wonderful help in managing the submissions. We thank the other organizers of both InfoVis and Vis, including the Vis<br />

Conference Chairs Jim Thomas and Hanspeter Pfister, the Vis Local Arrangements Chair Dave Kasik, the local Vis<br />

Program Chair Pak Chung Wong, the Vis Publications Chair Torsten Moller and the InfoVis Publication Chair<br />

Sheelagh Carpendale for shepherding the creation of this compendium. We also thank the InfoVis Steering Committee,<br />

and most importantly the contributors for their support of the symposium.<br />

InfoVis 2003 Posters Co-Chairs:<br />

Alan Keahey, Visintuit, USA<br />

Matt Ward, Worcester Polytechnic Institute, USA<br />

67


Interactive Poster: Axes-Based Visualizations for Time Series Data<br />

Christian Tominski James Abello Heidrun Schumann<br />

Institute for <strong>Computer</strong> Graphics Dimacs Institute for <strong>Computer</strong> Graphics<br />

University of Rostock Rutgers University University of Rostock<br />

ct@informatik.uni-rostock.de abello@dimacs.rutgers.edu schumann@informatik.uni-rostock.de<br />

Abstract<br />

In the analysis of multidimensional time series data questions<br />

involving extremal events, trends and patterns play an<br />

increasingly important role in several applications. We focus on<br />

the use of Axes-based visualizations(similar to Parallel or Star<br />

Coordinates) to aid in the analysis of multidimensional data sets.<br />

We present two novel radial visual arrangements of axes - the<br />

TimeWheel and the MultiComb. They are implemented as part of<br />

an interactive framework called VisAxes. We report our early<br />

experiences with these novel design patterns.<br />

The design and the scale of an axis depend strongly on the type of<br />

data that is being mapped onto the axis. (i.e. nominal, ordinal,<br />

discrete, or continuous data). In Axes-based visualizations each<br />

axis is associated with a data set variable. Usually, axes are scaled<br />

from the associated variable’s minimum value to it’s maximum.<br />

Our framework offers three basic interactive axes. They are<br />

applicable to a variety of data sets and can be used in different<br />

combinations according to several interaction needs. They are:<br />

• Scroll axis,<br />

• Hierarchical axis, and<br />

• Focus within context axis.<br />

1 Introduction The scroll axis (see Figure 1 left) main use is with variables that<br />

Visualization of multidimensional time-series data is a<br />

challenging fundamental problem. One of the tasks at hand is to<br />

answer questions involving special events such as large data<br />

fluctuations, stock market shocks, risk management and large<br />

insurance claims.<br />

For representing a limited number of time steps and limited<br />

number of time dependent variables conventional time plots are<br />

commonly used[Har96]. Parallel Coordinates [Ins98] and Star<br />

Coordinates [Ric95] have been used as effective data exploration<br />

tools. They can be termed Axes-based visualization techniques.<br />

Their advantage is that they constitute a lossless projection of n<br />

dimensional space onto 2-d space. Since these techniques differ<br />

depending on the way the axes are mapped onto the screen and on<br />

the level of axes interactivity, our aim was to develop a flexible<br />

framework, called VisAxes, to support the creation and evaluation<br />

of a variety of axes arrangements. VisAxes maps the time series<br />

into different radial axes arrangements in the display and provide<br />

support for a variety of navigation operations. We introduce two<br />

novel radial arrangements, the TimeWheel and the MultiComb, as<br />

promising designs for the representation and visualization of<br />

multiple data plots.<br />

have associated a large number of values. It combines a<br />

dimension with a slider that can be interactively moved<br />

(positioned) on the axis and narrowed or widened allowing a user<br />

to choose a section of interest within the variable’s domain.<br />

The second type of axis - the hierarchical axis (see Figure 1<br />

middle) - is motivated by [AK02]. It is applicable in the case of<br />

hierarchical structured variables. Here the axis is first divided into<br />

segments according to the number of nodes in the root level of the<br />

hierarchy. Select interactions can be used either to open up more<br />

child segments, or to subsume child segments back into a single<br />

(parent) segment.<br />

The third type of axis is the focus within context axis (see Figure<br />

1 right). It is of use when a mapping of the entire variable’s range<br />

is necessary. The focus within context axis is scaled nonuniformly.<br />

We apply one of the known magnification<br />

transformation functions [Kea98] to the mapping procedure. By<br />

doing so, we provide a more detailed view of the data (focus)<br />

without loosing the overall view that is provided as the context.<br />

-100 0<br />

-50 150<br />

context<br />

-100 200<br />

J-F-M-A-M-J-J-A-S-O-N-D-J-F-M-A-M-J-J-A-S-O-N-D-J-F-M-A-M-J-J-A-S-O-N-D<br />

Q1-Q2-Q3-Q4-Q1-Q2-Q3-Q4-Q1-Q2-Q3-Q4<br />

axis start slider axis end<br />

2001-2002-2003<br />

2 The Framework VisAxes Figure 1 Left: Differently scrolled axes for a variable with<br />

2.1 Design Criteria<br />

A variety of design criteria had to be met by our framework:<br />

• Emphasis of axes representing time,<br />

• Consideration of multidimensional data analysis,<br />

• Integration of common time plots, since they are easy to<br />

understand, and<br />

• Realization of a high degree of interactivity to allow an<br />

efficient data exploration.<br />

Conceptually, it is important to separate the design of an<br />

individual axis from the arrangement of all the axes on the screen.<br />

We focus in this work on radial arrangements of interactive axes<br />

with special emphasis on the temporal ones.<br />

2.2 Axes Design & Arrangement<br />

68<br />

minimum value -100 and maximum value 200. The sliders width<br />

and location determine the scale of the axis affecting the range of<br />

mapped values. Middle: A hierarchical time axis after several<br />

steps of interaction. Blue, green, and red frames identify currently<br />

visible segments. Right: A non-uniformly scaled focus within<br />

context axis combined with a plot of a single variable.<br />

Axes arrangement is a non-trivial task. It has a major impact on<br />

the expressiveness and effectiveness of the visualization.<br />

Therefore, we distinguish between independent variables (i.e.<br />

time) and dependent variables (i.e. time dependent variables).<br />

This distinction suggests to treat temporal axes in a special<br />

manner in order to emphasize their special role. In the following<br />

paragraphs we present several radial axes arrangements which<br />

meet our design criteria. Each axis can be any of the presented<br />

axis types and it has associated a specific color. Furthermore,<br />

addition and removal of axes is allowed during the visualization.<br />

focus<br />

context


2.3 The TimeWheel<br />

Focusing on the time axis was the main aim when designing the<br />

TimeWheel. Therefore, the basic idea of the TimeWheel<br />

technique is to present the time axis in the center of the display,<br />

and to circularly arrange the other axes around it (see Figure 2).<br />

Similar to Parallel Coordinates, a single colored line segment<br />

makes a connection between a time value and the corresponding<br />

variable’s value. From each time value a colored line segment is<br />

drawn to each variable axis on the display. By doing so, the<br />

dependency on time can be visualized.<br />

variable axes time axis<br />

reduced color<br />

intensity<br />

lines connecting<br />

time and variable<br />

values<br />

Figure 2 A TimeWheel. Six variable axes are arranged circularly<br />

around an exposed centered time axis.<br />

The relations between time and other variable values can be<br />

explored most efficiently when the dependent variable axis is laid<br />

out parallel to the time axis. Interactive rotation of the TimeWheel<br />

is provided so that a user can move his/her axes of interest into<br />

such position without visual discontinuities. When an axis is<br />

perpendicular to the time axis its visual analysis is very difficult.<br />

To alleviate this difficulty we use angle dependent color fading to<br />

hide lines drawn between such axes and the time axis (see Figure<br />

3).<br />

Additionally, these axes are presented in a lower degree of<br />

detail by shortening their lengths. The use of different axes<br />

lengths can be viewed in this case as an example of the focus<br />

within context approach. By using color fading and length<br />

adjustment we avoid overcrowded displays and reduce cluttering.<br />

Users familiar with Parallel Coordinates (the time axis can be<br />

arranged vertically as well) will see the TimeWheel as an<br />

enhancement of particular use for browsing time depending data<br />

sets.<br />

Figure 3 Screenshot of a TimeWheel featuring color fading and<br />

axes lengths adjustment.<br />

2.4 The MultiComb<br />

Since common time plots are very efficient for the visualization of<br />

a single time dependent variable, our aim for the MultiComb was<br />

to make use of this fact for the analysis of multivariate data. The<br />

basic idea (inspired by [AK02]) is to arrange the time plots of<br />

different variables (one plot for each variable) circularly on the<br />

display (see Figure 4). There are two possibilities when arranging<br />

the plots. In one case, the variable axes extend outwards from the<br />

69<br />

center of the display and in the second case the time axes extend<br />

radially. To avoid overlapping plots the axes are not started at the<br />

center of the display. In this way, the center area can be used to<br />

present additional information (e.g. a spike glyph for value<br />

comparison or an aggregated view of “past” values).<br />

Figure 4 Two MultiCombs. On the left, time axes extend from the<br />

center; The center area displays an aggregation view. The figure<br />

on the right shows time axes arranged circularly and the center are<br />

contains a spike glyph representing the different variable values<br />

that correspond to a chosen time value.<br />

3 Conclusion<br />

Inventing useful design patterns for multidimensional time<br />

dependent data is a very challenging undertaking. For the<br />

visualization of such data we suggested two novel radial<br />

arrangements of axes - the TimeWheel and the MultiComb. These<br />

radial arrangements in conjunction with our interactive axes -<br />

scroll, hierarchical, and focus within context - offer an interesting<br />

alternative to more conventional embeddings.<br />

The presented techniques have been implemented in an objectoriented<br />

and Internet capable framework called VisAxes. The<br />

framework can be used for easy creation and evaluation of<br />

different axes arrangements.<br />

References<br />

[AK02] Abello, J.; Korn, J.: MGV: A System for Visualizing<br />

Massive Multidigraphs. <strong>IEEE</strong> Transactions on<br />

Visualization and <strong>Computer</strong> Graphics, Vol.8, No.1,<br />

2002, pp. 21-38<br />

[Har96] Harris, R.L.: Information Graphics: A<br />

Comprehensive Illustrated Reference. Atlanta,<br />

Georgia: Management Graphics, 1996.<br />

[Ins98] Inselberg, A.: A survey of parallel coordinates. In<br />

Hege, H.-C.; Polthier, K. (eds): Mathematical<br />

Visualization, Heidelberg: Springer Verlag, 1998, pp.<br />

167-179<br />

[Kea98] Keahey, T.A.: The Generalized Detail-In-Context<br />

Problem. Proceedings of <strong>IEEE</strong> Symposium on<br />

Information Visualization, Los Alamitos: <strong>IEEE</strong><br />

<strong>Computer</strong> <strong>Society</strong>, 1998, pp. 44-51<br />

[Ric95] Richards, L.G.: Applications of Engineering<br />

Visualization to Analysis and Design. In: Gallagher,<br />

R.S. (ed): <strong>Computer</strong> Visualization. Boca Raton: CRC<br />

Press, 1995, pp. 267-<strong>28</strong>9


Abstract<br />

Interactive Poster: Visualising Large Hierarchically Structured<br />

Document Repositories with InfoSky<br />

Keith Andrews ∗<br />

Graz University of Technology<br />

Wolfgang Kienreich †<br />

Know-Center Graz<br />

InfoSky is an interactive system for the exploration of large, hierarchically<br />

structured document collections. InfoSky employs a<br />

planar graphical representation with variable magnification like a<br />

real-world telescope.<br />

The hierarchical structure is reflected using recursive subdivision<br />

into Voronoi polygons. At each level of the hierarchy documents<br />

and subcollections are positioned according to the similarity of their<br />

content using a force-directed placement technique.<br />

Documents are assumed to have significant textual content,<br />

which can be extracted with specialised tools. The hierarchical<br />

structure is exploited for greater performance. Force-directed<br />

placement is applied recursively at each level on the objects at that<br />

level rather than on the whole corpus.<br />

CR Categories: H.5.2 [Information Systems]: Information<br />

Interfaces and Presentation—User Interfaces; I.7.0 [Computing<br />

Methodologies]: Document and Text Processing—General<br />

Keywords: information visualisation, classification hierarchy,<br />

document repository, force-directed placement, Voronoi subdivision.<br />

1 Introduction<br />

InfoSky is an interactive system for the exploration of large, hierarchically<br />

structured document collections. InfoSky combines both<br />

a traditional tree browser and a new telescope view of a zooming<br />

galaxy of stars. The telescope view provides a planar graphical representation<br />

with variable magnification like a real-world telescope.<br />

Queries can be performed and the search results are highlighted in<br />

context in the galaxy visualisation.<br />

InfoSky assumes that documents are already organised in a hierarchy<br />

of collections and sub-collections, called the collection hierarchy.<br />

Both documents and collections can be members of more<br />

than one parent collection, but cycles are explicitly disallowed, a<br />

structure sometimes known as a directed acyclic graph. The collection<br />

hierarchy might, for example, be a classification scheme or<br />

taxonomy, manually maintained by editorial staff. The collection<br />

∗ e-mail:kandrews@iicm.edu<br />

† e-mail:wkien@know-center.at<br />

‡ e-mail:vsabol@know-center.at<br />

§ e-mail:mgrani@know-center.at<br />

70<br />

Vedran Sabol ‡<br />

Know-Center Graz<br />

Michael Granitzer §<br />

Know-Center Graz<br />

Figure 1: The original prototype of InfoSky as used in the comparative<br />

study.<br />

hierarchy could also be created or generated (semi-)automatically.<br />

Documents are assumed to have significant textual content, which<br />

can be extracted with specialised tools. Documents are typically<br />

plain text, <strong>PDF</strong>, HTML, or Word documents, but may also include<br />

spreadsheets and many other formats.<br />

In the galaxy, documents are visualised as stars and similar documents<br />

form clusters of stars. Collections are visualised as polygons<br />

bounding clusters and stars, resembling the boundaries of constellations<br />

in the night sky. Collections featuring similar content are<br />

placed close to each other, as far as the hierarchical structure allows.<br />

Empty areas remain where documents are hidden due to access<br />

right restrictions, and resemble dark nebulae found quite frequently<br />

within real galaxies. Figure 2 shows the original prototype<br />

of InfoSky.<br />

2 InfoSky Implementation<br />

InfoSky is implemented as a client-server system in Java. On the<br />

server side, galaxy geometry is created and stored for a particular<br />

hierarchically structured document corpus. On the client side, the<br />

subset of the galaxy visible to a particular user is visualised and<br />

made explorable to the user.<br />

The galactic geometry is generated from the underlying repository<br />

recursively from top to bottom:<br />

1. At each level, the centroids of any subcollections are positioned<br />

according to their similarity with each other using a<br />

force-directed similarity placement algorithm.<br />

2. A polygonal area is calculated around each subcollection<br />

centroid using modified, weighted Voronoi diagrams [Okabe<br />

et al. 2000, pg. 1<strong>28</strong>]. The size of each polygon is related to


Figure 2: The revised version of InfoSky, modified after the feedback<br />

from user studies.<br />

the total number of documents and collections contained in<br />

that subcollection (at all lower levels).<br />

3. Finally, documents contained in the collection at this level are<br />

positioned using the similarity placement algorithm as points<br />

within a synthetic “Stars” collection.<br />

When positioning subcollection centroids and documents at a particular<br />

level, the centroids of sibling collections are used as static<br />

influence factors, drawing obejcts towards the most appropriate sibling.<br />

3 User Testing<br />

Two user studies were carried out with a dataset consisting of approximately<br />

100,000 German language news articles from the Sddeutsche<br />

Zeitung. The articles are manually classified thematically<br />

by the newspaper’s editorial staff into around 9,000 collections and<br />

subcollections upto 15 levels deep.<br />

A thinking aloud test with 5 users was performed for design<br />

feedback. A small formal experiment with 8 users in a counterbalanced<br />

design was run to establish a baseline comparison between<br />

the InfoSky telescope browser alone and the InfoSky tree browser<br />

alone. On average, the tree browser alone performed better than<br />

the telescope browser alone for each of the tasks tested. This is at<br />

least partly due to the much greater familiarity of the users with a<br />

Windows-stlye explorer.<br />

We have not yet tested the complete InfoSky armoury of synchronised<br />

tree browser, telescope browser, and search in context<br />

against other methods of exploring large hierarchical document collections.<br />

Nor have we tested tasks involving finding related or similar<br />

documents or subcollections, something the telescope metaphor<br />

should be well-suited to. As development proceeds, we believe that<br />

the InfoSky prototype will constitute a step towards practical, useroriented,<br />

visual exploration of large, hierarchically structured document<br />

repositories.<br />

Figure 2 shows the modified version of InfoSky after user testing.<br />

4 Related Work<br />

Systems such as Bead [Chalmers 1993] and SPIRE [Thomas et al.<br />

2001] map documents from a high-dimensional term space to<br />

71<br />

a lower dimensional display space, whilst preserving the highdimensional<br />

distances as far as possible but operate on flat document<br />

repositories and do not take advantage of hierarchical structure.<br />

Systems such as the Hyperbolic Browser [Lamping et al.<br />

1995] and Information Pyramids [Andrews et al. 1997] visualise<br />

large hierarchical structures, but make no explicit use of document<br />

content and subcollection similarities. CyberGeo Maps [Holmquist<br />

et al. 1998] use a stars and galaxy metaphor similar to InfoSky,<br />

but the hierarchy is simply laid out in concentric rings around the<br />

root. WebMap’s InternetMap [WebMap 2002] visualises hierarchically<br />

categories of web sites recursively as multi-faceted shapes,<br />

but there is no correspondence between the local view at each level<br />

and the global view.<br />

5 Concluding Remarks<br />

This poster presents InfoSky, a system for the interactive visualisation<br />

and exploration of large, hierarchically structured, document<br />

repositories. With its telescope and galaxy metaphors, we believe<br />

that the InfoSky prototype will constitute a step towards practical,<br />

user-oriented, visual exploration of large, hierarchically structured<br />

document repositories. Readers are referred to detailed descriptions<br />

of both InfoSky and the user study in [Andrews et al. 2002] It is intended<br />

to give a live demo of InfoSky at the symposium.<br />

References<br />

ANDREWS, K., WOLTE, J., AND PICHLER, M. 1997. Information pyramids:<br />

A new approach to visualising large hierarchies. In <strong>IEEE</strong> Visualization’97,<br />

Late Breaking Hot Topics Proc., 49–52.<br />

ANDREWS, K., KIENREICH, W., SABOL, V., BECKER, J., DROSCHL,<br />

G., KAPPE, F., GRANITZER, M., AUER, P., AND TOCHTERMANN, K.<br />

2002. The infosky visual explorer: Exploiting hierarchical structure and<br />

document similarities. Information Visualization 1, 3/4 (Dec.), 166–181.<br />

CHALMERS, M. 1993. Using a landscape metaphor to represent a corpus<br />

of documents. In Spatial Information Theory, Proc. COSIT’93, Springer<br />

LNCS 716, 377–390.<br />

HOLMQUIST, L. E., FAGRELL, H., AND BUSSO, R. 1998. Navigating cyberspace<br />

with cybergeo maps. In Proc. of Information Systems Research<br />

Seminar in Scandinavia (IRIS 21).<br />

LAMPING, J., RAO, R., AND PIROLLI, P. 1995. A focus+context technique<br />

based on hyperbolic geometry for visualizing large hierarchies. In Proc.<br />

CHI’95, ACM, 401–408.<br />

OKABE, A., BOOTS, B., SUGIHARA, K., AND CHIU, S. N. 2000. Spatial<br />

Tesselations: Concepts and Applications of Voronoi Diagrams, second<br />

ed. Wiley.<br />

THOMAS, J., COWLEY, P., KUCHAR, O., NOWELL, L., THOMSON, J.,<br />

AND WONG, P. C. 2001. Discovering knowledge through visual analysis.<br />

Journal of Universal <strong>Computer</strong> Science 7, 6 (June), 517–529.<br />

WEBMAP, 2002. WebMap. http://www.webmap.com/.


Interactive Poster: An XML Toolkit for an<br />

Information Visualization Software<br />

Repository<br />

Jason Baumgartner*, Katy Börner, Nathan J.<br />

Deckard, Nihar Sheth<br />

Indiana University, SLIS<br />

Bloomington IN 47405, USA<br />

* jlbaumga@indiana.edu<br />

Introduction<br />

In (Baumgartner & Börner, 2002) we motivated the<br />

need and introduced the beginnings of a general<br />

software repository supporting education and research in<br />

information visualization (Börner & Zhou, 2001). This<br />

poster describes the general architecture of the XML<br />

toolkit and reviews the currently available data analysis,<br />

layout and interaction algorithms as well as their<br />

interplay. Last but not least it describes how new code<br />

can be integrated.<br />

XML Toolkit Architecture<br />

The unified toolkit architecture aims to provide a<br />

flexible infrastructure in which multiple data analysis<br />

and information visualization (IV) algorithms can be<br />

incorporated and combined. This structure allows<br />

concurrent visualization and interaction with the same<br />

datasets accessed through standard model interfaces.<br />

The supported models include the TreeModel,<br />

TableModel, and ListModel which are part of the<br />

standard Java edition (J2SE) along with the<br />

MatrixModel and NetworkModel which are additional<br />

interfaces supported in this framework.<br />

A persistence factory is utilized to enable a general<br />

and interchangeable layer for persisting and restoring<br />

those various data models. The persistence could be to<br />

an object database, a flat file, XML datastore, etc.<br />

The implemented persistence layer is an XML-based<br />

interchange format that is used to unify data input,<br />

interchange, and output formats. The factory and<br />

interface classes allow all software packages to<br />

implement and to use a defined XML schema set that is<br />

hidden away in the persistence layer of the toolkit. This<br />

ensures that software packages can be easily<br />

interchanged, compared, and combined through the<br />

models that are generated instead of an algorithm-byalgorithm<br />

direct use of the XML structure. Also simple<br />

configurations of the XML input format suffice to use<br />

different algorithms in a wide variety of applications as<br />

they may produce different model types that are<br />

supported by different IV algorithms. Finally, all the<br />

Java-based IV algorithms can be run in stand-alone<br />

mode as an applet or application.<br />

72<br />

Figure 1: General Architecture of the XML Toolkit<br />

The general structure of the IV repository XML<br />

toolkit, depicted in Figure 1, relies on the use of factory<br />

and interface classes to interact with various data<br />

analysis algorithms and to instantiate and populate the<br />

various visualization algorithms. Each algorithm class<br />

must implement at least one of the model interfaces for<br />

its internal data model in order to be registered with the<br />

toolkit. The XML data and the interfaced objects are<br />

managed through the persistence layer and the model<br />

interfaces that control access of the data and its<br />

population into the objects.<br />

Figure 2: Different visualizations of TreeModel data:<br />

Jtree, TreeMap, Hyperbolic Tree, and Radial Tree


Figure 2 shows visualizations generated by<br />

algorithms that supporting a TreeModel for their data<br />

management.<br />

Integrating New Code<br />

The process to integrate code is to support either<br />

building a supported model type of ListModel,<br />

TableModel, TreeModel, MatrixModel, or<br />

NetworkModel; or to use one or more of these models<br />

for an algorithm’s data representation. In order to<br />

integrate algorithms into the toolkit a code developer<br />

would need to either build their code with at least one of<br />

these model types or program a wrapper to interchange<br />

from their data structure to one of the interfaces.<br />

There is also an interface for processing formatting<br />

options called IVFormat which generalizes some of the<br />

general node and edge format options (background,<br />

foreground, font, size, etc). The toolkit can work<br />

without the IVFormat interface as the defaults are all<br />

populated for different formatting options.<br />

Outlook<br />

The toolkit is available for non-commercial<br />

purposes. The following issues are all under current<br />

development or planned for continued development.<br />

Finalize the design and development of a simple<br />

graphical user interface for the application layer of<br />

the toolkit to more easily interact with data analysis<br />

and IV code pieces.<br />

Continue to incorporate other algorithms into the<br />

toolkit where licensing allows.<br />

Allow for a dynamic lookup of classes that can be<br />

integrated with the toolkit via Java reflection over a<br />

working directory and/or a set of jar files.<br />

Save interaction data such as manipulation changes<br />

in the data, the state of the visualization, etc. that<br />

could be advantageous to compare visualizations of<br />

different data sets among others.<br />

Metadata schemas, like the Dublin Core and the<br />

Resource Description Framework (RDF), will be<br />

employed to provide an interoperable way to<br />

represent meaning with data. The current schema<br />

for the persistence layer provides both a data<br />

description and a view description for the various<br />

node – link structures. The resource description in<br />

the schema defines a tag set as related to the Dublin<br />

Core tag set so it can be easily transformed to<br />

straight Dublin Core. Therefore the use of RDF and<br />

Dublin Core can be interchanged via XSLT<br />

transformations to the schema. This will allow the<br />

ability to generalize to a RDF / Dublin Core<br />

representation and back to the schema of the toolkit.<br />

The schema set will be registered with the Open<br />

Archives Initiative (OAI) protocol (Lagoze &<br />

Sompel, 2001) to allow the greatest interoperability<br />

of the data.<br />

73<br />

Furthermore, the existing schemas will all be<br />

centered on exclusive document description, vector<br />

graphic markup, geographical layout, etc. The<br />

focus of the initial implementation will be on<br />

standard model structures and at this time will not<br />

include geographical layout representations found<br />

in most geographical information systems (GIS).<br />

The versioning of schemas will allow for extension<br />

of other schemas, i.e. scalable vector graphics<br />

(SVG), and future versions that could directly<br />

support items like GIS.<br />

Provide user documentation, JavaDoc, and a<br />

workshop on how to use the toolkit; including how<br />

to implement algorithms that work with general<br />

data model interfaces.<br />

We hope that the Information Visualization community<br />

will adopt this toolkit to create a central data-code<br />

repository for IV research and education.<br />

We believe the proposed architecture is flexible<br />

enough to facilitate easy sharing, comparison, and<br />

evaluation of existing and new IV algorithms. Its widely<br />

adoption will help to collectively understand the<br />

underlying issues of differing visualizations and to pool<br />

together existing and future IV efforts.<br />

Acknowledgements<br />

We are grateful to the students taking the IV class at<br />

Indiana University in Spring 2001, 2002, and 2003.<br />

They have provided invaluable input into the design and<br />

usage of the toolkit.<br />

Todd Holloway, Ketan Mane, Sriram Raghuraman,<br />

Nihar Sanghvi, Sidharth Thakur, Yin Wu, Ning Yu, and<br />

Hui Zhang contributed to the integration of diverse<br />

software packages into the repository.<br />

Ben Shneiderman, Matthew Chalmers, Michael<br />

Berry, Jon Kleinberg, Teuvo Kohonen and their<br />

respective research groups generously contributed<br />

source code to the repository.<br />

References<br />

Baumgartner, J., & Börner, K. (2002). Towards an XML<br />

Toolkit for a Software Repository Supporting<br />

Information Visualization Education. Paper<br />

presented at the <strong>IEEE</strong> Information Visualization<br />

Conference, Boston, MA,<br />

Börner, K., & Zhou, Y. (2001, July 25-27). A Software<br />

Repository for Education and Research in<br />

Information Visualization. Paper presented at the<br />

Fifth International Conference on Information<br />

Visualisation, London, England: <strong>IEEE</strong> Press, pp.<br />

257-262.<br />

Lagoze, C., & Sompel, H. V. (2001). The Open Archives<br />

Initiative: Building a low-barrier interoperability<br />

framework. Paper presented at the First<br />

ACM+<strong>IEEE</strong> Joint Conference on Digital Libraries,<br />

Portland, Oregon, USA: ACM Press.


Interactive Poster: Trend Analysis in Large Timeseries of High-Throughput<br />

Screening Data Using a Distortion-Oriented Lens with Semantic Zooming<br />

Abstract<br />

We present a design study that shows how information visualization<br />

techniques and information design principles are used to<br />

interactively analyze trends in large amounts of raw data from<br />

high-throughput screening experiments. The tool summarizes<br />

trends in the data both in space and time, through the use of distortion-oriented<br />

magnification as well as semantic zooming. Careful<br />

choice of visual representations allows an information-rich yet<br />

easily interpretable display of all the data and statistical indicators<br />

in a single view. It is used commercially for quality control of measurements<br />

in the drug discovery process.<br />

1. Introduction<br />

Dominique Brodbeck<br />

Macrofocus GmbH<br />

dominique.brodbeck@macrofocus.com<br />

High-throughput screening is a technique used in the drug<br />

discovery process to find lead candidates for further biological<br />

screening and pharmacological testing. Biological targets are<br />

thereby tested against large chemical compound libraries, and the<br />

intensity (e.g. fluorescence) of the chemical reactions with all the<br />

compounds measured. Typical libraries contain 100’000 to 1 million<br />

compounds. Several hundreds of them are filled into the wells<br />

of a microtiter plate and are brought in contact with the target substance.<br />

All the reactions in the wells then take place and are measured<br />

in parallel at the same time. This is repeated sequentially<br />

with as many plates as it takes to test all the compounds. The pro-<br />

Luc Girardin<br />

Macrofocus GmbH<br />

luc.girardin@macrofocus.com<br />

cessing of such an assay is performed automatically by a robot in<br />

several screening runs and stretches over hours or days.<br />

For subsequent data analysis, we therefore have to deal with<br />

on the order of 10 2 measurements per plate, for 10 3 plates, leading<br />

to a total of 10 5 to 10 6 values. In a first step, the quality of the raw<br />

data needs to be assessed in terms of signal strength, background<br />

noise, and other effects introduced by changes in the environment<br />

during the course of the measurements. The result from this quality<br />

control leads to the elimination of bad plates and serves as input<br />

for the choice of normalization and correction modes. After this<br />

assessment, the data is normalized and corrected, and the timing<br />

information discarded. Time is only an artefact of the measuring<br />

process and not relevant for the identification of lead candidates.<br />

In the following we describe a tool - named TrendDisplay -<br />

that supports the quality control process of raw high-throughput<br />

screening data. It solves the problem of representing and evaluating<br />

large amounts of time-dependent measured data. In particular<br />

our design objectives were:<br />

• show the trend of the raw data for all the wells across a plate<br />

• show the trend of the raw data over time, on different time<br />

scales<br />

• provide comparison with additional derived statistical values<br />

(signal to noise ratio, standard deviation, etc.)<br />

• allow masking of plates based on thresholding of any combination<br />

of derived values<br />

• industrial-strength information design and ease-of-use<br />

Figure 1: TrendDisplay showing trends both across space and time of in this example 230’400 measurement values, revealing saturation<br />

effects, time-dependent drift, as well as outliers. A bifocal lens with semantic zooming allows quick access to and investigation of temporal<br />

discontinuities, and anomalies. Derived values such as standard deviation (blue) or number of inhibitor reactions (green) are plotted in the<br />

top panel. Thresholds can be set interactively for the active plot (bold blue line) to visually define masking criteria.<br />

74


2. TrendDisplay<br />

TrendDisplay is composed of two panels: the main panel at<br />

the bottom shows all the measured values in one view, and the top<br />

panel shows various derived statistical values (Figure 1). The two<br />

panels share the same timeline (x-axis) along which the plates are<br />

positioned according to when they were measured. The background<br />

shading (light/dark) highlights the boundaries of the individual<br />

screening runs that make up the whole assay. The time axis<br />

at the top shows date and start time for each screening run,<br />

whereas the axis at the bottom shows their respective duration. The<br />

time gaps between screening runs are removed, to keep the representation<br />

contiguous and to save screen space. In addition to this<br />

“relative“ time mode, the axis can also be switched to show the<br />

sequence number of the plates only.<br />

An individual plate is represented as a perceptually linear<br />

greyscale density distribution of all the measured values that it<br />

contains. In order to avoid the visual activation of empty space<br />

between plates, the density distributions are drawn in such a way<br />

that they appear as a contiguous band along the horizontal direction,<br />

i.e. each individual band is connected to its neighbors to the<br />

left and right. We do insert a break for large gaps however, in order<br />

to prevent the bands from becoming overly asymmetric. This<br />

makes it easy to spot places with highly irregular time stamp distributions.<br />

To cope with the large number of plates and to provide access<br />

to details on different time scales, we make use of a distortion-oriented<br />

magnification technique, namely a bifocal lens [Apperley et<br />

al. 1982]. The lens can be opened and its position manipulated by<br />

using the two handles at the bottom of the display. Alternatively an<br />

area of interest can be chosen by rubberbanding the desired interval<br />

directly in the display, or by double-clicking on a screening<br />

run, in which case the lens boundary is positioned at the boundaries<br />

of the screening run.<br />

There are various ways to represent a set of measured values<br />

and their statistical characteristics, each with their own properties.<br />

We therefore implemented the lens as a semantic zoom [Bederson<br />

and Hollan 1994], choosing the appropriate representation depending<br />

on the amount of available screen space per plate at a certain<br />

magnification factor. There are four different levels of detail (from<br />

lowest to highest magnification): greyscale density distributions,<br />

thin box plots [Tufte 1983], box plots plus individual outliers, bar<br />

histograms (Figure 2). The magnification factor inside the lens is<br />

controlled by the zoom slider just below the lens position controls.<br />

Figure 2: The four different levels of detail: density distributions,<br />

thin box plots, box plots plus outliers, bar histograms (left). Brushing<br />

and linking: plates can be masked or marked without loosing<br />

the representation of the underlying data.<br />

75<br />

Both panels can also be magnified in the vertical direction<br />

independently by using the range sliders on the right side of the<br />

panels, or by rubberbanding the desired interval directly in the display.<br />

The vertical magnification is implemented as a standard linear<br />

zoom, because the y-axis represents a physical scale on which<br />

metric comparisons need to be performed, and where geometric<br />

distortions would lead to misinterpretations. We use gesture recognition<br />

to automatically detect if the desired rubberband interval<br />

should be applied to the vertical or horizontal direction, freeing the<br />

users from having to learn special keystrokes. All zooming and<br />

lens positioning transitions are smoothly animated, to guarantee<br />

object constancy and avoid change blindness effects.<br />

In addition to the measured reaction signals, there are several<br />

control signals (e.g. neutral reaction signal) and various derived<br />

statistical values that need to be visualized and correlated with the<br />

compound data. Selected control signals can be overlaid directly<br />

over the density distributions in the form of a line plot. In the upper<br />

panel, any number of derived statistical values can be plotted. We<br />

use different plotting styles that are optimized for the different<br />

time scales. Outside the lens, values are plotted in histogram style,<br />

to avoid aliasing problems caused by quasi-vertical lines. Inside<br />

the lens, values are represented as black dots that are connected by<br />

straight lines.<br />

If multiple derived statistics are selected concurrently, then<br />

they are overplotted in the same panel on different layers. Each of<br />

them is equipped with its own adjustable coordinate system, so<br />

that users can freely scale and shift the plots in the vertical direction<br />

in order to arrange or overlay them appropriately. In addition<br />

there is an upper and a lower threshold for each of the derived statistics<br />

that can be set to visually define certain masking criteria<br />

(e.g. mask all plates whose standard deviation is above r). Thresholds<br />

are represented by semi-transparent “curtains” that extend<br />

into the panel from the top and bottom.<br />

TrendDisplay supports brushing and linking. Plates can be<br />

selected, marked, or masked, which is indicated by different coloring<br />

in the main panel, and by little flags in the status strip along the<br />

bottom of the display (Figure 2).<br />

3. Conclusion<br />

TrendDisplay is embedded as a component in a comprehensive<br />

data analysis suite for biotechnology applications. It receives<br />

enthusiastic feedback from customers and enjoys commercial success.<br />

We envision similar applications of the approach described<br />

here and the techniques used, in timeseries-heavy areas such as<br />

finance, event scheduling, or project management.<br />

4. References<br />

APPERLEY, M.D., TZAVARAS, I. AND SPENCE, R. 1982. A Bifocal Display<br />

Technique for Data Presentation. In Proceedings of Eurographics'82,<br />

Conference of the European Association for <strong>Computer</strong> Graphics, pp. 27-<br />

43.<br />

BEDERSON, B. B. AND HOLLAN, J. D. 1994. Pad++: A Zooming Graphical<br />

Interface for Exploring Alternate Interface Physics. In Proceedings of<br />

UIST’94, ACM Symposium on User Interface Software and Technology,<br />

Marina del Rey, CA, pp. 17-26.<br />

TUFTE, E. R. 1983. The Visual Display of Quantitative Information. Graphics<br />

Press, Cheshire, Connecticut


Abstract<br />

Interacting with Transit-Stub Network Visualizations<br />

Real-world data networks are large, making them difficult to analyze.<br />

Thus, analysts often generate network models of a more<br />

tractable scale to perform simulations and analyses, but even these<br />

models need to be fairly large. Because these networks do not directly<br />

correspond to any particular network, it is often difficult for<br />

the user to construct a mental model of the network. We present<br />

a network model visualization system developed with networking<br />

researchers to help improve the design and analysis of these topologies.<br />

In particular, this system supports manipulation of the network<br />

layout based on hierarchical information; a novel display technique<br />

to reduce clutter around transit routers; and the mixture of manual<br />

and automatic interaction in the layout phase.<br />

CR Categories: H.5.2 [Information Systems]: Information Interfaces<br />

and Presentation—User Interfaces<br />

Keywords: network visualization, graph layout, graph manipulation<br />

1 Introduction<br />

Because of the scale of real-world networks, networking researchers<br />

typically use network models of a more manageable scale<br />

on which to perform analyses. Tools such as the Georgia Tech<br />

Internet Topology Modeler (GT-ITM) [Calvert et al. 1997] generate<br />

pseudo-random network topologies on which researchers can<br />

perform their analyses. These networks are pseudo-random in the<br />

sense that they are randomly generated within the constraints of<br />

various properties that have been identified as existing in many realworld<br />

networks. One limitation of these systems is that the output<br />

of the model generator is an abstract description of a network; the<br />

leading feature request for GT-ITM is “How can I see what this<br />

topology looks like?”<br />

To aid in the analysis of these network models, we created the<br />

NetVizor system, a tool designed to visually display the network<br />

models generated by GT-ITM. In designing NetVizor, we met with<br />

networking researchers to identify the tasks and peculiarities of the<br />

particular problems they address when looking at network topologies.<br />

One problem in particular is the generation of a suitable layout<br />

for a network.<br />

To help address the problem of graph layout, we propose a general<br />

method of attack to the layout problem that mixes automatic<br />

layout algorithms with manual interaction. Another problem that<br />

our networking participants face is the publication of generated<br />

models. As such, the aesthetics of the layout are important to convey<br />

the structure of the topology adequately. To help the user refine<br />

the layout, we take advantage of the hierarchical nature of realworld<br />

networks and use hierarchy information to aid in the manipulation<br />

of the layout of nodes and domains in the visualization.<br />

∗ email: eaganj@cc.gatech.edu<br />

† email: stasko@cc.gatech.edu<br />

‡ email: ewz@cc.gatech.edu<br />

James R. Eagan ∗ , John Stasko † , and Ellen Zegura ‡<br />

GVU Center, College of Computing<br />

Georgia Institute of Technology<br />

Atlanta, GA 30332<br />

76<br />

Lastly, we introduced a “fudge-factor” in the visualization that adds<br />

virtual aggregate edges to the visualization to reduce clutter around<br />

transit domains. We discuss these three techniques in more detail<br />

in the next few sections.<br />

2 Related Work<br />

Although the network topologies we are working with are not general<br />

graphs, work in the field of graph layout is relevant. A lot<br />

of work has gone into this field [Battista et al. 1999]. We leverage<br />

this existing work, focusing instead on the application of these<br />

techniques to this particular network layout problem.<br />

The Nicheworks system [Wills 1999] and the H3 browser [Munzner<br />

1997]operate on arbitrary graphs, but do not provide explicit<br />

support for hierarchical or nested graphs like the ones generated by<br />

GT-ITM. The layouts generated by Nicheworks are primarily static<br />

with respect to manual repositioning of the nodes within the graph.<br />

The H3 browser supports good interaction with the graph, but the<br />

layout is fixed in its hyperbolic space — the user changes perspective<br />

on the graph rather than how everything is laid out.<br />

The GraphVisualizer3D (GV3D) system [Ware et al. 1997] and<br />

the HINTS system [do Nascimento and Eades 2001] each involve<br />

the user in the layout process. In GV3D, the user plays a cleanup<br />

role in the layout process, post hoc. In the HINTS system, the user<br />

provides hints about the structure of the graph to improve the performance<br />

of the automatic layout algorithm for the purposes of generating<br />

a better layout. No emphasis is placed on improving the<br />

user’s understanding of the structure of the topology.<br />

Nam [Estrin et al. 1999], the network animator, provides an animation<br />

of a network animation trace, but has very rudimentary layout<br />

and interaction capabilities; its focus lies on the animation of<br />

trace data. Tools such as the Extended Nam Editor [nam 2003]<br />

provide more robust editing capabilities.<br />

3 Transit-Stub Models<br />

The models generated by GT-ITM follow the transit-stub model of<br />

networks. In this model, nodes, which represent routers on the network,<br />

are organized into logical domains, or collections of nodes.<br />

Nodes within a domain tend to be fairly interconnected within the<br />

domain, but rarely connect to nodes outside of the domain. Domains<br />

themselves are then classified into two types: transit domains<br />

and stub domains. Nodes in a stub domain are typically an endpoint<br />

in a network flow — network traffic either originates at or is destined<br />

for a node in a stub domain. Nodes in transit domains are<br />

typically intermediate in a network flow — traffic is typically just<br />

passing through. For example, one of UUnet’s backbone routers<br />

would be in a transit domain, while a router at the local ISP would<br />

be in a stub domain.


(a) Traditional Graph View (b) Spurred Graph View<br />

4 Manual-Automatic Hybrid Layout<br />

We suspect that mixing manual interaction with automatic layout<br />

can help the user of the system forge a stronger mental understanding<br />

of the structure of the model topology. This aid is particularly<br />

important in this case because the topologies being presented do not<br />

directly correspond to any existing real-world network. By letting<br />

the user do some of the work, he or she can better understand the<br />

process that is taking place and the structure of the network; by doing<br />

most of the work automatically, the system can keep the task<br />

from becoming too tedious. Thus, the user can “sketch out” a highlevel<br />

overview of the layout, while the system fills in the details.<br />

When loading a new topology, the system presents the user with<br />

3 options: layout the network automatically; layout the network<br />

manually; or layout the network using a mixture of the two. In<br />

the last case, the user is presented with a blank canvas and a list<br />

of the domains in the network. The user then assigns a position<br />

to each transit domain in the network; as a position is defined, the<br />

system runs an automatic layout algorithm on the stub domains that<br />

peer with that transit domain and on all of the nodes within each<br />

of those domains. In the case of a 2000 node topology, the manual<br />

component of the layout process consists of laying out 10-15 transit<br />

domains in a typical case.<br />

5 Aggregation Spurs<br />

Typically, many stub domains connect to a single transit node in<br />

a transit domain, with many other stub domains connecting to the<br />

other nodes within the transit domain. When drawn on the screen,<br />

this creates a ball of string as many edges converge on a small location<br />

on the screen. To help combat this problem, we introduce<br />

a virtual aggregation edge, which we call a “spur” to the network.<br />

Each spur draws a transit node outside of the domain and creates a<br />

larger area for all of the stub peers of a transit domain to converge<br />

upon (See figure 1).<br />

6 Hierarchical Manipulation<br />

We take advantage of the hierarchical nature of the transit-stub<br />

model when manipulating the layout of the graph. When the user<br />

drags a node on the screen, its position is constrained within the<br />

77<br />

domain it is in. When a domain is moved on the screen, all of the<br />

nodes within the domain move with it, as the user would expect.<br />

When the user changes the position of a transit domain, however,<br />

all of the stub domains that peer with it move as well, in addition<br />

to the nodes within the domains. Thus, one reposition of the transit<br />

domain can move the entire group of domains associated with that<br />

domain, as the user would typically wish to do. Similarly, when the<br />

user adjusts the position of one of the spurs, all of the domains that<br />

peer with that node are repositioned.<br />

References<br />

BATTISTA, G. D., EADES, P., TAMASSIA, R., AND TOLLIS, I. G.<br />

1999. Graph Drawing — Algorithms for the Visualization of<br />

Graphs. Prentice Hall.<br />

CALVERT, K., DOAR, M., AND ZEGURA, E. W. 1997. Modeling<br />

internet topology. <strong>IEEE</strong> Communications Magazine (June).<br />

DO NASCIMENTO, H. A. D., AND EADES, P. 2001. A system for<br />

graph clustering based on user hints. In Pan-Sydney Workshop<br />

on Visual Information Processing.<br />

ESTRIN, D., HANDLEY, M., HEIDEMANN, J., MCCANNE, S.,<br />

XU, Y., AND YU, H. 1999. Network visualization with the<br />

vint network animator nam. Tech. Rep. 99-703, University of<br />

Southern California.<br />

MUNZNER, T. 1997. H3: Laying out large directed graphs in 3d<br />

hyperbolic space. In <strong>IEEE</strong> Symposium on Information Visualization,<br />

2–10.<br />

2003. Extended Nam Editor.<br />

WARE, C., FRANCK, G., PARKHI, M., AND DUDLEY, T. 1997.<br />

Layout for visualizing large software structures in 3d. In Visual97<br />

Second International Conference on Visual Information<br />

Systems, 215–225.<br />

WILLS, G. J. 1999. Nicheworks — interactive visualization of very<br />

large graphs. Journal of Computational and Graphical Statistics<br />

8, 2, 190–212.


¢¡¤£¦¥¨§©£¥£¥¨§¨©¥¨§ ¨©§¡©¥©©£¦©<br />

<br />

Nils Erichson<br />

School of <strong>Computer</strong> Science and Engineering<br />

Chalmers University of Technology<br />

SE-412 96 Gothenburg, Sweden<br />

d97nix@dtek.chalmers.se<br />

This paper describes MVisualizer, a visual tool to help clinicians<br />

visualize and explore large sets of clinical data. The application<br />

has been developed through a user-centric process to maximize its<br />

usability with regard to non-computer scientists. MVisualizer uses<br />

a drag-and-drop-based interaction method to allow the user to move<br />

sets of data between views that offer different visualizations. User<br />

tests indicate that this interaction method is well suited to the task.<br />

Keywords: information visualization, medical informatics, usercentric<br />

design<br />

<br />

As health care becomes more computerized, an abundance of clinical<br />

data is becoming available to doctors and medical researchers.<br />

To make the most of this data, it needs to be made accessible<br />

to interested parties in a way that is understandable and easy to<br />

explore. Thus arises the need for information visualization [Mc-<br />

Cormick et al. 1987; Valdés-Pérez 1999]. A traditional problem in<br />

this field is that it has mainly been developed by computer scientists<br />

without cooperation from end users [Sakas and Bono 1996].<br />

Because of this, development of end-user applications for medical<br />

information visualization needs to be done with the user in mind.<br />

Since 1995, clinicians at the Clinic of Oral Medicine at the<br />

Sahlgrenska Academy, Gothenburg University have been collecting<br />

patient data in a knowledge base based on a definitional formal<br />

model [Falkman and Torgersson 2002] as part of the MedView<br />

project [Ali et al. 2000]. This knowledge base needs to be made<br />

accessible to researchers and clinicians in a way that is easy to use.<br />

Previous attempts to create tools for visualization of the Med-<br />

View knowledge base have been made [Falkman 2001]. However,<br />

these earlier tools have generally been received with lack of enthusiasm<br />

by non-computer scientists (clinicians), due to the concepts<br />

they present being perceived as too complicated and/or abstract.<br />

This points out the need to create a visualization tool that is accessible<br />

to clinicians.<br />

<br />

The user’s goal is to explore the data to find similarities and connections<br />

in the patient data, both between different examinations<br />

and within different aspects of single examinations. In medical research,<br />

“half the battle is finding the right question to ask”. Thus,<br />

the new application has been designed to allow the user to view<br />

as much information as is desired at a given time. This led to the<br />

choice of a window-based interface.<br />

The new application has been developed in close collaboration<br />

with the users, through frequent communication and testing, to ensure<br />

that the result is a usable, “hands-on” visualization tool.<br />

78<br />

Göran Zachrisson<br />

School of Electrical Engineering<br />

Chalmers University of Technology<br />

SE-412 96 Gothenburg, Sweden<br />

e8gz@etek.chalmers.se<br />

<br />

Figure 1: MVisualizer in action. The Data Group “Kvinnor” (females)<br />

has been selected, which results a global selection of all<br />

elements that represent women across all of the views.<br />

MVisualizer is a graphical tool for visualization and exploration<br />

of clinical data. The application presents a window-based interface<br />

which uses a drag-and-drop interaction method to encourage the<br />

user to move data around examine it in different ways.<br />

The user transfers patient data (through drag-and-drop operations)<br />

into one or more views, where different types of views present<br />

different visualizations of the data. This method of moving data between<br />

different types of views was pioneered in Visage [Roth et al.<br />

1996].<br />

Each data element belongs to a Data Group. The purpose of<br />

putting elements into Data Groups is to create a conceptual “bookmark”<br />

grouping of these elements. What such a grouping represents<br />

is entirely up to the user. Examples of Data Groups might<br />

range from the simple (males vs females) to the complex (ex: female<br />

industry workers aged 25-40 smoking more than 10 cigarettes<br />

per week). Each Data Group is assigned a unique color, which creates<br />

a visual cue to help the user differentiate between elements<br />

from different sets of data when viewing them together - either by<br />

using multiple views or by combining two or more data sets into<br />

the same view.<br />

All views support manual (graphical) selection of a subset of<br />

data elements, which can then be dragged and dropped into another<br />

view or another Data Group. In cases where manual selection does<br />

not provide enough detail to make the desired selection, a tool for<br />

making selections through dynamic queries is provided as well.<br />

The second visual cue to differentiate between elements is global<br />

selection. When an element is selected in one view, it is selected<br />

(highlighted) in all other views as well (figure 1). This allows the<br />

user to quickly see which elements correspond to each other in different<br />

views without having to group them.


When visualizing a large knowledge base, the value domain can<br />

become very large. There are hundreds of different kinds of fruit to<br />

be allergic to, different brands of medicines that contain the same<br />

active substance etc. This can make analysis difficult if the diversity<br />

of the values becomes too high to observe trends in the data, especially<br />

when the information in the knowledge base is more detailed<br />

than what the user considers to be relevant. This can also make<br />

certain views appear congested, which places a high cognitive load<br />

on the user. This problem has been solved by letting the user create<br />

aggregations which unify similar values under a superset value<br />

(figure 2). For example, allergies to oranges, lemons or kiwi fruits<br />

can be unified under a single allergy type named “citrus fruits”, and<br />

all brands of pain killers that contain Ibuprofen can be unified under<br />

the value “Ibuprofen”. Aggregations are created in a graphical<br />

editor, and can be stored in a library for later re-use.<br />

Figure 2: Aggregation: The right bar chart contains the same elements<br />

as the left, but with a more unified value domain through<br />

application of an aggregation.<br />

To increase the usefulness of the application to clinicians, related<br />

MedView software components such as automatic journal<br />

generation, a photo browser (figure 3) and a simple statistics view<br />

have been integrated into MVisualizer. This increases their level<br />

of usability to the user as MVisualizer’s drag-and-drop interaction<br />

method is be applied to these components. For example, dragging<br />

images from the Photo View to a graph view creates a graph of the<br />

examinations that the images belong to.<br />

For further analysis, data can be exported to Microsoft Excel<br />

format, or a text-based format for use in statistical tools such as<br />

SPSS.<br />

¡ <br />

Initial testing and user feedback has so far been promising. The<br />

users that have tested the application find it more appealing and<br />

accessible than the previous attempts mentioned in section 1. It<br />

also seems to have a shorter learning curve, as most users can start<br />

using the application after a brief (10-15 minute) demonstration.<br />

This suggests that the described interaction method is well suited to<br />

this type of application.<br />

The application is currently in use by clinicians at the Clinic of<br />

Oral Medicine at the Sahlgrenska Academy, Gothenburg University.<br />

So far use of the application has led to some interesting discoveries.<br />

One example can be seen in figure 4, where a view displays a<br />

stacked bar chart of all patients diagnosed with Oral Lichen Planus.<br />

To the far right, we can see that there is an overrepresentation of<br />

patients taking Östrogen (estrogen) in this group.<br />

79<br />

Figure 3: The Photo View: Images are draggable, and represent the<br />

examinations that the images belong to.<br />

Figure 4: A practical result of MVisualizer in use. Note the overrepresentation<br />

of patients taking Östrogen (estrogen).<br />

¢ ¤£ <br />

ALI, Y., FALKMAN, G., HALLNÄS, L., JONTELL, M., NAZARI, N., AND<br />

TORGERSSON, O. 2000. MedView—design and adaption of an interactive<br />

system for oral medicine. In Medical Infobahn for Europe: Proceedings<br />

of MIE2000 and GMDS2000, IOS Press.<br />

FALKMAN, G., AND TORGERSSON, O. 2002. Knowledge acquisition and<br />

modeling in clinical information systems: A case study. In Proceedings<br />

of the 13th International Conference on Knowledge Engineering<br />

and Knowledge Management, EKAW 2002, Springer-Verlag, vol. 2473<br />

of LNAI, 96–101.<br />

FALKMAN, G. 2001. Information visualization in clinical odontology. Artificial<br />

Intelligence in Medicine 22, 2, 133–158.<br />

MCCORMICK, B., DEFANTI, T. A., AND BROWN, M. D. 1987. Visualization<br />

in scientific computing. ACM SIGGRAPH <strong>Computer</strong> Graphics<br />

21, 6.<br />

ROTH, S., LUCAS, P., SENN, J., GO<strong>MB</strong>ERG, C., BURKS, M., STROF-<br />

FOLINO, P., KOLOJEJCHICK, J., AND DUNMIRE, C. 1996. Visage: A<br />

user interface environment for exploring information. In Proceedings of<br />

Information Visualization, <strong>IEEE</strong>, 3–12.<br />

SAKAS, G., AND BONO, P. 1996. Medical visualization. <strong>Computer</strong>s &<br />

Graphics: Special Issue on Medical Visualization 20, 6, 759–762.<br />

VALDÉS-PÉREZ, R. E. 1999. Principles of human-computer collaboration<br />

for knowledge discovery in science. Artificial Intelligence 107, 2, 335–<br />

346.


Abstract<br />

Interactive Poster: The InfoVis Toolkit<br />

Jean-Daniel Fekete<br />

INRIA Futurs & Laboratoire de Recherche en Informatique (LRI)<br />

Bat 490, Université Paris-Sud<br />

91405 ORSAY, FRANCE<br />

Jean-Daniel.Fekete@inria.fr<br />

The InfoVis Toolkit is designed to support the creation, extension<br />

and integration of advanced 2D Information Visualization<br />

components into interactive Java Swing applications. The InfoVis<br />

Toolkit provides specific data structures to achieve a fast<br />

action/feedback loop required by dynamic queries. It comes with<br />

a large set of components such as range sliders and tailored<br />

control panels to control and configure the visualizations.<br />

Supported data structures currently include tables, trees and<br />

graphs. Supported visualizations include scatter plots, time series,<br />

Treemaps, node-link diagrams for trees and graphs and adjacency<br />

matrix for graphs. All visualizations can use fisheye lenses and<br />

dynamic labeling. The InfoVis Toolkit supports hardware<br />

acceleration when used with Agile2D, an OpenGL-based<br />

implementation of the Java Graphics API resulting in speedup<br />

factors of 10 to 200.<br />

1 Introduction<br />

Figure 1: Examples of Scatter Plot, Treemap and Graph Visualizations Built with the InfoVis Toolkit<br />

Despite their well understood potentials, information visualization<br />

applications are difficult to implement. They require a set of<br />

components and mechanisms that are not available in or not well<br />

supported by traditional GUI toolkits such as range sliders,<br />

fisheye lenses and dynamic queries.<br />

The InfoVis Toolkit has been designed to quickly specialize<br />

existing information visualization techniques to specific<br />

applications, to design and test new visualization techniques and<br />

to experiment with new uses of visual attributes such as<br />

transparency and color gradients [2]. The InfoVis Toolkit key<br />

features are:<br />

• Generic data structures suited to visualization;<br />

• Specific algorithms to visualize these data structures;<br />

• Mechanisms and components to perform direct<br />

manipulations on the visualizations;<br />

• Mechanisms and components to perform well-known generic<br />

information visualization tasks;<br />

• Components to perform labeling and spatial deformation.<br />

80<br />

2 Structure of the InfoVis Toolkit<br />

The InfoVis Toolkit is a Java library and software architecture<br />

organized in five main parts (Figure 2): tables, columns,<br />

visualizations, components and input/ output. It brings together<br />

several ideas from different domains and assembles them in a<br />

consistent framework, similar to [1,2] but using the Java/Swing<br />

libraries instead of C++/OpenGL which are more difficult to learn<br />

and to use.<br />

The InfoVis Toolkit provides a unified underlying data structure<br />

based on tables. Representing data structures with tables improves<br />

the memory footprint and performance, compared with ad-hoc<br />

data structures used by other specialized InfoVis applications.<br />

Any data structure can easily be implemented on top of tables and<br />

accessed using an object-oriented interface for ease of<br />

programming.<br />

A table is a list of named columns plus metadata and user data. A<br />

column manages rows of elements of homogeneous type, i.e.<br />

integers, floating points or strings. The elements are indexed so<br />

columns are usually implemented with primitive arrays. Some<br />

rows can be undefined. This mechanism is important because in<br />

real data sets, values may be missing. Allowing undefined<br />

elements is also very useful for representing general data<br />

structures such as XML elements with attributes.<br />

Columns also support the following features:<br />

• they contain metadata, e.g. to express that an integer column<br />

contains categorical or numeral values;<br />

• they can trigger notifications when their content is modified;<br />

• they support formatting for input and output so, for example,<br />

dates can be stored in columns of “long integers” data types<br />

and still appear as dates when read or displayed.<br />

Layout algorithms are encapsulated into Visualization<br />

components that map data structures into visual shapes.<br />

Visualizations natively support dynamic labeling [3] and fisheye<br />

views.<br />

The InfoVis Toolkit currently supports three concrete data<br />

structures: tables, trees and graphs. For each data structure, its<br />

core supports several visualizations: time series and scatter plots<br />

for tables, node-link diagrams and treemaps for trees, node-link<br />

diagrams and adjacency matrices for graphs.


Readers<br />

Visualization<br />

Visual Attributes<br />

Layout<br />

Components<br />

Table<br />

Columns<br />

Metadata<br />

Shape Column<br />

Dyamic Queries Controls<br />

Writers<br />

Rendering<br />

Picking<br />

Labeling<br />

Image<br />

Fisheyes<br />

Figure 2: Internal structure of the InfoVis Toolkit.<br />

Squares represent data structures whereas ellipses<br />

represent functions.<br />

Creating a new visualization technique such as the Icicle Tree<br />

(Figure 3a) requires 50 lines of Java code. Adding direct<br />

manipulation to Icicle trees for interactive clustering requires 18<br />

additional lines of Java. Dynamic queries, dynamic labeling and<br />

fisheye views are immediately operational on this new<br />

visualization. Yet, all interactions can be tailored. Visualizations<br />

such as the Icicle tree can easily be used as a component, e.g. for<br />

controlling the clustering and permutations of a graph visualized<br />

as a matrix (Figure 3b).<br />

For greater flexibility, the toolkit creates most of its interactive<br />

components through “factory” objects, simplifying the integration<br />

of new components or new styles of interactions. For example,<br />

replacing the range sliders provided by the toolkit for performing<br />

dynamic queries by brushing histograms [4] only involves<br />

registering the brushing histogram class as the default interactive<br />

component in the “dynamic query factory”.<br />

One of the aim of the InfoVis toolkit is to simplify the<br />

implementation of new techniques. The toolkit comes with a large<br />

and growing set of examples of visualization techniques selected<br />

from conference papers such as InfoVis, UIST and CHI. These<br />

implementations are useful complements to the articles for<br />

pedagogical and technical purposes.<br />

4 Performance<br />

Java graphics is notoriously slow. To overcome that problem, the<br />

InfoVis Toolkit has been designed to use hardware acceleration<br />

provided by the Agile2D 1 system when available. Agile2D is an<br />

implementation of Java graphics relying on the OpenGL library<br />

that offers hardware accelerated graphics when a hardware<br />

accelerated board is available. Visualizations still work without<br />

Agile2D but the acceleration factor offered by hardware support<br />

can be 200 for times series and around 10 to 100 times typically<br />

for other visualization techniques, opening the toolkit to larger<br />

data sets or more sophisticated rendering techniques such as<br />

transparency, color gradients or textures with a decent redisplay<br />

speed.<br />

1 Agile2D has been designed by Jon Meyer and Ben Bederson at the<br />

University of Maryland and improved by the author to expose accelerated<br />

graphics in a portable way, (see www.cs.umd.edu/hcil/agile2d.)<br />

81<br />

a)<br />

b)<br />

Figure 3: a) An irregular Icicle trees, b) Icicle trees as<br />

components for a clustered graphs showing a web site with<br />

600 documents..<br />

5 Conclusion<br />

The InfoVis Toolkit is distributed as free software under a liberal<br />

license (QPL) in the hope that the Information Visualization<br />

community will adopt it as a workbench for implementing new<br />

ideas within an already rich toolkit. It is available at:<br />

http://www.lri.fr/~fekete/InfovisToolkit and is currently used by<br />

several research projects in domains including biology,<br />

cartography and trace analysis. It has also proved very efficient<br />

for student projects, both in terms of development time and shared<br />

experience.<br />

We are continuing the development of the InfoVis Toolkit and are<br />

looking forward to improvements and feedback from the<br />

Information visualization community.<br />

References<br />

1. BOSCH, R:, STOLTE, C., TANG, D. GERTH, J. ROSENBLUM, M.<br />

AND HANRAHAN; P., Rivet: A Flexible Environment for<br />

<strong>Computer</strong> Systems Visualization, <strong>Computer</strong> Graphics 34(1),<br />

February 2000, pp. 68 – 73.<br />

2. FEKETE, J.-D. AND PLAISANT, C. Interactive Information<br />

Visualization of a Million Items Proceedings of <strong>IEEE</strong><br />

Symposium on Information Visualization, 2002, Boston,<br />

October 2002, pp 117 -124.<br />

3. FEKETE, J.-D., AND PLAISANT, C. Excentric labeling: Dynamic<br />

neighborhood labeling for data visualization. In Proc. of CHI<br />

'99 ACM Press, May 1999, pp. 512-519.<br />

4. LI, Q., BAO, X., SONG, C., ZHANG, J., NORTH, C. Dynamic query<br />

sliders vs. brushing histograms, in CHI '03 extended abstracts<br />

on Human factors in computer systems Ft. Lauderdale, Florida,<br />

USA.


Jean-Daniel Fekete *<br />

INRIA Futurs/LRI<br />

Abstract<br />

Interactive Poster: Overlaying Graph Links on Treemaps<br />

David Wang<br />

Every graph can be decomposed into a tree structure plus a set of<br />

remaining edges. We describe a visualization technique that<br />

displays the tree structure as a Treemap and the remaining edges<br />

as curved links overlaid on the Treemap. Link curves are<br />

designed to show where the link starts and where it ends without<br />

requiring an explicit arrow that would clutter the already dense<br />

visualization. This technique is effective for visualizing structures<br />

where the underlying tree has some meaning, such as Web sites or<br />

XML documents with cross-references. Graphic attributes of the<br />

links – such as color or thickness – can be used to represent<br />

attributes of the edges. Users can choose to see all links at once or<br />

only the links to and from the node or branch under the cursor.<br />

CR Categories: H.5.2 [User Interfaces], E.1 [DATA<br />

STRUCTURES Graphs and Networks]<br />

Keywords: Information Visualization, Treemaps, Bézier Curves.<br />

1 Introduction<br />

The general problem of graph drawing and network visualization<br />

is notoriously difficult. Instead of tackling it directly, we present<br />

a method that starts from a decomposition of the graph into a tree<br />

and a set of remaining edges. This decomposition can always be<br />

done but some data structures are easier and more meaningful<br />

when decomposed that way. For example, a Web site is almost<br />

always organized in a hierarchical file system corresponding to a<br />

meaningful hierarchical organization. An XML document with<br />

cross references (e.g. a table of contents, footnotes, and index<br />

entries) can also naturally be decomposed into an XML tree plus<br />

the cross reference<br />

Our method uses Treemaps [1] for visualizing the tree structure<br />

and overlays links to represent the remaining edges (Figure 1).<br />

We initially used straight lines connecting the source and<br />

destination item centers but the results where very cluttered due to<br />

the superposition of lines [2]. There have been several attempts at<br />

simplifying the representation of links on node-link diagrams.<br />

Becker et al. proposed half-lines for that purpose [3], using a<br />

straight line from the source but stopping it halfway to the<br />

destination, avoiding drawing arrowheads. We have designed a<br />

novel method for drawing the links using a curved representation<br />

where the offset of curvature indicates the direction of the link.<br />

--------------------------------------------<br />

* bat 490, Univerité Paris-Sud, F91405 ORSAY Cedex, FRANCE,<br />

Jean-Daniel.Fekete@inria.fr<br />

‡ UMIACS- HCIL, A.V. Williams Building, University of Maryland,<br />

College Park, MD 20742, U.S.A. plaisant@cs.umd.edu,<br />

Niem Dang<br />

University of Maryland<br />

82<br />

Aleks Aris Catherine Plaisant ‡<br />

HCIL<br />

University of Maryland<br />

Figure 1: Directory structure of a Web site visualized as<br />

a Treemap with external links overlaid as curves. Blue<br />

curves are HTML links, red curves are image links.<br />

The curved link is modeled using a quadrilateral Bézier curve<br />

(Figure 2a). The first and last points are placed at the middle of<br />

the source and target regions. The second point is placed at a<br />

distance halfway from the source position from the first point and<br />

is on a line forming an angle of 60 degrees from the source to the<br />

destination line (Figure 2b).<br />

a)<br />

b)<br />

Figure 2: (a) The three control points of a quadrilateral<br />

Bézier curve and (b) the computation of the second point.


Using this method, the curve is not symmetrical but shifted<br />

towards the source and this shift is easy to recognize visually.<br />

When two items reference each other, the two curved links remain<br />

clearly distinguishable and are not occluded. Figure 3 shows<br />

several HTML pages pointing at each other’s, where links are<br />

easy to follow. Links can also be colored depending on attributes<br />

associated with the edges. We have experimented with colors but<br />

line width could also be used within reasonable limits.<br />

2 Interaction<br />

The links visibility can be static or dynamic. By default, the<br />

visibility is static: all the links are shown. This setup is useful as<br />

an overview but can clutter the treemap representation, making<br />

item labels hard to read for example. When users want to focus<br />

on the tree structure or on a specific region of the visualization,<br />

they can select the dynamic visibility of links. This setup only<br />

shows links starting from or arriving at items that have been<br />

selected (nodes or branches), when a selection exists. Otherwise,<br />

it tracks the mouse and shows links starting from and arriving at<br />

the item under the pointer. This last setup is useful for<br />

dynamically exploring a visualization and watching connections<br />

between areas of interests.<br />

The curved links visualization has been integrated into the latest<br />

version of the University of Maryland “Treemap 4.1” [5]. The<br />

data consists of a Treemap data file and a link file specifying the<br />

non-hierarchical links to be visualized. Treemap 4 can display<br />

data with a fixed variable depth hierarchy or - if the data does not<br />

include a fixed hierarchy - users can interactively create a<br />

hierarchy using the new “flexible hierarchy” feature of Treemap<br />

4. When users load the link links the links are visualized over the<br />

Treemap visualization of this dataset (see Figure 4). Treemap<br />

implements several treemap layouts [2] and allows for dynamic<br />

queries based on attributes associated with the nodes as well as<br />

attributes computed from the data topology such as depth or<br />

degree or nodes. Treemap also allows usersss to select nodes or<br />

branches and hide them, which can be useful to hide nodes that<br />

have a large number of links (e.g. the company logo) and make<br />

the rest of the display more usable.<br />

3 Conclusion and Future Work<br />

Some graphs that can be meaningfully visualized as an underlying<br />

tree structure with overlaid links. For these graphs, we present the<br />

tree structure using a Treemap layout and overlay the edges as<br />

curved links. This graph visualization is therefore an<br />

enhancement of a tree visualization and is simpler to control and<br />

understand than general purpose network visualization systems.<br />

We could generalize the idea and provide a tool that would<br />

transform any graph into a tree and a remaining set of edges.<br />

There are many algorithms to perform a tree extraction from a<br />

graph.<br />

One limit of our current implementation is that the Treemap<br />

program is meant to visualize trees and doesn’t currently perform<br />

dynamic queries or assign link visual attributes from edge<br />

attributes. This would belong to a more general graph<br />

visualization system. We are currently integrating this technique<br />

into the InfoVis Toolkit [6], which can visualize trees as well as<br />

graphs and will able to integrate those features more easily.<br />

83<br />

Figure 4: Details of six HTML files visualized in a<br />

Treemap with cross links without occlusions.<br />

Figure 3: Integration of the curved links inside the<br />

Treemap system<br />

Acknowledgments<br />

This work has been supported by ChevronTexaco.<br />

References<br />

[1] JOHNSON, B. AND SHNEIDERMAN, B. Tree-maps: A space-filling<br />

approach to the visualization of hierarchical information<br />

structures, Proc. <strong>IEEE</strong> Visualization’ 91 (1991) <strong>28</strong>4 – 291,<br />

<strong>IEEE</strong>, Piscataway, NJ<br />

[2] BEDERSON, B.B., SHNEIDERMAN, B., AND WATTENBERG, M Ordered<br />

and Quantum Treemaps: Making Effective Use of 2D Space to<br />

Display Hierarchies, ACM Transactions on Graphics (TOG),<br />

21, (4), October 2002, 833-854..<br />

[3] R. BECKER, S. EICK AND A. WILKS. Visualizing Network data, In<br />

<strong>IEEE</strong> Transactions on Visualization and <strong>Computer</strong> Graphics,<br />

vol 1,no. 1, March 1995.<br />

[4]WANG, D, Graph Visualization: An Enhancement to the<br />

Treemap Data Visualization Tool, University of Maryland<br />

InfoVis class project report,<br />

http://www.cs.umd.edu/class/spring2002/cmsc838f/Project/treemap.pdf<br />

[5] Treemap 4.1, http://www.cs.umd.edu/hcil/treemap<br />

[6] The InfoVis Toolkit, http://www.lri.fr/~fekete/InfovisTookit


Abstract<br />

Interactive Poster: Semantic Navigation in Complex Graphs<br />

We are investigating new interactive graph visualization<br />

techniques to support effective navigation of a complex graph and<br />

to form semantic neighborhoods via dynamic queries. We are also<br />

investigating issues of depicting and highlighting neighborhoods<br />

within the context of the full graph. Finally, we are investigating<br />

methods to control and display the intersection of multiple<br />

neighborhoods, using image layer metaphors.<br />

CR Categories: H.5.2 [Information Interfaces and Presentation]:<br />

User Interfaces – Graphical user interfaces; H.1.2 [User/Machine<br />

Systems]: Human Factors; I.3.6 [<strong>Computer</strong> Graphics]:<br />

Methodology and Techniques – Interaction Techniques<br />

Keywords: Information Visualization, Information Analysis,<br />

Dynamic Query, Visual Query.<br />

1 Introduction<br />

We are exploring methods for information navigation and<br />

depiction in complex graphs from task, interaction, and<br />

visualization perspectives. We have implemented an initial<br />

method to specify and depict a semantic neighborhood, a set of<br />

entities dispersed throughout a graph that are related by the<br />

semantics of a user task or inquiry. This set may not be “close” in<br />

terms of the graph’s topology; it should nevertheless be depicted<br />

in a way that highlights this user-determined relationship. We<br />

hypothesize (based in part on subject matter expert interviews)<br />

that it is important to maintain a relatively constant layout for the<br />

underlying graph; the layout can be an important contributor to<br />

the user’s mental model of the problem space. Performing a<br />

global re-layout of the graph to bring semantic neighbors closer to<br />

each other could be detrimental to that model. We chose instead to<br />

use a constrained, dynamic query approach to support this<br />

process; dynamic query interfaces have proven successful in<br />

allowing users to quickly filter through unwanted information in<br />

complex data sets [Shneiderman 1994]. Additionally, our dynamic<br />

query interface records a functional definition of the<br />

neighborhood.<br />

We are focusing on tasks related to intelligence analysis: “Let me<br />

see who else was at this meeting.”; “Let me follow the transaction<br />

chain: who gave the money to the person who gave the money to<br />

the person who bought the explosives.” Our scenario development<br />

is being supported by interviews with subject matter experts.<br />

2 Related work<br />

Many effective graph visualization techniques have been<br />

developed to examine large graphs, for example by using fisheye<br />

or hyperbolic lens approaches [Herman et al. 2000] [Pirolli et al.<br />

2003], visualizing multiple semantic contexts, allowing dynamic<br />

user modification of degree of interest and weight functions [Pu et<br />

al. 2003], or successive query refinement [Janecek et al. 2002].<br />

These developments have concentrated on using distortion to<br />

Amy Karlson, Christine Piatko, John Gersh<br />

The Johns Hopkins University Applied Physics Laboratory<br />

{Amy.Karlson, Christine.Piatko, John.Gersh}@jhuapl.edu<br />

84<br />

clarify the view of a particular section of the graph. Systems<br />

supporting edge-following have focused mainly on trees or strict<br />

hierarchies [Grosjean et al. 2002]. Much practical work continues<br />

to be done in support of link analysis tasks [Clearforest 2003]<br />

[Visual Analytics 2003]. This large body of related work,<br />

however, has not focused on interactively defining semantic<br />

neighbors in an arbitrarily-connected graph with rich link and<br />

edge attributes, and on visualizing the resulting neighborhood in<br />

context.<br />

3 Semantic Navigation<br />

Our current method for semantic navigation starts with the user<br />

selecting a node of interest. The user is then presented with a list<br />

of the edge types connected to that node and the node types one<br />

hop away. The user can then select relationship and entity types<br />

from the list; the associated edges and nodes on the graph<br />

highlight and increase in size to inform the user that they have<br />

been included in the semantic neighborhood. Once the new<br />

entities have been included, users can repeat the process with<br />

respect to the added nodes. They can continue to do so until the<br />

ultimate set of nodes satisfies a meaningful relationship to the<br />

source node according to the user's task. In this way, users can<br />

dynamically explore semantic paths within the context of the<br />

parent graph to generate a semantic neighborhood of related<br />

entities.<br />

In our intelligence-analysis example, we initially show an entire<br />

graph of information about terrorist activities, hiding most details,<br />

but providing a structural frame of reference. The analyst chooses<br />

a node of interest. This entity is added to the navigation interface<br />

as a tabbed pane, populated with a single column of checkboxes<br />

indicating the relationships (edge types) and entities (node types)<br />

to which the entity is directly attached (Figure 3). Each tabbed<br />

pane represents a means of navigating to and defining a set of<br />

entities that are meaningfully related to the source entity. As the<br />

analyst selects a relationship or entity type from the list, edges of<br />

that type are highlighted within the graph and the associated<br />

entities scale, visually announcing the location of the relationship<br />

in the graph and defining the semantic neighborhood. For<br />

example, the analyst selects a meeting as a source node, and<br />

checks “People” from the list of directly connected entity types to<br />

include all people who were associated (attended, organized, etc.)<br />

with that meeting. Alternatively, the analyst could select only<br />

“Attendee” from the list of relationship types to restrict the set of<br />

interest. The associated relationships and entities on the graph<br />

highlight and scale to indicate that they have been included in the<br />

semantic neighborhood. In addition, a new column of node and<br />

edge types associated with the newly added entities is displayed<br />

and updated dynamically. This new column represents the<br />

aggregation of entity and relationship types associated with any of<br />

the new neighborhood nodes. The user can then repeat the process<br />

by selecting from the newly generated type list, and can continue<br />

to do so until the ultimate set of entities satisfies a meaningful<br />

relationship to the source entity (Figure 1). For example, the final<br />

neighborhood might be “actions planned by organizations<br />

associated with people at this meeting,” or “individuals associated<br />

with events attended by people at this meeting.” The analyst has<br />

the option at this point to hide the intervening paths from source


node to the semantic neighbors, replacing them with single edges<br />

representing this new neighbor relationship (e.g., “potential coconspirators”)<br />

(Figure 2). Note the added value of displaying the<br />

neighborhood members in the global graph context: two clusters<br />

of potential co-conspirators appear in distinct regions of the graph.<br />

We envision users performing semantic navigation with a<br />

significant portion of the entire graph in view. As the user defines<br />

a semantic neighborhood, the participating entities and<br />

relationships are scaled and highlighted for visibility, effectively<br />

creating a detailed foreground of semantic neighborhoods against<br />

the less detailed graph background. We further distinguish<br />

foreground from background by supporting independent control<br />

over the visual depiction of individual neighborhoods as well as<br />

the underlying global graph context.<br />

4 Representing Multiple Neighborhoods<br />

Some analysis tasks involve finding common members of<br />

different neighborhoods; such discoveries can produce important<br />

analytical “Aha’s.” Considering each neighborhood as a “layer” in<br />

the graph, we have demonstrated mechanisms for distinguishing<br />

distinct neighborhoods from one another, handling overlap among<br />

neighborhoods, and manipulating neighborhoods independently,<br />

including hiding/showing a neighborhood or its intermediate<br />

entities, and controlling the foreground/background transparency<br />

of a neighborhood or the graph context to manage visual<br />

complexity. Our initial model of layer control is similar to image<br />

editing metaphors (Figure 4).<br />

5 Future Work<br />

We are continuing to develop our dynamic query mechanisms to<br />

specify neighborhoods by other means, e.g., through dynamic<br />

queries constraining values of entity or relationship attributes,<br />

rather than just their types. We also are continuing our<br />

investigation of methods to effectively display neighborhoods<br />

preserving context, such as fisheye techniques to perform local relayout<br />

of scaled nodes to avoid overlap [Storey et al. 1999].<br />

6 Acknowledgements<br />

Tom Sawyer Software Corporation’s Graph Editor Toolkit API<br />

provided the graph drawing, layout and pan/zoom foundation for<br />

our development and evaluation software environment. Thanks to<br />

Dan Haught of TrackingTheThreat.com for donating his terrorist<br />

network data.<br />

References<br />

CLEARFOREST 2002. Turning Unstructured Data Overload into a<br />

Competitive Advantage. White paper. http://www.clearforest.com.<br />

GROSJEAN, J., PLAISANT, C., BEDERSON, B. 2002. SpaceTree: Supporting<br />

Exploration in Large Node Link Tree, Design Evolution and Empirical<br />

Evaluation Proceedings of <strong>IEEE</strong> Symposium on Information<br />

Visualization, 57–64.<br />

HERMAN, I., MELANCON, G.,MARSHALL, M. 2000. Graph visualization<br />

and navigation in information visualization: A survey, <strong>IEEE</strong><br />

Transactions on Visualization and <strong>Computer</strong> Graphics, 6(1), 24–43.<br />

JANECEK, P., PU, P. 2002. A Framework for Designing Fisheye Views to<br />

Support Multiple Semantic Contexts. In International Conference on<br />

Advanced Visual Interfaces (AVI '02), ACM Press.<br />

PIROLLI, P., CARD, S. K., VAN DER WEGE, M. M. 2003. The effects of<br />

information scent on visual search in the hyperbolic tree browser. ACM<br />

Transactions on <strong>Computer</strong>-Human Interaction (TOCHI), 10(1), 20-53.<br />

PU, P., JANECEK, P. 2003. Visual Interfaces for Opportunistic Information<br />

Seeking. To appear in the 10th International Conference on Human -<br />

<strong>Computer</strong> Interaction (HCII’03).<br />

85<br />

SHNEIDERMAN, B. 1994. Dynamic queries for visual information seeking.<br />

<strong>IEEE</strong> Software, 11(6), 70-77.<br />

STOREY, M-A. D., FRACCHIA, D., MULLER, H. A. 1999. Customizing a<br />

Fisheye View Algorithm to Preserve the Mental Map. Journal of Visual<br />

Languages and Computing, 10(3), 245-267.<br />

VISUAL ANALYTICS 2003. How to Catch a Thief.<br />

http://www.visualanalytics.com/whitepaper/.<br />

Figure 1. Graph depiction of “individuals associated with events<br />

attended by people at this meeting.”<br />

Figure 2. Graph depiction of neighborhood links.<br />

Figure 3. Neighborhood navigation and definition interface.<br />

Figure 4. Independent neighborhood depiction control.


1. Motivation<br />

Interactive Poster: Business Impact Visualization<br />

Ming C. Hao, Daniel A. Keim*, Umeshwar Dayal, Fabio Casati, Joern Schneidewind<br />

(ming_hao, umeshwar_dayal, fabio_casati, joern.schneidewind@hp.com)<br />

Hewlett Packard Research Laboratories<br />

Recent research efforts have focused on how to transform<br />

business operation data, as logged by the IT infrastructure, into<br />

valuable business intelligence information. The goal of<br />

Business Impact Analysis is to improve the management of<br />

complex, large-scale IT infrastructures and optimize their<br />

operations by quickly and easily identifying problems and their<br />

causes.<br />

There are a number of business-oriented visualization<br />

techniques developed, such as the SeeSoft line representation<br />

technique [1] used for visualizing Y2K program changes,<br />

ILOG Jviews used for analyzing workflow processes, and<br />

E_BizInsights used for web path analysis, and parallel<br />

coordinates [2] used for correlations. All these methods aim at<br />

reducing the time to turn business data into information, which<br />

in turn reduces the business decision-making time.<br />

2. Our Approach<br />

In this poster, we present a new technique for interactively<br />

visualize business intelligence, called VisImpact. The basic<br />

idea of this technique is to visually analyze relationships<br />

between the most important operation parameters and to map<br />

the parameters into business impact visualization. The<br />

component architecture is as follows:<br />

• Use business impact visualization to analyze<br />

relationships between operation parameters and<br />

business process flow.<br />

• Use event occurrence visualization to observe the<br />

business operations occurrence sequence and its<br />

consequences.<br />

2.1. Business Impact Visualization<br />

VisImpact transforms multiple business attributes to nodes,<br />

with lines between nodes on a circle representing a business<br />

case. Five different attributes are:<br />

• Source: for partitioning the left side of the circle<br />

• Intermediate: for partitioning the center axis of the<br />

circle<br />

• Destination: for partitioning the right side of the<br />

circle<br />

• Color: using colored lines for specific business<br />

metrics, such as response time, violation level, or<br />

dollar amount<br />

• Time: for event occurrence sequences<br />

2.2 Event Occurrence Visualization<br />

The event occurrence visualization shows a collection of<br />

business process instances and their source and destination<br />

relationships over time. This visualization is displayed when a<br />

user drills down from a business parameter.<br />

* University of Constance, Germany, keim@informatik.uni-konstanz.de.<br />

86<br />

3. Applications<br />

We have experimented with VisImpact for business process<br />

analysis: service contract analysis, business operation analysis,<br />

and SARS disease analysis at HP Research Laboratories.<br />

3.1 Service Contract Analysis<br />

Business contracts typically contain SLAs (Service Level<br />

Agreements) that define what service should be delivered with<br />

a certain quality and within a specified time period. One of the<br />

common questions business mangers ask is whether business<br />

operations are fulfilling contracts, and which contract has been<br />

violated. Figure 1 shows a business contract impact<br />

visualization example.<br />

As illustrated in Figure 1A, the source nodes are Customers.<br />

The intermediate nodes are Providers. The destination nodes<br />

are Suppliers. The color shows the average violation level. The<br />

width of a line represents the number of SLAs in a contract.<br />

The contracts with the highest violation levels are 1, and 5,<br />

(color brown). The contracts with least violation level are 4, 7,<br />

and 8 (color yellow). Contract 3 is violated (exceed the<br />

threshold) and is colored red for quick identification.<br />

Source Nodes Intermediate Nodes Destination Nodes<br />

9<br />

Figure 1A: Business Impact Visualization<br />

The event occurrence visualization is employed to observe the<br />

sequence of violation occurrences over time. Figure 1B<br />

illustrates the first SLA 1 violation occurrence happening at<br />

10:31:22, 1/15/03. This violation from a Supplier Assembly<br />

caused other violations of SLA 2 and SLA 3 at 8:31:22,<br />

2/18/03, and 12:31:22, 3/11/03. Both SLAs 2 and 3 are the<br />

service agreements of Contract 3 made between Customer B<br />

and Provider PC. As a result, Contract 3 is violated (color red,<br />

shown in Figure 1A).<br />

high<br />

low


10:31:22 1/15/03<br />

occurrence #1<br />

1<br />

Figure 1B: Event Occurrence Visualization<br />

3.2 Business Operation Impact Analysis<br />

The VisImpact system has been applied to explore business<br />

process duration time. Figure 2A illustrates 63,544 business<br />

process instances, related to actual processes executed within<br />

HP. The source nodes are the days (1-7) and reside on the left<br />

of the circle. The intermediate nodes are the hours of a day (0-<br />

23) and reside in the middle axis of the circle. The destination<br />

nodes are the types of operation such as Travel, Payments,<br />

Personnel, Reimbursements, and Purchasing, and reside on the<br />

right side of the circle. The linked lines represent the<br />

connections between the nodes. The color represents the<br />

duration time. For fast identification, nodes are ordered by<br />

duration time from top to bottom on the circle. The analyst<br />

clicks on a node to show relationships with other parameters<br />

(i.e. day, hour, client) as illustrated in Figure 2B-2D.<br />

Figure 2A: Business Operation Impact Visualization<br />

2<br />

8:31:22 2/18/03<br />

occurrence #2<br />

Figure 2A –2D observes the following:<br />

• Within the five different business clients, Personnel has<br />

the largest number of business instances (more lines) as<br />

shown in Figure 2A.<br />

• Day 7 has the fastest duration time (most green lines),<br />

except a few Personnel business instances with a high<br />

duration time (color burgundy) as shown in Figure 2B<br />

when the user focuses on day 7.<br />

• Overall a large number of business process instances<br />

achieve a good duration time (yellow and green) except<br />

14 th hour as shown in Figure 2C.<br />

3<br />

12:31:22 3/11/03<br />

occurrence #3<br />

high<br />

duration<br />

short<br />

duration<br />

87<br />

• Purchasing has long duration times as shown in Figure 2D<br />

(it takes 10-12 days to complete an operation, colored<br />

burgundy) in contrast to Travel which has a short duration<br />

time (< 1day, yellow and green).<br />

Figure 2B: 7 th Day Figure 2C: 14 th Hour Figure 2D: Purchasing<br />

3.3 SARS Disease Analysis<br />

Figure 3: SARS: How the SARS disease infected the world<br />

VisImpact is not limited to the visualization of business data. It<br />

has been applied to visualize medical data, like the spreading of<br />

the global SARS disease. Figure 3 illustrates a simple technique<br />

to visualize the medical impact on people and countries of an<br />

infection disease. The source node represents Dr. Liu. He<br />

infected 12 people with SARS in a hotel in Hong Kong. The<br />

Intermediate nodes represent these 12 people. The destination<br />

nodes show how the disease came to the world. Each node<br />

represents an infected person. The labels for the destination<br />

nodes show the country where the infected people come from.<br />

4. Conclusion<br />

In this poster, we develop a new approach for enabling analysts<br />

to interactively visualize business operation flows and<br />

correlations. Future work will link multiple business impact<br />

visualization together.<br />

References<br />

[1] Stephen G. Eick et al.: ‘Seesoft”– a tool for visualizing line<br />

oriented software statistics’, <strong>IEEE</strong> Transactions,<br />

November, 1992.<br />

[2] Inselberg A., Dimsdale B.: `Parallel Coordinates: A Tool<br />

for Visualizing Multi-Dimensional Geometry’,<br />

Proc.Visualization ´90, San Francisco, CA, 1990.


Interactive Poster:<br />

Visualization for Periodic Population Movement between Distinct Localities<br />

Abstract<br />

We present a new visualization method to summarize and<br />

present periodic population movement between distinct<br />

locations, such as floors, buildings, cities, or the like. In the<br />

specific case of this paper, we have chosen to focus on student<br />

movement between college dormitories on the Columbia<br />

University campus. The visual information is presented to the<br />

information analyst in the form of an interactive geographical<br />

map, in which specific temporal periods as well as individual<br />

buildings can be singled out for detailed data exploration. The<br />

navigational interface has been designed to specifically meet a<br />

geographical setting.<br />

Keywords: geo-visualization, migration, movement, population,<br />

information visualization, mapping, cartographic visualization<br />

1 Introduction<br />

Visualization of large, highly dimensional data sets is an<br />

important and essential form of data analysis that applies to<br />

every field of information analysis. It lends itself especially well<br />

to mapping the temporal movement of people between distinct<br />

localities in a well-defined area, such as a university campus, a<br />

city, country, or the world. Generic visualizations of this type<br />

are widely used in historical [1] and socio-economic [2]<br />

contexts. In the context of this paper, Columbia University’s<br />

Residence Hall (URH) administration sought a tool to visually<br />

evaluate the movement of students between dormitories on<br />

campus; they provided the needs and questions that informed the<br />

visual display we have developed. While the raw data is<br />

available to the administration, it has never been used for<br />

analytical purposes, because there exists no tool that quickly<br />

discerns the data for useful results.<br />

Throughout the design we have tried to pick visual attributes to<br />

draw attention to the things analysts cared about the most. This<br />

breaks down into two distinct approaches: 1. purely visual<br />

techniques, and 2. metaphorical, mnemonic associations<br />

between images and their representation, creating a type of<br />

visual semantic.<br />

2 Interface Design<br />

The visualization interface features a two-dimensional<br />

interactive geographical map of city blocks and buildings that<br />

--------------------------------------------<br />

* e-mail: ah297@columbia.edu<br />

Alexander Haubold *<br />

Department of <strong>Computer</strong> Science<br />

Columbia University<br />

88<br />

Figure 1. Interface. City blocks and data irrelevant buildings<br />

appear in background colors; data relevant buildings and<br />

relocation arcs each assume specific color value, while their<br />

saturation changes according to user input and user interest.<br />

Movable relocation summary cards present detailed numerical<br />

data for each selected building.<br />

assume different states, as well as directed “relocation” arcs that<br />

represent relocations between two locations (Figure 1). In<br />

populating the map, we have paid careful attention to something<br />

we call a “contrast budget” as well as the order in which<br />

graphical components are placed on the map. A minimal portion<br />

of contrast has been set aside to manage the information<br />

provided by the system, while the larger portion is used to<br />

manage the viewer’s input. We also use hue and saturation to<br />

distinguish different types of visual representations. City blocks<br />

and data-irrelevant buildings have been colored in a low contrast<br />

and close to monochrome value, as their role is to merely<br />

provide spatial context. Buildings associated to data are<br />

emphasized in a separate color. As buildings become more<br />

interesting to the analyst (as evidenced by their being selected,<br />

armed, or selected and armed) the saturation changes<br />

exponentially to reflect the attention the viewer has given to the<br />

object.<br />

Relocation arcs follow a similar trend in increasing contrast<br />

versus increasing importance, and are additionally distinguished<br />

by their placement and mode of appearance. Links that are not<br />

associated to armed or selected buildings generally appear on<br />

the same background level as city blocks and irrelevant<br />

buildings. Furthermore, only links of substantial relocations are


Figure 2. Relocation Links. Left: Straight lines; Middle:<br />

Symmetric arcs; Right: Spiral-shaped arcs homing in onto<br />

target object.<br />

shown in the background, the threshold of which can be changed<br />

interactively. Arc thickness is adjusted logarithmically to the<br />

number of relocations in order to preserve the distinctiveness of<br />

the arcs. As buildings and their associated relocations become<br />

more interesting for the viewer, the arcs move into a position<br />

closer to the foreground, while at the same time assuming more<br />

saturated values.<br />

Relocations links between two given buildings appear in a<br />

clock-wise directed fashion while the spiral-shaped arcs sharply<br />

home in on the target object. The design of relocation arcs went<br />

through several iterations (Figure 2). First, simple straight lines<br />

were curved concavely to visually separate the relocation links<br />

from the inherently boxy building nodes. By means of this<br />

method we distinguish nodes from links as early as possible in<br />

visual processing. In a second step, we have changed the profile<br />

of the curve from a simple circular arc to one with an ever<br />

increasing curvature. This makes it easier to visually distinguish<br />

the beginning from the end of an arc, and complements the use<br />

of a directed arrow.<br />

3 Interface Tools<br />

A two-sided time slider, a more general version of which has<br />

been introduced by [3], features the distinct time periods over<br />

which relocation data exists, and allows the viewer to specify a<br />

lower and upper bound for displaying relocations during a<br />

particular period (Figure 3). The left, right, and middle arrows<br />

can be moved independently to increase the lower bound,<br />

increase the upper bound, and move a constant time period in<br />

either direction, respectively. Featured below the time line is a<br />

histogram of total relocations for each time period, where the<br />

values are adjusted logarithmically. Given the reduced space for<br />

the histogram, a linear scale would single out only the periods<br />

with high activity, which results in a too sparsely populated<br />

histogram.<br />

For each selected building a relocation summary card appears in<br />

the interface, which gives a numerical summary of the relocation<br />

data over the selected time period. This card can be moved<br />

freely within the interface and pinned onto the map similar to a<br />

PostIt note. A similar Details on Demand method utilizing “info<br />

cards” has been first used in [4].<br />

4 Data Model and Interoperability<br />

The interface is not restricted to the specific data presented<br />

herein. In a one-time pre-process step (Figure 4), a bitmapped<br />

geographical map is automatically vectorized, resulting in a list<br />

of polygons with corresponding fill color values. A second text<br />

file enumerates each color and maps colors to building names. A<br />

third file enumerates relocation matrices for each time period.<br />

89<br />

Figure 3. Time Slider with embedded histogram.<br />

Figure 4. The Relocation Visualization is generated using a<br />

polygonized bitmap, a color-to-building map, and a periodic<br />

relocation matrix file.<br />

These three text files serve as the input to the visualization<br />

interface. Using this data model, any geographical area can be<br />

presented in the visualization tool, including cities, building<br />

floor plans, and also material of non-geographical nature.<br />

5 Conclusion<br />

We have developed an information visualization method and a<br />

practical tool to aide in analyzing periodic movement between<br />

buildings (or other entities) within a defined spatial region.<br />

Using different conceptual layers, the information is presented<br />

to the viewers in a passive overview while giving them<br />

interactive tools to filter out buildings and associated relocations<br />

of interest. As this is a work in progress, we are further<br />

exploring which visual attributes are best suited for the purposes<br />

of visualization and interaction.<br />

Acknowledgements<br />

Discussions with W. Bradford Paley were the source of the<br />

spiral arcs and color choices and were fruitful in helping to keep<br />

the visual representations driven by the needs and expectations<br />

of the analysts rather than just the structure of the data.<br />

References<br />

[1] http://www.arts.auckland.ac.nz/online/history103/images/imperiali<br />

sm-migration.jpg<br />

[2] HANSEN, K. A. 1997. Geographical Mobility: March 1995 to<br />

March 1996. In Current Population Reports, U.S. Bureau of the<br />

Census. November 1997, P20-497.<br />

[3] AHLBERG, C. AND SHNEIDERMAN, B. 1994. Visual Information<br />

Seeking: Tight Coupling of Dynamic Query Filters with Starfield<br />

Displays. In Human Factors in Computing Systems: Proceedings<br />

of the CHI '94 Conference. New York: ACM, 1994.<br />

[4] SHNEIDERMAN, B. 1998. Designing the User Interface, Third<br />

Edition, Addison Wesley Longman, Plate B4(c).


ÈÓÐÝÈÐÒ Ò ÁÑÔÐÑÒØØÓÒ Ó ÆÛ ÄÝÓÙØ ÐÓÖØÑ ÓÖ ÌÖ× ÁÒ<br />

ÌÖ ÑÒ×ÓÒ×<br />

×ØÖ Ø<br />

Seok-Hee Hong £<br />

School of Information Technologies<br />

The University of Sydney<br />

This poster describes an implementation of a new layout algorithm<br />

for trees in three dimensions.<br />

CR Categories: I.3.7 [Computing Methodologies]: <strong>Computer</strong><br />

Graphics—Three-Dimensional Graphics and Realism<br />

Keywords: tree layout, three dimensions<br />

ÁÒØÖÓÙ ØÓÒ<br />

The tree is one of the most common relational structure. Many<br />

applications can be modeled as trees. Examples include family<br />

trees, hierarchical information, DFS (Depth-First-Search) tree of<br />

Web graphs and phylogenetic trees.<br />

Recently, Hong and Murtagh give a new linear time algorithm<br />

for drawing trees in three dimensions; this poster describes the implementation<br />

of the algorithm.<br />

Ì ÐÓÖØÑ<br />

The algorithm of Hong and Murtagh uses the concept of the subplanes,<br />

where a set of subtrees are laid out. The subplanes are<br />

defined using regular polytopes for easy navigation. The regular<br />

polytopes include pyramid, prism and the Platonic solids.<br />

The algorithm is very flexible and easy to implement. Further it<br />

runs in linear time with a given partitioning of subtrees. However,<br />

finding the best balanced partitioning is an NP-hard problem.<br />

Figure 1 shows an example of a layout of a tree with 6929 nodes.<br />

Here, we use the Icosahedron polytope to define 30 subplanes.<br />

Ì ËÝ×ØÑ<br />

We implemented the new layout algorithm of Hong and Murtagh as<br />

a part of the system 3DTreeDraw.<br />

The system also provides simple zoom in and zoom out functions,<br />

as well as rotation of the 3D drawing. This rotation function<br />

£ e-mail: shhong@it.usyd.edu.au<br />

† e-mail:tfm@it.usyd.edu.au<br />

90<br />

Tom Murtagh †<br />

School of Information Technologies<br />

The University of Sydney<br />

Figure 1: Example output of the algorithm.<br />

is sufficient for navigation, as the subplanes were defined using regular<br />

polytopes which make the drawing easy to navigate. It also<br />

provide a function that you can save the result as a bmp file.<br />

We use randomly generated data sets, from a few hundred up to<br />

a hundread thousands nodes. The experimental results show that it<br />

produces nice layouts of trees with up to ten thousands nodes.<br />

Figure 2 shows an example of a tree with 2982 nodes, using a<br />

regular 3-gon pyramid polytope with 3 subplanes. Figure 3 shows<br />

a tree with 8613 nodes, using a regular 3-gon prism polytope with<br />

6 subplanes. Figure 4 shows an example of a tree with 483 nodes,<br />

using the icosahedron polytopes.<br />

One can define more subplanes to improve resolution. Figure 5<br />

shows a tree with 139681 nodes, using a variation of the dodecahedron<br />

and the icosahedron polytopes with 90 subplanes.<br />

We also use real world data. Figure 6 shows a home directory<br />

with 1385 nodes, using the icosahedron polytope with 30 subplanes.<br />

Figure 7 shows a DFS tree of the School of IT, University<br />

of Sydney website, with 4485 nodes, using the cube polytope with<br />

12 subplanes.<br />

ÓÒ ÐÙ×ÓÒ Ò ÙØÙÖ ÏÓÖ<br />

The algorithm is flexible, as one can choose the polytope for their<br />

own purpose. For example, for rooted trees, the pyramid polytope<br />

is more suitable. For dense trees with small diameter and nodes of<br />

high degrees, the prism polytope or one of the Platonic solids can<br />

be preferred.<br />

Future work include evaluation using human experiments on this<br />

new metaphor and implementation of good navigation methods.


Figure 2: Drawing of a tree with 2982 nodes drawn with pyramid<br />

polytope (3 subplanes).<br />

Figure 3: Drawing of a tree with 8613 nodes drawn with prism<br />

polytope (6 subplanes).<br />

Figure 4: Drawing of a tree with 483 nodes drawn with the icosahedron<br />

polytope.<br />

91<br />

Figure 5: Drawing of a tree with 139681 nodes drawn with the<br />

dodecahedron and the icosahedron polytopes (90 subplanes).<br />

Figure 6: Drawing of a home directory with 1385 nodes drawn with<br />

the icosahedron polytope (30 subplanes).<br />

Figure 7: Drawing of a DFS tree of School of IT website with 4485<br />

nodes drawn with the cube polytope (12 subplanes).


Interactive Poster: Displaying English Grammatical Structures<br />

Pourang Irani<br />

University of Manitoba<br />

Department of <strong>Computer</strong> Science<br />

irani@cs.umanitoba.ca<br />

ABSTRACT<br />

This report describes ongoing work focused at designing a<br />

technique for visually representing English grammatical<br />

structures. A challenge in representing grammatical structures is<br />

to adequately display the linear as well as hierarchical nature of<br />

sentences. As our starting point we have adopted a radial spacefilling<br />

technique based on Clark’s etymological chart of the 19 th<br />

century. Clark devised this chart for the purpose of instructing<br />

students English grammar. We have automated the chart with<br />

basic visual features and interaction techniques. We report the<br />

results of a preliminary evaluation that suggests that subjects are<br />

able to better identify parts of a sentence after minimal training<br />

with the interactive visualization system.<br />

Keywords<br />

Visualizing English sentences, language structure visualization, radial<br />

space-filling visualization.<br />

1. INTRODUCTION<br />

As part of the writing process, the writer needs to know how to<br />

recognize complete thoughts and accordingly vary sentence<br />

structures to reflect these. Understanding the structure and various<br />

relationships between components in a sentence facilitates<br />

coherent writing. Many grammarians and English instructors hold<br />

that analyzing a sentence and portraying its structure with a<br />

consistent visual scheme can be helpful—both for language<br />

beginners and for those trying to make sense of the language at<br />

any level [3]. This is especially true for language learners who<br />

tend to be visual-learning types. One approach to better learning<br />

and understanding grammatical structures is to use diagrams.<br />

Several types of diagramming notations have been developed for<br />

capturing and representing structures in English grammar. Some<br />

of these are Clark’s diagrams [2], syntactic trees [1], and Kellogg-<br />

Reed diagrams [4]. In Clark’s diagrams, words, phrases, and<br />

sentences are classified according to their roles, and their relation<br />

to each other. Clark’s diagrams are hierarchical in that the first<br />

stage decomposes the parts into the appropriate structural units<br />

(subject, verb, noun, etc.). At a lower level, each unit is broken<br />

down into it various components. The elements are visually<br />

depicted by showing each unit as an outlined shape oval, and<br />

connection between units as lines or appendices. Syntactic trees<br />

provide a hierarchical representation of sentence structures. At the<br />

most bottom level, leaf nodes contain each atomic unit of the<br />

sentence. Above each leaf node in the tree, the specific role<br />

played by each atomic unit in the sentence is presented. These<br />

could be nouns, pronouns, prepositional phrases, adverbs, etc. In a<br />

recursive fashion, the role of each unit (compound or atomic) is<br />

depicted as a node of the tree. The most widely used form of<br />

sentence visualization has been developed by Brainerd Kellogg<br />

and Alonzo Reed, and is known as Kellogg-Reed diagrams. In the<br />

Kellogg-Reed diagrams, a sentence is divided into its component<br />

92<br />

Yong Shi<br />

University of Manitoba<br />

Department of <strong>Computer</strong> Science<br />

yongshi@cs.umanitoba.ca<br />

parts using solid and dashed lines. The most important cut being<br />

between the subject and the predicate. Horizontal lines are used<br />

for key structural elements, such as subject, verb, and direct<br />

object. Modifiers are placed on a diagonal bar and under the key<br />

elements they modify. Several hierarchies can also result from<br />

sentences that contain compound elements. Overall, these<br />

notations are weak in representing different types of relationships<br />

and semantics used in English grammatical structures. It is<br />

important to clearly reveal these relationships in order to allow<br />

the student to fully grasp the grammatical concepts. While these<br />

representations are complete, they are disjoint and do not provide<br />

a unified classification of the various types of possible sentence<br />

structures. As a result, they may not facilitate the learner who is<br />

particularly unaware of the range of sentence constructs in the<br />

language.<br />

Figure 1. Kellogg-Reed diagram for the sentence “The genial<br />

summer days have come”<br />

The inherent structure of these representations is either linear (as<br />

in the case of the Kellogg-Reed diagrams) or hierarchical (syntax<br />

trees). We hypothesized that adopting a representation that is at<br />

the same hierarchical and linear will facilitate analysis of<br />

sentences into their constituents.<br />

2. CLARK’S ETYMOLOGICAL CHART<br />

An alternative to providing separate and disjoint diagrams for the<br />

various forms and patterns of sentences is to create a compact<br />

representation. The representation would need to depict the linear<br />

as well as hierarchical construction of sentences in order to<br />

provide the learner a stronger view of the sentence. Such a<br />

compact representation has been proposed by Clark [2] in the 19 th<br />

century and is entitled as Clark’s etymological chart. While<br />

Clark's terminology is in certain places antiquated, the chart is<br />

compact and provides the learner with a concise representation of<br />

the various functional elements that could be part of a sentence.<br />

We have implemented this chart as a starting point over which all<br />

other visualization and interaction features are developed. A<br />

remarkable feature of Clark’s representation is the compactness<br />

that allows the entire system of grammatical constituents of<br />

sentence patterns to be depicted. Figure 2 shows our implemented<br />

version of Clark’s chart with the various elements of a sentence.<br />

Clark’s chart uses a radial display technique similar to that used<br />

by Sunburst [5]. While Sunburst is designed to display any form<br />

of hierarchy, Clark’s chart imposes a strict ordering of the<br />

constituent nodes based on the sentence being represented. At the<br />

center of the chart is the root node representing the entire<br />

sentence. At the next level, the chart contains two nodes, one


epresenting the principle parts and the other the adjuncts or<br />

qualifiers of the elements in the sentence. The principle part is<br />

further decomposed into a node representing the subject, the<br />

predicate and the object of the sentence. The adjuncts are<br />

separated into primary and secondary, the former qualifying<br />

elements within the principle part of the sentence, while the latter<br />

qualifying elements within the primary adjunct. At deeper levels<br />

in the hierarchy the various functions that constituent elements<br />

represent are depicted. For example, a subject can be represented<br />

by a word, a phrase or another sentence. In turn a word can either<br />

be a noun or pronoun. A noun can either be a proper or common<br />

noun either being in the masculine or the feminine gender, and<br />

finally in the singular or plural form.<br />

We have adapted Clark’s chart as the base representation and<br />

have augmented it with perceptual and interactive elements<br />

(Figure 2.a). We use color as the primary perceptual feature for<br />

highlighting the various components of a sentence. A color<br />

highlights all constituent elements of a sentence part through the<br />

sub tree of the hierarchy. A common problem affecting radial<br />

displays is the layout of the text. To facilitate text readability we<br />

implemented automatic smooth zooming whereby the chart is<br />

rotated to position the node of interest in a vertical readable<br />

position (Figure 2.b).<br />

To initially validate the effectiveness of the radial chart for<br />

language structures, we conducted a preliminary evaluation. Six<br />

computer science students from the University of Manitoba<br />

participated in the evaluation. None were familiar with any<br />

sentence diagramming methods. A pre-training evaluation was<br />

conducted to determine the students’ ability for parsing sentences<br />

into their components. All six subjects demonstrated a low and<br />

equal performance rate. To perform the evaluation we included a<br />

range of simple and complex sentences in the tool. By selecting a<br />

particular sentence its visual representation would get highlighted<br />

in the chart. Students were given time to familiarize themselves<br />

with the tool by selecting the various sentences and viewing their<br />

structure in the chart (lasted 20 minutes). The experiment then<br />

consisted of displaying a sentence and presenting the subject with<br />

a range of possible structures to choose from within the chart. The<br />

subject was then asked to select the visual representation that best<br />

suited the sentence. All subjects scored higher in the post-training<br />

evaluation after using the tool. These results provide a hint at the<br />

potential benefits that the chart may afford.<br />

3. FUTURE WORK AND CONCLUSION<br />

In this poster we discuss the automation of Clark’s etymological<br />

chart for the purposes of helping learners decipher sentence<br />

structure and their parts. The space-filling radial representation<br />

was evaluated and the results showed that subjects were able to<br />

break sentence constituents better using the visual aid.<br />

An objective of a visual tool for depicting sentence structure<br />

would be to facilitate learning and self correction of grammatical<br />

errors. Self correcting tools exist in editors such as MS Word .<br />

However, the methods simply hint at possible sentence errors<br />

without giving much recourse to a possible solution. Our future<br />

work will consist of further developing the tool to aid learners in<br />

identifying and possibly self-correcting grammatical errors. We<br />

will additionally augment the tool with focus+context techniques<br />

such as those discussed in [5] that will allow users to manipulate<br />

the chart for extracting vital information to their tasks.<br />

93<br />

a) Augmenting Clark’s etymological chart with visual features<br />

such as color to display sentence structures.<br />

b) Automatic zooming that rotates the radial display to align the<br />

text with the user’s node of interest. Here the user clicked on the<br />

Subject node to bring it into focus and then on the Predicate node.<br />

Figure 2. Representation of Clark’s etymological chart to<br />

highlight sentence structure using color and to facilitate<br />

interaction using automatic zooming<br />

4. REFERENCES<br />

[1] Chomsky, N. (1965). Aspects of a Theory of Syntax, Cambridge:<br />

M.I.T. Press.<br />

[2] Clark, S.W. (1853). A practical Grammar: in Which Words, Phrases,<br />

and Sentences are Classified According to their Offices, and their<br />

Relations to Each Other, New York, A. S. Barnes & Co.<br />

[3] Pinker, S. (1989). Learnability and Cognition: The Acquisition of<br />

Argument Structure, The MIT Press, Boston, MA.<br />

[4] Reed, A and Kellogg, B. (1878). Elementary English Grammar, New<br />

York, Clark & Maynard.<br />

[5] Stasko, J. and Zhang, E. (2000). Focus+Context Display and<br />

Navigation Techniques for Enhancing Radial, Space-Filling<br />

Hierarchy Visualizations, Proc. of the <strong>IEEE</strong> Symposium on<br />

Information Visualization 57-65.


Interactive Poster: VistaClara: An Interactive Visualization for Microarray Data Exploration<br />

Abstract<br />

VistaClara is a unique implementation of a permutation matrix<br />

designed specifically for exploratory microarray data analysis.<br />

The software supports incorporating supplemental data, which<br />

permits visually searching for patterns in microarray data that<br />

correlate with other types of relevant measurements or<br />

classifications. While the software supports traditional heatmap<br />

visualizations, an alternative view uses size as well as color to<br />

visually represent experimental values. Large data sets are<br />

effectively navigated by using well-known Overview+detail<br />

principles. Methods to computationally sort rows or columns by<br />

similarity allow more efficient searching for relevant patterns in<br />

very large data sets. Combined, these techniques make it possible<br />

to perform efficient interactive visual explorations of microarray<br />

data that are not possible with current tools.<br />

Keywords: microarray analysis, information visualization,<br />

permutation matrix, reorderable matrix, overview+detail,<br />

bioinformatics, gene expression<br />

1 Introduction<br />

Microarray data is frequently analyzed in tabular form. This is<br />

particularly true of recent experiments that search for insight into<br />

cancer and other diseases. Typically many biological samples are<br />

measured using individual microarray experiments, and a<br />

resulting matrix of gene vs. experiment is constructed. The<br />

problem then becomes one of finding those genes that have<br />

expression patterns with high correlation to the disease or disease<br />

classes being studied. Considerable efforts have been made to find<br />

computational techniques for classification or clustering of such<br />

data sets [1]. However, viewing such matrices is generally<br />

relegated to generic spreadsheet applications and static<br />

visualizations.<br />

An area of interest in information visualization is interactively<br />

manipulating matrix-organized data via spreadsheet-like<br />

applications. The goal of such software is to provide interactive<br />

mechanisms that enable visual pattern discernment. In analogy to<br />

woodworking, Rao refers to this as looking for the “grain” in<br />

information [2]. Unlike the more rigorous computational<br />

approaches typically used in bioinformatics, this form of visual<br />

data mining utilizes the highly developed pattern recognition<br />

abilities of human visual perception.<br />

VistaClara applies this exploratory style of information<br />

visualization to the problem of microarray analysis. It takes as a<br />

starting point the traditional heatmap visualization commonly<br />

used to display gene expression data, and extends this to a fully<br />

interactive permutation matrix supporting both column and row<br />

rearrangement. This is an important aspect for analyzing<br />

microarray data since correlations are likely to occur between<br />

both groups of genes as well as groups of samples.<br />

While the permutation matrix has been applied previously (e.g.<br />

VisuLab [3], TableLens [4], Siirtola [5]), no previous<br />

Robert Kincaid<br />

Agilent Technologies<br />

robert_kincaid@agilent.com<br />

94<br />

implementation has been specifically designed for the unique<br />

characteristics of multi-experiment analysis of microarray data.<br />

Bertin pointed out that meaningful permutation operations<br />

become difficult with very large data sets [6]. VistaClara<br />

implements a number of additional features designed to facilitate<br />

the interactive manipulation of the large data sets typical of<br />

microarray studies.<br />

Figure 1. A VistaClara view of melanoma data using an “ink blob”<br />

representation. Data is sorted by Pearson row similarity to the expression<br />

pattern of the gene Melan-A. Red indicates up-regulation (ratios>1), while<br />

green indicates down-regulation (ratios


“saturate”. This makes significant fold increases/decreases<br />

readily apparent.<br />

Row and column permutations are particularly advantageous for<br />

microarray data, since we typically expect to find correlations<br />

between samples (columns) as well as between gene expression<br />

(rows). However, due to the size of typical data sets, manual<br />

rearrangement is impractical. Moreover, these correlations are<br />

usually confounded with various noise contributions to the data,<br />

which often make simple row and column sorts ineffective as<br />

permutation operations.<br />

VistaClara implements an intuitive extension of simple sorting.<br />

We allow sorting rows using measures of similarity between<br />

entire rows of microarray data. A given row of interest is chosen,<br />

and the remaining rows are ordered by similarity to the chosen<br />

row. Similarity is computed using either the Euclidian distance or<br />

Pearson coefficients. Currently, only gene expression data is<br />

considered in the calculation. Column sorting by similarity is also<br />

supported. These similarity sorts can be performed almost as<br />

quickly as a standard single row or column sort, thereby retaining<br />

the benefits of a highly interactive permutation operation.<br />

However, as shown in the next section, meaningful correlations<br />

can be effectively extracted from large, complex data sets.<br />

Following overview+detail principles [7], we provide an overview<br />

display of the entire data set in the form of a dynamic heatmap.<br />

This is seen in the leftmost panel of Fig. 1. As rows and columns<br />

are rearranged, the overview is updated to reflect the change and<br />

any emerging correlations that might be visible beyond the tabular<br />

view.<br />

While difficult to make out in the figure, a blue rectangle in the<br />

overview outlines the position and range of the visible tabular<br />

view as the user scrolls the display. This provides further context<br />

and navigation orientation for the user. This also exposes the<br />

striking observation of how small a slice of the total microarray<br />

data is generally viewed in standard tabular spreadsheet<br />

visualizations.<br />

3 Results<br />

To demonstrate VistaClara in a typical use case, we examined the<br />

gene expression data from two previous studies with known<br />

results [8,9] in order to show that VistaClara can find similar<br />

correlations. Our intent is not to reproduce exactly the more<br />

rigorous results. Instead, we wish to show that making reasonable<br />

assumptions about what should be interesting, VistaClara<br />

manipulations can quickly reveal a qualitatively similar result of<br />

biological relevance. A user will typically explore the data<br />

interactively in search of previously unknown correlations or<br />

relationships in the data. The unstructured nature of a typical<br />

session of this type is difficult to convey in printed form, but these<br />

examples should at least demonstrate the potential of such<br />

operations and visual pattern finding.<br />

We first examined Bittner’s microarray data [8] consisting of<br />

8067 cDNA measurements for each of 31 patient samples<br />

(250,077 ratio measurements). Using computational techniques,<br />

Bittner et al. singled out 22 cDNA clones as being highly<br />

discriminating for one class of melanoma. We chose Melan-A as a<br />

gene of interest, as it is associated with melanoma [10] and might<br />

be reasonably chosen in the absence of Bittner’s results. Rows are<br />

interactively sorted by similarity using Pearson coefficients as a<br />

95<br />

distance measure (Fig.1). Within the first 21 rows we find 9 of the<br />

discriminating genes reported by Bittner. Within the first 40 rows<br />

we find all 11 of the 22 discrimi nating genes reported by Bittner,<br />

which have expression profiles similar to Melan-A. Based on our<br />

distance measure, the most distant rows consist of the most anticorrelated<br />

patterns relative to Melan-A. The last 44 rows of this<br />

data set contain 7 of the previously reported discriminating genes<br />

anti-correlated to Melan-A. Further, we visually find good<br />

correlation to the two classes of melanoma found by Bittner.<br />

Using only simple user interface manipulations and visual pattern<br />

finding, we are able to reproduce qualitatively similar results to<br />

more exact computational methods.<br />

We have also obtained comparable results from analyzing data<br />

from Luo et al., [9]. This data set includes gene expression<br />

differences between tissues representing Human prostate cancer<br />

and. benign prostatic hyperplasia (BPH).<br />

4 Conclusion<br />

Our preliminary experiments with VistaClara indicate it is a<br />

useful and powerful tool for exploring data from microarray<br />

experiments. It is possible to manipulate large heterogeneous data<br />

sets consisting of multiple microarray experiments and relevant<br />

supplemental annotations and data. This enables visual searching<br />

for biologically meaningful patterns in the data. Testing with<br />

melanoma and prostate data confirm that it is possible to obtain<br />

qualitative insights via interactive matrix permutations, and that<br />

these results are qualitatively similar to more rigorous<br />

computational methods.<br />

While VistaClara was designed for microarray analysis, features<br />

such as similarity sorting and color encoded ink blobs can be<br />

readily applied to other data types as well as other forms of<br />

visualization.<br />

References<br />

[1] D. Slonim, “From patterns to pathways: gene expression data<br />

analysis comes of age,” Nature Genetics, Vol. 32 supplement pp.<br />

502-508, 2002.<br />

[2] R. Rao, “See & Go Manifesto,” Interactions, Vol. 6, No. 5, pp. 64-ff,<br />

1999.<br />

[3] C. Schmid and H. Hinterberger, “Comparative multivariate<br />

visualization across conceptually different graphics displays,” Proc.<br />

of SSDBM '94, pp. 42-51, 1994.<br />

[4] R. Rao, S. Card, “Table lens: Merging graphical and symbolic<br />

representations in an interactive focus plus context visualization for<br />

tabular information,” Proc. of ACM Conf. on Human Factors in<br />

Comp. Systems (CHI'94), pp. 318-322, 1994.<br />

[5] H. Siirtola, “Interaction with the Reorderable Matrix,” Proc.,<br />

Internat. Conf. on Information Visualization, pp. 272-277, 1999.<br />

[6] J. Bertin, Graphics and Graphic Information Processing, deGruyter,<br />

New York, 1981.<br />

[7] S. Card, J. Mackinlay, B. Shneiderman, Readings in Information<br />

Visualization, Morgan Kaufmann, 1999.<br />

[8] M. Bittner et al. “Molecular classification of cutaneous malignant<br />

melanoma by gene expression profiling,” Nature, Vol. 406, pp. 536-<br />

540, 2000.<br />

[9] J. Luo, et al., “Human prostate cancer and benign prostatic<br />

hyperplasia: molecular dissection by gene expression profiling,”<br />

Cancer Res. Vol. 61, 4683–4688, 2001.<br />

[10] Y Kawakami et al., "Cloning of the Gene Coding for a Shared<br />

Human Melanoma Antigen Recognized by Autologous T Cells<br />

Infiltrating into Tumor," Proc. Natl. Acad. Sci. USA, Vol. 91, pp.<br />

3515-3519, 1994.


Interactive Poster: Linking Scientific and Information Visualization<br />

with Interactive 3D Scatterplots<br />

Abstract<br />

Robert Kosara Gerald N. Sahling Helwig Hauser<br />

VRVis Research Center, Vienna, Austria<br />

http://www.VRVis.at/vis/<br />

Kosara@VRVis.at, niki.sahling@paradigma.net, Hauser@VRVis.at<br />

3D scatterplots are an extension of the ubiquitous 2D scatterplots<br />

that is conceptually simple, but so far proved hard (if not impossible)<br />

to use in practice. But by combining them with a state-of-theart<br />

volume renderer, multiple views, and interaction between these<br />

views, 3D scatterplots become usable and, in fact, useful.<br />

Not only do 3D scatterplots show complex data, they can also<br />

show the structure of the object under investigation. Thus, they<br />

provide a link between feature space and the actual object. Brushing<br />

reveals connections between parts and features that otherwise<br />

are hard to find. This link also works not only from feature space<br />

to the spatial display, but also vice versa, which gives the user more<br />

ways to explore the data.<br />

Keywords: Scientific Visualization, Information Visualization,<br />

Scatterplots, Interaction<br />

1 Introduction<br />

Scientific visualization (SciVis) is usually considered separate and<br />

independent from information visualization (InfoVis). But there are<br />

many applications where data is used that is typical of both fields,<br />

e.g., flow data with many dimensions. In such cases, it is beneficial<br />

to combine both so that the data can be handled more easily.<br />

Scatterplots are a very ubiquitous method for visualization, and<br />

are used in many applications. They can not only show abstract<br />

data dimensions very effectively, but also provide a crude image of<br />

an object if fed with the right data (i.e., point coordinates).<br />

In this paper, 3D scatterplots are presented as a way to link scientific<br />

and information visualization, by using concepts and methods<br />

from both, integrating them with common interactions, and by<br />

providing an image of the data that cannot be attained with only<br />

one of the parts. Rendering and interaction are also fast, because<br />

state-of-the-art volume rendering software is used for displaying<br />

the scatterplots.<br />

2 Related Work<br />

3D Scatterplots have already been proposed, even using volume<br />

rendering [1]. But the resolution of the data there (20x50x50) was<br />

very coarse, and because the data bins are displayed in a very fuzzy<br />

way, structures in the data are very hard to see. Interaction and<br />

speed also seem to be lacking.<br />

Using multiple, linked views is one of the key ideas to combining<br />

scientific and information visualization views. One very good<br />

example for this is WEAVE [2], which allows the user to see different<br />

views like scatterplots, histograms, and a 3D rendering of an<br />

object and to brush in the 2D displays.<br />

Voxelplot uses RTVR [3], which is a very fast Java library for<br />

interactive direct volume rendering.<br />

96<br />

3 Voxelplot<br />

Voxelplot is an implementation of 3D scatterplots based on RTVR.<br />

Each data point is mapped to one voxel in three-dimensional visualization<br />

space depending on its value on the selected axes.<br />

Voxelplot usually shows four 3D scatterplots, which can be<br />

linked by the user. Linking can encompass view parameters (orientation,<br />

zoom) as well as brushing information.<br />

The user can display different dimensions from a dataset, and<br />

also select a function and a range for mapping the whole value range<br />

on each dimension into the 256 different values the volume renderer<br />

can handle.<br />

In any of the scatterplots, the user can brush points, which are<br />

then labeled as interesting. Different from systems like WEAVE,<br />

brushing can be done in any view, thus making interaction more<br />

flexible. By being able to brush the physical structure of the object,<br />

different hypothesis can be tested than when only the features can<br />

be brushed.<br />

In addition to a range brush, which consists of sliders that allow<br />

the user to specify the boundaries of the brush in any number of<br />

dimensions, we have implemented a beam brush. A beam brush<br />

brushes all points that are inside a cylinder that lies perpendicular<br />

to the viewing plane, and whose radius the user can select. Using<br />

different logical combinations of brushes, like AND, OR, etc., the<br />

user can build any complex brush quite easily with a number of<br />

beams.<br />

Another brush that is useful for “pure” InfoVis applications is<br />

called the cluster brush. It uses the results of a clustering algorithm<br />

(which are part of the data set) to allow the user to brush whole<br />

clusters.<br />

4 Results<br />

This section describes some results we obtained in working with a<br />

flow dataset from a catalytic converter simulation. The data set consisted<br />

of 9600 data points and 15 dimensions, among them the 3D<br />

coordinates of each data point, a velocity vector, pressure, turbulent<br />

kinetic energy (tkenergy), etc.<br />

Generally, there are three questions the user wants to answer in<br />

scientific visualization: Where are data of a certain characteristic?,<br />

What other features do these data have?, and What characteristics<br />

are present in a certain part of the object? The first and<br />

last question lead from information to scientific visualization, and<br />

vice versa, while the second question can be answered with InfoVis<br />

alone.<br />

Selecting low pressure areas in parameter space (Figure 1a) show<br />

where in the object these areas are (Figure 1b). From there, the<br />

analysis can be refined, e.g. by brushing one of the touched structures<br />

that are obviously present in the parameter space (this is all<br />

done using beam brushes). When this is done, it turns out that they<br />

correspond to different parts of the catalytic converter (Figure 1c/d).


a) Selecting the low pressure areas in parameter space ....<br />

b) ... shows where the pressure is low in the physical object.<br />

c) The lower part of the “spoon” ...<br />

d) ... represents the converter monolith.<br />

Figure 1: Examples of segmenting the catalytic converter data set<br />

in parameter space. The axes in parameter space are: pressure (red<br />

axis and color), velocity (green), tkenergy (blue)<br />

97<br />

We need to be able to brush from feature space to the spatial<br />

view as well as the other way to be sure that our analysis is correct.<br />

Only brushing a structure in feature space and seeing a part of the<br />

converter being brushed in the spatial view leaves the possibility<br />

that there are points in this part that are not part of the brushed<br />

structure in feature space (and that just were not visible between the<br />

brushed points). So to verify that the feature space structure indeed<br />

exactly corresponds to the part of the converter, it is necessary to<br />

brush in the spatial view and look for brushed points outside the<br />

structure that was originally brushed.<br />

This analysis brought the structure of the multi-block simulation<br />

to light, where different parts of the catalytic converter are treated<br />

differently, and also the grids differ. The gaps between the features<br />

of adjacent parts of the grid suggest that a higher resolution could<br />

be useful, and more care should be taken at the interface between<br />

the parts to make the transitions smoother.<br />

The results are difficult to characterize, because the discovered<br />

structures are complex. But this demonstrates how powerful the<br />

method is – even highly complex structures that are only discriminable<br />

in 3D can be found and separated.<br />

5 Conclusions, Future Work<br />

We have shown that information and scientific visualization can be<br />

integrated seamlessly and very flexibly through the use of a common<br />

method: interactive 3D scatterplots.<br />

The combination of methods and ideas from these two different<br />

fields also makes efficient work with high-dimensional data possible<br />

and useful to engineers. 3D scatterplots can also deal with data<br />

sets that are usually considered large in Information Visualization<br />

(over one million data points).<br />

More work combining scientific and information visualization<br />

should be done, and we believe that this will also happen more and<br />

more often. InfoVis can act as a support for scientific visualization,<br />

and the techniques from SciVis can be used in InfoVis.<br />

Undoubtedly, this work is only a first step, and a lot of work remains<br />

to be done. Perhaps the most important now is to provide<br />

more depth cues to the user, like perspective projection and stereo<br />

viewing. In addition to this, more use could be made of the possibilities<br />

of volume rendering, like better use of transparency, edge<br />

enhancement, MIP (maximum intensity projection), iso surfacing,<br />

etc.<br />

Acknowledgements<br />

This work was done in the scope of the basic research on visualization<br />

(http://www.VRVis.at/vis/) at the VRVis Research Center in Vienna, Austria<br />

(http://www.VRVis.at/), which is funded by the Austrian research program<br />

Kplus. The dataset is courtesy of AVL List GmbH, Graz, Austria.<br />

References<br />

[1] Barry G. Becker. Volume rendering for relational data. In <strong>IEEE</strong><br />

Symposium on Information Visualization (InfoVis ’97), pages<br />

87–91. <strong>IEEE</strong>, October 1997.<br />

[2] D. L. Gresh, B. E. Rogowitz, R. L. Winslow, D. F. Scollan,<br />

and C. K. Yung. WEAVE: A system for visually linking 3-D<br />

and statistical visualizations, applied to cardiac simulation and<br />

measurement data. In Proceedings Visualization 2000, pages<br />

489–492. <strong>IEEE</strong>, October 2000.<br />

[3] Lukas Mroz and Helwig Hauser. RTVR - a flexible java library<br />

for interactive volume rendering. In <strong>IEEE</strong> Visualization ’01<br />

(VIS ’01), pages 279–<strong>28</strong>6. <strong>IEEE</strong>, 2001.


Interactive Poster: Enlightenment: An Integrated Visualization and<br />

Analysis Tool for Drug Discovery<br />

Christopher E. Mueller<br />

Array BioPharma , 3200 Walnut St. Boulder, CO 80301<br />

cmueller@arraybiopharma.com<br />

Abstract<br />

Commercial software tools for interpreting analytical chemistry<br />

data provide basic views but offer few domain specific<br />

enhancements for exploring the data. Gaining an understanding<br />

of the results for an individual compound and a large set of<br />

compounds requires examining multiple data sets in multiple<br />

applications for each compound. In this poster, we present<br />

Enlightenment, a new tool that takes the traditional look and feel<br />

of an analytical application and significantly enhances the utility<br />

of the visualizations. Using Enlightenment, analytical chemists<br />

can review large sets of compounds quickly and explore the data<br />

from a single, unified interface. Enlightenment demonstrates how<br />

applying domain knowledge can enhance the usefulness of<br />

traditional displays.<br />

Keywords: Visualization, Chromatography, HPLC, Mass Spec,<br />

High Throughput Synthesis<br />

1 High Throughput Synthesis<br />

High Throughput Synthesis is the process of using combinatorial<br />

chemistry to create large numbers of related but diverse<br />

compounds quickly. The main vessel for handling compounds is a<br />

plate. A plate consists of wells arrayed in an m x n matrix, where<br />

m x n is typically 12 x 8 yielding 96 wells.<br />

To confirm that the correct products are created, each plate is<br />

analyzed using a high performance liquid chromatography<br />

(HPLC) instrument with UV and mass spectrometric (MS)<br />

detection to confirm purity and identity, respectively. An<br />

algorithm is applied to the data to make the first determination as<br />

to whether or not the compound was created properly. These<br />

results are then reviewed by an analytical chemist who either<br />

confirms or amends them. Interpreting the results algorithmically<br />

is non-trivial and often produces incorrect results, requiring<br />

human intervention to determine if a compound passes or fails.<br />

The manual process consists of using a collection of vendorsupplied<br />

tools to explore the data, each task requiring a separate<br />

application: one for viewing the plate and algorithmic results, one<br />

for viewing raw data for each well, one for viewing compound<br />

structures, and a spreadsheet for tracking observations. Finnigan's<br />

Xcalibur/Discovery [1,2] and Waters’ OpenLynx [3] system are<br />

examples of such commercial systems.<br />

2 Enlightenment<br />

Enlightenment provides a unified interface to all plate, structure,<br />

and analytical data. It applies information visualization<br />

techniques to enable the analytical chemist to understand the<br />

results quickly and increase the data density of the visualizations.<br />

When data exploration is required, a series of data-aware, linked<br />

plots allow the chemist to drill down into the data from a single<br />

application.<br />

98<br />

Figure 1 - Enlightenment<br />

Enlightenment is designed to be immediately familiar to analytical<br />

chemists but provides a more information-rich view of the data<br />

than commercially available tools. The main views integrated<br />

into the UI are the plate view with its linked tree and compound<br />

structure views as well as an analytical data view that shows the<br />

processing results, linked to plots of the raw data.<br />

3 Plate View<br />

The plate view in commercial applications displays a grid of<br />

color-coded circles for each well, with the color denoting the<br />

status of the well. By default, Finnigan’s Discovery Browser [2]<br />

uses four colors denoting pass (green), found but not pure<br />

(yellow), pure but not found (pink), and fail (red). However, other<br />

data items exist that can be displayed at the well level to give the<br />

chemist a better idea of what is happening in the plate. It is often<br />

the case that the chemist will step through each well to acquire<br />

these, just to get a better view of the big picture.<br />

Enlightenment uses the Finnigan color scheme to maintain<br />

familiarity, but replaces pink with blue since<br />

some displays made it hard to distinguish<br />

pink and red. The intensity of the colors<br />

was also adjusted using the guidelines in [4,<br />

p. 164] so that no single color stood out.<br />

Enlightenment uses overlays and size to<br />

show clearly three extra dimensions of<br />

data: HPLC signal strength, channel<br />

Figure 2 – Icons,<br />

Colors and Overlays<br />

used, and percent BPI (MS signal strength). These values are<br />

typically used to understand problems with a plate and are only<br />

available through analysis of multiple plots per well in<br />

commercial applications.<br />

Signal strength is illustrated by the size of the circle: smaller for<br />

low signals and larger for signals that are too strong. Size alone<br />

was hard to distinguish on small displays, so a “noisy” border was<br />

added to give the appearance of a deviant signal.


Selected channel and percent BPI use overlays to highlight cases es<br />

th at occur infrequently. Generally, channel 1 is selected and the<br />

BPI is 100%. If a different channel was used, the channel's<br />

number is overlayed in the upper left corner of the well. If the BPI<br />

is below a threshold (e.g. 80%), a bar appears on the left edge of<br />

the well, its height relative to the BPI. By using the overlay only<br />

in these cases, wells that exhibit these behaviors stand out.<br />

Enlightenment's plate view uses different levels of detail (L ODs)<br />

to display more or less information about each well, depending on<br />

the audience. For instance, business development staff can select a<br />

LOD that only displays green/red to determine which compounds<br />

can be sold, whereas an analytical chemist would select the most<br />

detailed LOD.<br />

The plate view is linked to a tree view that displays detailed<br />

in formation for each compound and a structure view that displays<br />

the structure of the selected compound (Figure 1, top row). The<br />

analytical views are also linked to the selected well.<br />

4 Analytical Results View<br />

The analytical results views are l ocated beneath the plate view<br />

(F igure 1, bottom three rows). There are four different channels of<br />

analytical information used to characterize a compound, three<br />

displayed by default. Applying the concept of multiples in space<br />

and time [5], each channel has an identical results view and a set<br />

of plots. Because the results view is linked to the plate view,<br />

changing the status of a well in the results view also changes the<br />

color and overlays for that well in the plate view.<br />

5 Analytical Plot Views<br />

HPLC and MS data are repr esented by line and stick plots,<br />

re spectively. HPLC data consists of a time-series trace with<br />

distinct peaks. Each peak corresponds to some amount of<br />

material passing through the detector and comparing peak areas<br />

gives the purity for each peak. Each peak has a start and end<br />

point, and the MS data is sub-sampled to show data in the range<br />

for each peak. Selecting a peak in a HPLC trace displays the sub-<br />

sample of the MS data in the MS plot. MS plots show the massto-charge<br />

(m/z) ratio on the x-axis and relative intensity on the yaxis.<br />

Figure 3 - Chromatogram and Mass Spec Plots<br />

Appl ying the principle of maximizing data ink [6], the HPLC and<br />

MS plots were redesigned to display more information than the<br />

simple scientific plots used in commercial tools. The axes on all<br />

plots were replaced with range-frame axes with carefully selected<br />

tick marks.<br />

Signal strength<br />

is important for HPLC traces; too low or strong a<br />

signal<br />

leads to incorrect purity results. The y-axis range-frame<br />

starts with the minimum good value and ends with the maximum<br />

observed value. If the signal is low, a single tick-mark with no<br />

99<br />

axis denotes the maximum value (Figure 1, middle plot). Thus, a<br />

quick glance can tell a chemist if the signal was strong enough for<br />

proper evaluation. Signals that are too strong lead to obviously<br />

distorted traces and have no special marking. Often, all data prior<br />

to a certain time will be excluded from analysis. The x-axis<br />

range-frame spans only the time range used in processing and<br />

includes a single tick mark showing the time for the currently<br />

selected peak. Labels on the peaks denote the purity of each peak.<br />

If the target compound was found for a given peak, its mass is<br />

displayed alongside the purity value.<br />

For an MS intensity to be useful, it should<br />

be above 20%. This is<br />

displayed<br />

by the y-axis range-frame on the MS plot, which spans<br />

20-100%. The x-axis range-frame spans the entire length of the<br />

plot with ticks at either end displaying the min and max m/z<br />

values. Sticks are labeled with the m/z value.<br />

The peaks in the HPLC plot are dynamically linked<br />

to the MS<br />

plot.<br />

Changing the endpoints of a peak or drawing a new peak<br />

sub-samples the MS data in real time to display the mass spec for<br />

the new peak.<br />

All plots feature<br />

interactive panning,<br />

zooming,<br />

and arbitrary value picking.<br />

Zooming is accomplished by drawing a<br />

rectangular region around a plot area to<br />

define the new view or by scrolling the<br />

ends of the PanBar controls (Figure 4).<br />

PanBars are similar to Spotfire's Range<br />

Sliders [7] and allow both panning and Figure 4 – PanBars<br />

zooming. Originally, only the PanBars and Zoom Controls<br />

were available for zooming, but user<br />

feedback led to the addition of the zoom box and a button in the<br />

lower-left corner of the view that zooms<br />

out completely. If no<br />

mouse button is pressed, the current x/y value below the mouse<br />

cursor is displayed in the status bar in data coordinates.<br />

6 Conclusions<br />

Enlightenment is similar<br />

to commercial analytical chemistry<br />

applications.<br />

However, careful analysis of the domain and the<br />

chemist’s usage patterns has led to several enhancements. By<br />

combining the functionality of multiple applications into one, we<br />

have eliminated redundant features and provided better linking<br />

among views. Using information visualization techniques, the<br />

views build on familiar displays but show significantly more<br />

information and allow chemists to draw conclusions more<br />

effectively.<br />

References<br />

[1] Finnigan (2000).<br />

Xcalibur 1.2. [Software]<br />

[2]<br />

Finnigan (2000). Xcalibur Discovery Browser 1.2. [Software]<br />

[3] Waters (2003). OpenLynx Application Manager<br />

- Processing &<br />

Reporting (Retrieved June 17, 2003). www.waters.com.<br />

[4] Kosslyn, S. M. (1994). Elements of Graph Design. US: W. H.<br />

Freeman and Company.<br />

[5] Tufte, E. R. (2002). Visual Explanations. Conn: Graphics Press.<br />

nd<br />

[6] Tufte, E. R. (2001, 2 Ed.). The Visual Display of Quantitative<br />

Information. Conn: Graphics Press.<br />

[7] Spotfire, Inc (2001). Spotfire DecisionSite 6.3.0.349 [Software]


Abstract<br />

A limitation of the existing ThemeRiver [1] paradigm is that<br />

only one attribute can be displayed per theme. In this poster, we<br />

present a 3D extension, which enables us to display two attributes<br />

of each variable in the data stream. We further describe a technique<br />

to construct the Bezier surface that satisfies the ThemeRiver<br />

requirements, such as boundedness and preservation of local<br />

extrema.<br />

1 Introduction<br />

The ThemeRiver visualization traditionally displays different<br />

variables as distinctly colored data streams. The streams usually<br />

flow along the time axis and their width reflect the attribute of a<br />

particular stream at a particular point in time. This attribute can be<br />

anything worthwhile investigating, such as time fluctuations of different<br />

company stock values, ranging from simple distributions to<br />

more complex variables. The main advantage of a ThemeRiver<br />

visualization is that it portrays different data groups simultaneously,<br />

revealing their co-variance, showing how they behave<br />

together. An example of a 2D ThemeRiver visualization is shown<br />

in Fig. 1.<br />

The 3D counterpart that we propose in this paper extends this<br />

idea and maps a second attribute, such as the revenue of the companies,<br />

as the height of the streams. Thus, the x-axis represents the<br />

time, the y-axis the stock price and the z-axis the company revenue.<br />

In short, 3D ThemeRiver is naturally suited to exhibit any<br />

sequential ternary covariate trends, mapping one quality as width<br />

and another as height. It is suited to correlate data episodes and<br />

environment.<br />

2 Construction<br />

Our 3D ThemeRiver is represented by a composite Bezier surface.<br />

Hence, the entire following discussion will revolve around<br />

the placement of the Bezier control points so that the resulting surface<br />

truly reflects the underlying data.<br />

A very important property of the correct surface is that it needs<br />

to preserve the extreme points in the dataset. This constraint is also<br />

maintained for spline curves in the original 2D ThemeRiver application.<br />

In other words, it is undesirable to violate local maxima or<br />

Figure 1: Traditional 2D ThemeRiver view on a few select<br />

dot.com company stocks in the period January 99 - April<br />

2002.<br />

Interactive Poster: 3D ThemeRiver<br />

Peter Imrich 1 Klaus Mueller 1 Dan Imre 3 Alla Zelenyuk 3 Wei Zhu 2<br />

1 Center for Visual Computing, <strong>Computer</strong> Science, Stony Brook University<br />

2 Applied Mathematics and Statistics, Stony Brook University<br />

3 Environmental Sciences, Brookhaven National Laboratory<br />

email: {imrich, mueller}@cs.sunysb.edu, {imre, alla}@bnl.gov,<br />

zhu@ams.sunysb.edu<br />

100<br />

(a)<br />

(c) (d)<br />

Figure 2: Bezier curves and surfaces around extreme points. (a)<br />

correctly interpolated datapoints, (b) the same curve with inflections<br />

and incorrect maxima, (c) top view of bezier surface that<br />

violates width extremeness, the lower boundary of the red<br />

stream contains two inflections and (d) the same surface viewed<br />

from a profile, here we see an incorrect peak.<br />

minima, and it is important to control surface inflections. A rule of<br />

thumb in this situation is to find a surface that does not overshoot<br />

its four corner points of any of its Bezier patches. Similarly, the<br />

curvature of stream boundaries has to preserve the same extreme<br />

points. This is illustrated in Fig. 2.<br />

Another more obvious requirement for the final surface is that<br />

it should be smooth. This can be achieved by placing the neighboring<br />

control points of adjacent patches into a co-planar configuration,<br />

forming preferable a parallelogram. This achieves C 2<br />

continuity.<br />

To satisfy the above, we represent each stream interval by two<br />

Bezier patches. The lower patch shares a boundary with the lower<br />

neighboring stream, and the upper patch shares a boundary with<br />

upper neighboring stream. The center of the stream lies on the edge<br />

shared by these two patches. This way, the stream boundaries as<br />

well as the stream troughs do not violate the previously stated<br />

ThemeRiver constraints. Both lie on edges of the two patches.<br />

(Recall that a Bezier patch passes through its four corner control<br />

points and interpolates the rest.)<br />

Given a height field, the procedure that constructs the<br />

ThemeRiver bezier surface is as follows.<br />

• Generate the boundary points - the height of a boundary can simply<br />

be a linear interpolation of heights of adjacent stream centers.<br />

• Stack and center the data streams.<br />

• Compute the placement of control points. The corner points are<br />

directly given by the data. Points along the edges can be determined<br />

solely from the positions among the corner control points<br />

along the same edges. Finally, the diagonal points only depend on<br />

the local slopes and their displacement from their closest corner<br />

point. The slopes at each corner point are designed either to pre-<br />

(b)


serve local extrema (slopes in these cases are zero in whatever<br />

direction the extrema occurs) or to blend the overall slopes of<br />

neighboring patches (overall slopes can be estimated by looking<br />

only at the positions of corner points of the involved patches).<br />

3 Domain Application<br />

Our particular application deals with the survey and analysis of<br />

a large collection of millions of digitized aerosol particle spectra.<br />

Our data comprise millions of 450-bin molecular mass spectrum<br />

for each individual particles, along with their total mass, a time<br />

stamp and score of environment variables like humidity, ozone<br />

concentrations and others. All of these make up a 500-D feature<br />

vector for each particle.<br />

Our 3D ThemeRiver is part of a comprehensive data mining<br />

and data clustering package for aerosol data that we have developed<br />

at BNL. Atmospheric scientists use the 3D ThemeRiver<br />

application to visualize time-variant or other environmental trends<br />

in context to the data clusters. An example of such an interactive<br />

display is shown in Fig. 3. The display is linked to the classification<br />

engine and display, and scientists can interactively modify the<br />

variables and streams displayed.<br />

Figure 3: A 3D ThemeRiver visualization of 17 organic clusters.<br />

Width encodes overall cluster distributions (the magnitide of<br />

each cluster) and the height encodes incidence of zinc.<br />

4 Comparison of 3D ThemeRiver with other<br />

approaches<br />

As an experiment we compared the performance of our 3D<br />

ThemeRiver approach with a modified 2D version that also<br />

attempts to incorporate a second variable. Basically, we wanted to<br />

study if we can actually gain from the 3D extension, or if a modified<br />

2D version would have performed just as well. In the 2D version,<br />

we assigned each stream a constant hue and saturation, but<br />

varied stream brightness in the same way we raised and lowered<br />

the landscape in the 3D version. The results of this experiment are<br />

shown in Fig. 4. There, 12 clusters are mapped across time, with<br />

width being mapped to their overall distribution and with height or<br />

brightness tracking the incidence of iron. The modified 2D version<br />

(Fig. 4a) highlights well the regions abundant with iron, however it<br />

looses visual separation of streams in zones lacking particles with<br />

this element. The stream distinctions are slightly improved when<br />

brightness ranges are clipped between.25 and 1 (see Fig. 4b).<br />

Nonetheless, this modification compromises the strength of highlights.<br />

The 3D ThemeRiver preserves colors and reflects the<br />

changes of iron occurrence on the z-axis. Interactive navigation of<br />

this scene is able to accentuate the depth diversity even more. In<br />

this respect a fully 3D extension to ThemeRiver appears superior<br />

to these other approaches.<br />

5 Navigation<br />

3D Navigation greatly enhances the visual understanding. Our<br />

3D ThemeRiver can be rotated, translated, scaled and box-zoomed<br />

101<br />

(a)<br />

(b)<br />

Figure 4: Comparison of the 3D approach to a 2D HSV<br />

approach: (a) 2D HSV ThemeRiver, (b) 3D ThemeRiver.<br />

all in real time, facilitated by commodity graphics hardware. In<br />

addition, the user has also the flexibility to move the light source<br />

around in the scene, to emphasize different geometric aspects of<br />

the flow. Shadows add additional depth cues. Fig. 3 shows a navigable<br />

3D ThemeRiver visualization. There 17 organic streams<br />

depict their overall distribution (the width) and occurrence of zinc,<br />

(height of streams).<br />

6 Future Work<br />

There are several potential areas for further research and<br />

improvements of this prototype tool. We would like to investigate<br />

ways to enrich 3D ThemeRiver to visualize more attributes per<br />

stream. One way would be provide a CD player-like interface to<br />

animate over the set of attributes. Another, quite intriguing, strategy<br />

would be to employ the concept of spectral volume rendering<br />

[2] to provide a set of “metameric lamps” to be used for highlighting<br />

different combinations of stream attributes on the fly.<br />

Acknowledgments<br />

We thank the Center for Data Intensive Computing (CDIC) at<br />

Brookhaven National Lab for their generous support of part of this<br />

work.<br />

References<br />

[1] S. Havre, E. Hetzler, P. Whitney, and L. Nowell, “ThemeRiver:<br />

Visualizing Thematic Changes in Large Document Collections,”<br />

<strong>IEEE</strong> Trans. Visualization and <strong>Computer</strong> Graphics,<br />

vol. 8, no. 1, pp. 9-20, 2002.<br />

[2] S. Bergner, T. Möller, M. Drew, G. Finlayson, “Interactive<br />

spectral volume rendering,” <strong>IEEE</strong> Visualization 2002, pp. 101-<br />

108, 2002.


Interactive Poster: A Hardware-Accelerated Rubbersheet Focus + Context<br />

Technique for Radial Dendrograms<br />

Abstract<br />

Previous focus+context techniques for radial dendrograms<br />

only allow users to either stretch the display along the radius or<br />

the angle. In this poster, we present an interactive, hardware-accelerated<br />

rubbersheet-like technique that allows users to perform<br />

both operations simultaneously.<br />

1 Introduction<br />

Peter Imrich 1 Klaus Mueller 1 Dan Imre 3 Alla Zelenyuk 3 Wei Zhu 2<br />

1 Center for Visual Computing, <strong>Computer</strong> Science, Stony Brook University<br />

2 Applied Mathematics and Statistics, Stony Brook University<br />

3 Environmental Sciences, Brookhaven National Laboratory<br />

Dendrograms are a popular visualization method for illustrating<br />

the outcome of decision tree-type clustering in statistics. Most<br />

commonly, dendrograms are drawn in a Cartesian layout, as an upright<br />

tree. However, this layout does not make good use of space, it<br />

is sparse towards the root and crowded towards the leaf nodes (see<br />

Fig. 1). The spacing between nodes at different levels in the hierarchy<br />

is not uniform, which is due to the shrinking number of nodes<br />

from bottom to top. For this reason, long, wide-spanning connecting<br />

lines are needed to merge nodes at higher levels. A better layout<br />

in this respect is the polar or radial layout, where leaf nodes are<br />

located on the outer ring and the root is located in the center, as a<br />

focal point. A more uniform node spacing results, leading to a better<br />

utilization of space and resulting in a better illustration of the<br />

class relationships. Recently, Barlow and Neville [1] presented an<br />

empirical user study for tree layouts (with less than 200 leaves) in<br />

which they compare some of the major schemes: organizational<br />

chart (a standard drawing of a tree), tree ring (basically a pie chart<br />

of circular segments), icicle plot (the cartesian version of the tree<br />

ring), and tree map. According to the measured performance<br />

within a group of 15 users, the three former methods yielded similar<br />

results, with the icicle plot having a slight advantage. However,<br />

given the much larger number of leaves in our case (1000 and<br />

more) and the fact that the tree ring is the most compact of the<br />

three winning configurations, a radial layout seemed to be the most<br />

favorable one for our purposes (see Fig. 2). Radial graph layouts<br />

that illustrate hierarchical relationships are very popular, and for<br />

the special application of dendrograms, we know only of one other<br />

application using a radial layout, the recent one by Kreussler and<br />

Schumann [3]. In their implementation, the radii of the circles onto<br />

which nodes can be placed are quantized into a number of levels.<br />

The radius at which a (non-leaf) node is placed is a measure of the<br />

dissimilarity among its child-nodes, and a linear mapping is used to<br />

relate dissimilarity to radius. Leaf nodes, on the other hand, are<br />

always placed onto the circle one level below that of the parent<br />

node, while the root node is always at the center of the radial layout.<br />

Context and focus is provided by mapping the radial dendro-<br />

Figure 1: Example of a large dendrogram, drawn in Cartesian layout as an upright tree.<br />

email: {imrich, mueller}@cs.sunysb.edu, {imre, alla}@bnl.gov, zhu@ams.sunysb.edu<br />

102<br />

gram onto a hemisphere, which can be rolled to expand interesting<br />

hemisphere regions in the center of projection. A number of radial<br />

layout techniques for hierarchies with a fixed root node are<br />

described by Wilson and Bergeron [6]. They show techniques that<br />

achieve (i) an equi-spaced grouping of leaf nodes on the outer-most<br />

circle, (ii) an equispaced grouping of inner nodes on the inner circles,<br />

(iii) a layout in which leaves are spaced on the outer-most circle<br />

with respect to their value range; (iv) a density-based layout<br />

Although these layout techniques, and their hybrids, provide<br />

some level of flexibility, they are still somewhat static. As mentioned<br />

before, users often would like to focus on certain portions of<br />

the display, while compressing others, without losing context.<br />

Fisheye lenses [5] and hyperbolic zooming [4] have been proposed<br />

to provide these capabilities. In the context of tree rings Yang,<br />

Ward and Rundensteiner [7] have proposed a system in which<br />

users may either perform a polar zoom (i.e. expand the width of<br />

one or more adjacent rings while reducing others) or a radial zoom<br />

(i.e. expand the arc angle of some adjacent segments while reducing<br />

others). Users can perform these operations by pinning down<br />

one ring or arc segment and dragging another. A limiting factor<br />

here is that users cannot perform both operations simultaneously,<br />

which can be awkward in certain instances. To address this shortcoming,<br />

our application generalizes these concepts by allowing<br />

arbitrary warps of the dendrogram domain, i.e. we allow radial and<br />

polar zooms simultaneously.<br />

2 Radial Dendrogram Preliminaries<br />

In our implementation, the radii of the circles onto which<br />

nodes can be placed are quantized into a number of levels. The<br />

radius at which a (non-leaf) node is placed is a measure of the dissimilarity<br />

among its child-nodes, and a linear logarithmic mapping<br />

is used to relate dissimilarity to radius. Leaf nodes, on the other<br />

hand, are always placed onto the circle one level below that of the<br />

parent node, while the root node is always at the center of the radial<br />

layout.<br />

The above layout process only depends on two user inputs: (i)<br />

the desired number of distinct concentric levels (used to reduce the<br />

number of nodes and arcs with respect to the resolution of the distance<br />

metric, and (ii) the desired minimum size of visible nodes<br />

(used to reduce the number of nodes with respect to their population<br />

density). However, we should note that the drawn dendrogram<br />

is by no means static. At any time user can re-specify and re-compute<br />

the tree layout globally as well as locally by manually expanding<br />

and collapsing individual nodes and polar zones. We have<br />

found that these two user-driven node reduction features make it<br />

possible to present clustering composed of thousands of nodes in a


space efficient manner. Edges are colored using a rainbow colormap<br />

to indicate the number of data elements they carry. More<br />

details on this aspect of our application can be obtained from [2].<br />

3 Rubbersheet Context + Focus Technique<br />

In addition to the dynamic layout, our dendrogram has the flexibility<br />

of non-linear, rubber sheet-like zooming. Here we have<br />

aimed to provide a focus+context scheme that is in good accordance<br />

with the polar layout of our graph. There were a number of<br />

zooming operations that our users found important: (i) enlarge certain<br />

levels of the hierarchy on a global scale, (ii) enlarge a subtree,<br />

possibly all the way from the leaves to the root, and (iii) zoom into<br />

a certain area and gradually reveal more local detail. We have<br />

achieved this by allowing users to select an arbitrary arc segment<br />

of interest, via specifying two anchor points located on opposite<br />

ends of the arc segment’s diagonal via mouse clicks. The specified<br />

arc segment then expands and shrinks, responding to the mouse<br />

motion, while the rest of the dendrogram deforms by opposite, proportional<br />

transformations. An example is illustrated in Fig. 2,<br />

where Fig. 2a shows an unzoomed dendrogram, and Fig. 2b shows<br />

the same hierarchy with a user-specified (green) arc segment,<br />

whose outer edge is being compressed towards the center, and<br />

whose right edge (looking towards the center) is being pulled further<br />

to the right. This has the effect of globally expanding the lower<br />

(leaf) level of the hierarchy, as well as locally expanding the subtree<br />

captured in the arc segment’s center.<br />

A recalculation of the dendrogram layout at interactive speeds<br />

would be infeasible for dense hierarchies. Instead, we achieve the<br />

real-time speed of this operation by exploiting the texture mapping<br />

expanded area<br />

Figure 2: Rubbersheet zooming: (a) unwarped dendrogram, (b)<br />

a user-specified arc segment is angularly expanded and radially<br />

shrunk. Notice the radial distortion of the dendogram’s polar<br />

rings.<br />

103<br />

facilities present on even low-end computers. Upon activation, the<br />

dendrogram is first captured into an image and then texturemapped<br />

onto a radial polygonal mesh. As the user drags the mouse,<br />

the polygonal mesh deforms which consequently warps the texture.<br />

However, the mesh recalculates the layout each time the distortion<br />

angle or radius exceeds a predefined threshold (we use 10° and<br />

10% of the maxRad, respectively). This layout-refresh prevents the<br />

well-known artifacts of pixelization of overly distorted textures. In<br />

addition, leaf nodes formerly collapsed into a common polar zone<br />

or node are also optionally uncollapsed in this layout process (or<br />

re-collapsed upon compression). The entire process is virtually<br />

transparent to the user and enables the warping of dendrograms of<br />

almost arbitrary complexity at constant effort, as afforded by the<br />

hardware. A look-ahead mechanism could be implemented that<br />

computes new anticipated layouts based on the current warping<br />

activity of the user.<br />

Arc-segment-based zooming of 2D polar space has two independent<br />

degrees of freedom: angular and radial. These two modes,<br />

while conceptually separate, define fundamentally the same operation<br />

for the texture mapping hardware. Their simultaneous integration<br />

gives the dendrogram an elastic, rubber-like feel and allows a<br />

compact, flexible, and elegant form of focus+context. It is fundamentally<br />

different from fisheye or hyperbolic zooms or the rolling<br />

sphere approach of [4]. The former are not specifically designed to<br />

work with polar graphs, while the latter does not provide the global<br />

enlargement of certain hierarchy levels. Finally, our rubbersheet<br />

approach is also different, and perhaps more useful for our purposes,<br />

from the method outlined in [7] since it allows both polar<br />

and radial zoom to be performed simultaneously.<br />

4 Conclusions and Future Work<br />

The application presented in this paper combines several different<br />

techniques to support data mining and survey with visual<br />

tools. Our rubbersheet technique adds the much needed versatile<br />

focus + context to the existing features of our interactive dendrogram<br />

application, for example, adjustable level of detail and manual<br />

subtree addition, removal, and migration.<br />

Acknowledgments<br />

We thank the Center for Data Intensive Computing (CDIC) at<br />

Brookhaven National Lab for their generous support of part of this<br />

work.<br />

References<br />

[1] T. Barlow and P. Neville, “A comparison of 2D visualization<br />

of hierarchies,” Information Visualization 2001, pp. 131-138.<br />

[2] P. Imrich, K. Mueller, R. Mugno, D. Imre, A. Zelenyuk, and<br />

W. Zhu, “Interactive Poster: Visual Data Mining with the<br />

Interactive Dendrogram,” <strong>IEEE</strong> Information Visualization<br />

Symposium, poster session. Available at http://www.cs.sunysb.edu/~mueller/research.<br />

[3] M. Kreussler and H. Schumann, “A flexible approach for<br />

visual data mining,” <strong>IEEE</strong> Trans. Visualization and <strong>Computer</strong><br />

Graphics, vol. 8, no. 1, pp. 39-51, 2002.<br />

[4] T. Munzner, "Exploring Large Graphs in 3D Hyperbolic<br />

Space." <strong>IEEE</strong> <strong>Computer</strong> Graphics & Applications, vol. 18, no.<br />

4, pp. 18-23, 1998.<br />

[5] M. Sarkar and M. Brown, “Graphical fisheye views,” Communications<br />

of the ACM, vol. 37, no. 12, pp. 73-84, 1994.<br />

[6] R. Wilson, R. Bergeron, “Dynamic hierarchy specification and<br />

visualization,“ Information Visualization 1999, pp. 65-72.<br />

[7] J. Yang, M. Ward, and E. Rundensteiner, “InterRing: An interactive<br />

tool for visually navigating and manipulatiung<br />

hierarchical structures,“ <strong>IEEE</strong> 2002 Symposium on Information<br />

Visualization, pp. 77-92, 2002.


Interactive Poster: Visualizations in the ReMail Prototype<br />

Abstract<br />

Over the past several years, the Collaborative User Experience<br />

research group, in conjunction with the Lotus Software division of<br />

IBM, has been investigating how people use email and how we<br />

might design and build a better email system. In this<br />

demonstration, we will show a prototype email client developed<br />

as part of a larger project on “reinventing email.” Among other<br />

new features, this prototype incorporates novel visualizations of<br />

the documents within mail databases to aid understanding and<br />

navigation. The visualizations include a thread map for<br />

navigating among related messages, a correspondent map for<br />

highlighting the senders of messages, and a message map which<br />

shows message relationships within a folder. Our goal in<br />

developing this prototype is to gather user experience data as<br />

people try these visualizations and others on their own email.<br />

Keywords: electronic mail, information visualization.<br />

1 Motivation<br />

Electronic mail has become the most widely used business<br />

productivity application. However, people increasingly feel<br />

frustrated by their email. They are overwhelmed by the volume<br />

(receiving hundreds of messages a day is not atypical [Levitt<br />

2000]), lose important items (folders fail to help people find and<br />

recall messages [Whittaker and Sidner 1996]), and feel pressure to<br />

respond quickly (often within seconds [Jackson et al. 2003]).<br />

Though email usage has changed, our email clients largely have<br />

not [Ducheneaut and Bellotti 2001]. As our reliance on email for<br />

performing an increasing number of daily activities grows, the<br />

situation promises to worsen.<br />

2 The Prototype<br />

To address these problems, our research group has been<br />

investigating electronic mail. In additional to user studies and<br />

design mockups, we have implemented several prototype email<br />

clients [Rohall and Gruen 2002]. Figure 1 shows a portion of our<br />

latest prototype highlighting three novel visualizations: the<br />

thread map (2.1), the correspondent map (2.2), and the message<br />

map (2.3). The thread map is shown as part of the message’s<br />

summary information, along with its subject and recipients. The<br />

correspondent and the message maps are shown as separate<br />

panels. Since the prototype is built as a set of plugins for the<br />

Eclipse environment, the correspondent and message maps can be<br />

easily rearranged and resized (or not shown at all). This<br />

Steven L. Rohall<br />

IBM T.J.Watson Research Center<br />

One Rogers Street<br />

Cambridge, MA 02142, USA<br />

steven_rohall @ us.ibm.com<br />

104<br />

flexibility allows us to explore their use in various combinations.<br />

These three visualizations are described in more detail below.<br />

Figure 1. Overview of the ReMail Prototype<br />

Even though the visualizations are implemented as separate<br />

plugins, they are integrated when used in the prototype. Selecting<br />

a message in one will select that message in the others as well as<br />

open it in the preview window. Other visualizations are easily<br />

incorporated in this architecture.<br />

2.1 Thread Map<br />

Email threads are groups of replies that, directly or indirectly, are<br />

responses to an initial email message. Ideally, e-mail threads can<br />

reduce the perceived volume of mail in users’ inboxes, enhance<br />

awareness of others’ contributions on a topic, and minimize lost<br />

messages by clustering related e-mail.<br />

Figure 2. Thread map visualization<br />

Our prototype supports threads of email messages by providing a<br />

visualization of the thread tree when any message is selected<br />

[Kerr 2003; Rohall et al. 2001] (Figure 2). (The prototype<br />

currently supports a subset of the functionality described by Kerr<br />

[2003].) The currently selected node is highlighted with blue.<br />

Unread messages are indicated with a bold, black border.<br />

Messages that the user has sent are filled with orange.<br />

The oldest message is drawn on the left and newer messages are<br />

added to the right. No attempt is made to indicate an accurate<br />

time scale for the message nodes; instead, nodes are evenly<br />

spaced simply indicating which messages are more recent. This


design maintains compactness with either deep or wide trees<br />

while clearly displaying chronological order (which users have<br />

told us is important).<br />

Hovering over a node in the tree view provides summary<br />

information for that message including the first line of the<br />

message body. Clicking on a node causes that message to be<br />

opened in the preview pane. The ability to view messages by<br />

clicking on the thread map has proven especially useful when a<br />

thread spans several days or weeks and not all of its messages are<br />

visible in the list view.<br />

2.2 Correspondent Map<br />

Our research has shown that the sender of an email message is<br />

one of the most important factors in determining the order in<br />

which people will process their messages. The correspondent<br />

map groups the messages in a folder by sender (Figure 3).<br />

Senders are also grouped by their domain. Within a domain,<br />

people are ordered by the number of messages they have sent,<br />

senders of more messages being shown first. If there are many<br />

senders (and/or the correspondent map view is reduced in size),<br />

the rectangles representing senders are reduced in size and may<br />

only show first names or initials. The color of the rectangles<br />

indicates the age of the most recent message from that sender.<br />

Selecting the “unanswered” check box grays out the rectangles of<br />

those to whom the user has sent mail more recently than they’ve<br />

received it (i.e., those people the user has already answered).<br />

Figure 3. Correspondent map visualization<br />

By displaying people from whom mail has been received more<br />

recently than they’ve been sent mail, the user can determine<br />

which people are owed a message, either as a response to an<br />

important message (a heavy correspondent with a dark blue or<br />

black box) or as a way of keeping in touch with an old friend<br />

(indicated with a light blue box). Conversely, the gray<br />

“unanswered” boxes may indicate that the user is owed a response<br />

and that reminders may need to be sent to those individuals.<br />

Selecting a sender rectangle pops up a menu with the subjects of<br />

the most recent messages (up to five) from that sender. Selecting<br />

a subject line displays that message to the user. In addition,<br />

dragging a subject line to another sender’s rectangle serves to<br />

forward that message to the indicated individual.<br />

2.3 Message Map<br />

Our most recent work is a message map which provides a visual<br />

representation of the messages in a folder (Figure 4). Messages<br />

are displayed in chronological order. Similar to the correspondent<br />

map, the color of the message rectangles gets lighter as the<br />

messages they represent get older; unread messages are indicated<br />

with a black border. Rectangles that are drawn as light gray<br />

105<br />

“ghosts” are ones that have been deselected due to a userspecified<br />

search (e.g., in Figure 4, a search was issued for the<br />

word “urgent”—messages that don’t match are drawn light gray).<br />

The selected message (also shown in the client’s preview<br />

window) is drawn with a blue border; other messages in the same<br />

thread are drawn with a light blue border. Messages with the<br />

same sender are filled with orange. Finally, the “dog ear” on<br />

some rectangles indicates a message that the user has authored.<br />

Figure 4. Message map visualization (preliminary version)<br />

This visualization allows the user to quickly see relationships<br />

among messages within a folder: which satisfy a query, are in the<br />

same thread, are by the same author, and were sent by the user.<br />

3 Acknowledgements<br />

Many people in the Collaborative User Experience research group<br />

have contributed to this project. Martin Wattenberg, Bernard<br />

Kerr, and intern Suzanne Minassian in particular worked on the<br />

visualizations which have been described. Others in the group<br />

who have been instrumental in the prototyping effort include<br />

Robert Armes, Kushal Dave, Dan Gruen, Paul Moody, Bob<br />

Stachel, Eric Wilcox, and intern Jennifer Liu.<br />

References<br />

DUCHENEAUT, N. and V. BELLOTTI 2001. “E-mail as Habitat,”<br />

Interactions, 8(5), September-October 2001, ACM, pp. 30-38.<br />

JACKSON, T.W., R. DAWSON, and D. WILSON 2003. “Understanding<br />

Email Interaction Increases Organizational Productivity,”<br />

Communications of the ACM, 46(8), August 2003, pp. 80-84.<br />

KERR, B 2003. “THREAD ARCS: An Email Thread Visualization,”<br />

Proceedings of the <strong>IEEE</strong> Symposium on Information Visualization,<br />

Seattle, WA, October 19-21, 2003.<br />

LEVITT, M. 2000. “Email Usage Forecast and Analysis, 2000-2005,” IDC<br />

Report # W23011, September 2000.<br />

ROHALL, S.L. and D. GRUEN 2002, “ReMail: A Reinvented Email<br />

Prototype,” demonstration, Conference Supplement for CSCW 2002,<br />

New Orleans, LA, November 16-20, 2002, pp. 119-122.<br />

ROHALL, S.L., D. GRUEN, P. MOODY, and S. KELLERMAN 2001. “Email<br />

Visualizations to Aid Communications,” Late Breaking, Hot Topic<br />

Proceedings of the <strong>IEEE</strong> Symposium on Information Visualization, San<br />

Diego, CA, October 22-23, 2001, pp. 12-15.<br />

WHITTAKER, S. and C. SIDNER 1996. “Email Overload: Exploring<br />

Personal Information Management of Email,” Proceedings of CHI’96,<br />

Vancouver, B.C., April 13-18, 1996, pp. 276-2


¢¡¤£¦¥¨§©¤£¥©¨©£¡¨¥¨©£©£<br />

¢¡¤£¦¥¨§©¤£¥£¦¥§<br />

¥§¥ §¨¡<br />

<br />

<br />

Chandrajit Bajaj, Shashank Khandelwal, J Moore, Siddavanahalli<br />

Vinay<br />

Center for Computational Visualization,<br />

Department of <strong>Computer</strong> Sciences and Institute for Computational Engineering & Sciences,<br />

University of Texas, Austin Texas 78712<br />

We present an interactive visualization environment for semiautomatic<br />

theorem provers in an attempt to help users better steer<br />

their theorem proving process. The augmented theorem proving<br />

environment provides synchronized multi-resolution textual and<br />

graphical views and direct navigation of large expressions or proof<br />

trees from either of the twin interfaces. We identify three levels<br />

of the proof process at which synchronized multi-resolution textual<br />

and graphical visualizations enhance user understanding.<br />

<br />

User interaction with theorem provers remains mostly text based.<br />

When a proof attempt fails, the user needs to diagnose the problem<br />

and then come up with new theorems, lemmas or hints to continue.<br />

This requires a thorough understanding of each proof attempt. Theorem<br />

provers typically generate large amounts (megabytes) of text<br />

during proof attempts; making intermediate expression navigation<br />

in the proof process a significant challenge. A command line text<br />

interface is used with most theorem provers. Pretty printing and text<br />

based primitives like searching are the main tools available to help<br />

reduce or manage visual complexity. The challenge is to integrate<br />

text-based interfaces with synchronized graphical visualization to<br />

speed comprehension and interaction. We identify three levels at<br />

which the command line interface could be augmented with synchronized<br />

graphical visualization for enhanced user understanding:<br />

1. The overall proof attempt can be visualized by a graph of the<br />

theorems used during a particular proof attempt.<br />

2. The structure of the proof can be visualized, by displaying the<br />

subgoals created at each step and indicating which subgoals<br />

could be proved or not.<br />

3. Examining failed subgoals is critical towards understanding<br />

why a proof attempt failed. Following the progress of similar<br />

subgoals through the proof attempt is useful. Graphical visualization<br />

would help quickly identify similar subgoals and<br />

their locations within the overall proof.<br />

We choose ACL2 [Kaufmann and Moore 1997], an industrial<br />

strength theorem prover for our case study.<br />

<br />

We provide a couple of relevant references here to previous work<br />

done in the visualization of output from theorem provers. The remaining<br />

are cited in the full version of the paper [Bajaj et al. 2003].<br />

Paper [Thiry et al. 1992] discusses the need and requirements for a<br />

bajaj, shrew, moore, skvinay @cs.utexas.edu<br />

106<br />

user friendly interface to theorem provers, but do not visualize the<br />

information inherent in the proof process as a way to understand the<br />

proof attempt. In [Goguen 1999], we see an attempt to use visualization<br />

to understand the structure of proofs, and a complete system<br />

for developing a user interface. Their system is designed for readers<br />

of proofs (as opposed to specifiers or provers). Web pages explain<br />

each proof with links to background material and tutorials. Their<br />

system is designed with distributed collaboration in mind. Our system<br />

is designed to be used by theorem provers, working alone.<br />

Figure 1: Proof tree visualization. Three different time steps during<br />

a proof attempt are shown in clockwise order.<br />

<br />

We provide details and justifications for the visualizations at each<br />

level of the hierarchy.<br />

¢ ¨<br />

A theorem is proved by using<br />

<br />

previously verified theorems and lemmas. These verified theorems<br />

and lemmas are used as a knowledge base for the theorem prover.<br />

By looking at the theorems and lemmas used in the proof of a previously<br />

verified theorem, a user may gain insight on how to steer<br />

a current proof attempt. The theorems and lemmas used during<br />

a proof attempt, when arranged to show inter-dependency, form a<br />

directed acyclic graph. This can be visualized using a simple nodelink<br />

diagram.<br />

¢ Most proof attempts have a tree structured<br />

approach; with the main theorem being proved as the root of<br />

the tree. A theorem prover either proves/disproves the theorem, or


Figure 2: Multi-view text and graphical visualization of a proof<br />

attempt. The text windows on the left contain the contents of some<br />

nodes from the proof tree on the right. The screen shot is from the<br />

proof of the proposition that the reverse of the reverse of a list is the<br />

list itself (given certain conditions and definitions).<br />

divides the theorem into subgoals. Each of these subgoals is then<br />

tackled in an order determined by the particular theorem proving<br />

system. ACL2 tends to use depth first search. See figure 1.<br />

We provide a synchronized multi-view representation of the<br />

proof tree to the user. Since proof attempts tend to be large, taking<br />

possibly hours to finish, users prefer being given synchronized<br />

feedback in both textual and graphical views of the current state of<br />

the proof. We use a variant of the cone tree algorithm [Carriere and<br />

Kazman 1995] to render, annotate and provide interactive navigation<br />

for our trees. ACL2 has a model in which the subgoals can be<br />

reduced using generalization, induction, simplification, etc. These<br />

actions are limited and distinct and can be visualized by the current<br />

node’s color, as shown in figure 2.<br />

The main hurdle in finding out why<br />

¢ <br />

¢¡¤£<br />

the theorem prover could not prove a theorem is understanding the<br />

critical node at which the theorem prover failed or deviated from the<br />

expected path. The expression trees of formulas at a subnode can be<br />

visualized as a 2D tree. In theorem proving, larger proof attempts<br />

are cumbersome to follow. From one goal to another, the theorem<br />

prover performs some actions, modifying the expressions at each<br />

stage. In order to follow changes, pattern matching can be applied<br />

to the expressions (after suitably representing them as trees).<br />

The tree matching algorithm we use is similar to the recursive algorithm<br />

presented by [Hoffmann and O’Donnell 1982]. Our heuristics<br />

for matching are domain specific. A Lisp expression E can be<br />

represented as a function symbol F operating on a set of parame-<br />

p1¥ P2¥§¦¨¦©¦©¥ ters Pn. In a binary tree representation, the left child of the<br />

root node contains the function symbol F. The parameters are then<br />

the left children of all the nodes of the right side path from the root<br />

to a leaf. Two trees which have different function symbols result in<br />

a low match. The match is also proportional to the distance from<br />

the root of the differences between the trees. A permutation in the<br />

parameters of a function symbol results in a high match.<br />

The visualization of the results from the pattern matching has<br />

been implemented in both text and graphics. In figure 3 we see two<br />

sets of texts. A sub expression from column 2 is matched with the<br />

entire expression on the right. The font color indicates how similar<br />

an expression is to the search expression. Unselected text is light<br />

gray, while selected text is black. The results from pattern matching<br />

are shown by varying the font color from bright red (high match)<br />

to dark red (low match). The graphical expression visualization<br />

107<br />

Figure 3: A synchronized view of text and graphics visualizations<br />

from level 3. Pattern matching of expressions from a proof: A composition<br />

of screen shots from our implementation.<br />

interface also shows the same results (the first column in figure 3).<br />

The unselected sections are gray, while selections are cyan. Again,<br />

bright to dark red is used to show high to low matches between the<br />

patterns. The third tree in the left column is a zoomed-in view of<br />

the outlined box in the second tree.<br />

<br />

We have presented some details of our interactive visualization environment<br />

for semi-automatic theorem provers. Further details are<br />

available from the full version of the paper [Bajaj et al. 2003], (with<br />

system animations), from our Symbolic Visualization web page<br />

(http://www.ices.utexas.edu/CCV/projects/VisualEyes/SymbVis/).<br />

<br />

We are grateful to Robert Krug for writing the socket code that<br />

helps us communicate with ACL2. Research supported in part by<br />

grants from NSF CCR-9988357and ACI 9982297.<br />

<br />

BAJAJ, C., KHANDELWAL, S., MOORE, J., AND SIDDAVANA-<br />

HALLI, V. 2003. Interactive symbolic visualization of semiautomatic<br />

theorem proving. In CS and ICES Technical Report.<br />

CARRIERE, J., AND KAZMAN, R. 1995. Interacting with huge<br />

hierarchies: Beyond cone trees. In proceedings of <strong>IEEE</strong> Information<br />

Visualization, 74–78.<br />

GOGUEN, J. A. 1999. Social and semiotic analyses for theorem<br />

prover user interface design. Formal Aspects of Computing 11,<br />

3, 272–301.<br />

HOFFMANN, C. M., AND O’DONNELL, M. J. 1982. Pattern<br />

matching in trees. Journal of the ACM (JACM) 29, 1, 68–95.<br />

KAUFMANN, M., AND MOORE, J. S. 1997. An industrial strength<br />

theorem prover for a logic based on common Lisp. Transactions<br />

on Software Engineering 23, 4, 203–213.<br />

THIRY, L., BERTOT, Y., AND KAHN, G. 1992. Real theorem<br />

provers deserve real user-interfaces. In proceedings of the fifth<br />

ACM SIGSOFT symposium on Software Development Environments,<br />

ACM Press, 120–129.


¢¡¤£¦¥¨§©£¥£¦¥¨§<br />

©§¥§¥¤£¦¥¥¨§¥¦¥¨¡£©¤£¡§¥¥¥¨§©§¥<br />

<br />

Lisong Sun<br />

Department of <strong>Computer</strong> Science<br />

The University of New Mexico<br />

Albuquerque, NM 87131<br />

e-mail: ¤<br />

FROTH is the implementation of a low complexity force-directed<br />

tree layout algorithm based on the Lennard-Jones potential. The recursive<br />

method lays out sub-trees as small disks contained in their<br />

parent disk. Inside each disk, children disks are dynamically laid<br />

out using a new force directed simulation. Unlike most other force<br />

directed layout methods which run in quadratic time for each simulation<br />

step, this algorithm runs O in n<br />

tree with n nodes, depth m and all the nodes having uniform number<br />

of children. The layout uses space efficiently and reflects both<br />

global structure and local detail. The method supports runtime insertion<br />

and deletion. Both operations and the evolving process are<br />

rendered with smooth animation to preserve visual continuity. The<br />

method could be used to monitor in real time, visualize and analyze<br />

a wide variety of data which has a rooted tree structure, e.g. internet<br />

hosts could be laid out by domain name (DNS) hierarchies. Figure<br />

1 is an example of how the FROTH is used to visualize DNS tree.<br />

Steve Smith<br />

Los Alamos National Laboratory<br />

Los Alamos, NM 87545<br />

e-mail: <br />

Thomas Preston Caudell<br />

Department of Electrical and <strong>Computer</strong> Engineering<br />

The University of New Mexico<br />

Albuquerque, NM 87131<br />

e-mail: ¤<br />

m 1<br />

m time per each step for a<br />

Keywords: FROTH, Graph Layout, Tree Layout, Lennard-Jones<br />

Potential, Force-Directed Simulation, Tree Hierarchy<br />

Figure 1. Layout Result of 8 000 Internet Host Names<br />

Partially funded by Los Alamos National Laboratory and the Center for<br />

<br />

High Performance Computing at the University of New Mexico.<br />

108<br />

<br />

<br />

Two decades ago, Peter Eades proposed a graph layout heuristic[Eades<br />

1984] which is called the “Spring Embedder” algorithm.<br />

As described later in a review[Battista et al. 1994], edges are replaced<br />

by springs and vertexes are replaced by rings that connect<br />

the springs. A layout can be found by simulating the dynamics of<br />

such a physical system. This method and other methods, which involve<br />

similar simulations to compute the layout, are called “Force<br />

Directed” algorithms.<br />

Because of the underlying analogy to a physical system, the force<br />

directed graph layout methods tend to meet various aesthetic standards,<br />

such as efficient space filling, uniform edge length, symmetry<br />

and the capability of rendering the layout process with smooth<br />

animation. Force directed layout methods commonly have computational<br />

scaling problems. When there are more than a few thousand<br />

vertexes in the graph, the running time of the layout computation<br />

can become unacceptable. This is caused by the fact that in each<br />

step of the simulation, the repulsive force between each pair of unconnected<br />

vertexes needs to be computed, costing a running time<br />

2<br />

of V<br />

E 2 . This complexity is hard to escape for general graphs<br />

O<br />

with no hierarchical structure. In this paper, we focus on a special<br />

but common type of graph, the tree.<br />

There are several conventions for drawing trees[Battista et al.<br />

1994][Eades et al. 1993], for example, the classic planar straightline<br />

drawing convention (Figure 2a), which represents nodes as dots<br />

and edges as straight lines connecting the two nodes, and the containment<br />

convention (Figure 2b), which represents nodes as squares<br />

or disks with children nodes contained inside parent nodes. The first<br />

convention is the most commonly used in tree layout and drawing<br />

algorithms. On the other hand, the containment convention may<br />

be used to increase the efficiency of space filling and reducing the<br />

visual complexity.<br />

e<br />

b<br />

k<br />

f<br />

l<br />

a<br />

c<br />

g<br />

m<br />

h<br />

d<br />

n<br />

i<br />

j<br />

n<br />

m<br />

j<br />

i<br />

h<br />

g<br />

e<br />

b<br />

a<br />

d c<br />

f<br />

k<br />

l<br />

e<br />

b<br />

k l<br />

a b<br />

Figure 2a. Straight-Line Drawing Convention<br />

Figure 2b. Containment Convention<br />

f<br />

a<br />

c<br />

i<br />

g<br />

j<br />

d<br />

h<br />

m n


The algorithm presented here follows the containment convention<br />

and makes use of force directed simulation. The quadric running<br />

time is avoided by applying the divide and conquer approach.<br />

By dividing the tree into sub-trees, only siblings of the same parent<br />

need to interact during the simulation, allowing the decomposition<br />

of the general tree layout problem into nested, independent subproblems.<br />

In the next section, a recursive layout algorithm utilizing<br />

a novel force simulation is introduced which has a much decreased<br />

complexity.<br />

¥¤<br />

This method works with data that has a rooted tree structure. Each<br />

node of the tree is represented as a disk. Each child node is contained<br />

inside its parent node which is represented as a larger disk.<br />

¡£¢<br />

Each node recursively contains all of its proper descendants and all<br />

the nodes are contained inside the disk that represents the root node.<br />

Inside each node, the positions of its children are determined<br />

by simulating a physical system. In the simulation, each node is<br />

treated as a particle. The mass of the particle is proportional to the<br />

size of the sub-tree rooted in the node. Each node has a position<br />

which is relative to the center of its parent disk. The position will<br />

be updated after each step of simulation. Each node has a relativeradius<br />

which equals the square root of the size of the sub-tree rooted<br />

in the node. The relative-radius will be used to plot the current disk<br />

inside its parent and it is updated whenever an insertion or deletion<br />

occurs in the sub-tree. Each node also has a children-radius which<br />

is the minimum radius to fully contain all its children disks. It is<br />

computed by finding the maximum sum of the distance from the<br />

center of current node to the center of each children disks and the<br />

children disk’s relative-radius. The children-radius will be used to<br />

compute the scale factor when rendering the children of the current<br />

node and is updated after each step of the simulation.<br />

Two types of forces are computed in this physical system. One of<br />

them is the force derived from the Lennard-Jones potential[Silbey<br />

and Alberty 2001]. This force exists between each pair of the contained<br />

siblings and ensures that they will not overlap upon each<br />

other. The normal form of Lennard-Jones potential is shown below.<br />

σ 12<br />

φLJ 4娩<br />

r r§¦<br />

The second force is a central-tendency radial force which pulls<br />

all the children toward the center of the parent disk. A simple illustration<br />

of the system is given is Figure 3.<br />

<br />

σ © 6 r<br />

Central-Tedency<br />

Radial Force<br />

Lennard-Jones<br />

Force<br />

Figure 3. Two Types of Forces in the Simulation<br />

In each simulation step, all of the forces exerted on each particle<br />

are vectorialy summed. The particle will move in the direction of<br />

the total force. The displacement is proportional to the total force<br />

divided by the mass of the particle. A maximum displacement is<br />

used to keep the particles from moving too far in each step. Since<br />

the potential field generated by such a particle system is very complicated,<br />

some particles will start to vibrate at various frequency.<br />

To reduce the vibrations, a simple filter is applied on the trajectory<br />

of each particle. There is no momentum term in the force equation<br />

which means kinetic energy is totally “damped”.<br />

109<br />

Once the method for laying out one node’s children is defined,<br />

the whole tree can then be laid out recursively starting with the<br />

root. The rendering was done inside the UNM virtual reality development<br />

environment “Flatland”[Caudell 2003] using openGL.<br />

Similar to the simulation process, the rendering is done recursively.<br />

¢ <br />

The complexity analysis and the result from a set of running time<br />

tests can be found in our previous publication[Sun et al. 2003].<br />

Since internet domain names have a natural hierarchical tree struc-<br />

<br />

ture, this data is well suited for the proposed algorithm. Figure 1<br />

shows the layout result of a tree 8 with 000 internet host names.<br />

The diameters of the disks reflect the number of descendants they<br />

have which gives the visual information about the structure of the<br />

tree. Detail inside sub-trees can be obtained by zooming into the<br />

figure. This is especially helpful as trees grow large. Figure 4 shows<br />

the detail of a sub-tree located in upper left corner of Figure 1.<br />

Figure 4. Zoomed In Detail of a Sub-Tree In Figure 1.<br />

¢ <br />

In this paper, a new tree layout method is introduced. The method<br />

is fast because of its “Divide and Conquer” nature. The design is<br />

simple because of its recursive layout and rendering mechanism.<br />

<br />

The visual effect is smooth because of its underlying simulation<br />

process.<br />

More interactive functionalities will be developed in the future.<br />

Mechanisms will be implemented to increase the frame rate.<br />

¢ <br />

BATTISTA, G. D., EADES, P., TAMASSIA, R., AND TOLLIS,<br />

L. G. 1994. Algorithms for drawing graphs: An annotated bibliography.<br />

Computational Geometry: Theory and Applications<br />

¢¢ ¢<br />

4, 5, 235–<strong>28</strong>2.<br />

CAUDELL, T. P., 2003. http://www.hpcerc.unm.edu/homunculus.<br />

EADES, P., LIN, T., AND LIN, X. 1993. Two tree drawing conventions.<br />

International Journal of Computational Geometry and<br />

Applications 3, 2, 133–153.<br />

EADES, P. 1984. A heuristic for graph drawing. Congressus Numerantium<br />

42, 149–160.<br />

SILBEY, R. J., AND ALBERTY, R. A. 2001. Physical Chemistry,<br />

3 ed. J. Wiley and Sons, Inc.<br />

SUN, L., SMITH, S., AND CAUDELL, T. P. 2003. A low complexity<br />

recursive force-directed tree layout algorithm based on<br />

the lennard-jones potential. Tech. Rep. EECE-TR-03-001, The<br />

University of New Mexico.


PaintingClass: Interactive Construction, Visualization and<br />

Exploration of Decision Trees<br />

Soon Tee Teoh<br />

Department of <strong>Computer</strong> Science<br />

University of California, Davis<br />

teoh@cs.ucdavis.edu<br />

1. INTRODUCTION<br />

Classification of multi-dimensional data is one of the major challenges<br />

in data mining. In a classification problem, each object is<br />

defined by its attribute values in multi-dimensional space, and furthermore<br />

each object belongs to one class among a set of classes.<br />

The task is to predict, for each object whose attribute values are<br />

known but whose class is unknown, which class the object belongs<br />

to. Typically, a classification system is first trained with a set of<br />

data whose attribute values and classes are both known. Once the<br />

system has built a model based on the training, it is used to assign<br />

a class to each object.<br />

A decision tree classifier first constructs a decision tree by repeatedly<br />

partitioning the dataset into disjoint subsets. One class<br />

is assigned to each leaf of the decision tree. Most classification<br />

systems, including most decision tree classifiers, are designed for<br />

minimal user intervention. More recently, a few classifiers have<br />

incorporated visualization and user interaction to guide the classification<br />

process. On one hand, visual classification makes use of<br />

human pattern recognition and domain knowledge. On the other<br />

hand, visualization gives the user increased confidence and understanding<br />

of the data [1, 3].<br />

This poster presents PaintingClass to the Information Visualization<br />

community. PaintingClass a complete user-directed decision<br />

tree construction and exploration system, which we presented in the<br />

Data Mining (DM) and Knowledge Discovery in Databases (KDD)<br />

community first in [6], then in [7].<br />

In [6], we proposed StarClass, a new interactive visual classification<br />

technique. StarClass allows users to visualize multi-dimensional<br />

data by projecting each data object to a point on 2-D display space<br />

using Star Coordinates [5]. When a satisfactory projection has been<br />

found, the user partitions the display into disjoint regions; each region<br />

becomes a node on the decision tree. This process is repeated<br />

for each node in the tree until the user is satisfied with the tree and<br />

wishes to perform no more partitioning.<br />

On the foundations of StarClass, we developed PaintingClass [7].<br />

In PaintingClass, we designed a new decision tree exploration mechanism,<br />

to give users understanding of the decision tree as well as<br />

the underlying multi-dimensional data. This is important to the<br />

110<br />

Kwan-Liu Ma<br />

Department of <strong>Computer</strong> Science<br />

University of California, Davis<br />

ma@cs.ucdavis.edu<br />

user-directed decision tree construction process as users need to efficiently<br />

navigate the decision tree to grow the tree. Furthermore,<br />

PaintingClass extends the technique proposed to StarClass so that<br />

datasets with categorical attributes can also be classified. This is<br />

useful because many real-world applications use data containing<br />

both numerical and categorical attributes.<br />

These features make PaintingClass an effective data mining tool.<br />

PaintingClass extends the traditional role of decision trees in classification<br />

to take on the additional role of identifying patterns, structure<br />

and characteristics of the dataset via visualization and exploration.<br />

This paradigm is a major contribution of PaintingClass. In<br />

the poster, we show some examples of knowledge gained from the<br />

visual exploration of decision trees. We also show the effectiveness<br />

of PaintingClass in classifying some benchmark datasets by<br />

comparing accuracy with other classification methods.<br />

2. DECISION TREE EXPLORATION<br />

As is typical of decision tree methods, PaintingClass starts by<br />

accepting a set of training data. The attribute values and class of<br />

each object in the training set is known. In the root of the PaintingClass<br />

decision tree, every object in the training set is projected<br />

and displayed visually. In PaintingClass, each non-terminal node<br />

in the decision tree is associated with a projection, which is a definite<br />

mapping from multi-dimensional space into two-dimensional<br />

display. The user creates a projection that best separates the data<br />

objects belonging to different classes. The user can create a projection<br />

either by editting the axes in Star Coordinates [5] projection for<br />

numerical attributes or by using Parallel Coordinates [4] projection<br />

for categorical attributes.<br />

Each projection is then partitioned by the user into regions by<br />

painting. In a Star Coordinates projection, a suitable projection<br />

separating objects of different classes is found by moving the axes<br />

around until a satisfactory projection is found. When the user is<br />

satisfied with a projection, the user specifies regions by “painting”<br />

over the display with the mouse icon. In a Parallel Coordinates<br />

projection, initially all intervals are set to belong to the blue region.<br />

The user clicks on an interval to change it to red. An object belongs<br />

to the red region if for every dimension with at least one red interval,<br />

the object has attribute value equal to a red interval. The object<br />

belongs to the blue class otherwise.<br />

Next, for each region in the projection, the user can choose to<br />

re-project it, forming a new node. In other words, the user creates a<br />

projection for this new node in a way that best separates the data objects<br />

in the region leading to this node. For every new node formed,<br />

the user has the option of partitioning its associated projection into<br />

regions. The user recursively creates new projection/nodes until<br />

a satisfactory decision tree has been constructed. Each projection<br />

thus corresponds to a non-terminal node in the decision tree, and


Table 1: Accuracy of PaintingClass compared with algorithmic<br />

approaches and visual approach PBC.<br />

Algorithmic Visual<br />

CART C4 SLIQ PBC PaintingClass<br />

Satimage 85.3 85.2 86.3 83.5 85.3<br />

Segment 84.9 95.9 94.6 94.8 95.2<br />

Shuttle 99.9 99.9 99.9 99.9 99.9<br />

Australian 85.3 84.4 84.9 82.7 84.7<br />

Table 2: Accuracy of PaintingClass compared with other classification<br />

methods.<br />

CBA C4.5 FID Fuzzy PaintingClass<br />

Australian 85.0 82.6 58.0 88.9 84.7<br />

adult 84.2 85.4 23.6 85.9 85.1<br />

diabetes 74.4 73.8 62.0 77.6 74.6<br />

each un-projected region thus corresponds to a terminal node. In<br />

this way, for each non-root node, only the objects projecting onto<br />

the chain of regions leading to the node are projected and displayed.<br />

In the classification step, each object to be classified is projected<br />

starting from the root of the decision tree, following the regionprojection<br />

edges down to an un-projected region, which is a terminal<br />

node (ie. a leaf) of the decision tree. The class which has<br />

the most training set objects projecting to this terminal region is<br />

predicted for the object.<br />

Decision tree visualization and exploration is important for two<br />

mutually-complimentary reasons. First, to effectively and efficiently<br />

build a decision tree, it is crucial to be able to navigate through the<br />

decision tree quickly to find nodes that need to be further partitioned.<br />

Second, exploration of the decision tree aids the understanding<br />

of the tree and the data being classified. From the visualization,<br />

the user gains helpful knowledge about the particular<br />

dataset, and can more effectively decide how to further partition the<br />

tree.<br />

PaintingClass decision tree visualization makes use of the focus+context<br />

concept, focusing on one particular projection, called<br />

the “current projection”, which is given the most screen space and<br />

shown in the upper right corner. The rest of the tree is shown as<br />

context to give the user a sense of where the current node is within<br />

the decision tree. Nodes that are close to the node in focus are allocated<br />

more space, because they are more relevant and the user<br />

is more likely to be interested in their contents. The ancestors (up<br />

to and including the root) of the node in focus are drawn in a line<br />

towards the left. The line including the focus node and all its ancestors<br />

is called the ancestor line. Except with both parent and child<br />

are in the ancestor line, the children of each node are drawn directly<br />

below it, in accordance with traditional tree drawing conventions.<br />

This layout is simple to understand, intuitive, and immediately portrays<br />

the shape and structure of the decision tree being visualized.<br />

The lower-left of the screen is not used by the decision tree, and so<br />

is utilized to show an auxiliary display (see Figure 1).<br />

For exploration purposes, interactivity is of utmost importance.<br />

PaintingClass allows the user to easily navigate the tree by changing<br />

the node in focus. This is done either by clicking on the arrow<br />

on the upper right corner of a projection to bring it into focus, or by<br />

clicking on the arrow on the lower left corner of a projection in the<br />

ancestor line to bring it out of focus. In this case, the parent of the<br />

projection brought out of focus will be the new focus.<br />

PaintingClass counts the number of objects belonging to each<br />

class mapping to each terminal region (i.e., the leaf of the decision<br />

tree). The class with the most number of objects mapping to<br />

a terminal region is elected as the region’s “expected class”. During<br />

classification, any object which finally projects to the region is<br />

111<br />

Figure 1: PaintingClass decision tree visualization. Top-right:<br />

decision tree. Bottom-left: auxiliary display using parallel coordinates,<br />

an alternate view of the data.<br />

predicted that class. Table 1 compares the accuracy of Painting-<br />

Class against the accuracy of popular classifiers for some benchmark<br />

datasets.<br />

From the experimental results, PaintingClass performs well compared<br />

with the other methods. We believe that PaintingClass is an<br />

effective classification and decision tree exploration tool. We hope<br />

that visualization will find increasing use in data mining.<br />

3. REFERENCES<br />

[1] M. Ankerst, M. Ester, and H.-P. Kriegel. Towards an effective<br />

cooperation of the user and the computer for classification.<br />

Proc. 6th Intl. Conf. on Knowledge Discovery and Data<br />

Mining (KDD ’00), 2000.<br />

[2] A. Buja and Y-S. Lee. Data Mining Criteria for Tree-Based<br />

Regression and Classification. Proc. 7th Intl. Conf. on<br />

Knowledge Discovery and Data Mining (KDD ’01), 2001.<br />

[3] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. The KDD<br />

Process for Extracting Useful Knowledge from Volumes of<br />

Data Communications of the ACM 39, 11, 1996.<br />

[4] A. Inselberg. The Plane with Parallel Coordinates. Special<br />

Issue on Computational Geometry: The Visual <strong>Computer</strong>, vol.<br />

1, pp. 69–91, 1985.<br />

[5] E. Kandogan. Visualizing Multi-Dimensional Clusters,<br />

Trends, and Outliers using Star Coordinates. Proc. ACM<br />

SIGKDD ’01, pp. 107-116, 2001.<br />

[6] S.T. Teoh and K.L. Ma. StarClass: Interactive Visual<br />

Classification Using Star Coordinates. Proc. 3rd SIAM Intl.<br />

Conf. on Data Mining (SDM ’03), 2003.<br />

[7] S.T. Teoh and K.L. Ma. PaintingClass: Interactive<br />

Construction, Visualization and Exploration of Decision Trees<br />

Proc. 9th Intl. Conf. on Knowledge Discovery and Data<br />

Mining (KDD ’03), 2003.


Evaluation of Spike Train Analysis using Visualization<br />

Martin A. Walter*<br />

mwalter@plymouth.ac.uk<br />

1 Introduction<br />

Liz J. Stuart*<br />

lstuart@plymouth.ac.uk<br />

Roman Borisyuk* †<br />

rborisyuk@plymouth.ac.uk<br />

* Centre for Neural and Adaptive Systems, School of Computing, University of Plymouth, Plymouth, Devon, UK<br />

† Institute of Mathematical Problems in Biology, Russian Academy of Sciences, Pushchino, Moscow Region 142 290, Russia<br />

Many unresolved issues in the field of Neuroscience are<br />

dependent upon the comprehension of vast quantities of neural<br />

data. Exploration of information processing within the nervous<br />

system depends upon the comprehension of this data. This<br />

research focuses on simultaneously recorded multi-dimensional<br />

spike train data that is used in the investigation of the principle of<br />

synchronisation of neural activity [Borisyuk and Borisyuk 1997].<br />

In order to “mine” this data for inherent information, these data<br />

sets require thorough and diverse analysis. Since these data sets<br />

are large, conventional means of analysis, such as the use of crosscorrelograms,<br />

are insufficient on their own.<br />

Thus, advanced techniques are being developed to exploit new<br />

and traditional analysis methods for larger data sets.<br />

This paper presents the initial evaluation of a method of dealing<br />

with the analysis of relatively large numbers of spike trains, based<br />

upon the cross-correlogram, called the Correlation Grid [Walter et<br />

al. 2003].<br />

2 The VISA Tool<br />

An initial prototype of the visualization tool was presented at the<br />

Neural Coding Workshop in 2001[Stuart et al. 2002(a)]. This tool<br />

is called, VISA, Visualization of Inter-Spike Associations, and it<br />

supports the analysis of multidimensional spike train data. It<br />

supported the use of the gravity transformation algorithm<br />

[Gerstein and Aertsen 1985] and the display of its output data<br />

using parallel coordinates [Inselberg and Dimsdale 1990].<br />

Additionally, these parallel coordinates could be animated over<br />

time [Stuart et al. 2002(b)].<br />

Much software development in the area of Information<br />

Visualization is now designed upon the much-cited “Information<br />

Seeking mantra” introduced by Shneiderman [1996]. This mantra<br />

states the basic requirements of any useful information<br />

visualization system as “Overview first, zoom and filter, then<br />

details on demand”. This mantra has been adopted as the<br />

fundamental premise upon which the VISA tool has been<br />

designed. The main aim of this tool is to enable users to view their<br />

data at different levels of detail, from abstract representations of<br />

the complete data set to specific representations that enable<br />

inspection of individual data items. The latest version of VISA<br />

includes additional numerical methods and visualization<br />

algorithms, including the Correlation Grid [Walter et al. 2003] and<br />

cluster analysis.<br />

3 The Cross-Correlogram<br />

A cross-correlogram is used to visually represent the synchrony<br />

between the spike trains of two neurons. This representation is<br />

plotted as a histogram and represents the spiking activity of one<br />

neuron, designated to be the ‘target’ neuron, with respect to a<br />

second neuron, designated the ‘reference’ neuron.<br />

112<br />

The cross-correlogram is analysed for the existence of<br />

‘significant’ peaks as defined by Brillinger[1979]. The height,<br />

position (with respect to zero) and the number of peaks all help to<br />

determine the type and number of connections, if any, between the<br />

neurons.<br />

A cross-correlogram with a significant peak after zero indicates<br />

that the target neuron has a tendency to spike after the reference<br />

neuron. In contrast, a significant peak before zero would indicate<br />

that the reference neuron tends to spike after the target neuron. A<br />

peak at or near zero indicates the neurons tend to spike<br />

coincidently.<br />

4 The Correlation Grid<br />

The Correlation Gird presents users with an overview of the crosscorrelogram<br />

results, for a number of spike trains.<br />

So, for a given dataset, of n spike trains, all unique crosscorrelograms<br />

are generated, for specified correlation parameters<br />

of bin and window size. Subsequently, the cross-correlograms are<br />

normalised using the Brillinger[1979] method. Finally, the results<br />

of these cross-correlograms are displayed as an n-by-n grid of<br />

grey scale cells, representing the individual correlations between<br />

all pairs of spike trains.<br />

The grid encodes the ‘height’ of the largest peak in each crosscorrelogram.<br />

The peaks are in coded from white, representing no<br />

peak, to black, representing the largest peak in the grid.<br />

The user can select whether to view ‘all peaks’ or just significant<br />

peaks. Significant peaks are those that lie outside of the Brillinger<br />

confidence interval specified for the grid. In addition, the<br />

individual cross-correlograms can be view by simply selecting the<br />

corresponding cell in the grid.<br />

The identification of groups, or clusters, of correlations is key to<br />

understanding the relationships between the underlying neurons. It<br />

is possible to identify these clusters visually, however this is not<br />

easily expanded for problems with larger datasets.<br />

To aid with the identification of correlation clusters, a statistical<br />

cluster analysis method has been implemented. This method uses<br />

the height of the most significant peak, if any exist, of each crosscorrelogram<br />

to build a dendrograph for the correlation gird. This<br />

in turn is used to generate the initial display order of the spike<br />

trains.<br />

5 Correlation Grid Trial<br />

Note that all spike trains used for experimentation were generated<br />

using an enhanced Integrate and Fire generator defined by<br />

Borisyuk and Borisyuk [1997].<br />

For this example, a dataset of fifteen spike trains was generated,<br />

over 2000ms, for the assembly of neurons shown in Figure 5-1.<br />

Figure 5-1 The assembly of 15 neurons


A Correlation Grid for this data was generated, with bin size 3ms<br />

and window size of 100 bins, and cluster analysis applied. This<br />

Grid is shown in Figure 5-2.<br />

Figure 5-2 Correlation Grid of spike train data in Figure 5-1, with<br />

bin size 3ms, window 100<br />

On initial inspection of the Grid, three groups of spike trains are<br />

identifiable, the first, spike trains 1, 5, 6, 8 and 9. The second,<br />

containing spike trains 2, 4, 7, 13, 15 and 11. The third, spike<br />

trains 3, 10, 12 and 14. Based upon experimental experience, it is<br />

possible to infer there are two separate neuronal assemblies,<br />

corresponding to the spike trains in the first and second groups,<br />

and several separate, unrelated neurons, the third group.<br />

On closer inspection of the first group, it is possible to see that the<br />

correlations between spike train 5 and trains 1, 6, 8 and 9 are the<br />

stronger, darker grey, compared to the rest of this group. Note,<br />

auto-correlations, correlation of a spike train with itself, are shown<br />

in the grid and are generally the darkest greys of the whole grid.<br />

It is possible to hypothesize from this that neuron 5 connects to<br />

neurons 1, 6, 8 and 9. In addition, the correlations between spike<br />

trains 1, 6, 8 and 9 are due to these connections, rather than a<br />

separate connection. More over, this hypothesis can be confirmed<br />

by inspecting the cross-correlations of these spike trains. Figure<br />

5-3 shows the correlation between spike trains (a) 5 and 6, (b) 1<br />

and 5 and (c) 1 and 6.<br />

Figure 5-3 Cross-correlograms from grid shown in Figure 5-2 for<br />

spike trains (a) 5 and 6, (b) 1 and 5 and (c) 1 and 6<br />

In Figure 5-3 (a), observe that the correlation between spike trains<br />

5 and 6 is high. Additionally, this correlation peak is to the right<br />

of zero, thus indicating a “positive” delay for the correlation. This<br />

shows that the connection is from neuron 5 to neuron 6, as neuron<br />

6 has a tendency to spike after neuron 5. Likewise, there is a<br />

strong correlation between neurons 1 and 5, see Figure 5-3 (b).<br />

However this correlation has a “negative” delay, indicating a<br />

113<br />

connection from 5 to 1. Contrasting to these correlations, spike<br />

trains 1 and 6, see Figure 5-3 (c), is not relatively strong (note<br />

differing scales) and the peak is around zero. This indicates that<br />

the two neurons have a tendency to spike at the same time, thus<br />

indicating that they are likely to have the same input. In this<br />

scenario, it is highly likely that they both have connections from<br />

neuron 5. The resultant correlations for the other neuron pairs in<br />

the first group are similar to these examples. Therefore, it is<br />

possible to deduce the neuronal configuration for the first group as<br />

that shown in Figure 5-1.<br />

A similar, more extended, yet equally successful approach is taken<br />

to the analysis of the second group. However, due to space<br />

limitations, this analysis is not included here.<br />

From these initial trials, the Grid appears to be useful. Its<br />

overview, filtering and sorting functionality in addition to the<br />

details of individual cross-correlations makes it possible to<br />

“recover” the assembly of neurons from the spike train dataset.<br />

This is very significant to our users as many neurophysiologists<br />

record “real” data in their laboratories that requires the type of indepth<br />

analysis that we have presented in this paper.<br />

6 Future Work<br />

The work presented in this paper is part of a Visualisation project,<br />

led by Dr. Liz Stuart, called Visualisation of Inter-Spike<br />

Associations [VISA]. Specifically, this paper has presented the<br />

preliminary evaluation of an information visualisation method<br />

developed for the analysis of large neural assemblies using crosscorrelation.<br />

Significant development of this visualization method, particularly<br />

with respect to user interaction, is planned. Further empirical<br />

testing is planned and underway.<br />

7 References<br />

BORISYUK R.M., AND BORISYUK G.N., 1997. Information coding<br />

on the basis of synchronisation of neuronal activity.<br />

BioSystems 40, 3-10<br />

BRILLINGER D. R. 1979. Confidence intervals for the<br />

crosscovariance function. Selecta Statistica Canadiana, V,<br />

pages 1-16.<br />

GERSTEIN, G.L. AND AERTSEN, A.M., 1985. Representation of<br />

Cooperative Firing Activity Among Simultaneously Recorded<br />

Neurons. Journal of Neurophysiology 54(6), 1513-15<strong>28</strong><br />

INSELBERG A. AND DIMSDALE B. 1990 Parallel Coordinates: A tool<br />

for visualising multidimensional geometry, in: Proceedings of<br />

Visualization’90, pp. 361-378.<br />

SHNEIDERMAN B, 1996. The eyes have it: A task by data type<br />

taxonomy for information visualizations. Proc. <strong>IEEE</strong><br />

Symposium Visual Language, VL, pages 336-343, 3-6.<br />

STUART L., WALTER M. AND BORISYUK R. 2002 (a). Visualization<br />

of synchronous firing in multi-dimensional spike trains.<br />

BioSystem 67:265-279<br />

STUART, L., WALTER, M. AND R. BORISYUK 2002 (b),<br />

Visualisation of Neurophsyiological Data, Presented at the 8th<br />

<strong>IEEE</strong> International Conference on Information Visualization.<br />

VISA Visualization of Inter-spike Association project website<br />

http://www.plymouth.ac.uk/infovis/<br />

WALTER M., STUART L. AND BORISYUK R. 2003 Spike Train<br />

Correlation Visualization, in: Proceedings of 7th International<br />

Conference on Information Visualization, pp. 555-560


Abstract<br />

Interactive Poster: Tree3D – A System for Temporal and Comparative<br />

Analysis of Phylogenetic Trees<br />

Eric A. Wernert, Donald K. Berry, John N. Huffman, Craig A. Stewart<br />

University Information Technology Services, Indiana University, Bloomington, IN<br />

{ewernert, dkberry, jnhuffma, stewart}@indiana.edu<br />

This paper describes a novel system for comparative visualization<br />

of phylogenetic trees that result from computational analyses of<br />

genetic sequence data. The system makes judicious use of 3D<br />

graphics to present a spatial arrangement of 2D trees in a 3D<br />

environment. It also supports interactive selection, highlighting,<br />

and linking of common nodes between trees. This presentation<br />

method scales well to multiple trees and facilitates understanding<br />

of evolving computations at a high level while also permitting<br />

interactive manipulation and analysis at a detailed level.<br />

1. Introduction<br />

A phylogenetic tree is a depiction of a hypothesized course of<br />

evolution representing relationships among a set of species. The<br />

development of an objective means for inferring phylogenetic<br />

trees is an important problem in evolutionary theory and<br />

bioinformatics. Advances in the availability of DNA sequence<br />

data and mathematical methods have given rise to a number of<br />

computational techniques for analyzing relationships among<br />

organisms, genes, and gene products (referred to here as taxa.)<br />

Creating visual representations of phylogenetic trees is a natural<br />

way to understand the relationships among the taxa, and a number<br />

of useful tree-drawing programs are readily available. However,<br />

even modest increases in the number of organisms and the length<br />

of sequences result in a combinatorial explosion in the number of<br />

plausible trees and significant growth in the computational<br />

requirements of these algorithms. Most existing tree-drawing<br />

tools are poorly suited to helping researchers understand the<br />

computational nature of their analyses or to aiding them in<br />

conducting visual comparisons of their results.<br />

Biologists and information technology professionals at Indiana<br />

University have been actively involved in the development of<br />

scalable parallel implementations of maximum likelihood<br />

methods based on the fastDNAml code [4]. In conjunction with<br />

this research, we have developed a visualization system called<br />

Tree3D that addresses two distinct needs of our users: (1) the need<br />

to visualize the progress of an ongoing phylogenetic computation,<br />

and (2) the need to visually compare the final candidate trees<br />

resulting from different initial conditions (bootstraps) of the<br />

computation.<br />

2. Related Work<br />

Methods and tools for visualizing phylogenetic trees have<br />

received a growing amount of attention in the past several years.<br />

There have been several notable implementations that address<br />

some important aspects of the overall problem.<br />

TreeWiz was one of the first systems to provide interactive<br />

drawing of very large trees (several tens of thousands of nodes)<br />

and facilitates zooming by spawning additional windows [3].<br />

TreeJuxtaposer provides an even greater level of scalability (over<br />

100,000 nodes) and provides focus+context navigation of the<br />

114<br />

entire data set in a single window. It also facilitates the side-byside<br />

comparison of small numbers of trees [2]. A system by<br />

Amenta and Klingner provides a novel method for coarse<br />

comparison of large numbers of trees by computing a distance<br />

metric between trees and transforming each into a point space [1].<br />

In contrast with these systems, the nature of our data and methods<br />

necessitated a visualization technique that would allow us to<br />

greatly scale the number of trees that could be viewed<br />

concurrently. (The number of nodes in any given tree would be<br />

relatively small – less than 200.) We also needed to retain a<br />

standard rectilinear tree representation that would to allow users<br />

to examine and interact with individual taxa or subtrees.<br />

3. The Tree3D System<br />

To address the specific requirements of our problem, we designed<br />

the Tree3D system to make judicious use of 3D graphics to<br />

present multiple 2D planar tree representations in a single, unified<br />

view. The system provides functionality analogous to (although<br />

not identical to) the established 2D information visualization<br />

methods of focus+context and brushing & linking. The following<br />

sections describe the key interactive analysis features of the<br />

system as well as some important usability considerations.<br />

3.1 Analysis Features<br />

3D Layout. Depending on the number of trees and the nature of<br />

the task, users may select between several available layouts,<br />

including side-by-side for direct comparison of two trees, a radial<br />

arrangement for small numbers of trees (8 or fewer), and a linear<br />

arrangement for larger numbers of trees or for temporal sequences<br />

of trees representing an evolving computation. We have found the<br />

linear layout to be the most general and extensible for our<br />

applications.<br />

Node and Subtree Tracing. The most important task of the<br />

analysis process is the identification of common nodes between<br />

the different trees. Users can interactively select individual nodes<br />

or subtrees in any of the trees, and the system will locate the<br />

corresponding nodes in the other trees and will highlight and<br />

graphically link them in a selected color.<br />

Branch Swapping. This feature allows the user to interactively<br />

pivot a subtree in order to disambiguate trees that are<br />

topologically different from those that only appear different<br />

because of reversed branch orderings.<br />

Length Measurements. Users have the option of encoding<br />

branch lengths (evolutionary distance) in the trees. If encoded,<br />

users can interactively measure the distance between any two<br />

nodes in a tree.<br />

3.2 Usability Features<br />

In addition to the analysis features, special attention was given to<br />

critical aspects of the user interface to make interaction with the<br />

3D representation easier for novice users.


3D Navigation. By default, users are presented with an<br />

orthographic view with simple widgets for panning, zooming, and<br />

rotation. Users may also select from several canonical camera<br />

positions to view the entire data set, or may choose a focused<br />

view for any specific tree. We have found that orthographic top<br />

and side views are reminiscent of parallel coordinate plots and are<br />

highly effective and intuitive overview presentations for users.<br />

Experienced users are able to opt for full 3D camera interaction,<br />

perspective views, and stereographic display.<br />

Viewing Parameters. Users can control the location of clipping<br />

planes to restrict the view to a subset of the trees. Likewise the<br />

transparency of the tree planes can be adjusted to control how<br />

much or little of the other trees are visible through frontal views.<br />

Temporal Decay. Temporal displays of running computations<br />

may generate thousands of trees, of which only the most recent<br />

several dozen are of particular interest. The system supports an<br />

option for automatic culling of older trees along with automatic<br />

viewpoint updating.<br />

Level of Detail and Text Labels. In order to maintain interactive<br />

frame rates, the system utilizes traditional 3D level-of-detail<br />

methods based on camera distance metrics. In addition, we<br />

carefully employ 2D (screen space) text to guarantee visibility of<br />

globally important labels while modeling less essential labels as<br />

3D (distance scaled) text.<br />

4. Conclusions and Ongoing Work<br />

The Tree3D system makes careful use of 3D graphics to permit<br />

interactive viewing, tracing, and manipulation of multiple 2D tree<br />

structures. It has proven to be an effective system for real-time<br />

visualization of running phylogenetic computations as well as for<br />

detailed, off-line comparison of candidate trees.<br />

The original system was implemented in 1998 using Open<br />

Inventor and accepted data for binary trees in the Newick format.<br />

We are currently revising the system to work on multiple<br />

platforms and to support more general tree topologies using XML<br />

encoding. In addition, we are enhancing the scalability and<br />

usability of the system by leveraging texture-based rendering and<br />

by incorporating multiple coordinated viewports. These<br />

enhancements will enable us to conduct more rigorous<br />

explorations into the performance and effectiveness of the system.<br />

Acknowledgements<br />

Portions of this work were made possible through the Indiana<br />

Genomics Initiative (INGEN) of Indiana University. INGEN is<br />

supported in part by the Lilly Endowment, Inc.<br />

References<br />

[1] Amenta, N., and Klingner, J. (2002) Case study: Visualizing sets of<br />

evolutionary trees. In Proceedings of InfoVis 2002.<br />

[2] Munzner, T., Guimbretiere, F., Tasiran, S., Zhang, L., and Zhou, Y.<br />

(2003) TreeJuxtaposer: scalable tree comparison using focus+<br />

context with guaranteed visibility. In Proceedings of ACM<br />

SIGGRAPH 2003, 453-462.<br />

[3] Rost, U., and Bornberg-Bauer, E. (2002) TreeWiz: interactive<br />

exploration of huge trees. Bioinformatics 18, 1, 109-114.<br />

[4] Stewart, C. A., Hart, D., Berry, D. K., Olsen, G. J., Wernert, E., and<br />

Fischer, W. (2001) Parallel implementation and performance of<br />

fastDNAml – a program for maximum likelihood phylogenetic<br />

inference. In Proceedings of SuperComputing 2001.<br />

115<br />

Figure 1. Left: 2D view of a phylogenetic tree with branch length<br />

encoding and depth-based coloring. Right: The tree embedded in a 3D<br />

perspective view as part of an evolving computation. Note the<br />

proportional temporal encoding and timing labels in the depth dimension.<br />

Figure 2. An evolving computation (without temporal or branch-length<br />

encoding) where tree size increases as taxa are added. Left: An oblique<br />

view with taxa of interest selected and traced. Right: A top-down view<br />

showing how selected taxa maintain or change depth over time.<br />

Figure 3. An orthogonal side view of a comparison of ten bootstrap<br />

candidate trees. Note how the selected subgroups of taxa are maintained<br />

between trees, although with different orderings.<br />

Figure 4. Alternate layout schemes. Left: A side-by-side comparison of<br />

two trees. Right: Top-down view of a cylindrical arrangement of 7 trees.


INFOVIS 2003 Contest<br />

The InfoVis 2003 Contest is a new submission category. Our goal in organizing the contest was to initiate the<br />

development of benchmarks for information visualization, establish a forum to promote evaluation methods and focus<br />

the evaluation community, and create an interesting new event at the conference.<br />

The topic of the 2003 contest was the Pair Wise Comparison of Trees. We provided three sets of trees: phylogenies<br />

(about 60 nodes), biological classifications (200,000 nodes) and file systems (70,000 nodes with many attributes). See<br />

www.cs.umd.edu/hcil/IV03contest for more details. Contest participants had to show how their tools could help<br />

analyze these datasets. They had to demonstrate how users would accomplish the specified task. Some tasks were<br />

generic tree analysis tasks, others were application specific. Many tasks were high-level and open-ended, others<br />

low-level. Although we aimed to be fairly exhaustive in our list of tasks and dataset types, we also encouraged partial<br />

answers because we were interested in seeing both versatile tools and narrowly focused ones illustrating interesting<br />

techniques. Authors were asked to provide a two-page summary, a video illustrating the interactive technique used, and<br />

a webpage providing details about how the tasks could be accomplished.<br />

We received eight submissions originating from six countries. We selected three entries ranked "First Place" and three<br />

"Second Place". Reviewing the submissions made it clear that the analysis of the datasets was challenging and that<br />

participants had worked hard to prepare their submissions. Some submissions originated from communities that may<br />

not have ordinarily participated in the conference, but had tools that addressed the particular tasks we had chosen.<br />

Those authors will bring different perspectives to the conference and we expect that the intellectual exchange that will<br />

take place during the conference will increase the quality of future tools.<br />

We want to thank Cynthia Parr, a biologist at the University of Maryland and Elie Dassa from the Institut Pasteur in<br />

Paris for helping us create the datasets. We also thank Anita Komlodi from U<strong>MB</strong>C and Cyndy Parr for participating in<br />

the review process.<br />

The contest is only a first step. The revised materials of the contestants are now available in the InfoVis repository<br />

hosted at the University of Maryland (www.cs.umd.edu/hcil/InfovisRepository). We encourage information<br />

visualization practitioners to continue exploring the datasets and tasks and to submit their analysis and results to the<br />

repository as well, allowing a more comprehensive review of the techniques available.<br />

This first year has been encouraging. We hope that the Information Visualization community will continue to develop<br />

benchmarks datasets and tasks, and grow the repository of techniques and results. We look forward to this new event at<br />

the conference and to hearing your feedback and suggestions.<br />

InfoVis 2003 Contest Co-Chairs:<br />

Catherine Plaisant, HCIL, University of Maryland , USA<br />

Jean-Daniel Fekete, INRIA, France<br />

117


1 Abstract<br />

James Slack ∗<br />

University of British Columbia<br />

TreeJuxtaposer is a tool for interactive side-by-side tree comparison.<br />

Its two key innovations are the automatic marking of topological<br />

structural differences, and the guaranteed visibility of marked<br />

items. It uses the AccordionDrawer approach for layout and navigation,<br />

a multifocus global Focus+Context approach where stretching<br />

one part of the tree or screen causes the rest to shrink, and vice<br />

versa. Progressive rendering guarantees immediate interactive response<br />

even for large trees.<br />

2 Introduction<br />

We showcase TreeJuxtaposer in our contest entry, a system recently<br />

created by one of the authors for the visual comparison of large<br />

trees [Munzner et al. 2003]. Our target audience was biologists<br />

comparing phylogenetic trees, so we were delighted by the topic<br />

choice of this year’s inaugural InfoVis contest.<br />

Although our tool is specifically tuned for the needs of evolutionary<br />

biologists, we have asserted that it is applicable in a wide<br />

variety of domains. We were pleased to back up this assertion with<br />

strong results for many of the file system log data questions.<br />

The TreeJuxtaposer [Munzner et al. 2003] interface is built<br />

around navigation by growing and shrinking areas. It also supposes<br />

very fast querying by mousing or keyboard across a dense visual<br />

representation of the tree. We compute the “best corresponding<br />

node” from one tree to another, and use this information both for<br />

linked highlighting and determining exact areas of structural difference<br />

to be marked,<br />

3 Strengths<br />

Our system has many strengths. From the first startup image alone,<br />

we can immediately answer many questions because we explicitly<br />

mark the exact places where structural differences occur. We can<br />

instantly characterize whether changes include additions or deletions<br />

to the leaves, based on whether the red difference marks occur<br />

in the leaves (additions/deletions) or solely in the interior (moving<br />

around existing nodes rather than adding or deleting them) as<br />

different. For example, we immediately saw that the classification<br />

datasets are almost all additions and deletions, and the phylogenetic<br />

dataset changes are all the result of moves with no adds/deletes.<br />

Linked highlighting is also a powerful feature when interacting<br />

with the trees, especially in conjunction with our design decision to<br />

use a very dense visual representation of the tree and support extremely<br />

fast mouse over pop-up highlighting (the latter using frontbuffer<br />

drawing tricks to avoid requiring a full redraw). The video<br />

shows how simply moving the mouse around the screen for a few<br />

seconds imparts a great deal of information about the structure at<br />

both high and and low ranks. When the trees are quite different, the<br />

pop-up highlight on the other side skitters about a great deal. For<br />

similar trees, the linked highlight is more sedentary.<br />

∗ Email: jslack@cs.ubc.ca<br />

† Email: tmm@cs.ubc.ca<br />

‡ Email: francois@cs.umd.edu<br />

TreeJuxtaposer InfoVis Contest Entry<br />

Tamara Munzner †<br />

University of British Columbia<br />

118<br />

François Guimbretière ‡<br />

University of Maryland<br />

The core navigation paradigm is growing and shrinking areas, allowing<br />

multiple focal areas to support inspection of multiple spots<br />

within a single tree. A particularly powerful feature is the ability<br />

to simultaneously grow or shrink every item in a marked group.<br />

Linked navigation is heavily used, because usually seeing the corresponding<br />

areas grow and shrink in “slave” mode while interacting<br />

with a “master” tree. Although we do support unlinked navigation,<br />

we note that it is used only rarely.<br />

Guaranteed visibility of marked areas is one of the major reasons<br />

for our success at the contest tasks. For instance, the incremental<br />

search is useful even for the full classification dataset of<br />

200K nodes, because a marked leaf is visible even from the global<br />

overview level. Guaranteed visibility is extremely helpful for comparison<br />

tasks, because the alternative is exhaustive search. Without<br />

guaranteed visibility, it is hard to know when to stop hunting for<br />

marks; marks could lie outside the viewport, be occluded by other<br />

objects, or even if these two constraints are met they might be invisible<br />

simply because they are culled when they project to less than<br />

one pixel of screen area. We conjecture that guaranteed visibility<br />

dramatically shortens the time required for exploration and analysis.<br />

However, we have no empirical proof because we did not test a<br />

second person with the tasks using a version of TreeJuxtaposer that<br />

disabled guaranteed visibility.<br />

This operation is a very quick way to understand structural differences<br />

and we do it extensively to answer the contest questions.<br />

Also, the incremental search function is a marking approach heavily<br />

used in our answers, because it shows the results situated in their<br />

usual context rather than out of context. The incremental search extension<br />

provided fine control of TreeJuxtaposer that did not exist<br />

in the previous work [Munzner et al. 2003] since it allows users to<br />

search for nodes by name. See, for example, how Figure 1 shows<br />

all nodes found with ”dolphin” in their common name. The partial<br />

matching provides the power to seek any node by substring matching<br />

and visually represents the found nodes with guaranteed visibility<br />

and negligible run-time or start-up overhead. In addition to<br />

marking nodes known by name, the searching interface allows users<br />

to browse through the search results and selectively mark nodes<br />

from the search if any undesirable search result occurs. Another<br />

improvement from the previous paper is changing the progressive<br />

rendering algorithm to use multiple seeds for the rendering queue<br />

rather than starting only with the focus cell of the last user interaction.<br />

We now add the first few items from each of the marked<br />

groups in the queue when starting a frame, so it is easier to maintain<br />

context when interacting with large datasets.<br />

4 Weaknesses<br />

One major weakness is that we make no attempt to handle attributes,<br />

so we leave several questions unanswered. If we had the<br />

time to spare, we could have implemented an interface where various<br />

attributes could be manipulated: marked with colors, and grown<br />

or shrunk. The internal infrastructure of TreeJuxtaposer would easily<br />

support this functionality, since would use the same underlying<br />

mechanism as our current interface that allows interactive manipulation<br />

of groups. Although we already have the infrastructure and<br />

the required parser would be straightforward, it would not be trivial


Figure 1: Result of an incremental search query on “dolphin” in<br />

classif A, common names, with all results grown<br />

to create a usable user interface for this sort of exploration.<br />

Although TreeJuxtaposer is very powerful for a fairly large set<br />

of tasks, it is not a flexible or general-purpose system. For instance,<br />

we do not support editing at all. Another weakness is the current<br />

lack of undo support or history tracking.<br />

We were able to load and interact in real-time with a single large<br />

classification tree of two hundred thousand nodes. However, we<br />

were not quite able to load both huge trees at once for side by side<br />

comparison. (We ran out of memory: an unfortunate java limitation<br />

is that on 32-bit machines the heap size cannot grow past 1.8GB.)<br />

We thus answer all of the classification comparison questions for<br />

the Mammalia subtree only.<br />

5 Contest Results and Discussion<br />

5.1 Phylogenies<br />

The trees in the phylogenies tasks were handled easily by TreeJuxtaposer.<br />

The structural differences were no problem for the difference<br />

computations and TreeJuxtaposer noted that no leaf nodes<br />

were added or deleted between the sample trees provided. The input<br />

order of the nodes did not affect the final matching of TreeJuxtaposer,<br />

only the top-to-bottom drawing order.<br />

We found that the structural differences in the internal nodes are<br />

varied; some subtrees we chose to mark in phylo A ABC were very<br />

spread out as forests in phylo B IM while other subtrees only differed<br />

by a slight perturbation in structure. The largest subtree we<br />

were able to find in the unmodified trees (we did not use the property<br />

of these trees being unrooted) was five levels deep.<br />

5.2 Classification Trees<br />

Unlike the trees in the phylogenies tasks, the classification trees had<br />

more lower-level differences in structure while larger subtrees (such<br />

as rodentia) were not being classified differently between trees. The<br />

classification differences were mostly additions and deletions (classif<br />

B had 7717 leaf nodes more than classif A, each tree has over<br />

two hundred thousand nodes) but some other types of structure<br />

changes such as the one in Figure 2 were also noticed.<br />

The differences when comparing the Latin versus the common<br />

naming conventions in the classification trees were also quite interesting.<br />

The common names were not consistent and produced<br />

119<br />

Figure 2: Structure movement shown by marked subtrees: classif A<br />

on the right and classif B on the left<br />

many differences (most of both trees were marked different, which<br />

did not provide useful information) while the Latin names provided<br />

a better insight into the subtree changes in the overall tree structure.<br />

The interactive mouse navigation and browsing features of TreeJuxtaposer<br />

are easy enough to use to find any animal with knowledge<br />

about basic animal physiology.<br />

5.3 File System Logs<br />

We were able to concurrently manipulate all four logs after reducing<br />

the set of the logs to the /projects/hcil as well as do pair-wise<br />

comparisons on each of the full trees. There were fewest differences<br />

between logs A and logs B so they were the most interesting<br />

to compare in a pair-wise manner. Each of the differences noticed<br />

in the four-way comparison were examined in detail.<br />

Determining which directory contained the largest number of<br />

files (leaf nodes) was easy with these data files since there were<br />

a few leaf-quantity-superior main directories in the file structure.<br />

Immediately after loading a log file, the biggest directories (users,<br />

class) pop out with their leaf nodes fanning out on the right side<br />

of the tree; this puts visually attractive large gaps between the big<br />

directories and their smaller neighbors.<br />

6 Conclusions<br />

TreeJuxtaposer is useful in automated and interactive tree comparison.<br />

The simplicity of the interface and the fluidity of the interaction<br />

allows users to concentrate on the more interesting tasks such<br />

as the ones provided by this contest. TreeJuxtaposer is flexible<br />

enough to handle many different types of tree structures as well<br />

as compare several trees side by side. Although the current toolset<br />

for TreeJuxtaposer lacks utilities for full attribute analysis, it’s<br />

clear that interface modifications will provide an attribute-capable<br />

comparison tool with the infrastructure that we have provided.<br />

References<br />

MUNZNER, T., GUI<strong>MB</strong>RETIÈRE, F., TASIRAN, S., ZHANG, L., AND<br />

ZHOU, Y. 2003. TreeJuxtaposer: Scalable tree comparison using Focus+Context<br />

with guaranteed visibility. In Proc. SIGGRAPH 2003.


Zoomology: ComparingTwo Large Hierarchical Trees<br />

Jin Young Hong, Jonathan D’Andries, Mark Richman, Maryann Westfall<br />

College of Computing<br />

Georgia Institute of Technology<br />

ABSTRACT<br />

Zoomology compares two classification datasets. In our<br />

solution the two trees are merged into a single overview,<br />

which unfolds top to bottom and left to right. Color<br />

represents rank, and the width of a classification node<br />

corresponds to the number of its descendants. Matched<br />

twin detail windows allow similarities and differences to be<br />

compared as the user navigates their hierarchies via a<br />

zoomable interface.<br />

THE ZOOMOLOGY APPROACH<br />

Our approach is a hybrid of several known techniques built<br />

within an overview and detail framework. The overview<br />

provides the “big picture” while the detail view explores<br />

substructures and nodes within the hierarchies.<br />

Overview<br />

Zoomology’s overview is intended to show structure, show<br />

navigation in context, and indicate regions of difference.<br />

This is accomplished in a single view by constructing the<br />

union of the two trees. Both trees are mapped to a spacefilling,<br />

multitree representation of the structure, as<br />

proposed by Furnas and Zacks [2]. Because the data sets<br />

are so similar—we found them nearly 90% the same—<br />

drawing both hierarchies side-by-side unnecessarily repeats<br />

most of the structure. Instead, we draw the union of both<br />

trees with areas of change marked in white. As shown in<br />

figure 1, this makes changed regions immediately<br />

identifiable.<br />

Figure 1. Zoomology’s space-filling overview of the<br />

multitree. White areas show difference between the trees.<br />

The path of dots shows navigation in the detail view.<br />

In the overview, a given node is allocated a percentage of<br />

available space based on the number of nodes beneath it in<br />

the hierarchy. This guarantees that those portions of the<br />

tree that contain the most nodes are allocated the most<br />

space and that the smallest portions will still be drawn in at<br />

least one pixel. Dots drawn on top of the overview show<br />

the path of all nodes traversed to the detail views.<br />

120<br />

Detail<br />

Zoomology’s detail view simulates a top-down navigation<br />

of the trees. Figure 2 depicts an immediate comparison of<br />

the two data sets as seen from the root node. The yellow<br />

circle represents kingdom Animalia, each deep red circle<br />

within represents a phylum, and the circles inside each<br />

phylum represent its children, color-coded by rank.<br />

Figure 2. Zoomology’s detail view comparison (data set A<br />

on the left and data set B on the right)<br />

In the detail view, spatial encoding distinguishes levels.<br />

Root nodes enclose child nodes, which then enclose their<br />

own children. As the user zooms into the next level, the<br />

current level outgrows the screen, revealing the next.<br />

Zooming out shrinks the current level and shows the prior.<br />

To represent twenty different ranks with easily<br />

distinguishable colors we assigned the same base color to<br />

all levels within a major rank and varied subrankings by<br />

differing their tints. As children rarely reside more than two<br />

ranks from their parent, few colors are needed in the detail<br />

view, avoiding visual clutter.<br />

Navigation in Zoomology is like flying through a tunnel.<br />

The user picks a direction and zooms towards more detail<br />

in the selected region. As a default, zooming in one tree<br />

zooms the other to the corresponding location. However,<br />

the windows can be unlinked to explore a single dataset.<br />

Figure 3: Zooming into the details.


Differences between the hierarchies are highlighted by<br />

position and border color. To facilitate comparison, enough<br />

space is allocated in each detail window to render the union<br />

of all nodes visible in both trees. Nodes existing in one tree<br />

but absent in the other are drawn with a white border. An<br />

empty space in a particular position indicates a node that is<br />

absent in that tree but present in the other. In this way, a<br />

unique node is highlighted in the correct tree and<br />

conspicuously absent in the other.<br />

The Smart Legend<br />

Centered between the detail windows is the “smart” legend,<br />

which maps all ranks to their associated colors and also<br />

records path data. The path marked in white to the left of<br />

the legend bar shows the nodes traversed en route to the<br />

view in the left detail window (data set A) and the right<br />

path to the right window (data set B). Path information<br />

helps identify the current level in the tree, the nodes<br />

traversed, and the difference between paths in the datasets.<br />

In the overview and legend, different shapes are used to<br />

mark the navigation paths of the different databases.<br />

Interaction of Overview and Detail<br />

Interaction between the overview and detail views<br />

enhances Zoomology’s usefulness. As users click on<br />

regions of interest in the overview, the detail view<br />

smoothly pans and zooms to that area. Context and<br />

navigation history, missing from the detail views, are<br />

provided both in the legend and the overview. The legend<br />

shows the type of ranks traversed, and the overview shows<br />

specific location within the hierarchy. This combination of<br />

overview, detail, and legend helps overcome the limitations<br />

of each view by itself.<br />

Sample Tasks<br />

• Explore difference between the hierarchies:<br />

Zoomology promotes ad-hoc exploration of differences<br />

between datasets. A white area represents change in the<br />

overview. Clicking in the area zooms the detail windows to<br />

that location in each tree. White also represents change in<br />

the detail windows. A white circle around any node<br />

indicates that it either does not exist in the other dataset, or<br />

that it exists in another location. Each node encloses its<br />

own children and a white border around any of these<br />

“grandchild” nodes indicates difference in the same<br />

manner. Clicking on the name of any node will locate it in<br />

the other tree. The path to each is shown in the overview<br />

and the levels of all ancestor nodes in each dataset are<br />

marked in the smart legend.<br />

• Find Spirulida in both trees and show its genealogy:<br />

The user selects “Latin Name” and enters “Spirulida” or<br />

selects it with the alphaslider. The active detail window<br />

zooms to the named node, its path is shown in the overview<br />

and the levels of its ancestor nodes are marked in the smart<br />

legend. If a white circle surrounds the node, clicking its<br />

name will locate it and mark its path in the other dataset.<br />

121<br />

• Locate all turtles:<br />

The user selects “common name” and enters “turtle.” A<br />

window appears showing all nodes with turtle as part of<br />

their common name, and the location of each is marked in<br />

black on the overview.<br />

RELATED WORK<br />

Our framework is similar to Pad++ [1]. Zoomology<br />

exploits zooming techniques employed in GVis, a tool for<br />

visualizing genome data [3]. The commercial product<br />

Grokker [4] uses a similar circular-container zooming<br />

methodology, but it is targeted for more general data sets.<br />

Limitations<br />

Future work on Zoomology could address some of its<br />

current limitations:<br />

• Intermediate Overview: Allowing the user to enlarge<br />

parts of the tree structure would ease some of the<br />

problems we have seen at the overview’s lower levels.<br />

It would allow accurate mapping of size and color for<br />

lower-level nodes, and allow the user to select one of<br />

these for detail view. Areas of difference between trees<br />

could be colored to indicate rank and bordered to<br />

represent the tree of origin. An intermediate overview<br />

would allow the user identify all nodes that share a<br />

common name and could aid comparison of subtrees.<br />

• Qualitative Differences: There is no way to discern<br />

between minor changes such as the insertion of a<br />

single node and major revisions such as changes<br />

throughout an entire branch. There is no indication of<br />

qualitative vs. quantitative change.<br />

• Multilevel details: The detail view cannot compare<br />

substructures spanning multiple levels.<br />

• Other Contest Tasks: Our solution would not apply to<br />

a dataset where change is marked by link length.<br />

However, it could easily serve as the basis of a system<br />

to examine file system differences.<br />

REFERENCES<br />

1. Bederson, B., Hollan, J., Perlin, K., Meyer, J., Bacon,<br />

D., and Furnas, G. (1996). Pad++: a zoomable graphical<br />

sketchpad for exploring alternate interface physics.<br />

Journal of Visual Languages and Computing, 7, 3-31.<br />

2.Furnas, G. and Zacks, J. (1994). Multitrees: enriching<br />

and reusing hierarchical structure. Proceedings of the<br />

CHI 1994 Conference on Human Factors in Computing<br />

Systems. 330-336.<br />

3. Hong, J., Shaw C., and Ribarsky W. (2003). GVis: a<br />

scalable visualization framework for genomic data.<br />

Georgia Institute of Technology, unpublished.<br />

4. Groxis Inc (2003). Grokker 1.0.<br />

http://www.groxis.com/cgi-bin/grok/


Abstract<br />

Visualization of Trees as Highly Compressed Tables with InfoZoom<br />

This paper describes the application of our<br />

data analysis tool InfoZoom to the tree<br />

structured data supplied for the InfoVis<br />

2003 Contest. InfoZoom was not especially<br />

designed for the analysis of trees, but is a<br />

general tool for visualization and exploration<br />

of tabular databases. Nevertheless it is<br />

well suited for the analysis and pair wise<br />

comparison of trees.<br />

CR Categories: H.5.2 [Information Interfaces<br />

and Presentation]: User Interfaces –<br />

Graphical User Interfaces (GUI).<br />

Keywords: Information visualization, interactive<br />

data exploration, user-interfaces for<br />

databases.<br />

1 InfoZoom as a Tree Browser<br />

InfoZoom displays database relations in<br />

tables with attributes as rows and objects as<br />

columns. Therefore, we had to transform<br />

the XML trees to a tabular representation.<br />

Each leaf of the tree, i.e. each animal,<br />

constitutes a column of the table. The path<br />

from a leaf to the root is stored in the attributes (rows) of the<br />

table. Since we display both of the trees A and B side by side, our<br />

table contains more than 300,000 columns.<br />

The basic concept InfoZoom is to compress even such large tables<br />

by reducing the column width until all columns fit on the screen<br />

(Figure 1). The column width is here about 0.002 pixels. Special<br />

techniques are used to make such highly compressed tables<br />

readable. The most important is that neighboring cells with<br />

identical values are combined into one larger cell. Because there<br />

are 150,000 adjacent columns with the value A for the attribute<br />

Tree, A is displayed only once in a large cell. The width of a cell<br />

indicates the number of consecutive objects with this value.<br />

Therefore, we can conclude from Figure 1, that the trees A and B<br />

have roughly the same number of leaves.<br />

--------------------------------------------<br />

* e-mail: michael.spenke@fit.fraunhofer.de<br />

† e-mail: christian.beilken@fit.fraunhofer.de<br />

Michael Spenke * and Christian Beilken †<br />

FIT – Fraunhofer Institute for Applied Information Technology<br />

122<br />

Figure 1. The two animal trees as a table<br />

We can also observe that the Arthropoda mainly consist of Crustacea<br />

and Hexapoda and that nearly all Insecta are Pterygota. The<br />

two trees look quite similar at this level of detail. However, the<br />

two cells for Chordata have different sizes. Obviously, in tree B<br />

there are more Chordates than in A. So it is a good idea to take a<br />

closer look at the Chordates. Pressing the zoom-in button or<br />

double-clicking one of the marked cells leads to an animated<br />

zoom on the Chordates: The black cells grow while the other cells<br />

in this line slowly disappear. After another zoom on Mammalia<br />

and Reptilia the result shown in Figure 2 is reached.<br />

Figure 2. The two trees contain different mammals and reptiles<br />

We can see that there are more mammals and reptiles in tree B.<br />

The group of 4,582 Sauria is completely missing in A. On the<br />

other hand the subclass Theria and the infraclass Eutheria are<br />

missing in B. By further zooms e.g. into the Mammals we can<br />

now analyze the differences in more detail.


2 Systematic Analysis of the Differences<br />

Like the formula-cells in a spreadsheet program, derived summary<br />

attributes (like sum, count, list, average, maximum, etc.) can be<br />

defined which are automatically updated by InfoZoom whenever<br />

necessary.<br />

Figure 3. Which animals can be found in both trees<br />

In Figure 3 we have defined a derived attribute Count(Tree) per<br />

Latin Name. It shows that most animals can be found in both<br />

trees. However, 12,789 animals appear in only one of the trees.<br />

We zoom on these by double-clicking the marked cell and get a<br />

result similar to that in Figure 2.<br />

Next we want to find all animals which exist in both trees, but<br />

with a different classification. Therefore, we define a new derived<br />

attribute Latin Path as the concatenation of all levels of the<br />

classification and the Latin Name. We get pathnames like<br />

Annelida/Polychaeta/Palpata/Fauveliopsidae/Flota/Flota flabelligera<br />

Now we can determine where there are two different pathnames<br />

for one Latin Name (Figure 4).<br />

Figure 4. Which animals are differently classified in each tree<br />

After a zoom on the marked cell only the 7,488 animals with 2<br />

different paths remain visible. The result is shown in Figure 5.<br />

Figure 5. Some sub- and infraclasses are used only in tree A<br />

As we can see, the main reason for different paths is that some<br />

subclasses and infraclasses are not used in tree B at all.<br />

Using the derived attribute Count(Phylum) per Latin Name, we<br />

detected that the 17 animals of Genus Apus, even belong to two<br />

different Phylums, namely Chordata in A, but Arthropoda in B!<br />

Also, 3,429 birds are classified in different families.<br />

A full-text search in some or all attributes is also possible. In<br />

Figure 6 the result of the search for “horse” in tree A is displayed.<br />

Matching values are highlighted at many different levels of the<br />

tree. An automatic zoom-operation has already been performed on<br />

all animals, where at least one cell is marked. This corresponds to<br />

a disjunction like<br />

Common Name in {American horsemussel,...,velvet horse crab}<br />

or Phylum = horsehair worms<br />

or Class = horseshoe crabs<br />

or Suborder = seahorses<br />

or Family in {horse crabs, horsefish, horses, seahorses}<br />

or Genus in {horses, redhorses, seahorses}<br />

or Species in {northern horsemussel, shorthead redhorse}<br />

123<br />

3 Conclusion<br />

Even though InfoZoom is a general tool for database analysis, it<br />

turned out that it is well suited for the analysis of trees. There is a<br />

natural mapping from trees to the stacked table cells of InfoZoom:<br />

The zoom mechanism allows to focus<br />

on any subset of the tree. In<br />

large trees, cells are often too small<br />

to read, but this complies with the<br />

zoom metaphor: small objects may be invisible from a distance.<br />

A small weakness of InfoZoom is that it cannot directly import<br />

the XML files. We had to write a simple transformation program.<br />

4 Related Work<br />

The TableLens [Rao and Card 1994] is the only approach we<br />

know which also uses the basic idea of compressing database<br />

tables until they completely fit on the screen. While InfoZoom<br />

displays each record in a column, in TableLens each row contains<br />

a record. Therefore, the TableLens cannot use the technique of<br />

uniting adjacent cells with identical values, which is vital to make<br />

textual values readable.<br />

References<br />

Figure 6. Result of a full-text search for “horse”<br />

RAO, R. AND CARD, S. K. 1994. The Table Lens: Merging Graphical and<br />

Symbolic Representations in an Interactive Focus+Context Visualization<br />

for Tabular Information. In Proceedings of the ACM SIGCHI Conference<br />

on Human Factors in Computing Systems (Boston, MA, Apr 24–<strong>28</strong>,<br />

1994), pp. 318–322.<br />

SPENKE, M.; BEILKEN, CHR. 2000. InfoZoom - Analysing Formula One<br />

racing results with an interactive data mining and visualization tool In:<br />

Data mining II / Ebecken, N.[Editor], S. 455 – 464.<br />

SPENKE, M. 2001.Visualization and interactive analysis of blood<br />

parameters with InfoZoom In: Artificial Intelligence in Medicine, Bd. 22,<br />

Nr. 2, S.159 – 172.<br />

http://www.humanIT.de – The InfoZoom home page. A free test version<br />

of InfoZoom can be obtained.<br />

http://www.fit.fraunhofer.de/~cici/InfoVis2003/Index.htm – This web<br />

page contains demo videos and the analysis of the file system data of the<br />

InfoVis 2003 Contest.


David Auber *<br />

Abstract<br />

EVAT : Environment for Vizualisation and Analysis of Trees<br />

Maylis Delest †<br />

This paper presents a piece of software for the visualization or<br />

navigation in trees. It allows some operations as comparison<br />

between trees or finding common subtrees. It means a<br />

presentation of common things with same colors. Overall<br />

implemented tools are strongly based on intrinsic combinatorial<br />

parameters with as few references as possible to syntactical data .<br />

CR Categories: D.2.2 [Software Engineering]: Tools and<br />

Techniques --User interfaces. G.2.1 [Discrete Mathematics]:<br />

Combinatorics -- Combinatorial algorithm<br />

Keywords: Trees, analysis, combinatorics, visualization<br />

1 Introduction<br />

Exploration of trees is an important domain for information<br />

visualization. EVAT is designed for exploring one tree or<br />

comparing two or more trees, keeping a view on each tree<br />

analyzed within the session. Three sets of data were proposed and<br />

for each of them, EVAT helps answering user’s questions. Help<br />

means that it shows, using colors, similarity or allows filtering on<br />

value in order to extract a selection view.<br />

In this filtering process, the focus+context technique is<br />

applied in relation with the drawing. It means that the coordinate<br />

are recomputed taking in account the selected data.<br />

EVAT allows importation of XML files and also direct<br />

importation of a file system. The whole session can be saved in<br />

EVAT format.<br />

It runs under Linux Redhat 9.1 using QT library. It is useful<br />

but not mandatory to have 512<strong>MB</strong> memory and a graphic card<br />

Open GL 3D with acceleration.<br />

EVAT manages trees up to 550 000 nodes using 140<strong>MB</strong><br />

memory. Thus it is possible to deal during the same session with<br />

several huge trees.<br />

In the following sections, hhis note presents the background<br />

tools, the menus and ergonomic feel, the main tools for filtering.<br />

Some examples are shown. Then in a last paragraph some strong<br />

and weak points of the software are discussed.<br />

2 Background tools<br />

In this work, we have used the Tulip framework [2] that can be<br />

downloaded at www.tulip-software.org. The main relevant<br />

features in this software are:<br />

• a powerful kernel in terms of time and memory complexity,<br />

• extensibility by plugins without recompiling,<br />

--------------------------------------------<br />

* e-mail: auber@labri.fr<br />

† e-mail: maylis@labri.fr<br />

‡ e-mail: domenger@labri.fr<br />

& e-mail: ferraro@labri.fr<br />

+ e-mail: strandh@labri.fr<br />

adress : 351, Cours de la Libération, 33405 TALENCE CEDEX,<br />

FRANCE<br />

J.Philippe Domenger ‡<br />

LaBRI UMR 5800 - Université Bordeaux 1<br />

124<br />

Pascal Ferraro §<br />

Robert Strandh +<br />

• possibility to map texture and colors on edges and nodes<br />

without loosing “performance”,<br />

• easy management of cluster of clusters.<br />

Searching analogue subtree is done for large trees more than<br />

300 nodes) by our unpublished heuristic based on Strahler<br />

numbers [1], that we call fast algorithm. For smaller trees, the<br />

Zhang algorithm [7] is used. For two subtrees, similar nodes are<br />

displayed with same color.<br />

Managing the colors of nodes in the interface is done by<br />

mapping attribute values on RGB or HSV values or size values.<br />

Two methods are proposed<br />

• linear mapping,<br />

• distribution mapping [4].<br />

The color mapping on HSV is done on the Hue component and<br />

fits the rainbow scale. That means :<br />

• Pink is associated to the highest value,<br />

• Yellow is associated the lowest one.<br />

Coloring edges is made by interpolating the colors of the two<br />

extremities. It is possible to map one attribute on the size of the<br />

nodes.<br />

3 Ergonomy and menus<br />

A view of the software is given in figure 1. During a session, the<br />

user can open several graphs or creates subgraphs. Each one is<br />

displayed in the left window (overview window). The active<br />

graphs are those displayed in the other upper right window<br />

(visualization window). In the lower right window (task window),<br />

four main tasks are reachable.<br />

• Visualization contains all the fonctionalities for setting<br />

the display : drawing the tree [3,5], color, size and<br />

shape of the nodes.<br />

• Search allows to select nodes according the value of the<br />

node attributes. In the advanced mode multiple selection<br />

using boolean operators can be done.<br />

• Detailed Data allows the user to inspect all the attribute<br />

values of a node by clicking it or all the values of a<br />

given attribute in a tree.<br />

• Comparison can be used if two visualization windows<br />

exist. It proposes the two algorithms quoted in the<br />

previous paragraph.<br />

When several active windows are set up, EVAT automatically<br />

provides the tool associated with each one (see figure 2).<br />

Moreover, each task which is done on one window is done on the<br />

other.<br />

4 Examples<br />

Here, we shortly describe some tasks that can be issued using<br />

EVAT.<br />

For system file data, it is possible to analyze an attribute for<br />

example the size associated to a node. The figure 1 shows the<br />

result after the following operation :<br />

• map the attribute on the node size of the drawing<br />

using a linear mapping,<br />

• map the attribute on the node color using HSV<br />

and uniform algorithm,


• select the jpg file,<br />

• zoom in the tree.<br />

For two trees, it is possible to identify possible common<br />

subtrees. On the figure 3, left and right subtrees with same color<br />

are probably similar. This is one using the fast algorithm. it also<br />

applies on one tree. On figure 4, the result of the algorithm<br />

applied on the logs_A file contest is displayed. The exploration<br />

with EVAT shows that some subtrees (blue or violet one ) were<br />

dupplication of directories.<br />

5 Weak and Strong points<br />

The strongest point of EVAT are in the fact that it can answer to<br />

most of the questions on trees (file system, phylogenic,<br />

classification, …) in real time. All algorithms (except oneare less<br />

than nlog(n) in complexity for memory and time. The algorithm<br />

that maps trees node by node has a complexity n 3 but the software<br />

EVAT does not run the process if the number of nodes is greater<br />

than 300. Nevertheless, the whole software is based on<br />

combinatorial properties and thus does not help in finding relation<br />

on the lexical meaning of the nodes. EVAT does not manage the<br />

history but successive views can be kept and saved in Tulip<br />

format in order to work later on the data.<br />

References<br />

[1] D. AUBER, Using Strahler numbers for real time visual exploration of<br />

huge graphs, In Proceedings of International Conference on <strong>Computer</strong><br />

Vision and Graphics 2002, Zakopane, 56-69.<br />

[2] D.AUBER, 2002., Outils de visualisation de grandes structures de<br />

données, PhD thesis, LaBRI, Université Bordeaux 1.<br />

[3] S. GRIVET, D.AUBER , J.P. DOMENGER, G. MELANCON, Bubble Tree<br />

Drawing Algorithm, Preprint LaBRI, 2003.<br />

[4] I. HERMAN, M. MARSHALL, G. MÉLANÇON, Density Functions for<br />

Visual Attributes and Effective Partitioning in Graph Visualization, In<br />

Proceedings of <strong>IEEE</strong> Symposium on Information Visualization 2000,<br />

<strong>IEEE</strong> <strong>Computer</strong> <strong>Society</strong>, 49-56.<br />

[5] E.M. REINGOLD, J.S. TILFORD, Tidier Drawings of Trees, <strong>IEEE</strong><br />

Transactions on Software Engineering , 1981 7(2), 223-2<strong>28</strong>.<br />

[6] A. N. STRAHLER, Hypsomic analysis of erosional topography, Bulletin<br />

Geological <strong>Society</strong> of America 1952, 63, 1117-1142.<br />

[7] K.ZHANG, Aconstrained edit distance between unordered labeled trees,<br />

Algorithmica, 1996, 15, 205-222.<br />

Figure 1. A view of EVAT<br />

125<br />

Figure 2. A view of multiple visualization windows.<br />

Figure 3. Common subtrees between logs_A and logs_B files.<br />

Figure 4. Similar subtrees in the logs_A file.


Comparison of multiple taxonomic hierarchies using TaxoNote<br />

David R. Morse 1<br />

The Open University,<br />

United Kingdom<br />

Abstract<br />

Nozomi Ytow 2<br />

University of Tsukuba,<br />

Japan<br />

In this paper we describe TaxoNote Comparator, a tool for<br />

visualising and comparing multiple classification hierarchies. In<br />

order to align the hierarchies, the Comparator creates an<br />

integrated hierarchy containing all the taxa in the hierarchies to be<br />

compared, so that alignment of the hierarchies can be maintained.<br />

A table of assignments reports the taxonomic names that are<br />

common to all hierarchies and the differences between them,<br />

which facilitates structural comparisons between the hierarchies.<br />

CR Categories: I.3.6 [<strong>Computer</strong> Graphics]: Methodology and<br />

Techniques – Graphics data structures and data types; J.3 [Life<br />

and medical sciences]: Biology and genetics<br />

Keywords: taxonomy, nomenclature, visualisation, rough set<br />

theory, formal concept analysis.<br />

1. Introduction<br />

Recent work on modelling taxonomic names and their relationships<br />

has highlighted the need to capture the multiple names and<br />

hierarchies that exist in nomenclature. A number of projects have<br />

considered this problem, including Nomencurator [Ytow et al.<br />

2001] and Prometheus [Pullan et al. 2000]. Data models<br />

incorporating multiple hierarchies are crucial in facilitating the<br />

effective integration of biodiversity data from diverse sources,<br />

since multiple and overlapping taxonomic concepts must be<br />

tracked, as well as the names that have been applied to these<br />

concepts. Equally important are visualisations which permit the<br />

comparison and exploration of several hierarchies simultaneously.<br />

In this paper we will describe an extension to our previous work<br />

on the Nomencurator data model [Ytow et al. 2001] by giving an<br />

overview of the visualisation and comparison tools within<br />

TaxoNote. TaxoNote (short for Taxonomist’s Notebook) is a<br />

graphical user interface to the Nomencurator data structures.<br />

2. Hierarchy visualisation and comparison<br />

The TaxoNote Comparator hierarchy visualisation and comparison<br />

tool is shown in Figure 1. The display is divided into three:<br />

• A Query panel can be used to search the displayed<br />

hierarchies for particular taxonomic names, by text entry.<br />

1 e-mail: d.r.morse@open.ac.uk 2 e-mail: nozomi@biol.tsukuba.ac.jp<br />

3 e-mail: dmr@nhm.ac.uk 4 e-mail: akira@cc.tsukuba.ac.jp<br />

126<br />

David McL. Roberts 3<br />

The Natural History<br />

Museum, United Kingdom<br />

Akira Sato 4<br />

University of Tsukuba,<br />

Japan<br />

• A Hierarchy Comparison panel shows the two hierarchies<br />

that are being compared (centre and right) and an ‘integrated<br />

view’ (left) where the hierarchies have been merged into one,<br />

composite, hierarchy. An additional pane would be added for<br />

each hierarchy being compared by the application. The<br />

hierarchy comparison panel provides a list of siblings and<br />

children of a taxon. It also captures the parent taxon and the<br />

path to the hierarchical root. These may not be displayed if<br />

there are many siblings or children of a node, in which case a<br />

Pop-up panel gives a short summary of the path to the root.<br />

• An Assignment Table at the bottom shows various<br />

alternative views of where names that appear in the<br />

hierarchies are assigned. It contains information on the parent<br />

taxon and potential equivalence of taxon concepts depending<br />

on its modes. While the Hierarchy Comparison panel gives<br />

a top-down oriented view, the Assignment Table gives a<br />

bottom-up oriented view.<br />

Figure 1. The TaxoNote Comparator hierarchy visualisation<br />

and comparison tool.<br />

2.1. The Query Panel<br />

In large data sets, efficient search tools are necessary to focus the<br />

display and the user’s attention on the area of interest. Additional<br />

fields to the taxon Name are included as potential query fields in<br />

order to refine the search. These search fields are metadata which<br />

are important in modelling multiple taxonomic hierarchies, since<br />

they allow you to compare, distinguish between and reconcile<br />

different taxonomic opinions of the taxon concepts that are linked<br />

to the same taxonomic name.<br />

2.2. The Hierarchy Comparison Panel<br />

In Figure 1, we prefixed all names with an abbreviated form of the<br />

taxonomic rank as an aid to navigation and comparison. We chose


an indented representation for the hierarchies because this is<br />

familiar to taxonomists and to most computer users through<br />

applications such as Microsoft Explorer. As with that interface,<br />

additional levels of the hierarchy can be expanded and contracted<br />

at will. While other representations such as Hyperbolic Trees and<br />

TreeMaps [Bederson et al. 2002; Graham and Kennedy 2001]<br />

may have a higher information density, it is important that the<br />

names retain their visibility and readability at all times. The<br />

hierarchies and integrated view can be scrolled in concert by<br />

holding down the middle mouse button while any of the hierarchy<br />

display panes is scrolled. This facilitates the search for a particular<br />

taxon and the structural comparison of the different hierarchies.<br />

2.2.1 Alignment of taxonomic names<br />

Core to the alignment problem is establishing the BCN (Best<br />

Corresponding Node, see [Munzner et al. 2003]). Ideally, corresponding<br />

nodes would represent equivalent taxonomic concepts.<br />

Unfortunately the taxonomic concept itself is extremely difficult<br />

to pin down [Ytow et al. 2001] and is approximated in one of two<br />

ways, either by consideration of the objects (taxa or specimens)<br />

included in the concept [Pullan et al. 2000; Munzner et al. 2003]<br />

or by analysis of the attributes of the taxon, i.e. the shared<br />

characters of the group. The former method is very sensitive to the<br />

contained set being incomplete for any reason, and data for the<br />

latter method are rarely available. Other proxy measures of the<br />

taxon concept have to be combined to establish the BCN, which<br />

include the hierarchical position (parent list), the included objects<br />

(the child list), but interpreted in a flexible manner, where positive<br />

matching counts for more than missing data and absence of<br />

conflict counts in favour, conflict against. This set of relationships<br />

is subtle and is currently being explored using rough set<br />

approximations and formal concept analysis [Yao et al. 1997].<br />

In order to align the two hierarchies and to maintain their<br />

alignment while the display panels are scrolled, a consensus<br />

hierarchy is constructed from the source hierarchies that are being<br />

compared. This is shown in the left hand pane in Figure 1, as the<br />

Integrated View. In the Hierarchy Comparison panel, rows which<br />

are aligned have the same names in the same hierarchical position<br />

in both hierarchies (e.g. family Phocoenidae in Figure 1). Rows<br />

which are not aligned are indicative of names missing from one<br />

hierarchy, perhaps because they are newly created (e.g. family<br />

Iniidae) or names whose hierarchical position has changed from<br />

one hierarchy to the other (e.g. genus Lipotes). The necessary<br />

inclusion of duplicates of a name has the potential to be a way of<br />

indicating regions of difference between trees. Indeed, an estimate<br />

of the number of incompatible views can be obtained by simply<br />

counting the number of duplicate names in the Integrated view.<br />

Construction of the consensus hierarchy requires the establishment<br />

of the BCN for each taxon in the Integrated View.<br />

Hierarchies proposed by different taxonomists are likely to<br />

embrace different taxon concepts that may or may not have the<br />

same name. Therefore, establishing node equivalence is not trivial<br />

and we are still working on algorithms for constructing the<br />

composite hierarchy that is shown in the Integrated View.<br />

2.3. The Assignment Table<br />

The bottom panel contains the Assignment table which consists of<br />

a number of organised lists whose purpose is to allow the user to<br />

explore the differences and commonalities between taxon<br />

concepts in the hierarchies. The table is structured into columns,<br />

127<br />

one for each hierarchy pane. The primary taxon is given on the<br />

left, underneath the integrated view while the parent taxon is listed<br />

underneath the appropriate hierarchical pane. Tabs at the bottom<br />

of the Assignment Table allow the user to see those taxa which<br />

are missing from one set or the other (‘Missing taxa’ tab), while<br />

those taxa with different positions are summarised under the<br />

‘Different taxa’ tab. Other forms of difference are given on the<br />

‘Inconsistent taxa’ and ‘Synonyms’ tabs. Finally, those nodes in<br />

common are listed under the ‘Common taxa’ tab.<br />

One use of the Assignment Table is illustrated under the ‘Missing<br />

taxa’ tab by the species Acomys cineraseus (in Mammals A) and<br />

Acomys cinerasceus (in Mammals B), that looks like a spelling<br />

error either in the original publication or in the data preparation.<br />

3. The InfoVis 2003 Contest Data Sets<br />

It is our contention that no one tool can solve all visualisations of<br />

hierarchical data problems. We have chosen to address one<br />

particular type of data – classification hierarchies – which may be<br />

characterised as being non-quantitative data. Our approach would<br />

need significant additions in order for it to perform well at<br />

visualising hierarchically arranged quantitative data; data which is<br />

often well suited to visualisations using TreeMaps [Bederson et al.<br />

2002]. Such additions to our system could include colour-coded<br />

glyphs or bars alongside, or in place of the text labels.<br />

Classification hierarchies are unusual in that the names in the<br />

hierarchies should be unique. The appearance of the same name in<br />

different places is indicative of homonymy and is of interest to<br />

taxonomists as an area that requires taxonomic revision. In<br />

contrast, file system hierarchies are replete with duplicated names.<br />

Files called ‘index.html’ abound in websites – the file logs_A_03-<br />

02-01.xml records 3356 occurrences of this file, for example.<br />

In classification hierarchies, the name is just that because of the<br />

assumption that taxonomic names in a hierarchy are unique. The<br />

position of the name in the hierarchy – the rank – gives extra<br />

information about the name. In contrast, in a file system hierarchy,<br />

the name consists of the path to the file in addition to the actual<br />

file name. While components of the path may give additional<br />

information about the file, this interpretation is not as strong as the<br />

rank in taxonomy. Clearly very different visualisation techniques<br />

are required in order to navigate and compare hierarchies with<br />

such different properties.<br />

References<br />

BEDERSON, B. B., SHNEIDERMAN, B. AND WATTENBERG, M. 2002.<br />

Ordered and quantum treemaps: Making effective use of 2D space to<br />

display hierarchies, ACM Transactions on Graphics 21, 4, 833 - 854.<br />

GANTER, B. AND WILLE, R., 1999. Formal Concept Analysis:<br />

Mathematical Foundations, Springer-Verlag.<br />

GRAHAM, M. AND KENNEDY, J. 2001. Combining linking & focusing<br />

techniques for a multiple hierarchy visualisation. In Fifth<br />

International Conference on Information Visualisation, <strong>IEEE</strong><br />

<strong>Computer</strong> <strong>Society</strong> Press. 425-432.<br />

MUNZNER, T., GUI<strong>MB</strong>RETIÈRE, F., TASIRAN, S., ZHANG, L. AND ZHOU, Y.<br />

2003. TreeJuxtaposer: Scalable Tree Comparison using<br />

Focus+Context with Guaranteed Visibility. In ACM SIGGRAPH,<br />

ACM Press.<br />

PULLAN, M. R., WATSON, M. F., KENNEDY, J. B., RAGUENAUD, C. AND<br />

HYAM, R. 2000. The Prometheus Taxonomic Model: a practical<br />

approach to representing multiple classifications, Taxon 49, 1, 55-75.<br />

YTOW, N., MORSE, D. R. AND ROBERTS, D. M. 2001. Nomencurator: a<br />

nomenclatural history model to handle multiple taxonomic views,<br />

Biological Journal of the Linnean <strong>Society</strong> 73, 1, 81-98.


Nihar Sheth<br />

School of Informatics<br />

Indiana University, Bloomington<br />

nisheth@indiana.edu<br />

1 Introduction<br />

Treemap, Radial Tree, and 3D Tree Visualizations<br />

This paper presents and discusses data analysis and visualization<br />

results as part of our InfoVis 2003 contest submission. Two inhouse<br />

developed approaches – a coupled dual radial tree layout<br />

interfaces and a three-dimensional tree viewer (Stewart, Hart et al.<br />

2001) – were employed to compare phylogenetic trees. An inhouse<br />

developed radial tree layout was applied to visualize tree<br />

topology. The treemap algorithm (Bederson, Shneiderman et al.<br />

2002) developed at HCIL, University of Maryland was utilized to<br />

visualize tree attributes.<br />

2 Phylogenies<br />

The test data comprised two small binary phylogenetic<br />

(evolutionary) trees. The task required to design an interactive<br />

tool that supports the alignment of the tree topologies. The trees<br />

are unrooted and leaf-labeled. Hence, mapping between leaf nodes<br />

is straight forma while the mapping between internal nodes is not<br />

obvious. Two solutions are proposed.<br />

The first tool is a 3D tree viewer developed at the Advanced<br />

Visualization Laboratory, Indiana University (Stewart, Hart et al.<br />

2001). It visualizes the hierarchical structure of both trees and<br />

interconnects matching leave nodes by straight, color coded lines.<br />

Matching sub-trees are color coded as well to support the<br />

interactive alignment process, see Figure 1.<br />

Figure 1: 3D tree viewer showing homology between two<br />

phylogenies<br />

Users can search for specific node labels and interactively align<br />

trees by changing the layout of trees. Exploiting the third<br />

dimension more than two trees can be visualized and compared.<br />

Katy Börner, Jason Baumgartner & Ketan Mane<br />

School of Library and Information Science<br />

Indiana University, Bloomington<br />

{katy, jlbaumga, kmane} @indiana.edu<br />

1<strong>28</strong><br />

Eric Wernert<br />

<strong>Computer</strong> Science Department<br />

Indiana University, Bloomington<br />

ewernert@cs.indiana.edu<br />

Individual taxa or groups of taxa can be traces across multiple<br />

trees. The viewer uses the Open Inventor graphics API to generate<br />

the<br />

3D visualization.<br />

The second tool uses two tightly coupled radial tree<br />

visualizations to support the semi-automatic alignment of trees.<br />

The tool takes a correspondence matrix for two phylogenies as<br />

input. This matrix can for example be computed using the popular<br />

consensus tree approach (Adams 1972) which establishes mappings<br />

of intermediate nodes<br />

according to the similarity between<br />

their respective leaf sets.<br />

The layout algorithm generates two radial tree visualizations<br />

– one for each phylogenetic<br />

tree (see Figure 2) – and two control<br />

panels (see Figure 3).<br />

Figure 2: Dual radial<br />

tree viewer showing the aligned topologies<br />

of<br />

two phylogenies<br />

The two radial tree visualizations are tightly coupled. A pop-up<br />

menu enables users to automatically align trees or to query the<br />

‘other’ tree for best matching nodes. Automatic alignment selects<br />

the node in the ‘other’ tree that best matches the currently selected<br />

node and moves selected and best matching node into the middle<br />

of the display. The search for best matching nodes uses the<br />

correspondence matrix to determine nodes that are embedded in a<br />

similar topology as the selected node and colors them black. Both<br />

tree view have the full radial tree functionality (search, details on<br />

demand, etc.) explained in the next section. The program<br />

was<br />

implemented<br />

in Java and runs as applet or application.<br />

3<br />

Classification<br />

This task required to visualize very large trees with large fanouts.<br />

The radial tree visualization introduced in the previous<br />

section was applied to visualize the tree structure and to support<br />

search and browsing. The radial tree is a focus and context<br />

technique developed at Indiana University and is available in the


Information Visualizatiosn Repository 1 . A query submission<br />

interface was added to support search. An interface snapshot<br />

showing the “Mammal” sub-tree is given in Figure 3.<br />

Browsing a radial tree is very similar to browsing a<br />

hyperbolic tree. Upon selection, a node is moved to the center of<br />

the tree and the surrounding tree is rearranged accordingly. A<br />

slow-in, slow-out animation technique (instead of a straight linear<br />

transition) was used to provide visual constancy and to reduce<br />

disorientation.<br />

The layout shown in Figure 3 provides labels for the center<br />

and first level nodes exclusively to avoid clutter. Each rank<br />

category of classification hierarchy such as phylum, order, family<br />

etc. were color coded differently to provide navigational cues to<br />

the user. For example, family nodes in Figure 3 are given pink,<br />

order nodes are in orange, and species in green.<br />

The path to the root node (Mammal) of the entire data is<br />

highlighted in red from the node at the center to aid the user in the<br />

navigation of the hierarchy. Node details can be requested on<br />

demand.<br />

Figure 3: Radial tree Viewer displaying the classification data<br />

The left panel provides a button that can be used to get the<br />

original tree layout with the tree root in the center. The slider lets<br />

users change the number of displayed levels. Selecting the<br />

animation check box leads to a smooth animation of tree layout<br />

changes. User can search for Latin or common name of nodes.<br />

Search terms are entered in the query field. Regular expression<br />

matching is also supported. Matching nodes are marked black<br />

providing instant visual feedback to the user. In addition,<br />

matching nodes are displayed in a list. Selection of a list item<br />

moves this particular node in the center of the display and aligns<br />

the tree accordingly.<br />

In sum, browsing results and search results are shown in the<br />

context of the particular tree structure.<br />

4 File System and Usage Logs<br />

The third task required to determine and visualize topological or<br />

attribute value changes in large trees.<br />

Subsequently we discuss a visualization that aims to<br />

visualize the attribute value changes for name, hit counts, title and<br />

others provided in the log files.<br />

1 http://iv.slis.indiana.edu<br />

129<br />

The treemap algorithm (Bederson, Shneiderman et al. 2002)<br />

implemented at the human computer interaction laboratory<br />

(HCIL) at the University of Maryland was identified as the tool<br />

that would serve the task requirements. Figure 4 shows the data<br />

using the squarified partitioning method. Each directory/file is<br />

represented by a square label. Size, color, and label of each square<br />

can be used to represent three attributes of this directory/file. For<br />

example in the figure 4 below we have the color coding for the<br />

creation time of the directories/files. The green coloration is given<br />

to older pages, the blue color to entities that are created most<br />

recently. This gives us a very good overview of the creation time<br />

of the structure at a glance. The nesting of the directory squares<br />

corresponds to the original directory hierarchy.<br />

Figure 4: Treemap visualization showing the file system of the<br />

<strong>Computer</strong> Science Department at the University of Maryland<br />

Placing the mouse pointer over a square brings up a panel with<br />

additional information such as file name, size, and number of hit<br />

count. If the user clicks on a square that represents a directory<br />

then the current treemap visualization is replaced by a zoomed in<br />

version of the selected square. Multiple combinations of<br />

individual attribute value can be visualized. The user can search<br />

for directories, do a comparative statistics on hit counts, and can<br />

visually identify files or folders with similar attribute values.<br />

A web page accompanying this submission with large scale<br />

versions of the interfaces as well as animation sequences showing<br />

them in interaction is at:<br />

http://ella.slis.indiana.edu/~kmane/katy/iv_contest/webpage/ .<br />

5 References<br />

Adams, E. N. (1972). "Consensus techniques and the comparison<br />

of taxonomic trees." Systematic Zoology 21: 390-397.<br />

Bederson, B. B., B. Shneiderman, et al. (2002). "Ordered and<br />

Quantum Treemaps: Making Effective Use of 2D Space to<br />

Display Hierarchies." ACM Transactions on Graphics (TOG)<br />

21(4): 833-854.<br />

Stewart, C. A., D. Hart, et al. (2001). Parallel implementation and<br />

performance of fastDNAml - a program for maximum<br />

likelihood phylogenetic inference. Supercomputing<br />

Conference, Denver, CO.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!