01.03.2014 Views

DG04 project presentation - Inforum

DG04 project presentation - Inforum

DG04 project presentation - Inforum

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

User Interfaces for<br />

Information Retrieval on the<br />

WWW<br />

Gary Marchionini<br />

University of North Carolina at Chapel Hill<br />

march@ils.unc.edu<br />

INFORUM 2005<br />

Prague<br />

May 24-27, 2005


Message<br />

• On the WWW, the User Interface is the librarian.<br />

• HCI and IR are related fields that have strong<br />

traditions that have been energized by WWW.<br />

• The intersection of these fields offers interesting<br />

new opportunities for high-impact R&D<br />

• Integrating the human and system interaction is<br />

the main design challenge: syminforosis—<br />

people continuously engaged with meaningful<br />

information<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Outline<br />

• IR and the WWW<br />

• HCI and the WWW<br />

• HCIR rooted in system development<br />

• Examples<br />

– Open Video<br />

– Relation Browser<br />

• Challenges and Opportunities<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Content-Centered Retrieval as Matching<br />

Document Re<strong>presentation</strong>s to Query<br />

Re<strong>presentation</strong>s<br />

Surrogates<br />

Match<br />

Algorithm<br />

Surrogates<br />

Terms<br />

Query Form A<br />

Document<br />

Space Sample Sample<br />

Query<br />

Space<br />

Vectors<br />

Query Form B<br />

Etc..<br />

Etc..<br />

A powerful paradigm that has driven IR R&D for half a century.<br />

Evaluation metric is effectiveness of the match. (e.g., recall and precision).<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


WWW Content Trend<br />

• Content Features (queries too)<br />

– Not only text<br />

• Statistics, images, music, code, streams, biochemical<br />

– Multimedia, multilingual<br />

–Dynamic<br />

• Temporal (e,g., blogs, wikis, sensor streams)<br />

• Conditional (e.g., computed links, recommendations)<br />

• Content Relationships<br />

– Hyperlinks, new metadata, aggregations<br />

– Digital Libraries, personal collections<br />

• Content acquires history<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Responses to Content Trend<br />

• Link analysis<br />

• Multiple sources of evidence (fusion)<br />

– Authors’ words (e.g., full text IR)<br />

– Indexer/abstractor words (e.g., OPACs)<br />

– Authors’ citations/links (e.g., ISI, Google)<br />

– Readers’ search paths (e.g., recommenders, opinion miners)<br />

– Machine generated features and relationships<br />

• Two key challenges:<br />

– What new relationships can we leverage (human and machine)?<br />

– How can we integrate multiple sources of evidence?<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Installed User Base Trend<br />

• Technical advances and technical literacy allows<br />

us to leverage information seeker intelligence<br />

– Rather than sole dependence on matching<br />

algorithms, focus on flow of re<strong>presentation</strong>s and<br />

actions in situ as people think with these new tools<br />

and information resources<br />

• Web and TV remotes have legitimized browsing<br />

as human-controlled information seeking<br />

• To leverage human intelligence and effort,<br />

people must assume responsibilities: beyond the<br />

two-word, single query<br />

• Aim at understanding rather than retrieval<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Responses to People Trend<br />

• Adapt techniques to WWW<br />

– Relevance feedback<br />

– Query expansion<br />

– User modeling/profiles, SDI services<br />

• Recommender systems<br />

– Explicit and implicit models<br />

• Capture everything (e.g., Lifebits)<br />

• User Interfaces<br />

– Dynamic queries<br />

– Agile views<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


An Expanded Model:<br />

Think of IR from the perspective of<br />

an active human with information<br />

needs, information skills, powerful<br />

IR resources, and situated in<br />

global and local connected<br />

communities, all of which evolve<br />

over time<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Human-Computer Communication<br />

Model of HCI<br />

Brain<br />

mental<br />

models<br />

hands<br />

voice<br />

etc.<br />

CONCEPTUAL INTERFACE<br />

(Rules)<br />

Mind<br />

task system<br />

eyes<br />

ears<br />

etc.<br />

Language<br />

Rate<br />

Half Duplex<br />

etc.<br />

NOISE<br />

Hardware<br />

conceptual<br />

models<br />

display<br />

speaker<br />

etc.<br />

CONCEPTUAL INTERFACE<br />

(Rules)<br />

Software<br />

task user<br />

keyboard<br />

mouse<br />

etc.<br />

Commands<br />

Menus<br />

Direct Manipulation<br />

etc.<br />

A user-oriented model that has driven R&D. Evaluation based<br />

on user time, accuracy, and satisfaction.<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


HCI WWW Trends<br />

• First decade of WWW as great equalizer<br />

(existing users get impoverished, but we<br />

admit MANY more people)<br />

• Universal access<br />

• Platform independence (lots of devices)<br />

• Enhanced browsers, specialized browsers<br />

• Interface Servers<br />

• Social awareness (user is not alone)<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


HCIR<br />

• Trend toward getting people closer to the<br />

information they need<br />

– Closer to the backend<br />

– Closer to the meaning<br />

• Increasing responsibility as well as control<br />

• More demanding and knowledgeable installed<br />

base<br />

• Ubiquity, digital libraries, e-commerce as<br />

extended memories and tools (personal and<br />

shared)<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Document<br />

Surrogates Surrogates<br />

Terms<br />

Match<br />

Algorithm<br />

Query Form A<br />

Space Sample Sample<br />

Query<br />

Space<br />

Vectors<br />

Etc..<br />

Query Form B<br />

Etc..<br />

HCIR: Bringing User Closer to World<br />

P<br />

Rules<br />

Structures<br />

Context<br />

Labels<br />

Help<br />

Start/Stop<br />

C<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005<br />

World


Key Challenges<br />

• Linking conceptual interface to system<br />

backend<br />

– metadata generation<br />

– alternative re<strong>presentation</strong>s and control<br />

mechanisms<br />

• Raising user literacy and involvement<br />

– Engaging without insulting or annoying<br />

• Moving beyond retrieval to understanding<br />

– context<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Two examples of getting people<br />

involved in continuous decision<br />

making and interaction with<br />

information resources: dynamic<br />

queries and the agile views<br />

interaction framework instantiated<br />

in Open Video and Relation<br />

Browser<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Open Video Example<br />

www.open-video.org<br />

• Open access digital library of digital video for<br />

education and research<br />

• 2600+ video segments: MPEG1, MPEG-2,<br />

MPEG-4, QuickTime<br />

• Multiple visual surrogates<br />

• Agile Views Design Framework<br />

– Different types of views<br />

• Overviews, previews, shared views<br />

– Multiple examples of views<br />

– Dynamic control mechanisms<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Alternative Overviews of Result<br />

Sets<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Alternative Previews for a Specific<br />

Video Segment<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Relation Browser Example<br />

www.idl.ils.unc.edu/rave<br />

• A general purpose dynamic query interface for<br />

databases with a small number of facets (~10)<br />

and a small number of categories in each facet<br />

(~10).<br />

• Easy to look ahead (overviews and previews)<br />

• Couples interactive partitioning/exploration with<br />

string query<br />

• Semi-automatic category generation and<br />

webpage classification<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Relation Browser Start State for Energy<br />

Information Admin Website<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Mousing over “Coal” under the “Fuel type”<br />

category reveals the distribution of coal related<br />

web pages to other categories<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Click on Natural Gas and Mouse<br />

over Residential Sector<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


RB++ showing ‘hous’ typed in title<br />

field<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Some Interaction Principles and<br />

Caveats in These Examples<br />

• Principles<br />

– Look ahead without penalty<br />

– Minimize scrolling and clicking<br />

– Alternative ways to slice and dice<br />

– Closely couple search, browse, and examine<br />

– Continuous engagement—useful attractors<br />

– Treasures to surface<br />

• Caveats<br />

– Scalability (getting metadata to client side)<br />

– Metadata crucial<br />

• We are working on automatically creating partitions<br />

– Increasing expectations about useful results (answers!)<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Long Term Paradigm: Information<br />

Interaction as Core Life Process<br />

Examples represent early ways to get the information seeker more<br />

involved in the information seeking process—there is plenty more to do.<br />

Like eating we have varying expectations, invest different levels of effort,<br />

and use diverse and ubiquitous infrastructures. Key challenge is to span<br />

boundaries between cyberinfrastructure and the ‘real’ world.<br />

Cyberinfrastructure<br />

Physical and Intellectual Reality<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Coda<br />

• Our hopes that we can create systems<br />

(solutions) that ‘do’ IR for us are unreasonable<br />

• Our expectations that people can find and<br />

understand information without thinking and<br />

investing effort are unreasonable.<br />

• We aim to develop ‘systems’ that involve people<br />

and machines continuously learning and<br />

changing together. Google would not work as<br />

well next month if there were not a large group<br />

of employees tuning the system, adding new<br />

spam filters, and crawlers checking out pages<br />

and links continuously.<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005


Thank You!<br />

Questions and Discussion<br />

march@ils.unc.edu<br />

Gary Marchionini, UNC-Chapel Hill INFORUM 2005

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!