Information und Wissen: global, sozial und frei? - Fachhochschule ...

Weitere Magazine

Empfehlungen

Info

28 Pavel Sirotkin These studies do not produce conclusive results, though they seem to cast doubt on the connections between popular metrics (as they have been used for web search evaluation) and user satisfaction. Therefore, the need for novel methods of metric evaluation has been emphasized (Mandl 2010). Methodology We attempt to provide a comparison of three popular explicit evaluation metrics in their relationship to user satisfaction. That is to say, we attempt to test whether and how well (M)AP and (n)DCG 1 indicate users’ explicitly stated preferences. While there is no absolute standard against which to measure evaluation metrics, we consider user preference between two result lists to be a useful start. From the point of view of a search engine developer, the most interesting question to be answered by a metric is whether a given algorithm is better than another. This other might be a previous version of the algorithm, a competing search engine, or just a baseline value. Additional questions might regard the confidence in the preference statement or the amount of difference between the algorithms. And the most direct way to gather user preference is to obtain explicit judgments. The directness is needed to ensure that the standard we are measuring metrics against is not itself biased by an intermittent layer of theory. While a direct comparison of two result sets is not usual (and might be considered “unnatural” for search behaviour), we think it nevertheless provides a more close reflection of actual user preference than other methods. For the study, the help of 31 first-year Information Science students was enlisted. They were required to enter queries they were interested in, as well as a detailed statement of their information need. For every query, the top 50 results were fetched from a major web search engine. From these, two result lists were constructed; one contained the results in original order, while the ordering of the other was completely randomized. Then the users were confronted, also through a web interface, with different types of judgments. 1 As we calculate the metrics on a per-query basis, nDCG is analogous to DCG while being easier to compare as it falls into the usual 0...1 range. Also, MAP for a single query is obviously equal to AP. For convenience, we will speak of MAP in all contexts.
Predicting user preferences 29 First, they were presented with a query, an information need statement, and two result lists displayed side by side, which were anonymized and presented in random order. They were asked to conduct a search session as they would do normally, and when they were done, to indicate which result list they found better, or if both were equally good (or bad) 2 . Later, they were presented with single results and requested to evaluate their relevance given the query and the information need. Ratings were graded on a 1...6 scale, which is familiar to German students since it is the standard grade scale in schools and universities. For evaluation purposes, the ratings were converted to a 1...0 scale with 0.2 intervals (1 → 1.0, 2 → 0.8, …, 6 → 0.0). Both the preference and the relevance judgments could be for the users’ own queries or for others’. The raters performed all actions via a Web interface. The main evaluation measure was the ratio of queries for which the difference between metric values for the two result lists would correctly predict explicit user preference. We call the measure Preference Identification Ratio (PIR). The definition is given in Formula 3, with Q being the set of queries where the output of one algorithm has been judged to be better than another, mq1 and mq2 being metric values for the two result lists under comparison, pq the preference judgment (with value 1 if q1 is preferred and -1 if q2 is preferred), and t a threshold value to allow treating result list quality as equal if their metric values are similar. On an intuitive level, the numerator is the number of queries where we can correctly predict the user preference from explicit result ratings minus the number of queries where the preference prediction is inversed. The denominator is simply the number of preference judgments where a preference actually exists. If two result lists are judged to be of similar quality, a metric’s values do not influence PIR, as choosing any one does not lead to any advantages or disadvantages to the user 3 . This implies that if a metric’s values could be used to correctly predict user preference judgments for all sessions, its PIR would be 1; and if every preference prediction was reversed, the PIR would be -1. However, since assuming no preferences at all would result in a PIR value of 0, we can consider this to be the baseline. 2 Interestingly (and surprisingly for us), the randomized result list was judged to be better than the original one in ca. 26% of all cases. The reasons for and implications of this finding go beyond the scope of this paper and will be discussed elsewhere. 3 It may be argued that if the current algorithm performs equally well, the adoption of a novel one is a waste of effort. Here, though, we focus on user experience.
Seite 1 und 2: Griesbaum, Mandl, Womser-Hacker (Hr
Seite 3 und 4: Joachim Griesbaum, Thomas Mandl, Ch
Seite 5 und 6: Inhaltsverzeichnis Veranstalter & T
Seite 7 und 8: Session 6: Multimedia 221 Peter Sch
Seite 9 und 10: Session 11: E-Learning / Social Med
Seite 11 und 12: Veranstalter & Tagungsteam Hochschu
Seite 13 und 14: Programmkomitee Internationales Sym
Seite 15 und 16: When Music comes into Play - Überl
Seite 17 und 18: Vorwort 17 Vorwort Das 12. Internat
Seite 19 und 20: Vorwort 19 Abstracts der Keynotes
Seite 21 und 22: Information Retrieval: Technology,
Seite 23 und 24: Information Retrieval: Technology,
Seite 25 und 26: Predicting user preferences 25 case
Seite 27: Predicting user preferences 27 othe
Seite 31 und 32: Predicting user preferences 31 Figu
Seite 33 und 34: Predicting user preferences 33 leas
Seite 35 und 36: Predicting user preferences 35 Jär
Seite 37 und 38: Usefulness Evaluation on Visualizat
Seite 47 und 48: Vergleich von IR-Systemkonfiguratio
Seite 63 und 64: Komponenten-basierte Metadatenschem
Seite 75 und 76: Toolbasierte Datendokumentation in
Seite 77 und 78: Toolbasierte Datendokumentation in
Seite 79 und 80:
Toolbasierte Datendokumentation in
Seite 81 und 82:
Seite 83 und 84:
Seite 85 und 86:
Nachhaltige Dokumentation virtuelle
Seite 87 und 88:
Seite 89 und 90:
Seite 91 und 92:
Seite 93 und 94:
Seite 95 und 96:
Seite 97 und 98:
Seite 99 und 100:
Konferenz-Tweets 99 Weise entsteht
Seite 101 und 102:
Konferenz-Tweets 101 Tabelle 1: Üb
Seite 103 und 104:
Konferenz-Tweets 103 Konferenz bezi
Seite 105 und 106:
Konferenz-Tweets 105 sentliche Frag
Seite 107 und 108:
Konferenz-Tweets 107 Eine intensive
Seite 109 und 110:
Konferenz-Tweets 109 4 Fazit und Au
Seite 111 und 112:
Social Bookmarking als Werkzeug fü
Seite 113 und 114:
Seite 115 und 116:
Seite 117 und 118:
Seite 119 und 120:
Seite 121 und 122:
Seite 123 und 124:
T-Index als Stabilitätsindikator f
Seite 125 und 126:
Seite 127 und 128:
Seite 129 und 130:
Seite 131 und 132:
Seite 133 und 134:
Seite 135 und 136:
Seite 137 und 138:
A data model for cross-domain data
Seite 139 und 140:
Seite 141 und 142:
Seite 143 und 144:
Seite 145 und 146:
Seite 147 und 148:
Seite 149 und 150:
Wissenschaftliche Zeitschriften im
Seite 151 und 152:
Seite 153 und 154:
Seite 155 und 156:
Seite 157 und 158:
Seite 159 und 160:
Seite 161 und 162:
Abdeckung erziehungswissenschaftlic
Seite 163 und 164:
Seite 165 und 166:
Seite 167 und 168:
Seite 169 und 170:
Seite 171 und 172:
Seite 173 und 174:
Constructing Topic-specific Search
Seite 175 und 176:
Seite 177 und 178:
Seite 179 und 180:
Seite 181 und 182:
Seite 183 und 184:
Seite 185 und 186:
Applying Science Models for Search
Seite 187 und 188:
Seite 189 und 190:
Seite 191 und 192:
Seite 193 und 194:
Seite 195 und 196:
Seite 197 und 198:
Spezielle Anforderungen bei d. Eval
Seite 199 und 200:
Seite 201 und 202:
Seite 203 und 204:
Seite 205 und 206:
Seite 207 und 208:
Seite 209 und 210:
Entwicklung einer Benutzeroberfläc
Seite 211 und 212:
Seite 213 und 214:
Seite 215 und 216:
Seite 217 und 218:
Seite 219 und 220:
Seite 221 und 222:
Seite 223 und 224:
Effects of real, media and presenta
Seite 225 und 226:
Seite 227 und 228:
Seite 229 und 230:
Seite 231 und 232:
Seite 233 und 234:
Seite 235 und 236:
Ein erweiterbares Tool zur Annotati
Seite 237 und 238:
Seite 239 und 240:
Seite 241 und 242:
Seite 243 und 244:
Seite 245 und 246:
Seite 247 und 248:
AV-Portal für wissenschaftliche Fi
Seite 249 und 250:
Seite 251 und 252:
Seite 253 und 254:
Seite 255 und 256:
Seite 257 und 258:
Significant properties digitaler Ob
Seite 259 und 260:
Seite 261 und 262:
Seite 263 und 264:
Seite 265 und 266:
Seite 267 und 268:
Seite 269 und 270:
Seite 271 und 272:
A survey of internet searching skil
Seite 273 und 274:
Seite 275 und 276:
Seite 277 und 278:
Seite 279 und 280:
Seite 281 und 282:
Seite 283 und 284:
Seite 285 und 286:
Seite 287 und 288:
Kontextspezifische Erhebung von auf
Seite 289 und 290:
Seite 291 und 292:
Seite 293 und 294:
Seite 295 und 296:
Seite 297 und 298:
Seite 299 und 300:
Evaluation von Summarizing-Systemen
Seite 301 und 302:
Seite 303 und 304:
Seite 305 und 306:
Seite 307 und 308:
Seite 309 und 310:
Bedarf an Informationsspezialisten
Seite 311 und 312:
Seite 313 und 314:
Seite 315 und 316:
Seite 317 und 318:
Seite 319 und 320:
Seite 321 und 322:
Seite 323 und 324:
Mining qualitative data on human in
Seite 325 und 326:
Mining qualitative data on human in
Seite 327 und 328:
The Social Persona Approach 327 Abs
Seite 329 und 330:
The Social Persona Approach 329 The
Seite 331 und 332:
The Social Persona Approach 331 Fig
Seite 333 und 334:
„Mobile Tagging“: Konzeption un
Seite 335 und 336:
Seite 337 und 338:
Seite 339 und 340:
Seite 341 und 342:
Seite 343 und 344:
Seite 345 und 346:
User Interface Prototyping 345 User
Seite 347 und 348:
User Interface Prototyping 347 tige
Seite 349 und 350:
User Interface Prototyping 349 Deta
Seite 351 und 352:
User Interface Prototyping 351 Tabe
Seite 353 und 354:
User Interface Prototyping 353 Tabe
Seite 355 und 356:
User Interface Prototyping 355 Ausb
Seite 357 und 358:
Analyse und Evaluierung der Nutzung
Seite 359 und 360:
Seite 361 und 362:
Seite 363 und 364:
Seite 365 und 366:
Seite 367 und 368:
Seite 369 und 370:
Online-Beratungsk. für die Auswahl
Seite 371 und 372:
Seite 373 und 374:
Seite 375 und 376:
Seite 377 und 378:
Use, but verify 377 information abo
Seite 379 und 380:
Use, but verify 379 their fruits ye
Seite 381 und 382:
Problems and prospects of implement
Seite 383 und 384:
Seite 385 und 386:
Seite 387 und 388:
Seite 389 und 390:
Domänenübergreifende Phrasenextra
Seite 391 und 392:
Domänenübergreifende Phrasenextra
Seite 393 und 394:
Content Analysis in der Mathematik:
Seite 395 und 396:
Seite 397 und 398:
Seite 399 und 400:
Seite 401 und 402:
Seite 403 und 404:
Seite 405 und 406:
Das Konzept der Informativität 405
Seite 407 und 408:
Seite 409 und 410:
Seite 411 und 412:
Identification Systems Adoption in
Seite 413 und 414:
Seite 415 und 416:
Seite 417 und 418:
Seite 419 und 420:
Seite 421 und 422:
Seite 423 und 424:
Virtuelle Forschungsumgebungen 423
Seite 425 und 426:
Seite 427 und 428:
Seite 429 und 430:
Seite 431 und 432:
Seite 433 und 434:
Seite 435 und 436:
Der Streit um die Regelung des Zwei
Seite 437 und 438:
Seite 439 und 440:
Seite 441 und 442:
Seite 443 und 444:
Seite 445 und 446:
Seite 447 und 448:
Seite 449 und 450:
Seite 451 und 452:
Seite 453 und 454:
Seite 455 und 456:
Integrating industrial partners int
Seite 457 und 458:
Seite 459 und 460:
Seite 461 und 462:
Seite 463 und 464:
Seite 465 und 466:
Seite 467 und 468:
E-Learningkurs Globalisierung 467 E
Seite 469 und 470:
E-Learningkurs Globalisierung 469 B
Seite 471 und 472:
E-Learningkurs Globalisierung 471 f
Seite 473 und 474:
E-Learningkurs Globalisierung 473 b
Seite 475 und 476:
E-Learningkurs Globalisierung 475 A
Seite 477 und 478:
E-Learningkurs Globalisierung 477 N
Seite 479 und 480:
Social-Media-Marketing im Hochschul
Seite 481 und 482:
Seite 483 und 484:
Seite 485 und 486:
Seite 487 und 488:
Seite 489 und 490:
Seite 491 und 492:
Seite 493 und 494:
Seite 495 und 496:
Nutzungsanalyse des Deutschen Bildu
Seite 497 und 498:
A landmark in biomedical informatio
Seite 499 und 500:
3D-Modelle in bibliothekarischen An
Seite 501 und 502:
First Aid for Information Chaos in
Seite 503 und 504:
Multilingual Interface Usage 503 Mu
Seite 505 und 506:
Fassettierte Suche in Benutzeroberf
Seite 507 und 508:
Ordnung im Weltwissen 507 Zusammenf
Seite 509 und 510:
Die European Psychology Publication
Seite 511 und 512:
IUWIS (Infrastruktur Urheberrecht i
Seite 513 und 514:
Die Kulturgüterdatenbank der Regio
Seite 515 und 516:
Die Kulturgüterdatenbank der Regio
Seite 517 und 518:
TagTree: Exploring Tag-Based Naviga
Seite 519 und 520:
Link server aggregation with BEACON
Seite 521 und 522:
Link server aggregation with BEACON
Seite 523 und 524:
Wissenschaft trifft Praxis 523 Prax
Seite 525 und 526:
Wissenschaft trifft Praxis 525 Prax
Seite 527 und 528:
Wissenschaft trifft Praxis 527 der
Seite 529 und 530:
Mittendrin statt nur dabei 529 Stud
Seite 531 und 532:
Mittendrin statt nur dabei 531 Nadj
Seite 533 und 534:
Mittendrin statt nur dabei 533
Seite 535 und 536:
K.-M. Behr: Kreativer Umgang mit Co
Alle anzeigen

Information und Wissen: global, sozial und frei? - Fachhochschule ...

Erfolgreiche ePaper selbst erstellen

Template löschen?

Als Template speichern?