Information und Wissen: global, sozial und frei? - Fachhochschule ...

Weitere Magazine

Empfehlungen

Info

32 Pavel Sirotkin A comparison of metrics and cut-off values suggests that in different circumstances, different metrics might be appropriate. MAP performs quite poorly at small cut-offs, but emerges as the best metric at 10. Precision never outperforms nDCG, but (at least at the earlier ranks) comes close enough for the difference to be minimal. In absolute terms, the maximum PIR reached is 0.84 (nDCG@7-8). Discussion We would like to point out that search engine evaluation is just a small part of IR evaluation and, moreover, the type of performance we have attempted to capture is just one of many possible aspects of search engine quality. Lewandowski and Hochstötter (2007) propose a four-way quality framework including index quality, quality of the results, quality of search features and usability. The pure evaluation of organic, web page based result lists (as opposed to paid content or “universal search” features) is itself only a minimalistic subset of “quality of the results”. However, the evaluated content is still an important and arguably even crucial part of a search engine’s results. Also, our test subjects obviously did not constitute a representative sample of search engine users. While we look forward to studies with more diverse raters, the group is hardly less heterogeneous than those of most comparable studies. Our results lead to some conclusions of practical importance. As an increasing cut-off value does not necessarily lead to a better approximation of user preferences, it might be a good idea to divert some resources from rating queries deeper to rating more queries. This has been found to provide higher significance (Sanderson and Zobel 2005); our results suggest that, rather than being a trade-off, exchanging depth for width can be doubly effective. It may even be sensible to reduce the cut-off to as low as 4, since it means cutting the effort in half while losing about 15% of information as measured by PIR. A possible explanation for the decrease of prediction quality is that users hardly look at documents beyond a certain rank (Hotchkiss et al. 2005), in which case any later difference in result quality is not reflected in actual user preferences. It would also explain why precision is the most and MAP the
Predicting user preferences 33 least affected, since the former has no and the latter a high discounting factor for later results. Regarding individual metrics, nDCG was shown to perform best in most circumstances. In the best case, it correctly predicted 84% of user preferences. MAP might be employed if one explicitly desires to take into account later results, even if their relevance may not be important to the user. While precision performs considerably well, the present study has not found a situation where it would be the most useful metric. The absolute PIR values we report may well be overestimations, as discussed in the Evaluation section. On the other hand, the preference judgments obtained were binary. We might assume that, given degrees of preference, we would find strong preferences easier to identify by considering document ratings. While metrics are often compared on their ability to distinguish between entities of relatively close quality, from the practical point of view, it is crucial for a metric to reliably pick out large differences, since those are the instances where the most improvements can be made. However, these conjectures await further research to confirm or disprove them. Finally, our evaluation might have a value beyond its immediate results. We think that choosing an explicit, praxis-based standard for evaluating evaluation can help distinguish between the multitudes of available metrics. Particularly, a measure like PIR can be more practical than correlation measures often employed in such studies. Rather than indicating whether a given metric reflects a preference tendency, it can tell for what ratio of queries we would provide better results by using each metric to simulate preference judgments. Conclusions and future work A measure of a metric’s ability to predict user satisfaction across queries was introduced. We used this measure, the Preference Identification Ratio (PIR), to provide estimates for the some common relevance metrics. (n)DCG was found to perform best, indicating the preferred result lists for up to 84% of queries. MAP provided good judgments at higher cut-off values, while precision did well without ever being the most informative metric. We also showed that search engine evaluations might be performed in a more signifi-
Seite 1 und 2: Griesbaum, Mandl, Womser-Hacker (Hr
Seite 3 und 4: Joachim Griesbaum, Thomas Mandl, Ch
Seite 5 und 6: Inhaltsverzeichnis Veranstalter & T
Seite 7 und 8: Session 6: Multimedia 221 Peter Sch
Seite 9 und 10: Session 11: E-Learning / Social Med
Seite 11 und 12: Veranstalter & Tagungsteam Hochschu
Seite 13 und 14: Programmkomitee Internationales Sym
Seite 15 und 16: When Music comes into Play - Überl
Seite 17 und 18: Vorwort 17 Vorwort Das 12. Internat
Seite 19 und 20: Vorwort 19 Abstracts der Keynotes
Seite 21 und 22: Information Retrieval: Technology,
Seite 23 und 24: Information Retrieval: Technology,
Seite 25 und 26: Predicting user preferences 25 case
Seite 27 und 28: Predicting user preferences 27 othe
Seite 29 und 30: Predicting user preferences 29 Firs
Seite 31: Predicting user preferences 31 Figu
Seite 35 und 36: Predicting user preferences 35 Jär
Seite 37 und 38: Usefulness Evaluation on Visualizat
Seite 47 und 48: Vergleich von IR-Systemkonfiguratio
Seite 63 und 64: Komponenten-basierte Metadatenschem
Seite 75 und 76: Toolbasierte Datendokumentation in
Seite 83 und 84:
Toolbasierte Datendokumentation in
Seite 85 und 86:
Nachhaltige Dokumentation virtuelle
Seite 87 und 88:
Seite 89 und 90:
Seite 91 und 92:
Seite 93 und 94:
Seite 95 und 96:
Seite 97 und 98:
Seite 99 und 100:
Konferenz-Tweets 99 Weise entsteht
Seite 101 und 102:
Konferenz-Tweets 101 Tabelle 1: Üb
Seite 103 und 104:
Konferenz-Tweets 103 Konferenz bezi
Seite 105 und 106:
Konferenz-Tweets 105 sentliche Frag
Seite 107 und 108:
Konferenz-Tweets 107 Eine intensive
Seite 109 und 110:
Konferenz-Tweets 109 4 Fazit und Au
Seite 111 und 112:
Social Bookmarking als Werkzeug fü
Seite 113 und 114:
Seite 115 und 116:
Seite 117 und 118:
Seite 119 und 120:
Seite 121 und 122:
Seite 123 und 124:
T-Index als Stabilitätsindikator f
Seite 125 und 126:
Seite 127 und 128:
Seite 129 und 130:
Seite 131 und 132:
Seite 133 und 134:
Seite 135 und 136:
Seite 137 und 138:
A data model for cross-domain data
Seite 139 und 140:
Seite 141 und 142:
Seite 143 und 144:
Seite 145 und 146:
Seite 147 und 148:
Seite 149 und 150:
Wissenschaftliche Zeitschriften im
Seite 151 und 152:
Seite 153 und 154:
Seite 155 und 156:
Seite 157 und 158:
Seite 159 und 160:
Seite 161 und 162:
Abdeckung erziehungswissenschaftlic
Seite 163 und 164:
Seite 165 und 166:
Seite 167 und 168:
Seite 169 und 170:
Seite 171 und 172:
Seite 173 und 174:
Constructing Topic-specific Search
Seite 175 und 176:
Seite 177 und 178:
Seite 179 und 180:
Seite 181 und 182:
Seite 183 und 184:
Seite 185 und 186:
Applying Science Models for Search
Seite 187 und 188:
Seite 189 und 190:
Seite 191 und 192:
Seite 193 und 194:
Seite 195 und 196:
Seite 197 und 198:
Spezielle Anforderungen bei d. Eval
Seite 199 und 200:
Seite 201 und 202:
Seite 203 und 204:
Seite 205 und 206:
Seite 207 und 208:
Seite 209 und 210:
Entwicklung einer Benutzeroberfläc
Seite 211 und 212:
Seite 213 und 214:
Seite 215 und 216:
Seite 217 und 218:
Seite 219 und 220:
Seite 221 und 222:
Seite 223 und 224:
Effects of real, media and presenta
Seite 225 und 226:
Seite 227 und 228:
Seite 229 und 230:
Seite 231 und 232:
Seite 233 und 234:
Seite 235 und 236:
Ein erweiterbares Tool zur Annotati
Seite 237 und 238:
Seite 239 und 240:
Seite 241 und 242:
Seite 243 und 244:
Seite 245 und 246:
Seite 247 und 248:
AV-Portal für wissenschaftliche Fi
Seite 249 und 250:
Seite 251 und 252:
Seite 253 und 254:
Seite 255 und 256:
Seite 257 und 258:
Significant properties digitaler Ob
Seite 259 und 260:
Seite 261 und 262:
Seite 263 und 264:
Seite 265 und 266:
Seite 267 und 268:
Seite 269 und 270:
Seite 271 und 272:
A survey of internet searching skil
Seite 273 und 274:
Seite 275 und 276:
Seite 277 und 278:
Seite 279 und 280:
Seite 281 und 282:
Seite 283 und 284:
Seite 285 und 286:
Seite 287 und 288:
Kontextspezifische Erhebung von auf
Seite 289 und 290:
Seite 291 und 292:
Seite 293 und 294:
Seite 295 und 296:
Seite 297 und 298:
Seite 299 und 300:
Evaluation von Summarizing-Systemen
Seite 301 und 302:
Seite 303 und 304:
Seite 305 und 306:
Seite 307 und 308:
Seite 309 und 310:
Bedarf an Informationsspezialisten
Seite 311 und 312:
Seite 313 und 314:
Seite 315 und 316:
Seite 317 und 318:
Seite 319 und 320:
Seite 321 und 322:
Seite 323 und 324:
Mining qualitative data on human in
Seite 325 und 326:
Mining qualitative data on human in
Seite 327 und 328:
The Social Persona Approach 327 Abs
Seite 329 und 330:
The Social Persona Approach 329 The
Seite 331 und 332:
The Social Persona Approach 331 Fig
Seite 333 und 334:
„Mobile Tagging“: Konzeption un
Seite 335 und 336:
Seite 337 und 338:
Seite 339 und 340:
Seite 341 und 342:
Seite 343 und 344:
Seite 345 und 346:
User Interface Prototyping 345 User
Seite 347 und 348:
User Interface Prototyping 347 tige
Seite 349 und 350:
User Interface Prototyping 349 Deta
Seite 351 und 352:
User Interface Prototyping 351 Tabe
Seite 353 und 354:
User Interface Prototyping 353 Tabe
Seite 355 und 356:
User Interface Prototyping 355 Ausb
Seite 357 und 358:
Analyse und Evaluierung der Nutzung
Seite 359 und 360:
Seite 361 und 362:
Seite 363 und 364:
Seite 365 und 366:
Seite 367 und 368:
Seite 369 und 370:
Online-Beratungsk. für die Auswahl
Seite 371 und 372:
Seite 373 und 374:
Seite 375 und 376:
Seite 377 und 378:
Use, but verify 377 information abo
Seite 379 und 380:
Use, but verify 379 their fruits ye
Seite 381 und 382:
Problems and prospects of implement
Seite 383 und 384:
Seite 385 und 386:
Seite 387 und 388:
Seite 389 und 390:
Domänenübergreifende Phrasenextra
Seite 391 und 392:
Domänenübergreifende Phrasenextra
Seite 393 und 394:
Content Analysis in der Mathematik:
Seite 395 und 396:
Seite 397 und 398:
Seite 399 und 400:
Seite 401 und 402:
Seite 403 und 404:
Seite 405 und 406:
Das Konzept der Informativität 405
Seite 407 und 408:
Seite 409 und 410:
Seite 411 und 412:
Identification Systems Adoption in
Seite 413 und 414:
Seite 415 und 416:
Seite 417 und 418:
Seite 419 und 420:
Seite 421 und 422:
Seite 423 und 424:
Virtuelle Forschungsumgebungen 423
Seite 425 und 426:
Seite 427 und 428:
Seite 429 und 430:
Seite 431 und 432:
Seite 433 und 434:
Seite 435 und 436:
Der Streit um die Regelung des Zwei
Seite 437 und 438:
Seite 439 und 440:
Seite 441 und 442:
Seite 443 und 444:
Seite 445 und 446:
Seite 447 und 448:
Seite 449 und 450:
Seite 451 und 452:
Seite 453 und 454:
Seite 455 und 456:
Integrating industrial partners int
Seite 457 und 458:
Seite 459 und 460:
Seite 461 und 462:
Seite 463 und 464:
Seite 465 und 466:
Seite 467 und 468:
E-Learningkurs Globalisierung 467 E
Seite 469 und 470:
E-Learningkurs Globalisierung 469 B
Seite 471 und 472:
E-Learningkurs Globalisierung 471 f
Seite 473 und 474:
E-Learningkurs Globalisierung 473 b
Seite 475 und 476:
E-Learningkurs Globalisierung 475 A
Seite 477 und 478:
E-Learningkurs Globalisierung 477 N
Seite 479 und 480:
Social-Media-Marketing im Hochschul
Seite 481 und 482:
Seite 483 und 484:
Seite 485 und 486:
Seite 487 und 488:
Seite 489 und 490:
Seite 491 und 492:
Seite 493 und 494:
Seite 495 und 496:
Nutzungsanalyse des Deutschen Bildu
Seite 497 und 498:
A landmark in biomedical informatio
Seite 499 und 500:
3D-Modelle in bibliothekarischen An
Seite 501 und 502:
First Aid for Information Chaos in
Seite 503 und 504:
Multilingual Interface Usage 503 Mu
Seite 505 und 506:
Fassettierte Suche in Benutzeroberf
Seite 507 und 508:
Ordnung im Weltwissen 507 Zusammenf
Seite 509 und 510:
Die European Psychology Publication
Seite 511 und 512:
IUWIS (Infrastruktur Urheberrecht i
Seite 513 und 514:
Die Kulturgüterdatenbank der Regio
Seite 515 und 516:
Die Kulturgüterdatenbank der Regio
Seite 517 und 518:
TagTree: Exploring Tag-Based Naviga
Seite 519 und 520:
Link server aggregation with BEACON
Seite 521 und 522:
Link server aggregation with BEACON
Seite 523 und 524:
Wissenschaft trifft Praxis 523 Prax
Seite 525 und 526:
Wissenschaft trifft Praxis 525 Prax
Seite 527 und 528:
Wissenschaft trifft Praxis 527 der
Seite 529 und 530:
Mittendrin statt nur dabei 529 Stud
Seite 531 und 532:
Mittendrin statt nur dabei 531 Nadj
Seite 533 und 534:
Mittendrin statt nur dabei 533
Seite 535 und 536:
K.-M. Behr: Kreativer Umgang mit Co
Alle anzeigen

Information und Wissen: global, sozial und frei? - Fachhochschule ...

Sie wollen auch ein ePaper? Erhöhen Sie die Reichweite Ihrer Titel.

Template löschen?

Als Template speichern?