General Computer Science 320201 GenCS I & II Lecture ... - Kwarc

More documents

Recommendations

Info

$The LaTeXML Daemon: A LATEX Entrance to the Semantic Web$

$Mathematical Documents want to Active Digital Math ... - Kwarc$

engine extracts salient features from the documents and stores them in a special data structure (usually tree-like) that can be queried instead of the documents themselves. In the prevalent information retrieval algorithms, the salient feature is a word frequency vector. Searching for Documents Efficiently: Indexing Problem: We cannot search the WWWeb linearly (even with 10 6 computers: ≥ 10 15 B) Idea: Write an “index” and search that instead. (like the index in a book) Definition 569 Search engine indexing analyzes data and stores key/data pairs in a special data structure (the search index to facilitate efficient and accurate information retrieval. Idea: Use the words of a document as index (multiword index) The key for a document is the vector of word frequencies. term 2 term 1 D1(t1,1, t1,2, t1,3) D2(t2,1, t2,2, t2,3) term 3 c○: Michael Kohlhase 375 Note: The word frequency vectors used in the “vector space model” for information retrieval are very high-dimensional; the dimension is the number of words in the document corpus. Millions of dimensions are usual. However, linguistic methods like “stemming” (reducing words to word stems) are used to bring down the number of words in practice. Once an answer set has been determined, the results have to be sorted, so that they can be presented to the user. As the user has a limited attention span – users will look at most at three to eight results before refining a query, it is important to rank the results, so that the hits that contain information relevant to the user’s information need early. This is a very difficult problem, as it involves guessing the intentions and information context of users, to which the search engine has no access. Ranking Search Hits: e.g. Google’s Pagerank Problem: There are many hits, need to sort them by some criterion (e.g. importance) Idea: A web site is important, . . . if many other hyperlink to it. Refinement: . . . , if many important web pages hyperlink to it. 215
Definition 570 Let A be a web page that is hyperlinked from web pages S1, . . . , Sn, then PR(S1) PR(Sn) PR(A) = 1 − d + d + · · · C(S1) C(Sn) where C(W ) is the number of links in a page W and d = 0.85. c○: Michael Kohlhase 376 Getting the ranking right is a determining factor for success of a search engine. In fact, the early of Google was based on the pagerank algorithm discussed above (and the fact that they figured out a revenue stream using text ads to monetize searches). The final step for a web search engine is answer composition; at least, if the answer is addressed at a human user. The main task here is to assemble those information fragments that the user needs to determine whether the hit described contains information relevant to the respective information need. Answer Composition in Search Engines Answers: To present the search results we need to address: Hits and their context format conversion caching Advertising: to finance the service advertiser can buy search terms ads correspond to search interest advertiser pays by click. c○: Michael Kohlhase 377 Due to the gigantic size of the Internet, search engines are extremely resource-hungry web applications. The precise figures about the computational resources of the large internet companies are well-kept trade secrets, but the following figure should give an intuition of the scales involved. 216
Page 1 and 2:
General Computer Science 320201 Gen
Page 3 and 4:
Java programmer” on the practical
Page 5 and 6:
Contents 1 Preface i 2 Representati
Page 7 and 8:
4 Search and Declarative Computatio
Page 9 and 10:
Fundamental Algorithms and Data str
Page 11 and 12:
To earn an audit you have to take t
Page 13 and 14:
i.e. to function as a member of the
Page 15 and 16:
a factor two in speed. This ability
Page 17 and 18:
“Applets, Not Craplets tm ” (-
Page 19 and 20:
c○: Michael Kohlhase 17 Can be re
Page 21 and 22:
Example 10 (Kruskal’s algorithm,
Page 23 and 24:
n 100n µs 7n 2 µs 2 n µs 1 100
Page 25 and 26:
2.2 Elementary Discrete Math 2.2.1
Page 27 and 28:
Axiom 19 (P 1) “ ” (aka. “zer
Page 29 and 30:
Theorem 36 is a very useful fact to
Page 31 and 32:
Definition 43 The unary product ope
Page 33 and 34:
y stating element-hood (a ∈ S) or
Page 35 and 36:
Idea: We need a notion of “counti
Page 37 and 38:
Example 64 On sets of persons, the
Page 39 and 40:
Definition 82 We say that a set A i
Page 41 and 42:
clean enough to learn important con
Page 43 and 44:
Definition 90 anonymous variables (
Page 45 and 46:
c○: Michael Kohlhase 66 Defining
Page 47 and 48:
- fun both_plus (x:int,y:int) = fn
Page 49 and 50:
+(〈n, o〉) = n (base) and +(〈m
Page 51 and 52:
the axiom says that any object that
Page 53 and 54:
epresent them as a data type, where
Page 55 and 56:
Example 127 〈{N}, {[o: N], [s: N
Page 57 and 58:
The central idea here is what we ha
Page 59 and 60:
Substitutions Definition 149 Let A
Page 61 and 62:
Idea: Well-formed parts of construc
Page 63 and 64:
Other programming languages chose a
Page 65 and 66:
generally: fn+1 := fn + fn−1 plus
Page 67 and 68:
input/output: the interesting bit a
Page 69 and 70:
exception NaN; (* Not a Number *) f
Page 71 and 72:
Example 191 If A = {a, b, c}, then
Page 73 and 74:
Example 210 The Morse Code in the t
Page 75 and 76:
The first 32 characters are control
Page 77 and 78:
Idea: Unicode supports multiple enc
Page 79 and 80:
2.5 Boolean Algebra We will now loo
Page 81 and 82:
What a mess! Iϕ((x1 + x2) + (x1
Page 83 and 84:
c○: Michael Kohlhase 140 The defi
Page 85 and 86:
(f ≤a g), iff there is an n0 ∈
Page 87 and 88:
P.1.2.3 then there are ei ∈ Ebool
Page 89 and 90:
2.5.4 The Quine-McCluskey Algorithm
Page 91 and 92:
the disjunctive normal form, and th
Page 93 and 94:
Proof: by contradiction: let p /∈
Page 95 and 96:
So, the minimal polynomial of f is
Page 97 and 98:
2.6 Propositional Logic 2.6.1 Boole
Page 99 and 100:
Idea: Import semantics from Boolean
Page 101 and 102:
which we can always do, since we ha
Page 103 and 104:
e quite difficult to establish in g
Page 105 and 106:
H 0 axioms are valid Lemma 316 The
Page 107 and 108:
c○: Michael Kohlhase 188 The enta
Page 109 and 110:
The deduction theorem and the entai
Page 111 and 112:
Inference with local hypotheses [A
Page 113 and 114:
2.7 Machine-Oriented Calculi Now we
Page 115 and 116:
Thus the tableau procedure can be u
Page 117 and 118:
A ∨ BT AT BT A ⇒ BT AF BT
Page 119 and 120:
Proof: P.1 It is easy to see tahat
Page 121 and 122:
(P ⇒ Q ⇒ R) ⇒ (P ⇒ Q) ⇒ P
Page 123 and 124:
⎧ ⎪⎨ We represented the maze
Page 125 and 126:
defined a directed graph to be a se
Page 127 and 128:
Paths in Graphs Definition 373 Giv
Page 129 and 130:
This allows us to view Boolean expr
Page 131 and 132:
Computing with Combinational Circui
Page 133 and 134:
Corollary 399 A fully balanced tree
Page 135 and 136:
X 1 X 2 X 3 X n = if L i =X i if L
Page 137 and 138:
3.2 Arithmetic Circuits 3.2.1 Basic
Page 139 and 140:
S S ψ ψ −1 fS = ψ −1 ◦ fT
Page 141 and 142:
The Full Adder Definition 415 The
Page 143 and 144:
first summand 3 4 7 9 8 3 4 7 9 2 s
Page 145 and 146:
c○: Michael Kohlhase 249 The anal
Page 147 and 148:
Problems of Sign-Bit Systems Gener
Page 149 and 150:
generate the n-bit binary number re
Page 151 and 152:
and an + bn + (icn(a, b, c)) = 〈
Page 153 and 154:
Summary: We have built a combinatio
Page 155 and 156:
To understand the operation of the
Page 157 and 158:
Example 443 (Address decoder logic
Page 159 and 160:
3.4 Computing Devices and Programmi
Page 161 and 162:
Our notion of time is in this const
Page 163 and 164:
instructions LOADIN 1 and LOADIN 2
Page 165 and 166:
Definition 457 An ASM program VM is
Page 167 and 168:
instruction effect VPC peek i push
Page 169 and 170:
c○: Michael Kohlhase 289 With the
Page 171 and 172: Imperative Stack Operations: peek l
Page 173 and 174: A SW program (see the next slide fo
Page 175 and 176: arguments to arithmetic operations
Page 177 and 178: µML, a very simple Functional Prog
Page 179 and 180: call pushes the return address (of
Page 181 and 182: [proc 2 26, con 0, arg 2, leq, cjp
Page 183 and 184: [proc 2 26, con 0, arg 2, leq, cjp
Page 185 and 186: eturn takes the current frame from
Page 187 and 188: label instruction effect comment
Page 189 and 190: Compiling µML Expressions (Continu
Page 191 and 192: c○: Michael Kohlhase 325 We want
Page 193 and 194: State Machine: 1 1,R 1 1,R 1 1,L 1
Page 195 and 196: The coded description acts as a pro
Page 197 and 198: the turing function uses will_halt
Page 199 and 200: Terabyte (T B) 1,000,000,000,000 by
Page 201 and 202: Layers in TCP/IP: TCP/IP uses encap
Page 203 and 204: name comment 4 version IPv4 or IPv6
Page 205 and 206: Domain names must be registered to
Page 207 and 208: That was almost all, but we close t
Page 209 and 210: c○: Michael Kohlhase 353 Note tha
Page 211 and 212: Definition 529 HTTP is used by a cl
Page 213 and 214: structure html,head, body metadata
Page 215 and 216: Server-Side Scripting: Programming
Page 217 and 218: c○: Michael Kohlhase 367 Indeed,
Page 219 and 220: presentation tools), but can also i
Page 221: 1. reads web page 2. reports it hom
Page 225 and 226: can combine both to falsify communi
Page 227 and 228: Candidates for one-way/trapdoor fun
Page 229 and 230: on a UNIX system, you can create a
Page 231 and 232: conceptually: transfer of directed
Page 233 and 234: Definition 591 The XML path languag
Page 235 and 236: Resources: Globally Identified by U
Page 237 and 238: R⌉}〉∫⊔⌉∇⌉⌈√⊣∇
Page 239 and 240: Problem solving Problem: Find algo
Page 241 and 242: Single-state Problem: Start in 5
Page 243 and 244: States integer locations of tiles A
Page 245 and 246: Implementation: States vs. nodes A
Page 247 and 248: Breadth-First Search c○: Michael
Page 249 and 250: Breadth-First Search: Romania c○:
Page 251 and 252: Note: Equivalent to breadth-first s
Page 253 and 254: A B C D E F G H I J K L M N O Depth
Page 255 and 256: A B C D E F G H I J K L M N O Depth
Page 257 and 258: Iterative deepening search Depth-l
Page 259 and 260: Breadth-first search Iterative deep
Page 261 and 262: Sibiu 253 Greedy Search: Romania Ar
Page 263 and 264: P.2 Let n be an unexpanded node on
Page 265 and 266: A ∗ search: Properties Complete Y
Page 267 and 268: Definition 618 (n-queens problem) P
Page 269 and 270: escape certain odd phenomena occurr
Page 271 and 272: GAs = evolution: e.g., real genes e
Page 273 and 274:
Definition 632 A query is a list of
Page 275 and 276:
(autoload ’run-prolog "prolog" "S
Page 277 and 278:
Example 638 Computing the the n th
Page 279 and 280:
Deduction: knowledge extension Abd
Page 281 and 282:
[Koh06] Michael Kohlhase. OMDoc - A
Page 283 and 284:
astarSearch search, 255 asymmetric-
Page 285 and 286:
infinite, 32 counter program, 152,
Page 287 and 288:
functional variable, 170 gate, 122
Page 289 and 290:
media access control address, 194 M
Page 291 and 292:
procedure abstract, 54 procedure de
Page 293 and 294:
function, 30 spanning tree, 13 spro
Page 295:
name, 70 variable functional, 170 v
show all

General Computer Science 320201 GenCS I & II Lecture ... - Kwarc

Create successful ePaper yourself

Delete template?

Save as template?