28.11.2016 Views

Recommender System with Apache Spark Apache Kafka and Elasticsearch

ApacheBigDataEU16-NPentreath

ApacheBigDataEU16-NPentreath

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Scoring<br />

Locality Sensitive Hashing<br />

Performance<br />

• LSH hashes each input vector into L “hash<br />

tables”. Each table contains a “hash signature”<br />

created by applying k hash functions.<br />

• St<strong>and</strong>ard for cosine similarity is Sign R<strong>and</strong>om<br />

Projections<br />

• At indexing time, create a “bucket” by combining<br />

hash table id <strong>and</strong> hash signature<br />

• Store buckets as part of item model metadata<br />

• At scoring time, filter c<strong>and</strong>idate set using term<br />

filter on buckets of query item<br />

• Tune LSH parameters to trade off speed /<br />

accuracy<br />

• LSH coming soon to <strong>Spark</strong> ML – SPARK-5992

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!