Using Solr search in a Dot Net environment - University of Essex

data.archive.ac.uk

Using Solr search in a Dot Net environment - University of Essex

Using Solr search in a Dot Net environment

……………………………………………………….………………………………..................................................................................................

MATTHEW

BRUMPTON

………………………………………...

SENIOR SYSTEMS AND APPLICATIONS

DEVELOPER

UK DATA ARCHIVE

UNIVERSITY OF ESSEX

………………………………………...

DevCon1

12 APRIL 2013


Overview of today’s talk

……………………………………………………………………………………………………………………………….……………………………..

• UK Data Archive and the UK Data Service

• What is Solr?

• Why Solr?

• Life before Solr

• Current applications

• Architecture

• Application

• Server

• Working with Solr

• Acknowledgements

• Q&A

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


The UK Data Archive and the UK Data Service

……………………………………………………………………………………………………………………………….……………………………..

• Based at the University of Essex since 1967

• Curator of the largest collection of digital data in the social sciences and

humanities in the UK

• See data-archive.ac.uk for more details

• Makes these available via the new UK Data Service

• UK Data Service also provides value-added services for UK Census data,

government surveys and beyond

• UK Data Service includes Universities of Essex, Manchester (Mimas,

CCSR), Leeds, Southampton, Edinburgh (Edina) and University College

London

• See ukdataservice.ac.uk for more details

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


What is Solr?

……………………………………………………………………………………………………………………………….……………………………..

• Open-source solution, largely supported by the

Apache Software Foundation.

• Written in Java

• SOLR built on Lucene

• Lucene itself is just an indexing and search library

• Capable of indexing billions of items in a clustered

environment.

• Features include:

• Full-text search

• Faceted search

• Highlighting

• Rich document handling

• Distributed search (Solr Cloud)

• Highly scalable

• NoSQL

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


Why Solr?

……………………………………………………………………………………………………………………………….……………………………..

• Alternatives

• Microsoft FAST (SharePoint 2010)

• HP Autonomy (TNA)

• Elastic Search (Lucene)

• Large community (http://wiki.apache.org/solr/PublicServers)

• British Library

• The Guardian

• Ticketmaster

• Scalability

• Capable of indexing billions of items in a clustered environment.

• Performance

• Can search millions of records in milliseconds

• Low cost

• No purchasing costs

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


Life before Solr

……………………………………………………………………………………………………………………………….……………………………..

ESDS Qualidata

search interface

ESDS International

search interface

SEARCH

ESDS Data catalogue

ESDS Government

Survey Finder

BROWSE

Major Studies

BROWSE

Subject Headings

BROWSE

New releases

HASSET

Subject

Headings

BROWSE

Subject Headings

BROWSE

Thematic pages

Comparable

indicators

(Long)

ESDS Longitudinal

search interface

ESDS Government

search interface

Comparable

geography

(Long)

ESDS Qualidata free

text search interface

ESDS

Government

Variable Search

Variable Search

ESDS Data

Catalogue

SEARCH

CESSDA catalogue

DATA

ESDS Government:

publications citing

ESDS International

data

ESDS International:

publications citing

ESDS International

data

ESDS Longitudinal:

publications citing

ESDS Longitudinal

surveys

SEARCH

Survey Question

Bank

SEARCH

RELU-DSS

SEARCH

Census data

catalogue

SEARCH

(Data exploration)

Nesstar

SEARCH

(Data exploration)

Quali Online

SEARCH

UKDA-Store

SEARCH

SDS

SEARCH

HDS

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


Current applications

……………………………………………………………………………………………………………………………….……………………………..

• We now have one architecture that supports all our search interfaces

• The following applications have been built over the last 2 years:

• UGEO - spatial content of studies

• http://geo.data-archive.ac.uk/

• RELU - Research data, publications and outputs

• http://relu.data-archive.ac.uk/explore-data/search-browse

• Discover – UK Data Service data collections, support guides, case studies, and related

publications.

• http://discover.ukdataservice.ac.uk

• Variable search - variables and questions from survey datasets.

• http://discover.ukdataservice.ac.uk/variables

• UK Data Service website search

• http://ukdataservice.ac.uk/web-search.aspx

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


Application Architecture

……………………………………………………………………………………………………………………………….……………………………..

• Application

• Umbraco / ASP.MVC

• HTML

• Jquery

• Business / Data Access

• .NET Libraries

• UKDA.Search.Library

SolrNet

Solr Cloud

Solr 4.1

• Lucene

• JVM 7

• Tomcat 7

• Data

• MS SQL

• .NET 4 Console

• Entity framework / WebAPI

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


Server Architecture

……………………………………………………………………………………………………………………………….……………………………..

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


Working with Solr

……………………………………………………………………………………………………………………………….……………………………..

• Sending a query to Solr

http://dasolrc3:8983/solr/Catalogue/select?

q="Computer+program"&

sort=Date+desc&

fl=CommonTitle%2C+CommonLink%2C+CommonDescription%2C+Date

• Responds with a json, XML or .csv result

• Built in admin panel

• http://dasolrc3:8983/solr/

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


Demo

……………………………………………………………………………………………………………………………….……………………………..

`1. Code behind request – Stack trace

• HTML Form

• Request : MVC Controller (HTTP Post)

• UKDA.Search.library

SolrNet

• UKDA.Search.library

• HTTP Repsonse

Solr

Tomcat 7

Java 7

Solr 4.2

Lucene

`2. Ajax request – Stack trace

Request : Get

Response : XML

• JQuery

• Request : Rest MVC WebService (HTTP Post)

• UKDA.Search.library

SolrNet

• UKDA.Search.library

• Response : JSON

• JQuery

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


Acknowledgements

……………………………………………………………………………………………………………………………….……………………………..

• The architecture shown was built with the input of the following people:

• Project managers

• Lucy Bell

• Jack Kneeshaw

• David Hall

• Van Den Eynden

• Tom Ensom

Solr

• Oscar Dovao

• .Net

• Jonathan Sexton

• Steve Warin

• Sidharth Balakrishnan

• John Payne

• Darren Bell

• Raju Golla

• Sirisha Kakarla

• Nic Dragos

• Erkan Bostanci

• Bayar Menzat

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


Thanks for listening

……………………………………………………………………………………………………………………………….……………………………..

• Any Questions?

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


CONTACT

……………………………………………………………………………………………………………………………….……………………………..

Matthew Brumpton

UK DATA ARCHIVE

UNIVERSIY OF ESSEX

WIVENHOE PARK

COLCHESTER

ESSEX CO4 3SQ

………………………………………

T +44 (0)1206 872001

E mbrump@data-archive.ac.uk

W data-archive.ac.uk

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE

More magazines by this user