12.09.2013 Views

Programme booklet (pdf)

Programme booklet (pdf)

Programme booklet (pdf)

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Abstract<br />

34<br />

Nauze, Fabrice<br />

Q-go<br />

Clustering customer questions<br />

CLIN 21 – CONFERENCE PROGRAMME<br />

Q-go’s natural language search technology powers the search box of many corporate<br />

websites. Its NLP technology allows customers to ask questions in their own words and<br />

returns a small set of relevant answers. Hundreds of millions of questions have already<br />

been processed and answered with Q-go’s solution providing us with a mine of data.<br />

In order to improve our knowledge of what customers are asking and to help further<br />

refine our core systems, Q-go needs a way to automatically cluster relevant queries<br />

from large sets of customer questions.<br />

To achieve this goal we tested several standard clustering methods on sets of customer<br />

questions. The outline of the talk will be the following.<br />

First we will explain the specific challenges one has to face when clustering customer<br />

questions (very short queries, typos, etc…). We will then present the clustering<br />

algorithms that have been tested (among other k-Means, GAAC hierarchical clustering,<br />

mini-batch k-Means). Thirdly we will outline two different types of heuristics used in<br />

the first case to improve the quality of the vector representations feeding the<br />

clustering algorithms and in the second to overcome the curse of dimensionality.<br />

Finally the different methods will be evaluated and compared with respect to<br />

processing speed and intrinsic quality of clustering (as well as its practical usefulness).<br />

Corresponding author: fabrice.nauze@q-go.com

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!