24.06.2013 Views

Atelier Visualisation et extraction de connaissances - Irisa

Atelier Visualisation et extraction de connaissances - Irisa

Atelier Visualisation et extraction de connaissances - Irisa

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CA has successfully been used too for content based image analysis (Pham <strong>et</strong> al, 2009a)<br />

or image r<strong>et</strong>rieval (Pham <strong>et</strong> al, 2009b).<br />

3 Interactive exploration of telecommunication data<br />

3.1 Preprocessing<br />

Due to the nature of telecommunication data (some customers never make a call, other<br />

use a single service) we have proposed some criteria for constructing the contingency table.<br />

- Data are collected every three months. When customers will need to use the service,<br />

it needs to be repeated in months to form the habit of using the service and also reflects<br />

the customer relation. A period of 3 months is sufficient to form the habit of<br />

using customer service.<br />

- Customers who use less than 3 services will be ignored. It is easy to analyze these<br />

customers by focusing on the services used.<br />

- Only services which are used at least 3 times are selected.<br />

We have, after this step, a contingency table crossing customers and services. The element<br />

fij of this table represent how often the customer i uses the service j.<br />

3.2 Projection on factorial plan<br />

In CA, customers and services are displayed on the same plane. Here, a dot “customer” is<br />

displayed as a red circle; and a dot “service” as a blue circle. User can select one or a group<br />

of customers (or services) by pointing on it.<br />

To help the user for interpr<strong>et</strong>ation, the label of the service is displayed besi<strong>de</strong>s the corresponding<br />

dot “service”. This gives us immediately a general summary of the group of services,<br />

e.g. international call.<br />

For focusing on the interesting customers and/or services, we display only the customers<br />

and/or services whose contributions to inertia are high, usually 2 or 3 times the average contribution.<br />

Other customers/services will not be displayed. The total inertia on one axis is<br />

equal to the eigenvalue associated to this axis. The threshold is easy to d<strong>et</strong>ermine. User can<br />

also change displayed axes (hence plan) in or<strong>de</strong>r to discover more group of customers/services.<br />

The choice of displayed axes could be done using the quality of representation<br />

of customer/service variables. The quality of representation of a projected point i (service i)<br />

on the axis j is the square cosine of the angle b<strong>et</strong>ween the original point i and its projection<br />

on axis j. If the square cosine is close to 1, this means that the position of the projected point<br />

is close to the original position.<br />

3.3 Hierarchical CA by visualization<br />

We introduce here a m<strong>et</strong>hod which allows us to discover hierarchical groups of services.<br />

The m<strong>et</strong>hod is based on an interesting property of CA: the simultaneous representation of<br />

lines (customers) and columns (services). When CA projects customers and services on the<br />

same plan, their association is interpr<strong>et</strong>ed by the distance (in the plan) among them. The<br />

points “services” around a point or a group of point “customers” indicate that there is a<br />

strong relation b<strong>et</strong>ween these customers and services. We can apply another CA on these<br />

F.Poul<strong>et</strong>, B.Le Grand : 9e <strong>Atelier</strong> <strong>Visualisation</strong> <strong>et</strong> Extraction <strong>de</strong> Connaissances 15

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!