08.04.2013 Views

Extraction and Integration of MovieLens and IMDb Data - APMD

Extraction and Integration of MovieLens and IMDb Data - APMD

Extraction and Integration of MovieLens and IMDb Data - APMD

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.2. Integrated schema<br />

Verónika Peralta<br />

The integrated schema consists in 52 tables describing movies, companies <strong>and</strong> persons related to movies <strong>and</strong> the<br />

users that evaluated movies. Figure 23 shows the tables <strong>of</strong> the integrated schema, their primary keys (underlined<br />

attributes) <strong>and</strong> foreign keys (arrows between tables). Additional dotted lines relate some tables to a fictitious (not<br />

implemented) relation, describing persons. Shadow tables were used for the construction <strong>of</strong> the referential but<br />

are not visible for making queries. Table 13 describes each table, its attributes <strong>and</strong> constraints.<br />

Table Attributes Constraints<br />

I_Movies<br />

Join between <strong>IMDb</strong> <strong>and</strong><br />

<strong>MovieLens</strong><br />

I_Actors<br />

Master <strong>of</strong> actors<br />

I_Actresses<br />

Master <strong>of</strong> actresses<br />

I_Ages<br />

Age intervals<br />

I_Biographies<br />

Biographies <strong>of</strong> actors,<br />

actresses, directors,<br />

producers <strong>and</strong> other<br />

people involved in movies<br />

I_Budgets<br />

Master <strong>of</strong> budget<br />

intervals<br />

I_Colors<br />

Master <strong>of</strong> colors<br />

I_Countries<br />

Master <strong>of</strong> countries,<br />

providing several<br />

aggregation criteria <strong>and</strong><br />

international country<br />

codes<br />

− MovieId: Numeric(4); <strong>MovieLens</strong>’ id<br />

− Title<strong>MovieLens</strong>: String(100)<br />

− TitleImdb: String(250)<br />

− Actor: String(75)<br />

− MovieQuantity: Numeric(3); the number<br />

<strong>of</strong> played movies<br />

− Actress: String(75)<br />

− MovieQuantity: Numeric(3); the number<br />

<strong>of</strong> played movies<br />

− AgeId: Numeric(2)<br />

− MinAge: Numeric(2)<br />

− MaxAge: Numeric(2)<br />

− Name: String(70)<br />

− RealName: String(220)<br />

− Birth: String(130); date <strong>and</strong> place <strong>of</strong> birth<br />

− Decease: String(160); date, place <strong>and</strong><br />

cause <strong>of</strong> decease<br />

− Height: String(15)<br />

− BudgetUSD: Numeric(15,2); start <strong>of</strong> an<br />

interval (internal use)<br />

Primary key: MovieId<br />

Unique: TitleImdb<br />

Primary key: Actor<br />

Primary key: Actress<br />

Primary key: AgeId<br />

Primary key: Name<br />

Primary key: BudgetUSD<br />

− Color: String(20) Primary key: Color<br />

− Country: String(40)<br />

− LongName: String(110)<br />

− DomainCode: String(2); internet domain<br />

− ISO2Code: String(2); ISO3166-1-alpha2<br />

code<br />

− ISO3Code: String(3); ISO3166-1-alpha2<br />

code<br />

− UNnumericalCode: Numeric(3); united<br />

nations country code<br />

− IsCurrent: Numeric(1); 1 for current<br />

countries, 0 for old ones<br />

− IsSovereign: Numeric(1); 1 for sovereign<br />

UN nations, 2 for sovereign non-UN nations,<br />

3 for sovereign non-recognized nations, 4 for<br />

dependent territories <strong>and</strong> 5 for areas <strong>of</strong><br />

special sovereignty<br />

− Sovereign: String(40); name <strong>of</strong> sovereign<br />

nation (current country in the case <strong>of</strong> old<br />

countries)<br />

− Continent: String(20)<br />

− SecondaryContinent: String(20); for<br />

countries having territories in two continents<br />

(the one with the highest are is taken as main<br />

continent)<br />

− Area: Numeric(8)<br />

− Inhabitants: Numeric(10)<br />

Primary key: Country<br />

29

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!