Extraction and Integration of MovieLens and IMDb Data - APMD
Extraction and Integration of MovieLens and IMDb Data - APMD
Extraction and Integration of MovieLens and IMDb Data - APMD
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Extraction</strong> <strong>and</strong> <strong>Integration</strong> <strong>of</strong> <strong>MovieLens</strong> <strong>and</strong> <strong>IMDb</strong> <strong>Data</strong> – Technical Report<br />
I_MovieRatings<br />
Global ratings on movies<br />
(not differentiated par<br />
user)<br />
I_MovieReleaseDates<br />
Dates <strong>of</strong> different releases<br />
I_MovieRunningTimes<br />
Running times <strong>of</strong> movies<br />
I_MovieSounds<br />
Sound mix <strong>of</strong> movies<br />
I_MovieTypes<br />
Types <strong>of</strong> movies<br />
I_MovieWriters<br />
Writers <strong>of</strong> movies<br />
I_MovieYears<br />
Years <strong>of</strong> movies<br />
I_Occupations<br />
User occupations<br />
I_Producers<br />
Master <strong>of</strong> producers<br />
I_Ratings<br />
Master <strong>of</strong> ratings<br />
I_Revenues<br />
Master <strong>of</strong> revenue<br />
intervals<br />
I_Sounds<br />
Master <strong>of</strong> sounds<br />
I_Types<br />
Types <strong>of</strong> movies<br />
I_UserRatings<br />
User ratings on movies<br />
32<br />
− MovieId: Numeric(4)<br />
− Distribution: String(10); unknown<br />
meaning<br />
− Votes: Numeric(8,2); quantity <strong>of</strong> votes<br />
(decimals are always zero)<br />
− Rating: Number(4,2), value between 0<br />
<strong>and</strong> 10<br />
− MovieId: Numeric(4)<br />
− Country: String(40)<br />
− ReleaseDate: String(20)<br />
− ReleaseYear: Numeric(4)<br />
− ReleaseInfo: String(90); further info<br />
(release city/festival)<br />
− MovieId: Numeric(4)<br />
− Country: String(40)<br />
− Duration: Numeric(5)<br />
− RunningInfo: String(75); further info<br />
(version, episodes, media, …)<br />
− MovieId: Numeric(4)<br />
− SoundMix: String(40)<br />
− MovieId: Numeric(4)<br />
− Type: String(2); {C=cinema,<br />
TV=television, V=video, VG=video game,<br />
S=serie, M=miniserie}<br />
− MovieId: Numeric(4)<br />
− Writer: String(45)<br />
− WriterInfo: String(110); further info<br />
(role, awards, …)<br />
− MovieId: Numeric(4)<br />
− Year: Numeric(4)<br />
− YearInfo: String(30); further info (shot<br />
years)<br />
− OccupationId: Numeric(2)<br />
− Occupation: String (25)<br />
− Producer: String(60)<br />
− MovieQuantity: Numeric(3); the number<br />
<strong>of</strong> produced movies<br />
− Rating: Number(4,2), value between 0<br />
<strong>and</strong> 10 (internal use, for classifying ratings)<br />
− RevenueUSD: Numeric(15,2); start <strong>of</strong> an<br />
interval (internal use)<br />
Primary key: MovieId<br />
Foreign keys:<br />
− MovieId: ref. I_Movies<br />
Primary key: MovieId, Country,<br />
ReleaseDate<br />
Foreign keys:<br />
− MovieId: ref. I_Movies<br />
− ReleaseYear: ref. I_Years<br />
− Country: ref. I_Countries<br />
Primary key: MovieId, Country<br />
Foreign keys:<br />
− MovieId: ref. I_Movies<br />
− Country: ref. I_Countries<br />
Primary key: MovieId, SoundMix<br />
Foreign keys:<br />
− MovieId: ref. I_Movies<br />
− SoundMix: ref. I_Sounds<br />
Primary key: MovieId<br />
Foreign keys:<br />
− MovieId: ref. I_Movies<br />
− Type: ref. I_Types<br />
Primary key: MovieId, Writer<br />
Foreign keys:<br />
− MovieId: ref. I_Movies<br />
− Writer: ref.: I_Writers<br />
Primary key: MovieId, Year<br />
Foreign keys:<br />
− MovieId: ref. I_Movies<br />
− Year: ref. I_Years<br />
Primary key: OccupationId<br />
Primary key: Producer<br />
Primary key: Rating<br />
Primary key: RevenueUSD<br />
− SoundMix: String(40) Primary key: SoundMix<br />
− Type: String(2); {C=cinema,<br />
TV=television, V=video, VG=video game,<br />
S=serie, M=miniserie}<br />
− TypeDescription: String(10)<br />
− UserId: Numeric(4)<br />
− MovieId: Numeric(4)<br />
− Rating: Number(1), value in {1,2,3,4,5}<br />
− Timestamp: Numeric(9); represents<br />
milliseconds from an initial time<br />
Primary key: Type<br />
Primary key: UserId, MovieId<br />
Foreign keys:<br />
− UserId: ref. I_Users<br />
− MovieId: ref. I_Movies