02.11.2014 Views

Deliverable D3.1 First batch of resources ... - CESAR project

Deliverable D3.1 First batch of resources ... - CESAR project

Deliverable D3.1 First batch of resources ... - CESAR project

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Contract no. 271022<br />

Table <strong>of</strong> Contents<br />

Introduction ............................................................................................................................ 6<br />

1 HASRIL <strong>resources</strong> ............................................................................................................. 6<br />

1.1 Szeged Corpus ......................................................................................................................... 6<br />

1.2 Szeged Treebank ..................................................................................................................... 7<br />

1.3 Szeged Named Entity Recognition Corpus ............................................................................. 9<br />

1.4 Hungarian WordNet .............................................................................................................. 10<br />

1.5 Hungarian Webcorpus ........................................................................................................... 11<br />

1.6 Hunglish Corpus .................................................................................................................... 12<br />

1.7 morphdb.hu ............................................................................................................................ 14<br />

2 BME-TMIT <strong>resources</strong> ..................................................................................................... 15<br />

2.1 Mindentudás Speech Corpus ................................................................................................. 15<br />

2.2 Word level speech database for Hungarian ........................................................................... 16<br />

2.3 Hungarian BABEL ................................................................................................................ 23<br />

2.4 Hungarian Broadcast News Database.................................................................................... 27<br />

2.5 Sound Gesture Database ........................................................................................................ 32<br />

2.6 Hungarian Speech Emotion Database ................................................................................... 36<br />

3 FFZG <strong>resources</strong> ................................................................................................................ 41<br />

3.1 Croatian National Corpus ...................................................................................................... 41<br />

3.2 Croatian Morphological Lexicon........................................................................................... 44<br />

3.3 Croatian-English Parallel Corpus .......................................................................................... 47<br />

3.4 Croatian Lemmatisation Server ............................................................................................. 50<br />

3.5 Croatian Valency Lexicon ..................................................................................................... 53<br />

4 IPIPAN <strong>resources</strong>............................................................................................................. 56<br />

4.1 Polish Sejm Corpus ............................................................................................................... 56<br />

4.2 PoliMorf Inflectional Dictionary ........................................................................................... 63<br />

4.3 Polish WordNet ..................................................................................................................... 67<br />

4.4 Polish Named Entity Recognition Tool ................................................................................. 71<br />

4.5 1 million subcorpus <strong>of</strong> National Corpus <strong>of</strong> Polish ................................................................ 76<br />

4.6 Polish Named Entity Resources ............................................................................................ 82<br />

4.7 LUNA.PL Corpus .................................................................................................................. 85<br />

4.8 LUNA-WOZ.PL Corpus ....................................................................................................... 88<br />

5 ULodz <strong>resources</strong> ............................................................................................................... 90<br />

5.1 PELCRA Polish-English parallel corpora (CC-BY) ............................................................. 90<br />

5.2 PELCRA Polish-English parallel corpora (CC-BY-NC) .................................................... 101<br />

5.3 PELCRA Polish spoken corpus (CC-BY-NC) .................................................................... 107<br />

4

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!