RSS
热门关键字:  数据挖掘  人工智能  数据仓库  搜索引擎  数据挖掘导论
当前位置 :| 首页>人工智能>知识工程>

ACE Data Overview

来源: 作者:unkonwn 时间:2004-11-30 点击:

Project Specifications: ACE Data Overview

Corpus Data(words) Tasks: Languages Annotation Task Definition Sources Phase Evaluations Availability
ACE-Pilot 15K Pilot Entities-Pilot: English Entity Dectection and Tracking: Phase1 v2.2 TDT-2, Newspaper ACE Pilot May 2000
November 2000
Currently ACE/TIDES only; slated for future publication
ACE-1 180K Train

45K Eval
Entities:
English
Entity Detection and Tracking: Phase1 v2.2 TDT-2, Newspaper ACE PHASE1 February 2002 Currently ACE/TIDES only; slated for future publication
ACE-2 180K Train 数据挖掘研究院
(from ACE-1 Train)

 45K Dev/Test
(from ACE-1 Eval)

45,000 Eval (new)
Entities (revised treatment), Relations:
English
EDT Annotation Guidelines V2.5
 
RDC Annotation Guidelines V3.6
TDT-2, Newspaper
ACE PHASE2 September 2002 LDC Publication 数据挖掘实验室
LDC2003T11
ACE-2 EELD Supplement
30K Train (new)

20K Eval (new)
Entities, Relations:
English
EDT Annotation Guidelines v2.5

RDC Annotation Guidelines V3.6
RCK domain

数据挖掘实验室


EELD
September 2002
EELD/ACE only
ACE2003 Training Data
100K/lang Train (new)

50K/lang Eval (new)
Entities: English, Chinese, Arabic

Relations: English, Chinese
EDT Annotation Guidelines v2.5

RDC Annotation Guidelines V3.6
TDT-4 ACE PHASE2 TIDES Extraction, September 2003 LDC Publication 数据挖掘研究院
LDC2004T09




ACE 2004 Pilot Data




25K English Pilot (new)




English:

Entities, Relations, Events
English Entity  Guidelines V4.2.6

数据挖掘研究院



English Linking Guidelines V3.0

English Relations Guidelines V4.3.2

English Events Guidelines V2.0




Spring 2004 Mid-Course Correction Workshop





2004 Pilot Study,
February, 2004



ACE2004 Pilot Corpus: LDC2004E03

Availability: contact the LDC
ACE2004 Training Data
150K Train/lang (new)


50K Eval/lang


Entities: English, Chinese, Arabic


Relations: 
English, Chinese, Arabic
English Entity  Guidelines V4.2.6

English Linking Guidelines V3.0

English Relations Guidelines V4.3.2



Chinese Entity Guidelines V4.2.4 数据挖掘实验室

Chinese Linking Guidelines V2.0

Chinese Relations Guidelinies V4.3



Arabic Entity Guidelines V4.2.3

Arabic Linking Guidelines V1.0

Arabic Relations Guidelines V4.3
TDT-4; 数据挖掘研究院
Chinese Treebank;
Arabic Treebank;
Switchboard;
Fisher;

ACE PHASE3 ACE Program/
TIDES Extraction, September 2004


ACE/TIDES sites can obtain the following corpora by contacting LDC


ACE/TIDES Extraction 2004 Training Data
LDC2004E17, LDC2005T09


ACE/TIDES Extraction 2004 Training Data - Consistency Study
LDC2004E39

数据挖掘研究院
ACE/TIDES Extraction 2004 Evaluation/Test Data
LDC2004E51


ACE/TIDES Extraction 2004 Evaluation Data - Consistency Study
LDC2004E40
ACE 2005 Training Data (new):

English: 260K words
Chinese: 308K characters (205K words)
Arabic: 100K words

Evaluation Data (new):

Engish, Chinese and Arabic: 50K words
Entities, Relations, Events: 数据挖掘研究院

English, Chinese, Arabic
English-Entities-Guidelines_v5.6.1.pdf
English-Values-Guidelines_v1.2.4.pdf
English-Relations-Guidelines_v5.8.3.pdf
English-Events-Guidelines_v5.4.3.pdf
English-TimestampingGuidelines_v3.pdf
English-TIMEX2-Guidelines_v0.1.pdf

Chinese-Entities-Guidelines_v5.5.pdf
Chinese-Values-Guidelines_v1.1.2.pdf
Chinese-Relations-Guidelines_v5.5.1.pdf Chinese-Events-Guidelines_v5.5.1.pdf
Chinese-TIMEX2-Guideline-Summary_v1.2.pdf

数据挖掘研究院


Chinese-Timestamping-Guidelines_v2.pdf

Arabic-Entities-Guidelines_v5.3.3.pdf
Arabic-Values-Guidelines_v1.2.3.pdf
Arabic-Relations-Guidelines_v5.3.4.pdf
Arabic-Events-Guidelines_v5.4.4.pdf
Newswire;

Broadcast News;

Broadcast Conversation;

WebBlogs;

WebForums;

English Fisher Telephone Transcripts;
ACE 2005 November 2005 ACE sites can obtain the following corpus by contacting LDC

ACE 2005 Multilingual Training Data V6.0: LDC2005E18

最新评论共有 0 位网友发表了评论
发表评论
评论内容:不能超过250字,需审核,请自觉遵守互联网相关政策法规。
匿名?