The objective of the ACE Program is to develop extraction technology to support automatic processing of source language data (in the form of natural text, and as text derived from ASR and OCR). This includes classification, filtering, and selection based on the language content of the source data, i.e., based on the meaning conveyed by the data. Thus the ACE program requires the development of technologies that automatically detect and characterize this meaning. The ACE research objectives are viewed as the detection and characterization of Entities, Relations, and Events.
Linguistic Data Consortium develops annotation guidelines, corpora and other linguistic resources to support the ACE Program. Some of these resources have been developed in cooperation with the TIDES Program, in support of TIDES Extraction evaluations.
ACE annotators tag broadcast transcripts, newswire and newspaper data in English, Chinese and Arabic, producing both training and test data for common research task evaluations. There are three primary ACE annotation tasks corresponding to the three research objectives: Entity Detection and Tracking (EDT), Relation Detection and Characterization (RDC), and Event Detection and Characterization (EDC). A fourth annotation task, Entity Linking (LNK), groups all references to a single entity and all its properties together into a Composite Entity.
Entity Detection and Tracking (EDT) is the core annotation task, providing the foundation for all remaining tasks. The current ACE task identifies seven types of entities: Person, Organization, Location, Facility, Weapon, Vehicle and Geo-Political Entity (GPEs). Each type is further divided into subtypes (for instance, Organization subtypes include Government, Commercial, Educational, Non-profit, Other). Annotators tag all mentions of each entity within a document, whether named, nominal or pronominal. For every mention, the annotator identifies the maximal extent of the string that represents the entity, and labels the head of each mention. Nested mentions are also captured. Each entity is classified according to its type and subtype. Each entity mention is further tagged according to its class - specific, generic, attributive, negatively quantified or underspecified. During the LNK annotation task, annotators review the entire document to group mentions of the same entity together; they also label cases of metonymy, where the name of one entity is used to refer to another entity (or entities) related to it.
Relation Detection and Characterization (RDC) involves the identification of relations between entities. This task was added in Phase 2 of ACE. The current definition of RDC targets physical relations including Located, Near and Part-Whole; social/personal relations including Business, Family and Other; a range of employment or membership relations; relations between artifacts and agents (including ownership); affiliation-type relations like ethnicity; relationships between persons and GPEs like citizenship; and finally discourse relations. For every relation, annotators identify two primary arguments (namely, the two ACE entities that are linked) as well as the relation′s temporal attributes. Relations that are supported by explicit textual evidence are distinguished from those that depend on contextual inference on the part of the reader. 数据挖掘研究院
ACE Phase Three adds a new challenge: Event Detection and Characterization (EDC) . In EDC, annotators identify and characterize five types of events in which EDT entities participate. Targeted types include Interaction, Movement, Transfer, Creation and Destruction events. Annotators tag the textual mention or anchor for each event, and categorize it by type and subtype. They further identify event arguments (agent, object, source and target) and attributes (temporal, locative as well as others like instrument or purpose) according to a type-specific template.
In future phases of ACE, annotators will identify additional event types as well as characterizing relations between events.

