RSS
热门关键字:  数据挖掘  人工智能  数据仓库  搜索引擎  数据挖掘导论

Bibliography on Automated Text Categorization

来源: 作者:unkonwn 时间:2004-12-09 点击:

This is a large online bibliography on automated text categorization (ATC). You can either view it or download it as a single file (ASCII text in BibTex format) or access the fully searchable online version.

Overview

ATC is the activity of automatically building, by means of machine learning techniques, automated text classifiers, i.e., systems capable of assigning a text document to one or more thematic categories (or labels) from a predefined set. The following article contains a very comprehensive survey of the state of the art in ATC (see entry [Sebastiani02] in the bibliography): Fabrizio Sebastiani, Machine Learning in Automated Text Categorization, ACM Computing Surveys, 34(1):1-47, 2002. 数据挖掘研究院

Scope

In general, only references specific to ATC are considered pertinent to this bibliography; in particular, references that are considered pertinent are: 数据挖掘实验室

  • publications that discuss novel ATC methods, novel experimentation with previously known methods, or new resources for ATC experimentation;
  • publications that discuss applications of ATC (e.g., automated indexing for Boolean IR systems, filtering, etc.).
  • References that are not considered pertinent are:
  • Publications that discuss techniques that are in principle useful for ATC (e.g., machine learning techniques, information retrieval techniques) but do not explicitly discuss their application to ATC;
  • Publications that discuss related topics sometimes confused with ATC; these include (but are not limited to) text clustering (i.e., text classification by unsupervised learning) and text indexing;
  • Technical reports and workshop papers. Only papers that have been the object of formal publication (i.e., conferences and journals) are to be included in the bibliography, so as to avoid its explosion and the inclusion of material bound to obsolescence
  • Updates

    Please do send me new references, as well as corrections and additions (e.g., missing URLs and abstracts) to the existing ones. I′m routinely monitoring major conferences and journals several times a year, but there always will be articles that I unfortunately overlook, so please help me keep the bibliography as current and complete as possible.

    数据挖掘研究院

    Concerning URLs from which on-line copies of the papers can be downloaded: where possible, I included URLs with unrestricted access (e.g., home pages of the authors). When such URLs were not available, sometimes a URL with restricted access (e.g., the ACM Digital Library or the IEEE Computing Society Digital Library, which are accessible to subscribers only) is provided. When this is the case, if you know of a URL with unrestricted access from which the paper is also available, please let me know and I will update the link. 数据挖掘研究院

    Historical notes

    This bibliography was originally created by Fabrizio Sebastiani.
    最新评论共有 0 位网友发表了评论
    发表评论
    评论内容:不能超过250字,需审核,请自觉遵守互联网相关政策法规。
    匿名?