RSS
热门关键字:  数据挖掘  人工智能  数据仓库  搜索引擎  数据挖掘导论

STING : A Statistical Information Grid Approach to Spatial D

来源: 作者:unkonwn 时间:2004-12-04 点击:

In general, spatial data mining, or knowledge discovery in spatial databases, is the extraction of implicit knowledge, spatial relations and discovery of interesting characteristics and patterns that are not explicitly represented in the databases. These techniques can play an important role in understanding spatial data and in capturing intrinsic relationships between spatial and nonspatial data. Moreover, such discovered relationships can be used to present data in a concise manner and to reorganize spatial databases to accommodate data semantics and achieve high performance. Spatial data mining has wide applications in many fields, including GIS systems, image database exploration, medical imaging, etc.[Che97, Fay96a, Fay96b, Kop96a, Kop96b]

The amount of spatial data obtained from satellite, medical imagery and other sources has been growing tremendously in recent years. A crucial challenge in spatial data mining is the efficiency of spatial data mining algorithms due to the often huge amount of spatial data and the complexity of spatial data types and spatial accessing methods. In this paper, we introduce a new STatistical INformation Grid-based method (STING) to efficiently process many common “region oriented” queries on a set of points. Region oriented queries are defined later more precisely but informally, they ask for the selection of regions satisfying certain conditions on density, total area, etc. This paper is organized as follows. We first discuss related work in Section 2. We propose our statistical information grid hierarchical structure and discuss the query types it can support in Sections 3 and 4, respectively. The general algorithm as well as a detailed example of processing a query are given in Section 5. We analyze the complexity of our algorithm in Section 6. In Section 7, we analyze the quality of STING’s result and propose a sufficient condition under which STING is guaranteed to return the correct result. Limiting Behavior of STING is in Section 8 and, in Section 9, we analyze the performance of our method. Finally, we offer our conclusions in Section 10.

数据挖掘研究院

[Lu93] proposed two generalization based algorithms: spatial-data-dominant and non-spatial-data-dominant algorithms. Both of these require that a generalization hierarchy is given explicitly by experts or is somehow generated automatically. (However, such a hierarchy may not exist or the hierarchy given by the experts may not be entirely appropriate in some cases.) The quality of mined characteristics is highly dependent on the structure of the hierarchy. Moreover, the computational complexity is O(NlogN), where N is the number of spatial objects. Given the above disadvantages, there have been efforts to find algorithms that do not require a generalization hierarchy, that is, to find algorithms that can discover characteristics directly from data. This is the motivation for applying clustering analysis in spatial data mining, which is used to identify regions occupied by points satisfying specified conditions.

数据挖掘研究院

资料全文下载

最新评论共有 0 位网友发表了评论
发表评论
评论内容:不能超过250字,需审核,请自觉遵守互联网相关政策法规。
匿名?