ACM's KDD 2008 is the annual premier international forum for data mining researchers and practitioners from academia, industry, and government to share their ideas, research results and experiences. This year this event was held in Loews Lake Las Vegas resort where Jeff Bergman and I attended it. Details of the program can be found here
http://www.kdd2008.com/program.html and the summary is as follows.
9:00 am - 5:00 pm
Full Day Workshop W1 - ADKDD'08
Full Day Workshop W2 - WEBKDD'08
Full Day Workshop W3 - Sensor-KDD
Full Day Workshop W4 - PinKDD'08
Full Day Workshop W5 - SNA-KDD
Full Day Workshop W13 - Multimedia Data Mining
9:00 am - 12:00 pm
Half Day Workshop W6 - KDD CUP and Mining Medical data
Half Day Workshop W7 - Multiple Information Sources
Half Day Workshop W11 - BIOKDD08
Half Day Workshop W12 - Mining for Business Applications
9:00 am - 12:00 pm
Tutorial - Mining Massive RFID, Trajectory, and Traffic Data Sets
Tutorial - Predictive Modeling with Social Networks
Tutorial - Mining Uncertain and Probabilistic Data: Problems, Challenges, Methods, and Applications
Tutorial - Detecting Clusters in Moderate-to-High Dimensional Data: Subspace Clustering, Pattern-based Clustering, and Correlation Clustering
2:00 pm - 5:30 pm Half Day Workshop
W8 - Large Scale Recommender Systems and NetFlix Prize
W10 - Mining using Matrices and Tensors
2:00 pm - 5:00 pm
Tutorial - Blogosphere: Research Issues, Applications, and Tools
Tutorial - Graph Mining and Graph Kernels
Tutorial - Applied Text Mining
6:15 pm - 6:45 pm : Award Presentations
6:45 pm - 7:30 pm : Innovation Award Talk
Day 1 was very informative and provided good learning experience. The program included several full day workshops and tutorials listed below.
<!--[if !supportLists]-->
· <!--[endif]-->J. Han, J. Lee, H. Gonzalez, X. Li, "Mining Massive RFID, Trajectory, and Traffic Data Sets"
数据挖掘研究院 Jiawei Han, Jae-Gil Lee, Hector Gonzalez, Xiaolei Li
Department of Computer Science, University of Illinois at Urbana-Champaign
<!--[if !supportLists]-->· <!--[endif]-->J. Neville, F. Provost, "Predictive Modeling with Social Networks"
Jennifer Neville, Purdue University
Foster Provost, New York University
<!--[if !supportLists]-->
· <!--[endif]-->J. Pei, M. Hua, Y. Tao, X. Lin, "Mining Uncertain and Probabilistic Data: problems, Challenges, Methods, and Applications"
Jian Pei, Simon Fraser University, Canada
Ming Hua, Simon Fraser University, Canada
数据挖掘工具
Yufei Tao, The Chinese University of Hong Kong
Xuemin Lin, The University of New South Wales, Australia
<!--[if !supportLists]-->· <!--[endif]-->H. Kriegel, P. Kroger, A. Zimek, "Detecting Clusters in Moderate-to-High Dimensional Data: Subspace Clustering, Pattern-based Clustering, and Correlation Clustering"
Hans-Peter Kriegel, Peer Kröger, and Arthur Zimek
Institute for Informatics, Ludwig-Maximilians-Universitat Munchen, Germany
<!--[if !supportLists]-->
· <!--[endif]-->H. Liu and N. Agarwal, "Blogosphere: Research Issues, Applications, and Tools". Huan Liu, Arizona State University, Nitin Agarwal, Arizona State University
数据挖掘研究院 R. Feldman, L. Ungar, "Applied Text Mining"
Social Networking being the prominent theme at the conference, I decided to get a head start by attending the half day tutorial on "Predictive Modeling in Social Networks" by Jennifer Neville and Foster Provost. The abstract from the tutorial is as follows.
Recently there has been a surge of interest in methods for analyzing complex social networks: from communication networks, to friendship networks, to professional and organizational networks. The dependencies among linked entities in the networks present an opportunity to improve inference about properties of individuals, as birds of a feather do indeed flock together. For example, when deciding how to market a product to people in MySpace or Facebook, it may be helpful to consider whether a person's friends are likely to purchase the product.
This tutorial will explore the unique opportunities and challenges for modeling social network data. We will begin with a description of the problem setting, including examples of various applications of social network mining (e.g., marketing, fraud detection). We will then present a number of characteristics of social network data that differentiate it from traditional inference and learning settings, and outline the resulting opportunities for significantly improved inference and learning. We will discuss specific techniques for capitalizing on each of the opportunities in statistical models, and outline both methodological issues and potential modeling pathologies that are unique to network data. We will give links to the recent literature to guide study, and present results demonstrating the effectiveness of the techniques.