What features are new in Oracle Data Mining 10g?
What mining capabilities does Oracle Data Mining support? Oracle Data Mining provides programmatic access to six data mining algorithms embedded in Oracle Database. Data mining algorithms are machine-learning techniques for analyzing data for specific categories of problems. Different algorithms excel at different types of analysis. Classification: Oracle Data Mining's Classification algorithms can predict binary or multi-class outcomes. In binary problems, each record either will or will not exhibit the modeled behavior. For example, a model could be built to predict whether a customer will churn or remain loyal. These algorithms can also make predictions for multi-class problems where there are several possible outcomes. For example, a model could be built to predict which class of service will be preferred by each prospect. 数据挖掘研究院 This line becomes a predictive model when the value of the dependent variable is not known; its value is predicted by the point on the line that corresponds to the values of the independent variables for that record. Oracle Data Mining provides both linear and non-linear regression models. Clustering: Oracle Data Mining provides two Clustering algorithms for the segmentation of individuals (cases) in a dataset. Both the enhanced version of K-Means and the proprietary O-Cluster algorithms produce clusters with descriptive rules for membership. Clusters are organized hierarchically with histograms provided for each attribute in a cluster. 数据挖掘研究院 Attribute Importance: Attribute Importance measures the predictive power of each attribute in classifying the target values and produces a list of attributes ranked by relative importance. This information can be used to reduce the size of input data, increasing the speed of mining tasks.
Does Oracle Data Mining support RAC and the GRID?Algorithm options: Minimum Description Length (MDL) Feature Extraction: ODM Feature Extraction creates a new set of features by decomposing the original data. Feature extraction lets you describe the data with a number of features far smaller than the number of original dimensions (attributes). A feature is a combination of attributes in the data that is of special interest and captures important characteristics of the data. Some applications of feature extraction are latent semantic analysis, data compression, data decomposition and projection, and pattern recognition. Feature extraction can also be used to enhance the speed and effectiveness of supervised learning. For example, feature extraction can be used to extract the themes of a document collection, where documents are represented by a set of key words and their frequencies. Each theme (feature) is represented by a combination of keywords. The documents in the collection can then be expressed in terms of the discovered themes. Algorithm options: Non-negative Matrix Factorization (NMF) Text Mining: Text mining is conventional data mining done using "text features." Text features are usually keywords, frequencies of words, or other document-derived features. Once you derive text features, you mine them just as you would any other data. Some of the applications for text mining include
Yes. ODM runs on RAC taking advantage of RAC in two ways: individual jobs (mining tasks) can be distributed in parallel across RAC nodes automatically, in addition, several of the algorithms execute in parallel to the extent the database processes queries in parallel using RAC. Several of the algorithms, e.g., NB and ABN, are written as SQL queries. When the database executes these queries, they are automatically able to leverage RAC according to standard database parallelism. Kmeans and OCluster do not leverage parallelism for model build, but do leverage parallelism for scoring. The new 10g algorithms, SVM and NMF, are written as C table functions, and are not yet parallel. Does Oracle Data Mining support PMML? Can ODM import PMML models from other vendors? A JDeveloper extension can be downloaded from the ODMr page of OTN and added to the JDeveloper 10g environment for the purpose of accessing the Java code associated with the data mining operations of ODMr. For example, a model is built and applied in ODMr - the Java program that applies the model is generated by JDeveloper. Where can this program execute, what is the normal execution environment?
Does Oracle Data Mining support neural networks? Although ODM does not support Neural Networks explicitly, Support Vector Machines (SVM) in 10g can act as a superset of Neural Networks and a much superior one at that. SVMs work with high dimensional data, they generalize better, and are easier to tune and train. You get no overfitting, no early stopping, and no voting. SVMs can apply to the same class of problems as neural networks. Generally, SVM kernels map to activation functions and support vectors to nodes. 数据挖掘实验室 |
||||||||||
| Graphical Interface What is the Oracle Data Miner? |
||||||||||
| Interfaces
What programming interfaces does Oracle Data Mining provide? See Appendix A of the ODM Concepts Guide. When will Oracle Data Mining support the Java Data Mining (JSR-73) standard interface? Oracle Data Mining will provide a JDM interface in the next available release, which is currently 10gR2. What will happen to the 10g Java interface? The ODM Java interface in 10gR1 will be replaced by the Java Data Mining (JSR-73) standard interface in 10gR2. As such the 10gR1 Java interface is being desupported in 10gR1 in preparation for the new interface. Why is the 10g Java interface being desupported? The goal is to avoid having two Java interfaces in the product for code complexity, support, and maintenance issues, as well as confusion in the marketplace, and among customers / developers. We also want to avoid having one of the interfaces (ODM Java) not be interoperable with the other two (JDM and Pl/SQL). JDM and Pl/SQL will be fully interoperable. |
||||||||||
| Migration Can users of ODM 9i migrate to the ODM 10g PL/SQL interface? The PL/SQL interface is new in 10g and leverages a different repository than that for the 9i and 10g Java interface. As such, ODM 9i models can only be migrated to the ODM 10g Java repository. Models created via the Java interface are not interoperable with the PL/SQL interface in 10g. 数据挖掘研究院 Users of the PL/SQL interface need to redefine the Java objects as required for the PL/SQL interface and rebuild models, or recreate results. ODM 9i customers can migrate models created with the Java interface to ODM 10g and continue to use the Java interface. However, this Java interface will be replaced with the JDM interface in 10gR2. What is the migration strategy for users of the 10gR1 Java interface in 10gR2? The 10gR1 ODM Java API will be desupported in 12 months. This is motivated by the Oracle-led Java Data Mining (JDM) standard (JSR-73), which will be available in 10gR2 replacing the current Java API. As such, the current Java API will not exist in any way in 10gR2. The JDM API will be implemented as a layer on top of the 10g ODM PL/SQL API. This has the benefit of interoperability between the Java and PL/SQL interfaces. With 10gR1, the Java API and PL/SQL API are not interoperable, i.e., a model created in Java cannot be used in PL/SQL and vice versa. This resulted from our efforts to integrate data mining more tightly with the core RDBMS. At present, there is no utility to migrate Java-produced models to PL/SQL models. This is due to significant changes in the metadata and automated transformations which make this rather intractable. 数据挖掘研究院 In general, applications periodically need to refresh their models as models grow stale over time. In converting to 10gR2, applications would have to change code to use the JDM or PL/SQL APIs and rebuild their models. To mitigate the migration problems, applications new to 10g can use the PL/SQL API as this will be supported in 10gR2 along with model migration. Later, the user can determine if the application should be converted to the new JDM API where PL/SQL-generated models will still be usable. For ODM 9i applications, users can continue to use the 10gR1 ODM Java API which supports migration from ODM 9i, however, moving to 10gR2 will require code changes and rebuilding of models as noted above. If Java is the only option and a customer does not want to rework applications, the customer can wait until 10gR2 and JDM. |
||||||||||
| System What platforms does Oracle Data Mining run on? All platforms supported by Oracle, including Windows, Solaris, HP-UX, IBM AIX, Compaq Tru64, and Linux. What are the system requirements to run Oracle Data Mining? Oracle Data Mining runs in the Oracle 10g Database on all supported platforms. Oracle Partitioning is recommended for large data mining problems. |

