The Use of Java in Machine Learning

Open Source Machine Learning Tools in Java

There are a number of Machine Learning Tools written in Java, but the followings are the most popular with the ML community. These GPL tools are widely adopted for Computer Science courses in Machine Learning at universities around the world (refer to the resources for download links). 数据挖掘论坛

  • WEKA (Waikato Environment for Knowledge Analysis): This package is by far the best of all Java Machine Learning tools available on the Internet. The majority of methods discussed in this article are all implemented in WEKA. Because it is an ongoing project, expect to see new ML methods being implemented in this API.
  • YALE (Yet Another Learning Environment): YALE is an environment for Machine Learning experiments. Experiments can be made up of a large number of arbitrarily nestable operators and their setup is described by XML files. YALE is used for both research and real-world learning tasks. The set of operators in YALE includes:
    • ML methods such as support vector machines for regression and classification, decision tree learners, clustering algorithms, and a wrapper to all Weka classifiers (learners) and clusterers.
    • Feature selection and generation those are forward selection, backward elimination, and several genetic algorithms.
    • Data preprocessing.
    • Performance evaluation such as cross-validation and other evaluation schemes, several performance criteria for classification and regression, operators for parameter optimization in enclosed operators or operator chains, and operators for logging and presenting results.
    • Flexible operators for data input and output, support of flexible experimental arrangements, and usage of (optional) meta information on data.
  • MLJ (Machine Learning Tools in Java): MLJ contains decision tree algorithms such as ID3, C4.5, Naive Bayes, Genetic algorithm, and wrappers for feature selection. WEKA 3 interfaces are in development.

The Philosophical Debate Goes On

Bill Joy, the chief scientist at Sun Microsystems, along with other computer scientists, predicts that the future will be machines dominating humans. Humans will become an endangered species. This is caused by humans trying to accelerate inventions of new technologies and that one day it will turn against us. He warned that the pace of new technology should be slowed down. This comment is from his article at Wired Magazine found here: http://www.wired.com/wired/archive/8.04/joy.html. A comment from a theologian suggests that while the academic community does weigh broader ethical considerations, Silicon Valley has a more narrow point of view. "It′s much more about building a company that can go IPO (Initial Public Offering) as fast as possible," noting a Gold Rush mentality that produces a very short-term view of the world.

Bill Joy realized that the big advances in information technology come not from the work of computer scientists, computer architects, or electrical engineers, but from that of physical scientists (physicists). It was physicists who invented semi-conductors and transistors that are components of every electronic gadget on the planet. Computers were first developed for use by physicists in scientific calculations. The top current research topic by physicists is in Quantum Computers, based on the theory of Quantum Physics, which was first proposed and inspired by Physics Nobel Prize winner Richard Feynman of Cal Tech at a symposium at MIT in 1982. If Quantum Computers are achieved within the next 40 years, the nightmare of machines dominating humans will be on the horizon. What current computers (conventional) can compute in billions of years will take less than a second for a Quantum Computer to compute because of its massive parallelism. If you have an expert systems wit, say, more than 10,000 rules, you will notice a slow performance in the system. Increasing the number of rules will probably crash the system. An expert system that is deployed in a Quantum Computer can think fast, even when the number of rules are in the millions. 数据挖掘交友

Do the experts think that machines will ever acquire intelligence? Not so, according to top Physicist and Mathematician Roger Penrose from Cambridge University, UK. Roger Penrose is well known in mathematics and physics circles for his theoretical work on extending the theory of Relativity and combining it with Quantum Physics, leading to what is now known as Quantum Gravity (a special branch of physics). He worked with Stephen Hawking in the field of cosmology in the late 1960s. From his book titled The Emperor′s New Mind, he argued that machines will never ever understand the theory of knowledge and existence of reality nor acquire consciousness. As an aside, the mathematics used in AI or Machine Learning is not nearly as complex as the mathematics used in Relativity and Quantum Physics. The kinds of argument that Professor Penrose used in his book are consistent theoretical proofs that are commonly found in mathematical inductions techniques.

I believe that it is best to leave such debate to computer scientists, philosophers, mathematicians, and physicists. The job of a software developer is to produce software applications that benefit human society now.

数据挖掘工具

Resources

Downloads

Links

Books

  • Machine Learning by Tom. M. Mitchell, pub: Mc Graw Hill
  • Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations by Ian H. Witten and Eibe Frank, pub: Morgan Kaufman.
  • Learning and Soft Computing : Support Vector Machines, Neural Networks, and Fuzzy Logic Models by Vojislav Kecman, pub: MIT Press.
  • Neuro-Fuzzy and Soft-Computing : A Computational Approach to Learning and Machine Intelligence by J.S.R Jang, C.T Sun and E. Mizutani, pub: Prentice Hall.
  • Principles of Data Mining (Adaptive Computation and Machine Learning), by David J. Hand, Heikki Mannila and Padhraic Smyth, pub: MIT Press.
  • An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, by Nello Cristianini and John Shawe-Taylor, pub: Cambridge University Press.
  • Bioinformatics: The Machine Learning Approach, (Adaptive Computation and Machine Learning) 2nd Edition, by Pierre Baldi, Soren Brunak and Sren Brunak, pub: MIT Press.

About the Author

Sione Palu has developed software for publishing systems, imaging, Web applications, amd symbolic computer algebra systems for secondary education. Palu graduated from the University of Auckland, New Zealand, with a science degree (B.Sc.) in mathematics and computing. His interests involve the application of Java and mathematics in the fields of mathematical modelling and simulations, symbolic AI and soft-computing, numerical analysis, image processing, wavelets, digital signal processing, control systems and computational finance.

数据挖掘研究院

[数据挖掘专家] [数据挖掘研究院] [数据挖掘论坛] [数据挖掘实验室]
上一篇:The Use of Java in Machine Learning(3)
下一篇:The Use of Java in Machine Learning(1)
最新评论共有 0 位网友发表了评论 , 查看所有评论
发表评论( 不能超过250字,需审核,请自觉遵守互联网相关政策法规。 )
匿名?
数据挖掘网站导航 数据挖掘论坛导航
  • 数据挖掘工具
  • 数据挖掘论坛
  • DataCruncher - Cognos
  • MineSet - MathSoft
  • Intelligent Miner - GainSmarts
  • Sqlserver - SAS - Clementine
  • CART - Weka - WizSoft
  • NeuroShell - ModelQuest
  • data mining tools - Darwin
  • 数据挖掘交友
  • 数据挖掘博客
  • 数据挖掘工具
  • 数据挖掘资源
  • 数据挖掘技术算法
  • 数据挖掘相关期刊、会议
  • 研究院联盟合作专区
  • 数据挖掘基础与相关技术
  • 数据挖掘厂商与就业
  • 数据挖掘研究者乐园
  • 知名厂商数据挖掘工具资料
  • 国内数据挖掘实验室
  • Foreign Data Mining Lab
  • 热点关注
  • 支持向量机算法及其代码实现
  • Boosting算法及其代码实现
  • K近邻算法
  • Kalman filter toolbox for Matlab
  • Decision Trees算法及其代码实现
  • 生物信息学--机器学习方法
  • [mlchina] ICML 2008 Call for Papers
  • Java Machine Learning Library
  • Paperless office? Only on paper
  • Normal Bayes 分类器
  • 论坛最新话题
  • Foundations of Statistical Natural Langu
  • Game Theory meet Data Mining: A Recent P
  • System Building: How does it help or hin
  • 数据挖掘与Clementine培训
  • 新手报到
  • 求 SASEM 客户流失预测分析
  • 数据挖掘工程师/搜索研究院—北京——无线
  • 数据挖掘入门介绍(如何着手数据挖掘)
  • Information Overload Survey Results
  • The INEX 2005 Workshop on Element Retrie
  • 相关资讯
  • 预言:50年后机器人威胁人类 数十亿人丧命智
  • Paperless office? Only on paper
  • Simplicity vs. Complexity
  • 生物信息学--机器学习方法
  • Java Machine Learning Library
  • IBM visualization software uses 3D avata
  • Combining classifiers to predict gene fu
  • Anyone has experience using data mining
  • The 3rd International Conference on Larg
  • A satisfied customer
  • 数据挖掘实验室资料
  • 数据挖掘博客地址
  • 数据挖掘实验室网站地址
  • Prepare for Medicare audits by using dat
  • 注册成为SAS用户与爱好者俱乐部会员
  • 水南梅
  • 明日烟
  • 新人报道
  • 下载
  • 厦门服务器托管,450元/月—0592-5177319 高
  • 买空间送域名--0592-5177319 高静