Why Not Data Warehouse Appliances?

In my book, it's time to stop thinking of data warehouse appliances (including those powered by column-store databases) as experimental devices for pioneers and performance nuts. Having personally interviewed more than a handful of appliance customers, my sense is that we're on the cusp of a broad adoption phase. Will these devices simply compliment conventional data warehouses as the foundation for data marts and non-mission-critical apps? Or will they also start replacing conventional enterprise data warehouses (EDWs)? I haven't heard many solid arguments against the appliance approach.

Just last week I scored an interview with appliance customer Reliance Communications, which is the Verizon or AT&T of India. The company has some 40 million customers and a growth rate that would be the envy of any executive; Reliance is adding some 1.5 million customers per month, thanks in large measure to India's growing economic strength and emerging middle class.

So how is Reliance coping with 1 billion new call data records each day swelling the company's 40-terabyte data warehouse? After exploring the field of data warehouse appliances in early 2007, Reliance implemented a 60-terabyte Greenplum appliance last summer, and it now has another 120-terabyte Greenplum implementation in the works. All 180 terabytes of capacity will be dedicated to call data records, which have to be kept around for 13 months for compliance reasons. Queries typically involve vast quantities of data. 数据挖掘论坛

"Greenplum was really new technology for us, so we wanted to start with the CDRs," says Raj Joshi, Vice President of Decision Support Systems. "Access to CDRs is not very frequent, but they need to go in a big database, and we wanted to address our biggest problem first."

The advantages of the appliance route? "I can't comment on our final costs, but the savings were substantial," says Joshi. "As far as performance goes, it's about three to five times faster [than our old warehouse], so the queries that were taking a couple of hours now take 30 minutes."

I've talked to a number of other companies with DW appliance deployments:

The New York Stock Exchange has multiple EDWs on Netezza Appliances.

Capital Equity firm Arsenal Partners and its Sermatech business unit have an EDW on HP's Neoview, and HP points to about a dozen other customers that have gone public, including WalMart.

Trade Doubler, a European Web marketing firm, is using InfoBright's Brighthouse appliance to analyze Web clickstreams (a case study I have yet to write up). 数据挖掘工具

Corporate Express, the office supply giant, is running a data mart on Netezza. Executive Matt Schwartz described the deployment as a "go-fast sports car" as compared with a family sedan, suggesting that the maturity and versatility of conventional databases still appeals.

Yes, all of these customers proceeded with caution, knowing that DW appliances aren't the proven way, but best practices are emerging quickly.

True, not all appliance vendors currently offer the depth and breadth in supporting data integration and data quality software that the database incumbents can offer, but third-party vendors are quickly stepping in and IBM, for one, has joined the appliance market.

Point taken, not all appliances can handle mixed query loads or vast numbers of users, but several can, and these ranks will surely grow with maturity.

The bottom line is that these and many other appliance customers are getting great performance and they are spending less money. And then there's Teradata, which has been selling and succeeding with appliances for years, even if they weren't calling them that. 数据挖掘论坛

So my question is, what are the arguments against DW appliances? I'm sure there are other cases to be made, but I'm just not hearing them. Point me to a credible white paper!

In the absence of a strong case against appliances, I have to believe that only maturity and product diversity stand between the data warehouse market as we know it today and one dominated by appliances.

[数据挖掘专家] [数据挖掘研究院] [数据挖掘论坛] [数据挖掘实验室]
上一篇:探求数据仓库关键环节ETL的本质
下一篇:什么是ETL
最新评论共有 0 位网友发表了评论 , 查看所有评论
发表评论( 不能超过250字,需审核,请自觉遵守互联网相关政策法规。 )
匿名?
数据挖掘网站导航 数据挖掘论坛导航
  • 数据挖掘工具
  • 数据挖掘论坛
  • DataCruncher - Cognos
  • MineSet - MathSoft
  • Intelligent Miner - GainSmarts
  • Sqlserver - SAS - Clementine
  • CART - Weka - WizSoft
  • NeuroShell - ModelQuest
  • data mining tools - Darwin
  • 数据挖掘交友
  • 数据挖掘博客
  • 数据挖掘工具
  • 数据挖掘资源
  • 数据挖掘技术算法
  • 数据挖掘相关期刊、会议
  • 研究院联盟合作专区
  • 数据挖掘基础与相关技术
  • 数据挖掘厂商与就业
  • 数据挖掘研究者乐园
  • 知名厂商数据挖掘工具资料
  • 国内数据挖掘实验室
  • Foreign Data Mining Lab
  • 热点关注
  • SQL与最短路径算法
  • 求一个数据库备份方案
  • 某商店数据仓库的原型分析和设计
  • 移动通信数据仓库联合实验室在北京成立
  • 数据仓库的规划构建策略
  • NCR Teradata数据仓库概述
  • 各位进来帮忙参考一下关于个人发展方向问题
  • 关于数据仓库的数据模型
  • 第五届机器学习及其应用研讨会日程表
  • 数据库归来——下一代数据库扫描简介
  • 论坛最新话题
  • Foundations of Statistical Natural Langu
  • Game Theory meet Data Mining: A Recent P
  • System Building: How does it help or hin
  • 数据挖掘与Clementine培训
  • 新手报到
  • 求 SASEM 客户流失预测分析
  • 数据挖掘工程师/搜索研究院—北京——无线
  • 数据挖掘入门介绍(如何着手数据挖掘)
  • Information Overload Survey Results
  • The INEX 2005 Workshop on Element Retrie
  • 相关资讯
  • 处理海量数据的经验和技巧
  • 数据仓库的新生
  • 什么是ETL
  • Data Warehousing for the Midsize Organiz
  • Data warehouse management strategies for
  • 第五届机器学习及其应用研讨会日程表
  • SQL Data Warehouse Analyst
  • Edge appliances and the evolution of dat
  • 动态数据仓库让BI走向一线
  • The OLAP Report
  • 数据挖掘实验室资料
  • 数据挖掘博客地址
  • 数据挖掘实验室网站地址
  • Prepare for Medicare audits by using dat
  • 注册成为SAS用户与爱好者俱乐部会员
  • 水南梅
  • 明日烟
  • 新人报道
  • 下载
  • 厦门服务器托管,450元/月—0592-5177319 高
  • 买空间送域名--0592-5177319 高静