Increasing the number of tiers has one major risk: too much data may have to be moved across the network. This is most serious if large data volumes must be moved to clients, particularly over a wide area network (WAN) or the Internet. There are basically two solutions to the problem: doing the processing close to the data, and caching. 数据挖掘研究院
Most OLAP servers now permit large scale data selections and sorting tasks to be performed dynamically on the server so that only data required for display is sent to the client. This can be non-trivial, because the server may be required to perform an on-the-fly calculation, and then rank the results. But if this must be done across the members of a very large dimension (perhaps with millions of members), there is really no other way. Vendors such as Hyperion Solutions, Pilot and Oracle have implemented sophisticated data selection tools to facilitate this. One vendor, MicroStrategy, tried to perform almost all multidimensional calculations and selections within a relational database, thus ensuring that only the minimum amount of data ever left the server; this made it particularly well-equipped to deal with complex rankings of multi-million member dimensions, though at the cost of requiring a very large database server. One product that is vulnerable in this regard is Microsoft Analysis Services 2000, which does a high proportion of on-the-fly calculations, usually on the client. The 2005 release abandons this approach, and the product is now server-centric. 数据挖掘研究院
In any multi-tier architecture (apart from the thinnest clients), it is usually beneficial to maintain a multidimensional data cache on the client and a shared cache on the mid-tier server. This can greatly improve performance and reduce network load, because once data has been sent across the network once, it can be re-used if required. A surprising number of OLAP products have weak technology in this area. For example, most Unix OLAP servers are not multi-threaded, and do not maintain shared data caches. Among the guilty are Pilot, Holos, SAS and all the ROLAP vendors. Many vendors have weak client caching, with the ROLAP vendors again being the biggest culprits. 数据挖掘研究院
Web OLAP
All OLAP products can now be deployed to Web browsers. It should come as no surprise that the OLAP vendors have found almost as many different ways of implementing Web support as they have of providing multi-tier architectures. 数据挖掘研究院
The simplest and earliest approach is static Web publishing. In this case, reports can be generated in the form of HTML documents, and stored as conventional, linked pages. This is a very simple architecture that works well, because it is a natural approach to Web publishing. The only issue is to ensure that the reports are indexed into the intranet search engine. This can always be done manually, but the former Information Advantage was the first vendor to provide facilities to automate the process, even for dynamically generated reports (so the template is stored, rather than a finished, static report). However, this was dropped in the later Eureka product.
The static approach denies browser users the interactive analysis that all OLAP users demand, so most OLAP vendors have had to move on to provide additional facilities, using a Web OLAP server (see Figure 3). The simplest approach is to implement simple interactive reporting using standard HTML, and the ROLAP vendors were the first to do this. This approach uses tables for formatting and GIFs for charts, and is how most conventional Web publishing works. However, simple scripts allow users to perform basic manipulations like dimension rotations, drill downs and dimension member selections. This approach soon becomes clumsy, because HTML does not support direct manipulation of screen elements, and downloaded GIF/JPEG/PNG files are slow, of a fixed pixel size and cannot be manipulated once downloaded.
|
|
Once vendors move beyond this point, their paths diverge. Some (like Business Objects, Oracle, Comshare, Panorama and Hyperion) added Java applets to supplement the conventional HTML. These are typically used for charting, dimension selection and possibly the tabular display. Some vendors, such as Databeacon, are committed to 100 percent Java. As more functionality is added to these Java applets, the time to download them increases correspondingly and sometimes makes them too large for use on the public Internet. Even today, large Java applets tend also to have performance problems, which is why many vendors such as Oracle, AlphaBlox (now IBM), WhiteLight, Hyperion, arcplan and MicroStrategy have moved away from client-side Java.
In each case, a Web OLAP server is still required. In effect, the Web OLAP server and Web browser have collectively replaced the single conventional client, so this architecture has one more tier than the conventional client/server architecture it might be replacing; if both are used together, there are arguably two more tiers. Depending on how you count, you can easily identify five tiers when Web deployment is added to a three-tier architecture. In the Microsoft environment, active server pages (ASP) are often used to implement this additional layer. 数据挖掘研究院
Other vendors like the former Brio Technology, Business Objects and DecisionWorks went down the plug-in route, though this is no longer popular. Brio 6 delivered most of the analysis (but not the query) functionality in a series of plug-ins for different browsers on different platforms. These delivered uncompromised human factors by providing the top two layers from Figure 1 in the plug-in, but at the cost of requiring a relatively large plug-in to be stored locally and sometimes updated.
A number of vendors, such as Informix, MicroStrategy, Temtec, ProClarity and Sagent went for the ActiveX approach instead. This delivers the greatest browser functionality, but at the theoretical cost of some platform portability (of course, nearly all OLAP browsers run on the Windows platform anyway). In order to support ActiveX with Netscape Navigator, they normally need to use a plug-in, but with Internet Explorer, this is not necessary. It also raise security issues in Internet (as opposed to intranet) deployments. This approach is also now little used.
Lately, there has been a move away from both ActiveX and client-side Java because neither is ideal for use on the Internet (as opposed to intranets and intranets). This is mainly because of security concerns, because both ActiveX and Java can introduce viruses and cause other problems when running locally (so they are often blocked by firewalls). They are also both very platform dependent.
With DHTML and XML, it is now possible to provide a richer thin-client environment than previously, with no requirement to download more than style sheets and XML or HTML documents containing JavaScript. This architecture does not provide the richness or performance of a full client/server or ActiveX architecture, but is usually flexible enough for casual consumers of reports. The big advantage is that no software need be installed and maintained on local machines, which is a major benefit with large-scale deployments. It is not just the issue of having to install and occasionally update local software, but of ensuring that it is compatible with other local applications, operating systems versions, virus checkers, firewalls, drivers and peripherals. This is hard enough within a single large organization, but is almost impossible for extranet and Internet applications.
The latest approach is to use .Net, which delivers most of the capabilities of ActiveX, without requiring the installation of local programs. However, it is still platform and browser specific.
Summary
OLAP vendors have succeeded in delivering their usual bewildering plethora of options in their client/server architectures, just as they do in their calculation, data storage and multidimensional structure strategies. The result is that buyers can choose from any number of tiers, from as few as one to as many as five. As usual, no single number is perfect, and the right choice depends on the number of concurrent users, their geographic spread, the data volumes and the calculation complexity. Even the simplest architectures have merit, and one should not assume that more tiers are necessarily needed for the best solution. 数据挖掘实验室
Most products only allow a subset of the options, and the vendors demonstrate their usual enthusiasm for producing white papers that purport to ‘prove’ that only their options are suitable for most users. Seen individually, these might be convincing, but collectively, they contradict each other.
The Web has done nothing to simplify things. It makes possible the delivery of simple OLAP analyses to wider groups of users, but no miracles have occurred . The browser versions of OLAP clients usually deliver less functionality and worse human factors than the conventional client/server versions. Also, for any given bandwidth, they are almost always slower than well-designed client/server architectures. This means that Web clients can supplement, but not replace, conventional clients, so sites must expect to support both types of client. Some product designers have tried to make this easier by allowing the same reports to be delivered via both routes; even though they will not look the same, at least they do not need to be redeveloped. But many other vendors require separate Web report pages to be developed. This is an additional development cost that is easily overlooked.
The conclusion must be that OLAP client/server architecture options are not only confusing today, but will become even more so in the future.


