In theory, there could be many ways of assembling multiple layers on clients and servers, but certain approaches are more popular than others. Vendors use terms like ‘two-tier’ and ‘three-tier’ to describe their architectures, but this can be confusing because the exact split of tasks varies by product. It can also confuse people who are used to the different way in which these terms are used with relational client/server applications. 数据挖掘研究院
Any product will have a bottleneck somewhere, and it will be this that limits performance. As shown in Figure 2, a well balanced architecture will attempt to ensure that all the potential bottlenecks are about equally loaded, so that no single component becomes overloaded too quickly. If this does happen, then other components will be wastefully under-used. The mapping of the logical application layers on to the tiers of a client/server architecture plays a big part in determining where the bottlenecks occur, and how easily they can be overcome. 数据挖掘研究院
|
|
A single tier product would be one where all five layers are on a stand-alone PC. This would most probably be in an environment where data was downloaded from a networked server, for subsequent analysis by a mobile user. It may also include examples where users have entered their own data, perhaps as part of a budgeting or planning process, which they will subsequently send back to the server. Many single tier products do all processing in RAM, which can make them very fast. Desktop OLAPs like PowerPlay usually work this way, and single user versions of many other client/server products like Express, Essbase, Gentia and iTM1 are also available. Local cubes (cub files) produced from Microsoft Analysis Services work the same way. iTM1 can also work like a desktop OLAP, but in other respects, including functionality and capacity, it is more like a large-scale OLAP product.
A two-tier product would have the lowest levels on a server, with the upper levels on a client. Different products choose to have the split between the client and server at different points.
A shared file architecture keeps only the lowest level on the server, and we doubt that this should even be regarded as a client/server architecture. Nevertheless, it has some attractions, including simplicity. It also scales well for large numbers of users doing simple things, but less well for large amounts of data. It often causes problems if used over a wide area network, because too much data is transmitted around.
Many MOLAP products like Essbase, TM1 and PowerPlay Enterprise server in client/server mode have a two-tier architecture, with at least the lowest three layers being on the multidimensional database server. Unlike in two-tier relational applications, the server may perform most of the application’s functionality. This is a very efficient architecture, as it does most of the multidimensional processing very close to where the data is stored, and the data management and calculation software are essentially the same thing. This integration has many benefits, including a minimal network load. The bottleneck may be the processing or data management capabilities of the server. 数据挖掘研究院
Some ROLAP products are also two-tier. With MicroStrategy releases up to 6.n, almost all of the processing was done by the RDBMS, but other two-tier ROLAPs tend to do some of the multidimensional processing in the client. The original MicroStrategy approach minimized the network load, and does not require high powered PCs, but it put a very large load on the RDBMS, because SQL is not an efficient language for expressing multidimensional calculations. Multi-pass SQL was required for all but the most trivial queries, and this meant that the database server was usually the bottleneck. Not surprisingly, MicroStrategy eventually dropped this architecture, and MicroStrategy 7 and 8 spread the processing more evenly across the tiers, with more sophisticated data caching at each level. The other approach puts less load on the database server, but more on the network and the client PCs, and one or the other of these will become the bottleneck. 数据挖掘研究院
Hybrid and some ROLAP products (now including MicroStrategy 7/8) have a three-tier architecture. These products place the lower two layers on an RDBMS server, and hybrid products may also have a multidimensional database server (whose database could appear as a fourth tier). In most cases, the mid-tier application server does most of the processing, with the client concentrating on the display and perhaps some simple ad hoc calculations. The former Information Advantage’s approach was particularly server-centric, and with most processing performed by the mid-tier, with relatively little done by the RDBMS and none by the client, which was used purely for the top (GUI) layer. This approach risks a bottleneck in either the mid-tier server or in the data link between the lower two tiers. To circumvent this bottleneck, both tiers are sometimes placed on the same physical server, so that the logical two-tier architecture becomes a physical two-tier set-up. 数据挖掘实验室
There are several other flavors of three-tier architecture. The original IBM DB2 OLAP Server (with relational storage selected) used Essbase as the mid-tier engine, and this performed most of the functions of all three of the intermediate layers shown in Figure 1. This is an unusual and very inefficient architecture, as it uses a multidimensional engine to build and maintain the relational star schema. Not surprisingly, IBM eventually abandoned it. Another company to adopt an approach along these lines was Relational Matters (now acquired by Cognos), whose DecisionStream aggregation engine builds and maintains the summary tables in a ROLAP schema, but it does not provide the client software. In both these cases, the bottleneck is likely to come between the bottom two tiers. 数据挖掘研究院
Microsoft Analysis Services goes the opposite way, and is the only three-tier OLAP able (in ROLAP mode) to distribute calculations for even a single query across all three tiers. This makes maximum use of the total available distributed horsepower of the database and application servers as well as the clients but runs the risk, in certain cases, of transmitting too much data to the client PCs which do almost all of the on-the-fly calculations apart from aggregations; however, it attempts to locate all calculations involving a lot of dimension members on the server, which should reduce the risk. One snag with this approach is that it has a very ‘thick’ client, with significant data and metadata caches on the client machine. The local software to be installed is also large. Overall, Analysis Services in ROLAP mode does not perform well, so it is usually used in MOLAP mode.


