Though the purpose of a data warehouse is to exploit information, most companies seek to limit the information that a data warehouse user can access.
There are many very good and very bad reasons to restrict, or secure, the information in a data warehouse. All too often this decision is made well after the warehouse is designed because the requirements for security seldom surface during pilots, prototypes or even implementations.
数据挖掘研究院
The following article outlines some of the primary issues and decision points of implementing a relational on-line analytical processing data warehouse with data security. It is intended to raise concerns that you may have underestimated the effort or price of implementing a secured data warehouse and make you question the level of security you are planning.
Understanding, and Accepting, the Purpose of your Warehouse
A solid understanding of the purpose and use of your data warehouse is a requirement before any security plan can be designed. Be honest. If the majority of your users are going to be viewing standard reports, even though the main "selling" point of the warehouse was data analysis, the security plan should be biased toward standardized reporting. The impact of security will last far longer than the initial warehouse proposal. 数据挖掘工具
For the sake of this article we are going to address the three fundamental types of data warehouses.
数据挖掘论坛
Analytical
These are the data warehouses that you read about. They enjoy significant mind share during the planning and selling of a data warehouse.
数据挖掘研究院
However, industry practitioners will agree that 80 percent of the data warehouses are built for standardized reporting. In most companies there may be a dozen individuals who understand the information and can be effective at analyzing and implementing plans based on an analytical data warehouse. 数据挖掘交友
While these users are of primary concern, they are a small percentage of the overall data warehouse market. 数据挖掘工具
Security or anything that restricts access to information has no place in an analytical warehouse.
数据挖掘交友
Standardized reporting
Standardized reporting has been a goal of most companies long before the term data warehouse was coined. In these cases, the data warehouse has simply become the enabling technology. 数据挖掘工具
Security is almost always mandated. These warehouses will have widespread usage throughout an organization and are the most sensitive to performance degradation.
Data Homogenization/Consolidation
These data warehouses combine multiple sources of information with a primary goal of integrating the information. This is one of the goals of every data warehouse. However, organizations with mature decision support systems that span many databases and systems may choose to integrate that data into one source as the reporting and application demands outstrip their ability to "code" the functionality. 数据挖掘实验室
The complexity of security increases as different sources of information are combined. Financial information will have a different security plan than sales and inventory. Different areas of the organization may have their own security plan. 数据挖掘实验室
If the warehouse is to be successful there can only be one security plan for each type of information.
数据挖掘研究院
Application vs. Database Level Security
There are two schools of thought regarding the appropriate place for security. 数据挖掘论坛
One logical place for security to be implemented is in the application. Security is integrated into the application and can be specific to the data accessed by the application as well as the functions of the application.
The other place for security is in the data warehouse. Security is thereby provided consistently to all applications and has a single point of maintenance.
数据挖掘工具
What are the Decision Points?
- Is there more than one application accessing the warehouse?
- Is the warehouse accessible via standard methods such as ODBC?
- Are there more than 100 users?
- Are there more than four dimensions?
- Are there multiple hierarchies within the same dimension?
If you answered yes to any of the above questions, then implementing security at the data warehouse level is appropriate. The following sections relate specifically to designing and implementing security in a relational database with a dimensional database design. 数据挖掘工具
The Security Table
The security table contains each attribute which is to be secured along with a user identification that relates the person logged into the warehouse to the attributes.
数据挖掘工具
For each user, the table contains the values of each of the attributes that the user is permitted to access. If the warehouse is secured at the lowest levels of granularity, this table can become quite large. It can be the largest table in the data warehouse.
数据挖掘实验室
This table or tables become the basis for all security views, roles or partitions that provide the physical implementation of the security plan. 数据挖掘研究院
Security vs. Value
The purpose of a data warehouse is to unlock and exploit a company′s information. This is my information, and my peers throughout the company should not be able to access it. Theory and reality. 数据挖掘交友
The value of the information is protected by implementing security and lessened by restricting access. By carefully matching the purpose of the warehouse with the possibilities of the information, the appropriate level of security can be determined. 数据挖掘工具
Performance
Performance will suffer. We have observed a 20 to 500 percent increase in query run times. The most important issue to consider is the threshold at which the usage of the warehouse drops.
数据挖掘论坛
A restrictive security plan will incur the greatest penalty with the worst performance for the least restricted users. A loose security plan will provide the best performance with the most restricted users having the worst performance. 数据挖掘实验室
Usefulness
One of the side benefits of securing the dimension in a data warehouse is the increase in usability. Your level of security only presents you with information that is permitted. List boxes and tree controls are manageable in size. Performance navigating the front-end application during query preparation may be improved over an unrestricted warehouse.
Four Types of Data Security
There are actually three definable types of data warehouse security and a fourth type that is a conglomeration of all three. If you can′t describe any of the first three types without getting a "yeah, but...," you fall into the conglomeration category.
数据挖掘论坛
Individual
Example: A client can view only information about themselves.
If your warehouse is accessible by individuals outside your company this is the most common type of security. 数据挖掘工具
Group
Example: Information for a division can be viewed by anyone in the Division. One Division cannot view information from another Division.
For an internal data warehouse this is the least restrictive security plan. It is also the rarest.
Hierarchical
Example: A person at any level can view only their assignment and the assignments of their direct reports down to the sales representative level.
This is most often used in sales and marketing warehouses or in single subject area data marts. It provides an acceptable impact to performance and meshes with the corporate cultures of most companies. 数据挖掘研究院
Conglomeration
Example: A person at any level can view details of their assignment, the assignments of their direct reports and summarized corporate level data.
数据挖掘论坛
For a secured enterprise data warehouse, this is the security plan. The level of complexity varies widely, and there may be more than one plan per warehouse.
These warehouses are the ones that require analysis of the feasibility and maintainability of the security plan.
Design and Construction
Schema Implications
The design and specification of your security plan must be completed before you finalize the design of your data warehouse. Since the most common methods of implementing security involve database views, roles and partitions, the underlying database must be designed in concert with these methods.
数据挖掘工具
Dimensional modeling techniques in a secured warehouse will lean heavily toward a snowflake design and will have limited aggregation tables. Fact columns or rows may need to be redundant in order to provide information such as a company summary.
The list of items to consider is extensive. 数据挖掘工具
Feasibility and Maintenance
Maintaining security in a database is difficult. Organizational changes can require rebuilding the entire security plan. Security views can degrade the performance beyond acceptable limits. Applications and interfaces may be required to administer the rights and levels of security.
数据挖掘工具
Securing a data warehouse can become a larger effort than building the data warehouse. Ensure that this reality is interjected into the design of your security plan.
数据挖掘工具
Reporting Appropriateness
One of the more subtle nuances of implementing a secured warehouse is providing the recipient with a report of the information that is not there.
数据挖掘研究院
For example, a sales representative has five stores from a single chain. The demographic market that the sales representative is in has seven stores. When the sales representative prints a market report by chain, only the five stores will be included in the total. The report, at first glance, will lead the observer to the wrong conclusion. 数据挖掘实验室
The report needs to include information about the missing components (i.e., five of seven stores). Each report needs to be correct and appropriate when viewed in isolation. 数据挖掘研究院
Summary
This article touched on some of the main issues and concerns involving data warehouse security. Most companies will elect to secure their data warehouses. In fact, most corporate cultures are biased toward protecting rather than exploiting information. Fortunately this is in alignment with the primary use of a data warehouse and current technologies enable the implementation of a feasible and maintainable security plan.
数据挖掘研究院