Reema Thareja
Data Warehousing
Reema Thareja
Data Warehousing
- Broschiertes Buch
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
Data Warehousing is designed to serve as a textbook for students of Computer Science & Engineering (BE/Btech), computer applications (BCA/MCA) and computer science (B.Sc) for an introductory course on Data Warehousing. It provides a thorough understanding of the fundamentals of Data Warehousing and aims to impart a sound knowledge to users for creating and managing a Data Warehouse. The book introduces the various features and architecture of a Data Warehouse followed by a detailed study of the Business Requirements and Dimensional Modelling. It goes on to discuss the components of a Data…mehr
Andere Kunden interessierten sich auch für
- John ParedesThe Multidimensional Data Modeling Toolkit: Making Your Business Intelligence Applicatio56,99 €
- Pete WardenBig Data Glossary23,99 €
- Bill InmonBuilding the Data Lakehouse43,99 €
- Nagaraj VenkatesanAzure Data Engineering Cookbook - Second Edition57,99 €
- Prashanth H. SouthekalAnalytics Best Practices: A Business-driven Playbook for Creating Value through Data Analytics27,99 €
- Lakshman BulusuOpen Source Data Warehousing and Business Intelligence84,99 €
- Pulkit ChadhaData Engineering with Databricks Cookbook59,99 €
-
-
-
Data Warehousing is designed to serve as a textbook for students of Computer Science & Engineering (BE/Btech), computer applications (BCA/MCA) and computer science (B.Sc) for an introductory course on Data Warehousing. It provides a thorough understanding of the fundamentals of Data Warehousing and aims to impart a sound knowledge to users for creating and managing a Data Warehouse. The book introduces the various features and architecture of a Data Warehouse followed by a detailed study of the Business Requirements and Dimensional Modelling. It goes on to discuss the components of a Data Warehouse and thereby leads up to the core area of the subject by providing a thorough understanding of the building and maintenance of a Data Warehouse. This is then followed up by an overview of planning and project management, testing and growth and then finishing with Data Warehouse solutions and the latest trends in this field. The book is finally rounded off with a broad overview of its related field of study, Data Mining. The text is ably supported by plenty of examples to illustrate concepts and contains several review questions and other end-chapter exercises to test the understanding of students. The book also carries a running case study that aims to bring out the practical aspects of the subject. This will be useful for students to master the basics and apply them to real-life scenario.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Produktdetails
- Produktdetails
- Verlag: Hurst & Co.
- Seitenzahl: 456
- Erscheinungstermin: 15. Juni 2009
- Englisch
- Abmessung: 239mm x 183mm x 25mm
- Gewicht: 699g
- ISBN-13: 9780195699616
- ISBN-10: 0195699610
- Artikelnr.: 26886020
- Herstellerkennzeichnung
- Libri GmbH
- Europaallee 1
- 36244 Bad Hersfeld
- 06621 890
- Verlag: Hurst & Co.
- Seitenzahl: 456
- Erscheinungstermin: 15. Juni 2009
- Englisch
- Abmessung: 239mm x 183mm x 25mm
- Gewicht: 699g
- ISBN-13: 9780195699616
- ISBN-10: 0195699610
- Artikelnr.: 26886020
- Herstellerkennzeichnung
- Libri GmbH
- Europaallee 1
- 36244 Bad Hersfeld
- 06621 890
Reema Thareja was until recently working as an IT Lecturer at the Institute of Information and Technology, an affiliate of GGS Indraprastha University, New Delhi. She has completed her MCA from the same University and specializes in Programming Languages, OS, DBMS, Multimedia and Web Technologies.
1: The Compelling Need for Data Warehousing
Learning Objective
Case Study
1.1 A Short Historical Note
1.2 Need for Data Warehousing
1.2.1 Increasing Demand for Strategic Information
1.2.2 The Information Crisis
1.2.3 Inability of Past Decision Support System
1.2.4 Presence of Better Technology
1.2.5 Expectations from the New Kind of Decision Support System
1.2.6 Operational Vs Decisional Support System
1.3 Data Warehouse Defined
1.3.1 What can a Data Warehouse Do?
1.3.2 What Data Warehouse cannot do?
1.3.3 What is a Data Warehouse- an Environment or a Product?
1.3.4 A Blend of Many Technologies
1.4 Data Warehouse Users
1.4.1 Why do they want Information?
1.5 Benefits of Data Warehousing
1.5.1 Tangible Benefits
1.6 Concerns in Data Warehousing
1.6.1 Nothing is for free
Summary
Review Questions
2: Data Warehouse: Defining Features
Learning Objectives
Case Study
2.1 Introduction
2.2 Features of a Data Warehouse
2.2.1 Subject Oriented Data
2.2.2 Integrated Data
2.2.2.1 Data Cleansing
2.2.2.2 Data Transformation
2.2.2.3 Non-Volatile Data
2.2.2.4 Time Variant Data
2.3 Data Granularity
2.3.1 Benefits of Data Granularity
2.3.2 Data granularity - Pros and Cons
2.3.3 Dual Levels of Data Granularity
2.4 The Information Flow Mechanism
2.5 Metadata
2.5.1 Role of Metadata
2.5.2 Classification of Metadata
2.5.3 Metadata is the Nerve Centre of the Data Warehouse
2.5.4 Metadata Management
2.6 Two Classes of Data
2.7 Life Cycle of Data
2.7.1 What is Data Velocity?
2.7.2 Moving Data from One Medium to Another
2.7.3 Inverted Data Warehouse
2.8 Can Data Move from Data Warehouse to the Operational Systems?
2.8.1 Direct Access Mode
2.8.2 Indirect Access Mode
Summary
Review Questions
3: Physical Architecture of a Data Warehouse and Data Mart Issues
Learning Objectives
Case Study
3.1 Introduction
3.2 Distinguishing Characteristics of Data Warehouse Architecture
3.3 Data Warehouse Architectural Goals
3.4 Data Warehouse Architecture
3.4.1 Pros and Cons of Data Warehouse Architecture
3.4.2 The Two Tier Architecture
3.4.3 The Three Tier Architecture
3.4.4 The Four Tier Architecture
3.4.5 Three Tier Versus Two Tier Architecture
3.4.6 Architecture Considerations and Challenges
3.4.7 Interfacing
3.5 Data Warehouse and Data Marts
3.6 Issues in Building Data Marts
3.6.1 A Change of Approaches
3.6.2 How Are Data Warehouse Different From Data Marts
3.6.3 Reasons for Creating Data Marts
3.6.4 Advantages of Building a Data Mart
3.6.5 Limitations of Building a Data Mart
3.7 Building Data Marts
3.8 Other Data Mart Issues
3.8.1 Types of Data Marts Based on Underlying DBMS
3.8.2 Loading of Data Marts
3.8.2.1 The Types of Data Marts to Load
3.8.2.2 Loading Temporal Data Marts
3.8.2.3 Loading of Non- Temporal Data Marts
3.8.3 Metadata for a Data Mart
3.8.4 Maintenance of a Data mart
3.8.5 Nature of data in a Data Mart
3.8.6 Software Components of a Data Mart
3.8.7 Performance Issues
3.8.8 Monitoring Requirements for a Data Mart
3.8.9 Security In A Data Mart
3.8.10 Structure of a Data Mart
3.9 Reasons for Increased Popularity of Data Marts
3.10 Can We Have the Data Warehouse and Data Marts on the Same Processor?
3.11 Pushing and Pulling Data
Summary
Review Questions
4: Gathering the Business Requirements
Learning Objective
Case Study
4.1 Introduction
4.2 Determining the End User Requirements
4.2.1 Business Objectives
4.2.2 Business Queries
4.2.3 Determining the Functional Requirements
4.2.4 Information Infrastructure Environment
4.2.5 The Data Quality Levels
4.3 Requirements Gathering Methods
4.3.1 Interviews
4.3.2 JAD Methodology
4.3.3 Review of Existing Documentation
4.3.4 Brainstorming
4.3.5 Questionnaires
4.3.6 Where to Stop?
4.4 Requirements Analysis
4.4.1 Requirements Definition Document
4.5 Gathering Requirements for a Data Warehouse Project
4.6 Dimensional Analysis
4.6.1 Business Dimensions
4.6.2 Dimension Hierarchies/Categories
4.6.3 Facts or Metrics
4.6.4 Example
4.7 Information Package Diagram
4.7.1 What Information does an IPD contain?
4.7.2 Example
4.7.3 Reason for Forming IPD
Summary
Review questions
5: Planning and Project Management In A Data Warehouse
Learning Objective
Case Study
5.1 The Project Management Principles
5.1.1 Key Considerations
5.1.2 The Ideal Approach
5.2 Data Warehouse Readiness Assessment
5.2.1 Bad Performance Indicators
5.2.2 Indications for a Successful Data Warehouse Project
5.3 The Data Warehouse Project Team
5.3.1 Key Roles
5.3.2 User Involvement
5.4 Planning for the Data Warehouse
5.4.1 Gathering the Business Requirements
5.4.2 Gaining Support for the Project
5.5 The Data Warehouse Project Plan
5.6 Economic Feasibility Analysis
5.6.1 Costs and Benefits of the System
5.6.2 Economic Feasibility Measures
5.6.3 Justifying the New System
5.7 Planning For a Data Warehouse Server
5.7.1 SMP
5.7.2 Clusters
5.7.3 MMP
5.7.4 ccNUMA
5.8 Capacity Planning
5.8.1 Estimating the Load
5.8.2 Estimating the CPU Bandwidth
5.8.3 Estimating the Memory
5.8.4 Estimating the Disk
5.9 Selecting the Operating System for the Data Warehouse
5.10 Selecting the Database Software
5.10.1 Difference between General DBMS and Data Warehouse DBMS
5.10.2 How to Choose?
5.11 Selection of Tools
5.11.1 Information Delivery Tools
5.11.1.1 The Tool Selection Technique
5.11.1.2 Criteria for Selecting the Information Delivery Tool
5.11.2 Query Tools
5.11.3 Browser Tools
5.11.4 Metadata Tools
5.15.5 Data Quality Tools
Summary
Review Questions
6: Data Warehouse Schema
6.1 Introduction
6.2 Building the Fact Tables and Dimension Tables
6.2.1 The Traditional Approach
6.3 Dimensional Modeling
6.3.1 Data Warehouse Modeling Vs Operational Database Modeling
6.3.2 Dimensional Model Vs ER Model
6.3.3 The Need for Dimension Model
6.3.4 Features of a Good Dimensional Model
6.4 The Star Schema
6.4.1 How Does a Query Execute?
6.4.2 Example
6.4.3 Pros and Cons of the Star Schema
6.5 The Snowflake Schema
6.5.1 The Technique
6.5.2 Example
6.5.3 Is Snowflaking Really Helpful?
6.5.4 Pros and Cons of the Snowflake Schema
6.6 Aggregate Tables
6.6.1 Need for Building Aggregate Fact Tables
6.6.2 Limitations of Aggregate Tables
6.7 Fact Constellation Schema or Families of Star
6.7.1 Pre-requisite for a Fact Constellation Schema
6.7.2 Pros and Cons of Fact Constellation Schema
6.8 Strengths of Dimensional Modeling
6.9 Data Warehouse and the Data Model
Summary
Review Questions
7: Fact Tables and Dimension Tables: Miscellaneous Issues
Learning Objective
Case Study
7.1 Characteristics of a Dimension Table
7.2 Characteristics of a Fact Table
7.3 The Factless Fact Table
7.4 Updates To Dimension Tables
7.4.1 Slowly Changing Dimensions
7.4.1.1 Type 1 Changes
7.4.1.2 Type 2 Changes
7.4.1.3 Type 3 Changes
7.4.1.4 Example
7.5 Cyclicity of Data - Wrinkle of Time
7.6 Other Types of Dimension Tables
7.6.1 Large Dimension Tables
7.6.2 Rapidly Changing or Large Slowly Changing Dimensions
7.6.3 Junk Dimensions
7.7 Keys in the Data Warehouse Schema
7.7.1 Primary Keys
7.7.2 Surrogate Keys
7.7.3 Foreign Keys
7.8 Enhancing the Data Warehouse Performance
7.8.1 Table Compression
7.8.2 Parallel Execution
7.8.3 Table Partitioning
7.8.3.1 The Partitioning Technique
7.8.3.2 Advantages of Partitioning
7.8.4 Data Clustering
7.8.5 Data Summarization
7.8.6 Bypassing the Referential Integrity Checks
7.8.7 Indexing the Data Warehouse
7.9 Data Warehousing and the Technology
Summary
Review Questions
8: THE ETL PROCESS
Learning Objective
Case Study
8.1 Introduction
8.1.1 Challenges in ETL Functions
8.2 Data Extraction
8.2.1 Identification of Data Sources
8.2.2 Extracting Data for Data Warehouse Refreshing
8.2.2.1 Immediate Data Extraction Technique
8.2.2.2 Deferred Data Extraction Technique
8.2.2.3 Evaluation of Extraction Techniques
8.2.3 Managing Reference Tables in a Data Warehouse
8.3 Data Transformation
8.3.1 Tasks Involved in Data Transformation
8.3.2 Role of Data Transformation Process
8.4 Data Loading
8.4.1 Techniques of Data Loading
8.4.2 When should we go for Data Update rather than Data Refresh?
8.4.3 Loading the Fact Tables and Dimension Tables
8.5 Data Quality
8.5.1 The Need for Data Quality
8.5.2 Categories of Errors Which Effect data Quality
8.5.2.1 Incomplete Errors
8.5.2.2 Incorrect Errors
8.5.2.3 Incomprehensibility Errors
8.5.2.4 Inconsistency Errors
8.5.3 Issues in Data Cleansing
8.5.4 Conclusion about Data Quality
Summary
Review Questions
9: Testing, Growth and Maintenance Of Data Warehouse
Learning Objective
Case Study
9.1 Data Warehouse Design Review
9.1.1 Contents of a Typical Design Review
9.2 Developing the Data Warehouse Iteratively
9.3 Testing
9.3.1 Testing the Data Warehouse
9.3.2 Developing the Test Plan
9.3.3 Testing the Backup and Recovery Processes
9.3.4 Testing the Data Warehouse Environment
9.3.5 Testing the Database
9.3.6 Logging of Test Results
9.4 Monitoring the Data Warehouse
9.4.1 Why Are Statistics Monitored?
9.5 Tuning the Data Warehouse
9.5.1 Tuning the Data Load
9.5.2 Tuning Queries
9.6 The Feedback Loop
Summary
Review Questions
10: OLAP in the Data Warehouse
Learning Objective
Case Study
10.1 Need for Online Analytical Processing
10.1.1 Multi Dimensional Analysis
10.1.2 Fast Access and Powerful Calculations
10.2 OLAP
10.2.1 OLAP Defined
10.2.2 OLAP is a Data Warehouse Tool
10.3 OLAP and Multidimensional Analysis
10.3.1 The Multi-Dimensional Logical Data Model
10.3.2 Multi Dimensional Model's Users
10.3.3 The Multi Dimensional Structure
10.3.4 Multi- Dimensional Operations
10.3.5 The Business Need
10.4 OLAP Functions
10.4.1 Dimensional Analysis
10.4.2 Hypercubes
10.4.3 OLAP Operations in Multidimensional Data Model
10.5 OLAP Applications
10.5.1 Integrating OLAP with GIS
10.6 OLAP Models
10.6.1 MOLAP
10.6.2 ROLAP
10.6.3 HOLAP
10.6.4 DOLAP
10.6.5 OLAP Survey
10.6.6 OLAP Trends
10.7 OLAP Design Considerations
10.8 OLAP Tools and Products
10.8.1 Report Scheduling and Sharing
10.8.2 Ad hoc Reporting
10.8.3 OLAP Customization
10.8.4 The Human Angle
10.9 Existing OLAP Tools
10.9.1 Spreadsheet OLAP Clients
10.9.2 Other OLAP Clients
10.9.3 Embedded OLAP
10.10 Data Design
10.10 Administration and Performance
10.11 OLAP Platforms
Summary
Review Questions
11: Overview of Building and Maintaining A Data Warehouse
Learning Objective
Case Study
11.1 Problem Definition
11.2 Critical Success Factors
11.3 Requirement Analysis
11.4 Planning for the Data Warehouse
11.4.1 Project Staff
11.4.2 Project Plan
11.4.3 Outsourcing Vs Custom Planning
11.4.4 Detailed Project Plan
11.5 Data Warehouse Design Stage
11.5.1 Design the Dimensional Model
11.5.2 Develop the Architecture
11.5.3 Design for Update and Expansion
11.5.4 Design the Relational Database and OLAP Cubes
11.5.5 Decisions in Design
11.5.6 Detail Design
11.5.7 Other Design Considerations
11.6 Building and Implementing Data Marts
11.7 Building Data Warehouse
11.7.1 Test and Deploy the System
11.7.2 Transition to Production
11.7.3 User Training and Support
11.7.3.1 The Success Factors of a Training Program
11.7.3.2 Issues in User Support
11.8 Backup and Recovery
11.9 Establish the Data Quality Framework
11.9.1 Data Purification Process
11.10 Security Issues in a Data Warehouse
11.11 Operating the Data Warehouse
11.11.1 Day-to-Day Operations of the Data Warehouse
11.11.2 Administering the Data Warehouse
11.11.3 Overnight Processing
11.12 Recipe for a Successful Data Warehouse
11.13 Data Warehouse Pitfalls
Summary
Review Questions
12: Data Mining Basics
Learning Objective
Case Study
12.1 Introduction
12.1.1 What Is Data Mining
12.1.2 Foundation of Data Mining
12.1.3 An Analogy
12.1.4 What Can Be Discovered
12.1.5 What Type of Data Can Be Mined
12.2 Architecture of Data Mining System
12.3 The KDD Process
12.4 Integrating Data Mining and the Data Warehouse
12.4.1 KDD versus Data Mining
12.4.2 DBMS versus Data Mining
12.4.3 OLAP versus Data Mining
12.5 Related Areas of Data Mining
12.6 Data Mining Techniques
12.6.1 Association Rule Mining
12.6.2 Decision Tress
12.6.3 Clustering Analysis
12.6.4 Memory Based Reasoning
12.6.5 Genetic Algorithm
12.6.6 Neural networks
12.6.7 Outlier Analysis
Summary
Review Questions
13: Moving into Data Mining
Learning Objective
Case Study
13.1 Introduction
13.2 How Do We Categorize Data Mining System
13.3 Is all that is Discovered Interesting and Useful
13.4 Applications of Data Mining
13.4.1 Benefits of Data Mining
13.4.2 Data Mining For Retail Industry
13.4.3 Data Mining For Telecommunication Industry
13.4.4 Data Mining For Banking and Finance
13.4.5 Data Mining For Biomedical and DNA Data Analysis
13.4.6 Data Mining For Customer Retention
13.4.7 Data Mining For Targeted Marketing
13.4.8 Data Mining For Customer Relationship Management
13.5 Other Data Mining Application Areas
13.6 Advantages and Disadvantages of Data Mining
13.7 Web Mining
13.7.1 Web Content Mining
13.7.2 Web Structure Mining
13.7.3 Web Usage Mining
13.8 Text Mining
13.9 Temporal Data Mining
13.10 Sequence Mining
13.11 Time Series Analysis
13.12 Spatial Data Mining
13.13 Issues and Challenges in Data Mining
13.14 Current Trends Affecting Data Mining
Summary
Review Questions
14: Trends In Data Warehousing
Learning Objective
Case Study
14.1 Introduction
14.2 Data Warehouse Solutions
14.2.1 Data Warehouse Implementation Alternatives
14.2.2 Host-Based Data Warehouses
14.2.2.1 Single host Based Data Warehouses
14.2.2.2 Host Based Single Stage (LAN)-Based Data Warehouses
14.2.3 LAN- Based Workgroup Data Warehouses
14.2.4 Multistage Data Warehouses
14.2.5 Stationary Data Warehouses
14.3 Web Enabled Data Warehouse
14.3.1 Using the Web for Information Delivery
14.3.2 Expectations from the Web as an Information Delivery Medium
14.3.3 Super Growth Problem
14.3.4 Data Webhouse Prominent Features
14.3.5 The Need for Data Webhouse
14.3.6 The Data Webhouse Architecture
14.3.7 Similarities with Traditional Data Warehouses
14.3.8 Building Clickstream Data Webhouse
14.3.9 The Granularity Manager
14.3.10 Challenges in the Clickstream Data Webhouse Lifecycle
14.4 Distributed Data Warehouses
14.4.1 Advantages of Distributed Data Warehousing
14.4.2 Distributed versus Centralized Warehouse
14.5 The Virtual Data Warehouse
14.5.1 Why to Go For a Virtual Data Warehouse
14.5.2 Problems with a Virtual Data Warehouse
14.5.3 Advantages of Using a Virtual Data Warehouse
14.6 Data Warehouse and the ODS
14.7 Integration of Data Warehousing with other Technologies
14.7.1 Data Warehousing and ERP
14.7.1.1 Integrating ERP and Data Warehouse
14.7.1.2 Issues in integrating ERP with Data Warehousing
14.7.1.3 Common Misconceptions about DW and ERP
14.7.1.4 Conclusion
14.7.2 Data Warehousing and Knowledge Management
14.7.3 Data Warehousing and EIS
14.7.3.1 Executive information System
14.7.3.2 Data Warehouse as a Basis for EIS
14.7.4 Data Warehousing and CRM
14.7.4.1 Active Data Warehousing
14.8 Trends in Data Warehousing
14.8.1 Multiple Data Types
14.8.2 Data Visualization
14.8.3 Parallel Processing
14.8.4 Agent Technology
14.9 Data Warehouse Futures
Summary
Review Questions
Appendix
Glossary
Learning Objective
Case Study
1.1 A Short Historical Note
1.2 Need for Data Warehousing
1.2.1 Increasing Demand for Strategic Information
1.2.2 The Information Crisis
1.2.3 Inability of Past Decision Support System
1.2.4 Presence of Better Technology
1.2.5 Expectations from the New Kind of Decision Support System
1.2.6 Operational Vs Decisional Support System
1.3 Data Warehouse Defined
1.3.1 What can a Data Warehouse Do?
1.3.2 What Data Warehouse cannot do?
1.3.3 What is a Data Warehouse- an Environment or a Product?
1.3.4 A Blend of Many Technologies
1.4 Data Warehouse Users
1.4.1 Why do they want Information?
1.5 Benefits of Data Warehousing
1.5.1 Tangible Benefits
1.6 Concerns in Data Warehousing
1.6.1 Nothing is for free
Summary
Review Questions
2: Data Warehouse: Defining Features
Learning Objectives
Case Study
2.1 Introduction
2.2 Features of a Data Warehouse
2.2.1 Subject Oriented Data
2.2.2 Integrated Data
2.2.2.1 Data Cleansing
2.2.2.2 Data Transformation
2.2.2.3 Non-Volatile Data
2.2.2.4 Time Variant Data
2.3 Data Granularity
2.3.1 Benefits of Data Granularity
2.3.2 Data granularity - Pros and Cons
2.3.3 Dual Levels of Data Granularity
2.4 The Information Flow Mechanism
2.5 Metadata
2.5.1 Role of Metadata
2.5.2 Classification of Metadata
2.5.3 Metadata is the Nerve Centre of the Data Warehouse
2.5.4 Metadata Management
2.6 Two Classes of Data
2.7 Life Cycle of Data
2.7.1 What is Data Velocity?
2.7.2 Moving Data from One Medium to Another
2.7.3 Inverted Data Warehouse
2.8 Can Data Move from Data Warehouse to the Operational Systems?
2.8.1 Direct Access Mode
2.8.2 Indirect Access Mode
Summary
Review Questions
3: Physical Architecture of a Data Warehouse and Data Mart Issues
Learning Objectives
Case Study
3.1 Introduction
3.2 Distinguishing Characteristics of Data Warehouse Architecture
3.3 Data Warehouse Architectural Goals
3.4 Data Warehouse Architecture
3.4.1 Pros and Cons of Data Warehouse Architecture
3.4.2 The Two Tier Architecture
3.4.3 The Three Tier Architecture
3.4.4 The Four Tier Architecture
3.4.5 Three Tier Versus Two Tier Architecture
3.4.6 Architecture Considerations and Challenges
3.4.7 Interfacing
3.5 Data Warehouse and Data Marts
3.6 Issues in Building Data Marts
3.6.1 A Change of Approaches
3.6.2 How Are Data Warehouse Different From Data Marts
3.6.3 Reasons for Creating Data Marts
3.6.4 Advantages of Building a Data Mart
3.6.5 Limitations of Building a Data Mart
3.7 Building Data Marts
3.8 Other Data Mart Issues
3.8.1 Types of Data Marts Based on Underlying DBMS
3.8.2 Loading of Data Marts
3.8.2.1 The Types of Data Marts to Load
3.8.2.2 Loading Temporal Data Marts
3.8.2.3 Loading of Non- Temporal Data Marts
3.8.3 Metadata for a Data Mart
3.8.4 Maintenance of a Data mart
3.8.5 Nature of data in a Data Mart
3.8.6 Software Components of a Data Mart
3.8.7 Performance Issues
3.8.8 Monitoring Requirements for a Data Mart
3.8.9 Security In A Data Mart
3.8.10 Structure of a Data Mart
3.9 Reasons for Increased Popularity of Data Marts
3.10 Can We Have the Data Warehouse and Data Marts on the Same Processor?
3.11 Pushing and Pulling Data
Summary
Review Questions
4: Gathering the Business Requirements
Learning Objective
Case Study
4.1 Introduction
4.2 Determining the End User Requirements
4.2.1 Business Objectives
4.2.2 Business Queries
4.2.3 Determining the Functional Requirements
4.2.4 Information Infrastructure Environment
4.2.5 The Data Quality Levels
4.3 Requirements Gathering Methods
4.3.1 Interviews
4.3.2 JAD Methodology
4.3.3 Review of Existing Documentation
4.3.4 Brainstorming
4.3.5 Questionnaires
4.3.6 Where to Stop?
4.4 Requirements Analysis
4.4.1 Requirements Definition Document
4.5 Gathering Requirements for a Data Warehouse Project
4.6 Dimensional Analysis
4.6.1 Business Dimensions
4.6.2 Dimension Hierarchies/Categories
4.6.3 Facts or Metrics
4.6.4 Example
4.7 Information Package Diagram
4.7.1 What Information does an IPD contain?
4.7.2 Example
4.7.3 Reason for Forming IPD
Summary
Review questions
5: Planning and Project Management In A Data Warehouse
Learning Objective
Case Study
5.1 The Project Management Principles
5.1.1 Key Considerations
5.1.2 The Ideal Approach
5.2 Data Warehouse Readiness Assessment
5.2.1 Bad Performance Indicators
5.2.2 Indications for a Successful Data Warehouse Project
5.3 The Data Warehouse Project Team
5.3.1 Key Roles
5.3.2 User Involvement
5.4 Planning for the Data Warehouse
5.4.1 Gathering the Business Requirements
5.4.2 Gaining Support for the Project
5.5 The Data Warehouse Project Plan
5.6 Economic Feasibility Analysis
5.6.1 Costs and Benefits of the System
5.6.2 Economic Feasibility Measures
5.6.3 Justifying the New System
5.7 Planning For a Data Warehouse Server
5.7.1 SMP
5.7.2 Clusters
5.7.3 MMP
5.7.4 ccNUMA
5.8 Capacity Planning
5.8.1 Estimating the Load
5.8.2 Estimating the CPU Bandwidth
5.8.3 Estimating the Memory
5.8.4 Estimating the Disk
5.9 Selecting the Operating System for the Data Warehouse
5.10 Selecting the Database Software
5.10.1 Difference between General DBMS and Data Warehouse DBMS
5.10.2 How to Choose?
5.11 Selection of Tools
5.11.1 Information Delivery Tools
5.11.1.1 The Tool Selection Technique
5.11.1.2 Criteria for Selecting the Information Delivery Tool
5.11.2 Query Tools
5.11.3 Browser Tools
5.11.4 Metadata Tools
5.15.5 Data Quality Tools
Summary
Review Questions
6: Data Warehouse Schema
6.1 Introduction
6.2 Building the Fact Tables and Dimension Tables
6.2.1 The Traditional Approach
6.3 Dimensional Modeling
6.3.1 Data Warehouse Modeling Vs Operational Database Modeling
6.3.2 Dimensional Model Vs ER Model
6.3.3 The Need for Dimension Model
6.3.4 Features of a Good Dimensional Model
6.4 The Star Schema
6.4.1 How Does a Query Execute?
6.4.2 Example
6.4.3 Pros and Cons of the Star Schema
6.5 The Snowflake Schema
6.5.1 The Technique
6.5.2 Example
6.5.3 Is Snowflaking Really Helpful?
6.5.4 Pros and Cons of the Snowflake Schema
6.6 Aggregate Tables
6.6.1 Need for Building Aggregate Fact Tables
6.6.2 Limitations of Aggregate Tables
6.7 Fact Constellation Schema or Families of Star
6.7.1 Pre-requisite for a Fact Constellation Schema
6.7.2 Pros and Cons of Fact Constellation Schema
6.8 Strengths of Dimensional Modeling
6.9 Data Warehouse and the Data Model
Summary
Review Questions
7: Fact Tables and Dimension Tables: Miscellaneous Issues
Learning Objective
Case Study
7.1 Characteristics of a Dimension Table
7.2 Characteristics of a Fact Table
7.3 The Factless Fact Table
7.4 Updates To Dimension Tables
7.4.1 Slowly Changing Dimensions
7.4.1.1 Type 1 Changes
7.4.1.2 Type 2 Changes
7.4.1.3 Type 3 Changes
7.4.1.4 Example
7.5 Cyclicity of Data - Wrinkle of Time
7.6 Other Types of Dimension Tables
7.6.1 Large Dimension Tables
7.6.2 Rapidly Changing or Large Slowly Changing Dimensions
7.6.3 Junk Dimensions
7.7 Keys in the Data Warehouse Schema
7.7.1 Primary Keys
7.7.2 Surrogate Keys
7.7.3 Foreign Keys
7.8 Enhancing the Data Warehouse Performance
7.8.1 Table Compression
7.8.2 Parallel Execution
7.8.3 Table Partitioning
7.8.3.1 The Partitioning Technique
7.8.3.2 Advantages of Partitioning
7.8.4 Data Clustering
7.8.5 Data Summarization
7.8.6 Bypassing the Referential Integrity Checks
7.8.7 Indexing the Data Warehouse
7.9 Data Warehousing and the Technology
Summary
Review Questions
8: THE ETL PROCESS
Learning Objective
Case Study
8.1 Introduction
8.1.1 Challenges in ETL Functions
8.2 Data Extraction
8.2.1 Identification of Data Sources
8.2.2 Extracting Data for Data Warehouse Refreshing
8.2.2.1 Immediate Data Extraction Technique
8.2.2.2 Deferred Data Extraction Technique
8.2.2.3 Evaluation of Extraction Techniques
8.2.3 Managing Reference Tables in a Data Warehouse
8.3 Data Transformation
8.3.1 Tasks Involved in Data Transformation
8.3.2 Role of Data Transformation Process
8.4 Data Loading
8.4.1 Techniques of Data Loading
8.4.2 When should we go for Data Update rather than Data Refresh?
8.4.3 Loading the Fact Tables and Dimension Tables
8.5 Data Quality
8.5.1 The Need for Data Quality
8.5.2 Categories of Errors Which Effect data Quality
8.5.2.1 Incomplete Errors
8.5.2.2 Incorrect Errors
8.5.2.3 Incomprehensibility Errors
8.5.2.4 Inconsistency Errors
8.5.3 Issues in Data Cleansing
8.5.4 Conclusion about Data Quality
Summary
Review Questions
9: Testing, Growth and Maintenance Of Data Warehouse
Learning Objective
Case Study
9.1 Data Warehouse Design Review
9.1.1 Contents of a Typical Design Review
9.2 Developing the Data Warehouse Iteratively
9.3 Testing
9.3.1 Testing the Data Warehouse
9.3.2 Developing the Test Plan
9.3.3 Testing the Backup and Recovery Processes
9.3.4 Testing the Data Warehouse Environment
9.3.5 Testing the Database
9.3.6 Logging of Test Results
9.4 Monitoring the Data Warehouse
9.4.1 Why Are Statistics Monitored?
9.5 Tuning the Data Warehouse
9.5.1 Tuning the Data Load
9.5.2 Tuning Queries
9.6 The Feedback Loop
Summary
Review Questions
10: OLAP in the Data Warehouse
Learning Objective
Case Study
10.1 Need for Online Analytical Processing
10.1.1 Multi Dimensional Analysis
10.1.2 Fast Access and Powerful Calculations
10.2 OLAP
10.2.1 OLAP Defined
10.2.2 OLAP is a Data Warehouse Tool
10.3 OLAP and Multidimensional Analysis
10.3.1 The Multi-Dimensional Logical Data Model
10.3.2 Multi Dimensional Model's Users
10.3.3 The Multi Dimensional Structure
10.3.4 Multi- Dimensional Operations
10.3.5 The Business Need
10.4 OLAP Functions
10.4.1 Dimensional Analysis
10.4.2 Hypercubes
10.4.3 OLAP Operations in Multidimensional Data Model
10.5 OLAP Applications
10.5.1 Integrating OLAP with GIS
10.6 OLAP Models
10.6.1 MOLAP
10.6.2 ROLAP
10.6.3 HOLAP
10.6.4 DOLAP
10.6.5 OLAP Survey
10.6.6 OLAP Trends
10.7 OLAP Design Considerations
10.8 OLAP Tools and Products
10.8.1 Report Scheduling and Sharing
10.8.2 Ad hoc Reporting
10.8.3 OLAP Customization
10.8.4 The Human Angle
10.9 Existing OLAP Tools
10.9.1 Spreadsheet OLAP Clients
10.9.2 Other OLAP Clients
10.9.3 Embedded OLAP
10.10 Data Design
10.10 Administration and Performance
10.11 OLAP Platforms
Summary
Review Questions
11: Overview of Building and Maintaining A Data Warehouse
Learning Objective
Case Study
11.1 Problem Definition
11.2 Critical Success Factors
11.3 Requirement Analysis
11.4 Planning for the Data Warehouse
11.4.1 Project Staff
11.4.2 Project Plan
11.4.3 Outsourcing Vs Custom Planning
11.4.4 Detailed Project Plan
11.5 Data Warehouse Design Stage
11.5.1 Design the Dimensional Model
11.5.2 Develop the Architecture
11.5.3 Design for Update and Expansion
11.5.4 Design the Relational Database and OLAP Cubes
11.5.5 Decisions in Design
11.5.6 Detail Design
11.5.7 Other Design Considerations
11.6 Building and Implementing Data Marts
11.7 Building Data Warehouse
11.7.1 Test and Deploy the System
11.7.2 Transition to Production
11.7.3 User Training and Support
11.7.3.1 The Success Factors of a Training Program
11.7.3.2 Issues in User Support
11.8 Backup and Recovery
11.9 Establish the Data Quality Framework
11.9.1 Data Purification Process
11.10 Security Issues in a Data Warehouse
11.11 Operating the Data Warehouse
11.11.1 Day-to-Day Operations of the Data Warehouse
11.11.2 Administering the Data Warehouse
11.11.3 Overnight Processing
11.12 Recipe for a Successful Data Warehouse
11.13 Data Warehouse Pitfalls
Summary
Review Questions
12: Data Mining Basics
Learning Objective
Case Study
12.1 Introduction
12.1.1 What Is Data Mining
12.1.2 Foundation of Data Mining
12.1.3 An Analogy
12.1.4 What Can Be Discovered
12.1.5 What Type of Data Can Be Mined
12.2 Architecture of Data Mining System
12.3 The KDD Process
12.4 Integrating Data Mining and the Data Warehouse
12.4.1 KDD versus Data Mining
12.4.2 DBMS versus Data Mining
12.4.3 OLAP versus Data Mining
12.5 Related Areas of Data Mining
12.6 Data Mining Techniques
12.6.1 Association Rule Mining
12.6.2 Decision Tress
12.6.3 Clustering Analysis
12.6.4 Memory Based Reasoning
12.6.5 Genetic Algorithm
12.6.6 Neural networks
12.6.7 Outlier Analysis
Summary
Review Questions
13: Moving into Data Mining
Learning Objective
Case Study
13.1 Introduction
13.2 How Do We Categorize Data Mining System
13.3 Is all that is Discovered Interesting and Useful
13.4 Applications of Data Mining
13.4.1 Benefits of Data Mining
13.4.2 Data Mining For Retail Industry
13.4.3 Data Mining For Telecommunication Industry
13.4.4 Data Mining For Banking and Finance
13.4.5 Data Mining For Biomedical and DNA Data Analysis
13.4.6 Data Mining For Customer Retention
13.4.7 Data Mining For Targeted Marketing
13.4.8 Data Mining For Customer Relationship Management
13.5 Other Data Mining Application Areas
13.6 Advantages and Disadvantages of Data Mining
13.7 Web Mining
13.7.1 Web Content Mining
13.7.2 Web Structure Mining
13.7.3 Web Usage Mining
13.8 Text Mining
13.9 Temporal Data Mining
13.10 Sequence Mining
13.11 Time Series Analysis
13.12 Spatial Data Mining
13.13 Issues and Challenges in Data Mining
13.14 Current Trends Affecting Data Mining
Summary
Review Questions
14: Trends In Data Warehousing
Learning Objective
Case Study
14.1 Introduction
14.2 Data Warehouse Solutions
14.2.1 Data Warehouse Implementation Alternatives
14.2.2 Host-Based Data Warehouses
14.2.2.1 Single host Based Data Warehouses
14.2.2.2 Host Based Single Stage (LAN)-Based Data Warehouses
14.2.3 LAN- Based Workgroup Data Warehouses
14.2.4 Multistage Data Warehouses
14.2.5 Stationary Data Warehouses
14.3 Web Enabled Data Warehouse
14.3.1 Using the Web for Information Delivery
14.3.2 Expectations from the Web as an Information Delivery Medium
14.3.3 Super Growth Problem
14.3.4 Data Webhouse Prominent Features
14.3.5 The Need for Data Webhouse
14.3.6 The Data Webhouse Architecture
14.3.7 Similarities with Traditional Data Warehouses
14.3.8 Building Clickstream Data Webhouse
14.3.9 The Granularity Manager
14.3.10 Challenges in the Clickstream Data Webhouse Lifecycle
14.4 Distributed Data Warehouses
14.4.1 Advantages of Distributed Data Warehousing
14.4.2 Distributed versus Centralized Warehouse
14.5 The Virtual Data Warehouse
14.5.1 Why to Go For a Virtual Data Warehouse
14.5.2 Problems with a Virtual Data Warehouse
14.5.3 Advantages of Using a Virtual Data Warehouse
14.6 Data Warehouse and the ODS
14.7 Integration of Data Warehousing with other Technologies
14.7.1 Data Warehousing and ERP
14.7.1.1 Integrating ERP and Data Warehouse
14.7.1.2 Issues in integrating ERP with Data Warehousing
14.7.1.3 Common Misconceptions about DW and ERP
14.7.1.4 Conclusion
14.7.2 Data Warehousing and Knowledge Management
14.7.3 Data Warehousing and EIS
14.7.3.1 Executive information System
14.7.3.2 Data Warehouse as a Basis for EIS
14.7.4 Data Warehousing and CRM
14.7.4.1 Active Data Warehousing
14.8 Trends in Data Warehousing
14.8.1 Multiple Data Types
14.8.2 Data Visualization
14.8.3 Parallel Processing
14.8.4 Agent Technology
14.9 Data Warehouse Futures
Summary
Review Questions
Appendix
Glossary
1: The Compelling Need for Data Warehousing
Learning Objective
Case Study
1.1 A Short Historical Note
1.2 Need for Data Warehousing
1.2.1 Increasing Demand for Strategic Information
1.2.2 The Information Crisis
1.2.3 Inability of Past Decision Support System
1.2.4 Presence of Better Technology
1.2.5 Expectations from the New Kind of Decision Support System
1.2.6 Operational Vs Decisional Support System
1.3 Data Warehouse Defined
1.3.1 What can a Data Warehouse Do?
1.3.2 What Data Warehouse cannot do?
1.3.3 What is a Data Warehouse- an Environment or a Product?
1.3.4 A Blend of Many Technologies
1.4 Data Warehouse Users
1.4.1 Why do they want Information?
1.5 Benefits of Data Warehousing
1.5.1 Tangible Benefits
1.6 Concerns in Data Warehousing
1.6.1 Nothing is for free
Summary
Review Questions
2: Data Warehouse: Defining Features
Learning Objectives
Case Study
2.1 Introduction
2.2 Features of a Data Warehouse
2.2.1 Subject Oriented Data
2.2.2 Integrated Data
2.2.2.1 Data Cleansing
2.2.2.2 Data Transformation
2.2.2.3 Non-Volatile Data
2.2.2.4 Time Variant Data
2.3 Data Granularity
2.3.1 Benefits of Data Granularity
2.3.2 Data granularity - Pros and Cons
2.3.3 Dual Levels of Data Granularity
2.4 The Information Flow Mechanism
2.5 Metadata
2.5.1 Role of Metadata
2.5.2 Classification of Metadata
2.5.3 Metadata is the Nerve Centre of the Data Warehouse
2.5.4 Metadata Management
2.6 Two Classes of Data
2.7 Life Cycle of Data
2.7.1 What is Data Velocity?
2.7.2 Moving Data from One Medium to Another
2.7.3 Inverted Data Warehouse
2.8 Can Data Move from Data Warehouse to the Operational Systems?
2.8.1 Direct Access Mode
2.8.2 Indirect Access Mode
Summary
Review Questions
3: Physical Architecture of a Data Warehouse and Data Mart Issues
Learning Objectives
Case Study
3.1 Introduction
3.2 Distinguishing Characteristics of Data Warehouse Architecture
3.3 Data Warehouse Architectural Goals
3.4 Data Warehouse Architecture
3.4.1 Pros and Cons of Data Warehouse Architecture
3.4.2 The Two Tier Architecture
3.4.3 The Three Tier Architecture
3.4.4 The Four Tier Architecture
3.4.5 Three Tier Versus Two Tier Architecture
3.4.6 Architecture Considerations and Challenges
3.4.7 Interfacing
3.5 Data Warehouse and Data Marts
3.6 Issues in Building Data Marts
3.6.1 A Change of Approaches
3.6.2 How Are Data Warehouse Different From Data Marts
3.6.3 Reasons for Creating Data Marts
3.6.4 Advantages of Building a Data Mart
3.6.5 Limitations of Building a Data Mart
3.7 Building Data Marts
3.8 Other Data Mart Issues
3.8.1 Types of Data Marts Based on Underlying DBMS
3.8.2 Loading of Data Marts
3.8.2.1 The Types of Data Marts to Load
3.8.2.2 Loading Temporal Data Marts
3.8.2.3 Loading of Non- Temporal Data Marts
3.8.3 Metadata for a Data Mart
3.8.4 Maintenance of a Data mart
3.8.5 Nature of data in a Data Mart
3.8.6 Software Components of a Data Mart
3.8.7 Performance Issues
3.8.8 Monitoring Requirements for a Data Mart
3.8.9 Security In A Data Mart
3.8.10 Structure of a Data Mart
3.9 Reasons for Increased Popularity of Data Marts
3.10 Can We Have the Data Warehouse and Data Marts on the Same Processor?
3.11 Pushing and Pulling Data
Summary
Review Questions
4: Gathering the Business Requirements
Learning Objective
Case Study
4.1 Introduction
4.2 Determining the End User Requirements
4.2.1 Business Objectives
4.2.2 Business Queries
4.2.3 Determining the Functional Requirements
4.2.4 Information Infrastructure Environment
4.2.5 The Data Quality Levels
4.3 Requirements Gathering Methods
4.3.1 Interviews
4.3.2 JAD Methodology
4.3.3 Review of Existing Documentation
4.3.4 Brainstorming
4.3.5 Questionnaires
4.3.6 Where to Stop?
4.4 Requirements Analysis
4.4.1 Requirements Definition Document
4.5 Gathering Requirements for a Data Warehouse Project
4.6 Dimensional Analysis
4.6.1 Business Dimensions
4.6.2 Dimension Hierarchies/Categories
4.6.3 Facts or Metrics
4.6.4 Example
4.7 Information Package Diagram
4.7.1 What Information does an IPD contain?
4.7.2 Example
4.7.3 Reason for Forming IPD
Summary
Review questions
5: Planning and Project Management In A Data Warehouse
Learning Objective
Case Study
5.1 The Project Management Principles
5.1.1 Key Considerations
5.1.2 The Ideal Approach
5.2 Data Warehouse Readiness Assessment
5.2.1 Bad Performance Indicators
5.2.2 Indications for a Successful Data Warehouse Project
5.3 The Data Warehouse Project Team
5.3.1 Key Roles
5.3.2 User Involvement
5.4 Planning for the Data Warehouse
5.4.1 Gathering the Business Requirements
5.4.2 Gaining Support for the Project
5.5 The Data Warehouse Project Plan
5.6 Economic Feasibility Analysis
5.6.1 Costs and Benefits of the System
5.6.2 Economic Feasibility Measures
5.6.3 Justifying the New System
5.7 Planning For a Data Warehouse Server
5.7.1 SMP
5.7.2 Clusters
5.7.3 MMP
5.7.4 ccNUMA
5.8 Capacity Planning
5.8.1 Estimating the Load
5.8.2 Estimating the CPU Bandwidth
5.8.3 Estimating the Memory
5.8.4 Estimating the Disk
5.9 Selecting the Operating System for the Data Warehouse
5.10 Selecting the Database Software
5.10.1 Difference between General DBMS and Data Warehouse DBMS
5.10.2 How to Choose?
5.11 Selection of Tools
5.11.1 Information Delivery Tools
5.11.1.1 The Tool Selection Technique
5.11.1.2 Criteria for Selecting the Information Delivery Tool
5.11.2 Query Tools
5.11.3 Browser Tools
5.11.4 Metadata Tools
5.15.5 Data Quality Tools
Summary
Review Questions
6: Data Warehouse Schema
6.1 Introduction
6.2 Building the Fact Tables and Dimension Tables
6.2.1 The Traditional Approach
6.3 Dimensional Modeling
6.3.1 Data Warehouse Modeling Vs Operational Database Modeling
6.3.2 Dimensional Model Vs ER Model
6.3.3 The Need for Dimension Model
6.3.4 Features of a Good Dimensional Model
6.4 The Star Schema
6.4.1 How Does a Query Execute?
6.4.2 Example
6.4.3 Pros and Cons of the Star Schema
6.5 The Snowflake Schema
6.5.1 The Technique
6.5.2 Example
6.5.3 Is Snowflaking Really Helpful?
6.5.4 Pros and Cons of the Snowflake Schema
6.6 Aggregate Tables
6.6.1 Need for Building Aggregate Fact Tables
6.6.2 Limitations of Aggregate Tables
6.7 Fact Constellation Schema or Families of Star
6.7.1 Pre-requisite for a Fact Constellation Schema
6.7.2 Pros and Cons of Fact Constellation Schema
6.8 Strengths of Dimensional Modeling
6.9 Data Warehouse and the Data Model
Summary
Review Questions
7: Fact Tables and Dimension Tables: Miscellaneous Issues
Learning Objective
Case Study
7.1 Characteristics of a Dimension Table
7.2 Characteristics of a Fact Table
7.3 The Factless Fact Table
7.4 Updates To Dimension Tables
7.4.1 Slowly Changing Dimensions
7.4.1.1 Type 1 Changes
7.4.1.2 Type 2 Changes
7.4.1.3 Type 3 Changes
7.4.1.4 Example
7.5 Cyclicity of Data - Wrinkle of Time
7.6 Other Types of Dimension Tables
7.6.1 Large Dimension Tables
7.6.2 Rapidly Changing or Large Slowly Changing Dimensions
7.6.3 Junk Dimensions
7.7 Keys in the Data Warehouse Schema
7.7.1 Primary Keys
7.7.2 Surrogate Keys
7.7.3 Foreign Keys
7.8 Enhancing the Data Warehouse Performance
7.8.1 Table Compression
7.8.2 Parallel Execution
7.8.3 Table Partitioning
7.8.3.1 The Partitioning Technique
7.8.3.2 Advantages of Partitioning
7.8.4 Data Clustering
7.8.5 Data Summarization
7.8.6 Bypassing the Referential Integrity Checks
7.8.7 Indexing the Data Warehouse
7.9 Data Warehousing and the Technology
Summary
Review Questions
8: THE ETL PROCESS
Learning Objective
Case Study
8.1 Introduction
8.1.1 Challenges in ETL Functions
8.2 Data Extraction
8.2.1 Identification of Data Sources
8.2.2 Extracting Data for Data Warehouse Refreshing
8.2.2.1 Immediate Data Extraction Technique
8.2.2.2 Deferred Data Extraction Technique
8.2.2.3 Evaluation of Extraction Techniques
8.2.3 Managing Reference Tables in a Data Warehouse
8.3 Data Transformation
8.3.1 Tasks Involved in Data Transformation
8.3.2 Role of Data Transformation Process
8.4 Data Loading
8.4.1 Techniques of Data Loading
8.4.2 When should we go for Data Update rather than Data Refresh?
8.4.3 Loading the Fact Tables and Dimension Tables
8.5 Data Quality
8.5.1 The Need for Data Quality
8.5.2 Categories of Errors Which Effect data Quality
8.5.2.1 Incomplete Errors
8.5.2.2 Incorrect Errors
8.5.2.3 Incomprehensibility Errors
8.5.2.4 Inconsistency Errors
8.5.3 Issues in Data Cleansing
8.5.4 Conclusion about Data Quality
Summary
Review Questions
9: Testing, Growth and Maintenance Of Data Warehouse
Learning Objective
Case Study
9.1 Data Warehouse Design Review
9.1.1 Contents of a Typical Design Review
9.2 Developing the Data Warehouse Iteratively
9.3 Testing
9.3.1 Testing the Data Warehouse
9.3.2 Developing the Test Plan
9.3.3 Testing the Backup and Recovery Processes
9.3.4 Testing the Data Warehouse Environment
9.3.5 Testing the Database
9.3.6 Logging of Test Results
9.4 Monitoring the Data Warehouse
9.4.1 Why Are Statistics Monitored?
9.5 Tuning the Data Warehouse
9.5.1 Tuning the Data Load
9.5.2 Tuning Queries
9.6 The Feedback Loop
Summary
Review Questions
10: OLAP in the Data Warehouse
Learning Objective
Case Study
10.1 Need for Online Analytical Processing
10.1.1 Multi Dimensional Analysis
10.1.2 Fast Access and Powerful Calculations
10.2 OLAP
10.2.1 OLAP Defined
10.2.2 OLAP is a Data Warehouse Tool
10.3 OLAP and Multidimensional Analysis
10.3.1 The Multi-Dimensional Logical Data Model
10.3.2 Multi Dimensional Model's Users
10.3.3 The Multi Dimensional Structure
10.3.4 Multi- Dimensional Operations
10.3.5 The Business Need
10.4 OLAP Functions
10.4.1 Dimensional Analysis
10.4.2 Hypercubes
10.4.3 OLAP Operations in Multidimensional Data Model
10.5 OLAP Applications
10.5.1 Integrating OLAP with GIS
10.6 OLAP Models
10.6.1 MOLAP
10.6.2 ROLAP
10.6.3 HOLAP
10.6.4 DOLAP
10.6.5 OLAP Survey
10.6.6 OLAP Trends
10.7 OLAP Design Considerations
10.8 OLAP Tools and Products
10.8.1 Report Scheduling and Sharing
10.8.2 Ad hoc Reporting
10.8.3 OLAP Customization
10.8.4 The Human Angle
10.9 Existing OLAP Tools
10.9.1 Spreadsheet OLAP Clients
10.9.2 Other OLAP Clients
10.9.3 Embedded OLAP
10.10 Data Design
10.10 Administration and Performance
10.11 OLAP Platforms
Summary
Review Questions
11: Overview of Building and Maintaining A Data Warehouse
Learning Objective
Case Study
11.1 Problem Definition
11.2 Critical Success Factors
11.3 Requirement Analysis
11.4 Planning for the Data Warehouse
11.4.1 Project Staff
11.4.2 Project Plan
11.4.3 Outsourcing Vs Custom Planning
11.4.4 Detailed Project Plan
11.5 Data Warehouse Design Stage
11.5.1 Design the Dimensional Model
11.5.2 Develop the Architecture
11.5.3 Design for Update and Expansion
11.5.4 Design the Relational Database and OLAP Cubes
11.5.5 Decisions in Design
11.5.6 Detail Design
11.5.7 Other Design Considerations
11.6 Building and Implementing Data Marts
11.7 Building Data Warehouse
11.7.1 Test and Deploy the System
11.7.2 Transition to Production
11.7.3 User Training and Support
11.7.3.1 The Success Factors of a Training Program
11.7.3.2 Issues in User Support
11.8 Backup and Recovery
11.9 Establish the Data Quality Framework
11.9.1 Data Purification Process
11.10 Security Issues in a Data Warehouse
11.11 Operating the Data Warehouse
11.11.1 Day-to-Day Operations of the Data Warehouse
11.11.2 Administering the Data Warehouse
11.11.3 Overnight Processing
11.12 Recipe for a Successful Data Warehouse
11.13 Data Warehouse Pitfalls
Summary
Review Questions
12: Data Mining Basics
Learning Objective
Case Study
12.1 Introduction
12.1.1 What Is Data Mining
12.1.2 Foundation of Data Mining
12.1.3 An Analogy
12.1.4 What Can Be Discovered
12.1.5 What Type of Data Can Be Mined
12.2 Architecture of Data Mining System
12.3 The KDD Process
12.4 Integrating Data Mining and the Data Warehouse
12.4.1 KDD versus Data Mining
12.4.2 DBMS versus Data Mining
12.4.3 OLAP versus Data Mining
12.5 Related Areas of Data Mining
12.6 Data Mining Techniques
12.6.1 Association Rule Mining
12.6.2 Decision Tress
12.6.3 Clustering Analysis
12.6.4 Memory Based Reasoning
12.6.5 Genetic Algorithm
12.6.6 Neural networks
12.6.7 Outlier Analysis
Summary
Review Questions
13: Moving into Data Mining
Learning Objective
Case Study
13.1 Introduction
13.2 How Do We Categorize Data Mining System
13.3 Is all that is Discovered Interesting and Useful
13.4 Applications of Data Mining
13.4.1 Benefits of Data Mining
13.4.2 Data Mining For Retail Industry
13.4.3 Data Mining For Telecommunication Industry
13.4.4 Data Mining For Banking and Finance
13.4.5 Data Mining For Biomedical and DNA Data Analysis
13.4.6 Data Mining For Customer Retention
13.4.7 Data Mining For Targeted Marketing
13.4.8 Data Mining For Customer Relationship Management
13.5 Other Data Mining Application Areas
13.6 Advantages and Disadvantages of Data Mining
13.7 Web Mining
13.7.1 Web Content Mining
13.7.2 Web Structure Mining
13.7.3 Web Usage Mining
13.8 Text Mining
13.9 Temporal Data Mining
13.10 Sequence Mining
13.11 Time Series Analysis
13.12 Spatial Data Mining
13.13 Issues and Challenges in Data Mining
13.14 Current Trends Affecting Data Mining
Summary
Review Questions
14: Trends In Data Warehousing
Learning Objective
Case Study
14.1 Introduction
14.2 Data Warehouse Solutions
14.2.1 Data Warehouse Implementation Alternatives
14.2.2 Host-Based Data Warehouses
14.2.2.1 Single host Based Data Warehouses
14.2.2.2 Host Based Single Stage (LAN)-Based Data Warehouses
14.2.3 LAN- Based Workgroup Data Warehouses
14.2.4 Multistage Data Warehouses
14.2.5 Stationary Data Warehouses
14.3 Web Enabled Data Warehouse
14.3.1 Using the Web for Information Delivery
14.3.2 Expectations from the Web as an Information Delivery Medium
14.3.3 Super Growth Problem
14.3.4 Data Webhouse Prominent Features
14.3.5 The Need for Data Webhouse
14.3.6 The Data Webhouse Architecture
14.3.7 Similarities with Traditional Data Warehouses
14.3.8 Building Clickstream Data Webhouse
14.3.9 The Granularity Manager
14.3.10 Challenges in the Clickstream Data Webhouse Lifecycle
14.4 Distributed Data Warehouses
14.4.1 Advantages of Distributed Data Warehousing
14.4.2 Distributed versus Centralized Warehouse
14.5 The Virtual Data Warehouse
14.5.1 Why to Go For a Virtual Data Warehouse
14.5.2 Problems with a Virtual Data Warehouse
14.5.3 Advantages of Using a Virtual Data Warehouse
14.6 Data Warehouse and the ODS
14.7 Integration of Data Warehousing with other Technologies
14.7.1 Data Warehousing and ERP
14.7.1.1 Integrating ERP and Data Warehouse
14.7.1.2 Issues in integrating ERP with Data Warehousing
14.7.1.3 Common Misconceptions about DW and ERP
14.7.1.4 Conclusion
14.7.2 Data Warehousing and Knowledge Management
14.7.3 Data Warehousing and EIS
14.7.3.1 Executive information System
14.7.3.2 Data Warehouse as a Basis for EIS
14.7.4 Data Warehousing and CRM
14.7.4.1 Active Data Warehousing
14.8 Trends in Data Warehousing
14.8.1 Multiple Data Types
14.8.2 Data Visualization
14.8.3 Parallel Processing
14.8.4 Agent Technology
14.9 Data Warehouse Futures
Summary
Review Questions
Appendix
Glossary
Learning Objective
Case Study
1.1 A Short Historical Note
1.2 Need for Data Warehousing
1.2.1 Increasing Demand for Strategic Information
1.2.2 The Information Crisis
1.2.3 Inability of Past Decision Support System
1.2.4 Presence of Better Technology
1.2.5 Expectations from the New Kind of Decision Support System
1.2.6 Operational Vs Decisional Support System
1.3 Data Warehouse Defined
1.3.1 What can a Data Warehouse Do?
1.3.2 What Data Warehouse cannot do?
1.3.3 What is a Data Warehouse- an Environment or a Product?
1.3.4 A Blend of Many Technologies
1.4 Data Warehouse Users
1.4.1 Why do they want Information?
1.5 Benefits of Data Warehousing
1.5.1 Tangible Benefits
1.6 Concerns in Data Warehousing
1.6.1 Nothing is for free
Summary
Review Questions
2: Data Warehouse: Defining Features
Learning Objectives
Case Study
2.1 Introduction
2.2 Features of a Data Warehouse
2.2.1 Subject Oriented Data
2.2.2 Integrated Data
2.2.2.1 Data Cleansing
2.2.2.2 Data Transformation
2.2.2.3 Non-Volatile Data
2.2.2.4 Time Variant Data
2.3 Data Granularity
2.3.1 Benefits of Data Granularity
2.3.2 Data granularity - Pros and Cons
2.3.3 Dual Levels of Data Granularity
2.4 The Information Flow Mechanism
2.5 Metadata
2.5.1 Role of Metadata
2.5.2 Classification of Metadata
2.5.3 Metadata is the Nerve Centre of the Data Warehouse
2.5.4 Metadata Management
2.6 Two Classes of Data
2.7 Life Cycle of Data
2.7.1 What is Data Velocity?
2.7.2 Moving Data from One Medium to Another
2.7.3 Inverted Data Warehouse
2.8 Can Data Move from Data Warehouse to the Operational Systems?
2.8.1 Direct Access Mode
2.8.2 Indirect Access Mode
Summary
Review Questions
3: Physical Architecture of a Data Warehouse and Data Mart Issues
Learning Objectives
Case Study
3.1 Introduction
3.2 Distinguishing Characteristics of Data Warehouse Architecture
3.3 Data Warehouse Architectural Goals
3.4 Data Warehouse Architecture
3.4.1 Pros and Cons of Data Warehouse Architecture
3.4.2 The Two Tier Architecture
3.4.3 The Three Tier Architecture
3.4.4 The Four Tier Architecture
3.4.5 Three Tier Versus Two Tier Architecture
3.4.6 Architecture Considerations and Challenges
3.4.7 Interfacing
3.5 Data Warehouse and Data Marts
3.6 Issues in Building Data Marts
3.6.1 A Change of Approaches
3.6.2 How Are Data Warehouse Different From Data Marts
3.6.3 Reasons for Creating Data Marts
3.6.4 Advantages of Building a Data Mart
3.6.5 Limitations of Building a Data Mart
3.7 Building Data Marts
3.8 Other Data Mart Issues
3.8.1 Types of Data Marts Based on Underlying DBMS
3.8.2 Loading of Data Marts
3.8.2.1 The Types of Data Marts to Load
3.8.2.2 Loading Temporal Data Marts
3.8.2.3 Loading of Non- Temporal Data Marts
3.8.3 Metadata for a Data Mart
3.8.4 Maintenance of a Data mart
3.8.5 Nature of data in a Data Mart
3.8.6 Software Components of a Data Mart
3.8.7 Performance Issues
3.8.8 Monitoring Requirements for a Data Mart
3.8.9 Security In A Data Mart
3.8.10 Structure of a Data Mart
3.9 Reasons for Increased Popularity of Data Marts
3.10 Can We Have the Data Warehouse and Data Marts on the Same Processor?
3.11 Pushing and Pulling Data
Summary
Review Questions
4: Gathering the Business Requirements
Learning Objective
Case Study
4.1 Introduction
4.2 Determining the End User Requirements
4.2.1 Business Objectives
4.2.2 Business Queries
4.2.3 Determining the Functional Requirements
4.2.4 Information Infrastructure Environment
4.2.5 The Data Quality Levels
4.3 Requirements Gathering Methods
4.3.1 Interviews
4.3.2 JAD Methodology
4.3.3 Review of Existing Documentation
4.3.4 Brainstorming
4.3.5 Questionnaires
4.3.6 Where to Stop?
4.4 Requirements Analysis
4.4.1 Requirements Definition Document
4.5 Gathering Requirements for a Data Warehouse Project
4.6 Dimensional Analysis
4.6.1 Business Dimensions
4.6.2 Dimension Hierarchies/Categories
4.6.3 Facts or Metrics
4.6.4 Example
4.7 Information Package Diagram
4.7.1 What Information does an IPD contain?
4.7.2 Example
4.7.3 Reason for Forming IPD
Summary
Review questions
5: Planning and Project Management In A Data Warehouse
Learning Objective
Case Study
5.1 The Project Management Principles
5.1.1 Key Considerations
5.1.2 The Ideal Approach
5.2 Data Warehouse Readiness Assessment
5.2.1 Bad Performance Indicators
5.2.2 Indications for a Successful Data Warehouse Project
5.3 The Data Warehouse Project Team
5.3.1 Key Roles
5.3.2 User Involvement
5.4 Planning for the Data Warehouse
5.4.1 Gathering the Business Requirements
5.4.2 Gaining Support for the Project
5.5 The Data Warehouse Project Plan
5.6 Economic Feasibility Analysis
5.6.1 Costs and Benefits of the System
5.6.2 Economic Feasibility Measures
5.6.3 Justifying the New System
5.7 Planning For a Data Warehouse Server
5.7.1 SMP
5.7.2 Clusters
5.7.3 MMP
5.7.4 ccNUMA
5.8 Capacity Planning
5.8.1 Estimating the Load
5.8.2 Estimating the CPU Bandwidth
5.8.3 Estimating the Memory
5.8.4 Estimating the Disk
5.9 Selecting the Operating System for the Data Warehouse
5.10 Selecting the Database Software
5.10.1 Difference between General DBMS and Data Warehouse DBMS
5.10.2 How to Choose?
5.11 Selection of Tools
5.11.1 Information Delivery Tools
5.11.1.1 The Tool Selection Technique
5.11.1.2 Criteria for Selecting the Information Delivery Tool
5.11.2 Query Tools
5.11.3 Browser Tools
5.11.4 Metadata Tools
5.15.5 Data Quality Tools
Summary
Review Questions
6: Data Warehouse Schema
6.1 Introduction
6.2 Building the Fact Tables and Dimension Tables
6.2.1 The Traditional Approach
6.3 Dimensional Modeling
6.3.1 Data Warehouse Modeling Vs Operational Database Modeling
6.3.2 Dimensional Model Vs ER Model
6.3.3 The Need for Dimension Model
6.3.4 Features of a Good Dimensional Model
6.4 The Star Schema
6.4.1 How Does a Query Execute?
6.4.2 Example
6.4.3 Pros and Cons of the Star Schema
6.5 The Snowflake Schema
6.5.1 The Technique
6.5.2 Example
6.5.3 Is Snowflaking Really Helpful?
6.5.4 Pros and Cons of the Snowflake Schema
6.6 Aggregate Tables
6.6.1 Need for Building Aggregate Fact Tables
6.6.2 Limitations of Aggregate Tables
6.7 Fact Constellation Schema or Families of Star
6.7.1 Pre-requisite for a Fact Constellation Schema
6.7.2 Pros and Cons of Fact Constellation Schema
6.8 Strengths of Dimensional Modeling
6.9 Data Warehouse and the Data Model
Summary
Review Questions
7: Fact Tables and Dimension Tables: Miscellaneous Issues
Learning Objective
Case Study
7.1 Characteristics of a Dimension Table
7.2 Characteristics of a Fact Table
7.3 The Factless Fact Table
7.4 Updates To Dimension Tables
7.4.1 Slowly Changing Dimensions
7.4.1.1 Type 1 Changes
7.4.1.2 Type 2 Changes
7.4.1.3 Type 3 Changes
7.4.1.4 Example
7.5 Cyclicity of Data - Wrinkle of Time
7.6 Other Types of Dimension Tables
7.6.1 Large Dimension Tables
7.6.2 Rapidly Changing or Large Slowly Changing Dimensions
7.6.3 Junk Dimensions
7.7 Keys in the Data Warehouse Schema
7.7.1 Primary Keys
7.7.2 Surrogate Keys
7.7.3 Foreign Keys
7.8 Enhancing the Data Warehouse Performance
7.8.1 Table Compression
7.8.2 Parallel Execution
7.8.3 Table Partitioning
7.8.3.1 The Partitioning Technique
7.8.3.2 Advantages of Partitioning
7.8.4 Data Clustering
7.8.5 Data Summarization
7.8.6 Bypassing the Referential Integrity Checks
7.8.7 Indexing the Data Warehouse
7.9 Data Warehousing and the Technology
Summary
Review Questions
8: THE ETL PROCESS
Learning Objective
Case Study
8.1 Introduction
8.1.1 Challenges in ETL Functions
8.2 Data Extraction
8.2.1 Identification of Data Sources
8.2.2 Extracting Data for Data Warehouse Refreshing
8.2.2.1 Immediate Data Extraction Technique
8.2.2.2 Deferred Data Extraction Technique
8.2.2.3 Evaluation of Extraction Techniques
8.2.3 Managing Reference Tables in a Data Warehouse
8.3 Data Transformation
8.3.1 Tasks Involved in Data Transformation
8.3.2 Role of Data Transformation Process
8.4 Data Loading
8.4.1 Techniques of Data Loading
8.4.2 When should we go for Data Update rather than Data Refresh?
8.4.3 Loading the Fact Tables and Dimension Tables
8.5 Data Quality
8.5.1 The Need for Data Quality
8.5.2 Categories of Errors Which Effect data Quality
8.5.2.1 Incomplete Errors
8.5.2.2 Incorrect Errors
8.5.2.3 Incomprehensibility Errors
8.5.2.4 Inconsistency Errors
8.5.3 Issues in Data Cleansing
8.5.4 Conclusion about Data Quality
Summary
Review Questions
9: Testing, Growth and Maintenance Of Data Warehouse
Learning Objective
Case Study
9.1 Data Warehouse Design Review
9.1.1 Contents of a Typical Design Review
9.2 Developing the Data Warehouse Iteratively
9.3 Testing
9.3.1 Testing the Data Warehouse
9.3.2 Developing the Test Plan
9.3.3 Testing the Backup and Recovery Processes
9.3.4 Testing the Data Warehouse Environment
9.3.5 Testing the Database
9.3.6 Logging of Test Results
9.4 Monitoring the Data Warehouse
9.4.1 Why Are Statistics Monitored?
9.5 Tuning the Data Warehouse
9.5.1 Tuning the Data Load
9.5.2 Tuning Queries
9.6 The Feedback Loop
Summary
Review Questions
10: OLAP in the Data Warehouse
Learning Objective
Case Study
10.1 Need for Online Analytical Processing
10.1.1 Multi Dimensional Analysis
10.1.2 Fast Access and Powerful Calculations
10.2 OLAP
10.2.1 OLAP Defined
10.2.2 OLAP is a Data Warehouse Tool
10.3 OLAP and Multidimensional Analysis
10.3.1 The Multi-Dimensional Logical Data Model
10.3.2 Multi Dimensional Model's Users
10.3.3 The Multi Dimensional Structure
10.3.4 Multi- Dimensional Operations
10.3.5 The Business Need
10.4 OLAP Functions
10.4.1 Dimensional Analysis
10.4.2 Hypercubes
10.4.3 OLAP Operations in Multidimensional Data Model
10.5 OLAP Applications
10.5.1 Integrating OLAP with GIS
10.6 OLAP Models
10.6.1 MOLAP
10.6.2 ROLAP
10.6.3 HOLAP
10.6.4 DOLAP
10.6.5 OLAP Survey
10.6.6 OLAP Trends
10.7 OLAP Design Considerations
10.8 OLAP Tools and Products
10.8.1 Report Scheduling and Sharing
10.8.2 Ad hoc Reporting
10.8.3 OLAP Customization
10.8.4 The Human Angle
10.9 Existing OLAP Tools
10.9.1 Spreadsheet OLAP Clients
10.9.2 Other OLAP Clients
10.9.3 Embedded OLAP
10.10 Data Design
10.10 Administration and Performance
10.11 OLAP Platforms
Summary
Review Questions
11: Overview of Building and Maintaining A Data Warehouse
Learning Objective
Case Study
11.1 Problem Definition
11.2 Critical Success Factors
11.3 Requirement Analysis
11.4 Planning for the Data Warehouse
11.4.1 Project Staff
11.4.2 Project Plan
11.4.3 Outsourcing Vs Custom Planning
11.4.4 Detailed Project Plan
11.5 Data Warehouse Design Stage
11.5.1 Design the Dimensional Model
11.5.2 Develop the Architecture
11.5.3 Design for Update and Expansion
11.5.4 Design the Relational Database and OLAP Cubes
11.5.5 Decisions in Design
11.5.6 Detail Design
11.5.7 Other Design Considerations
11.6 Building and Implementing Data Marts
11.7 Building Data Warehouse
11.7.1 Test and Deploy the System
11.7.2 Transition to Production
11.7.3 User Training and Support
11.7.3.1 The Success Factors of a Training Program
11.7.3.2 Issues in User Support
11.8 Backup and Recovery
11.9 Establish the Data Quality Framework
11.9.1 Data Purification Process
11.10 Security Issues in a Data Warehouse
11.11 Operating the Data Warehouse
11.11.1 Day-to-Day Operations of the Data Warehouse
11.11.2 Administering the Data Warehouse
11.11.3 Overnight Processing
11.12 Recipe for a Successful Data Warehouse
11.13 Data Warehouse Pitfalls
Summary
Review Questions
12: Data Mining Basics
Learning Objective
Case Study
12.1 Introduction
12.1.1 What Is Data Mining
12.1.2 Foundation of Data Mining
12.1.3 An Analogy
12.1.4 What Can Be Discovered
12.1.5 What Type of Data Can Be Mined
12.2 Architecture of Data Mining System
12.3 The KDD Process
12.4 Integrating Data Mining and the Data Warehouse
12.4.1 KDD versus Data Mining
12.4.2 DBMS versus Data Mining
12.4.3 OLAP versus Data Mining
12.5 Related Areas of Data Mining
12.6 Data Mining Techniques
12.6.1 Association Rule Mining
12.6.2 Decision Tress
12.6.3 Clustering Analysis
12.6.4 Memory Based Reasoning
12.6.5 Genetic Algorithm
12.6.6 Neural networks
12.6.7 Outlier Analysis
Summary
Review Questions
13: Moving into Data Mining
Learning Objective
Case Study
13.1 Introduction
13.2 How Do We Categorize Data Mining System
13.3 Is all that is Discovered Interesting and Useful
13.4 Applications of Data Mining
13.4.1 Benefits of Data Mining
13.4.2 Data Mining For Retail Industry
13.4.3 Data Mining For Telecommunication Industry
13.4.4 Data Mining For Banking and Finance
13.4.5 Data Mining For Biomedical and DNA Data Analysis
13.4.6 Data Mining For Customer Retention
13.4.7 Data Mining For Targeted Marketing
13.4.8 Data Mining For Customer Relationship Management
13.5 Other Data Mining Application Areas
13.6 Advantages and Disadvantages of Data Mining
13.7 Web Mining
13.7.1 Web Content Mining
13.7.2 Web Structure Mining
13.7.3 Web Usage Mining
13.8 Text Mining
13.9 Temporal Data Mining
13.10 Sequence Mining
13.11 Time Series Analysis
13.12 Spatial Data Mining
13.13 Issues and Challenges in Data Mining
13.14 Current Trends Affecting Data Mining
Summary
Review Questions
14: Trends In Data Warehousing
Learning Objective
Case Study
14.1 Introduction
14.2 Data Warehouse Solutions
14.2.1 Data Warehouse Implementation Alternatives
14.2.2 Host-Based Data Warehouses
14.2.2.1 Single host Based Data Warehouses
14.2.2.2 Host Based Single Stage (LAN)-Based Data Warehouses
14.2.3 LAN- Based Workgroup Data Warehouses
14.2.4 Multistage Data Warehouses
14.2.5 Stationary Data Warehouses
14.3 Web Enabled Data Warehouse
14.3.1 Using the Web for Information Delivery
14.3.2 Expectations from the Web as an Information Delivery Medium
14.3.3 Super Growth Problem
14.3.4 Data Webhouse Prominent Features
14.3.5 The Need for Data Webhouse
14.3.6 The Data Webhouse Architecture
14.3.7 Similarities with Traditional Data Warehouses
14.3.8 Building Clickstream Data Webhouse
14.3.9 The Granularity Manager
14.3.10 Challenges in the Clickstream Data Webhouse Lifecycle
14.4 Distributed Data Warehouses
14.4.1 Advantages of Distributed Data Warehousing
14.4.2 Distributed versus Centralized Warehouse
14.5 The Virtual Data Warehouse
14.5.1 Why to Go For a Virtual Data Warehouse
14.5.2 Problems with a Virtual Data Warehouse
14.5.3 Advantages of Using a Virtual Data Warehouse
14.6 Data Warehouse and the ODS
14.7 Integration of Data Warehousing with other Technologies
14.7.1 Data Warehousing and ERP
14.7.1.1 Integrating ERP and Data Warehouse
14.7.1.2 Issues in integrating ERP with Data Warehousing
14.7.1.3 Common Misconceptions about DW and ERP
14.7.1.4 Conclusion
14.7.2 Data Warehousing and Knowledge Management
14.7.3 Data Warehousing and EIS
14.7.3.1 Executive information System
14.7.3.2 Data Warehouse as a Basis for EIS
14.7.4 Data Warehousing and CRM
14.7.4.1 Active Data Warehousing
14.8 Trends in Data Warehousing
14.8.1 Multiple Data Types
14.8.2 Data Visualization
14.8.3 Parallel Processing
14.8.4 Agent Technology
14.9 Data Warehouse Futures
Summary
Review Questions
Appendix
Glossary