Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

DataBase Development and Implementation Lec19 - OLAP and Data Mining, Study notes of Database Management Systems (DBMS)

This document about , Business Intelligence Technologies, OLAP Benchmarks, Examples of OLAP applications in various functional areas ,OLAP Applications key features, Hybrid OLAP (HOLAP)

Typology: Study notes

2010/2011

Uploaded on 09/08/2011

rossi46
rossi46 🇬🇧

4.5

(10)

313 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
DBDI 30/05/2007
Lecture-19 / OLAP & DM 1
DBDI/ Lecture 19
OLAP & Data Mining
Dr. Ala Al-Zobaidie
The slides are based on the textbooks:
Database Systems by Thomas Connolly & Carolyn Begg (4th ed.),
Introduction to Data Mining, by Pang-Ning Tan, et.al., 2006
30/05/2007 DBDI / OLAP & DM 2
Objectives
The relationship between OLAP and DW.
Key features of OLAP applications.
12 rules for OLAP tools.
Main categories of OLAP tools.
OLAP extensions to the SQL standard
30/05/2007 DBDI / OLAP & DM 3
Business Intelligence Technologies
Need of access tools to perform advanced
analytical capabilities.
two main types:
Online Analytical Processing (OLAP)
Data Mining (DM)
An environment that includes a DW with
tools (e.g. OLAP, DM) Î
Business Intelligence (BI) technologies
30/05/2007 DBDI / OLAP & DM 4
OLAP Benchmarks
Measuring OLAP performance for
common business operations.
Just-in-time (JIT) information for effective
decision-making.
Benchmark metric called AQM (Analytical
Queries per Minute).
Results include both the DB schema &
required code.
Data model flexibility.
30/05/2007 DBDI / OLAP & DM 5
Examples of OLAP applications in various
functional areas
30/05/2007 DBDI / OLAP & DM 6
OLAP Applications key features
Multi-dimensional views of data
realisticbusiness model
access to corporate data
treat all dimensions equally
Support for complex calculations
Mechanisms support powerful declarative
computational methods
Time intelligence
Judging over time hierarchy
Date and Period comparisons
pf3
pf4
pf5

Partial preview of the text

Download DataBase Development and Implementation Lec19 - OLAP and Data Mining and more Study notes Database Management Systems (DBMS) in PDF only on Docsity!

DBDI/ Lecture 19

OLAP & Data Mining

Dr. Ala Al-Zobaidie

The slides are based on the textbooks:Database Systems by Thomas Connolly & Carolyn Begg (4th (^) ed.), Introduction to Data Mining, by Pang-Ning Tan, et.al., 2006 30/05/2007 DBDI / OLAP & DM 2

Objectives

  • The relationship between OLAP and DW.
  • Key features of OLAP applications.
  • 12 rules for OLAP tools.
  • Main categories of OLAP tools.
  • OLAP extensions to the SQL standard

30/05/2007 DBDI / OLAP & DM 3

Business Intelligence Technologies

  • Need of access tools to perform advanced analytical capabilities.
  • two main types:
    • Online Analytical Processing (OLAP)
    • Data Mining (DM)
  • An environment that includes a DW with tools (e.g. OLAP, DM) Î Business Intelligence (BI) technologies

30/05/2007 DBDI / OLAP & DM 4

OLAP Benchmarks

  • Measuring OLAP performance for common business operations.
  • Just-in-time (JIT) information for effective decision-making.
  • Benchmark metric called AQM (Analytical Queries per Minute).
  • Results include both the DB schema & required code.
  • Data model flexibility.

Examples of OLAP applications in various functional areas

OLAP Applications key features

  • Multi-dimensional views of data
    • realistic’ business model
    • access to corporate data
    • treat all dimensions equally
  • Support for complex calculations
    • Mechanisms support powerful declarative computational methods
  • Time intelligence
    • Judging over time hierarchy
    • Date and Period comparisons

30/05/2007 DBDI / OLAP & DM 7

OLAP Benefits

  • Increased productivity of end-users.
  • Reduced backlog of applications development for IT staff.
  • Retention of organizational control over the integrity of corporate data.
  • Reduced query drag and network traffic on OLTP systems or on the data warehouse.
  • Improved potential revenue and profitability.

30/05/2007 DBDI / OLAP & DM 8

Multi-dimensional Data as Three-field table versus Two-dimensional Matrix

30/05/2007 DBDI / OLAP & DM 9

Multi-dimensional Data as 4-field Table versus 3-dimensional Cube

Region Month

Product 30/05/2007 DBDI / OLAP & DM 10

OLAP Tools & Codd’s Rules for OLAP Systems

  • Available OLAP tools in marketplace.
  • Codd’s Rules:
    1. Multi-dimensional conceptual view
    2. Transparency
    3. Accessibility
    4. Consistent reporting performance
    5. Client-server architecture
    6. Generic dimensionality
    7. Dynamic sparse matrix handling
    8. Multi-user support
    9. Unrestricted cross-dimensional operations 10.Intuitive data manipulation 11.Flexible reporting 12.Unlimited dimensions and aggregation levels

Categories of OLAP Tools

  • OLAP tools are categorized according to the architecture used to store and process multi-dimensional data.
  • There are four main categories:
    • Multi-dimensional OLAP (MOLAP)
    • Relational OLAP (ROLAP)
    • Hybrid OLAP (HOLAP)
    • Desktop OLAP (DOLAP)

Multi-dimensional OLAP & Typical Archit’ure

  • Specialized data structures & MDDBMSs.
  • Aggregated Data
  • Array technology & efficient storage techniques.
  • Excellent performance.
  • Tight coupling.
  • Use APIs.
  • Development Issues
    • Limited representation ability
    • Limited Navigation and analysis ability
    • Required high skills & advanced tools

30/05/2007 DBDI / OLAP & DM 19

Data Mining

From [Pang-Ning Tan , et.al.] Introduction to Data Mining , 2006 30/05/2007 DBDI / OLAP & DM 20

  • Draws ideas from machine learning/AI, pattern recognition, statistics, and database systems
  • Traditional Techniques may be unsuitable due to - Enormity of data - High dimensionality of data - Heterogeneous, distributed nature of data

Origins of Data Mining

Machine Learning/ Pattern Recognition

Statistics/ AI

Data Mining

Database systems

30/05/2007 DBDI / OLAP & DM 21

Examples of Applications of Data Mining

  • Retail / Marketing
  • Banking
  • Insurance
  • Medicine

30/05/2007 DBDI / OLAP & DM 22

Criteria for selection Data Mining tool

  • Suitability for certain input data types
  • Transparency of the mining output
  • Tolerance of missing variable values
  • Level of accuracy possible
  • Ability to handle large volumes of data.

Data Mining Operations and Associated Techniques

Data Mining Operations

  • Prediction Methods
    • Use some variables to predict unknown or future values of other variables.
  • Description Methods
    • Find human-interpretable patterns that describe the data.

30/05/2007 DBDI / OLAP & DM 25

Data Mining Operations

  • Predictive
    • Classification
    • Regression
    • Deviation Detection
  • Descriptive
    • Clustering
    • Association Rule Discovery
    • Sequential Pattern Discovery

30/05/2007 DBDI / OLAP & DM 26

Predictive / Classification

Tid Refund MaritalStatus TaxableIncome Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 1010 No^ Single^ 90K^ Yes

categoricalcategoricalcontinuousclass Refund Marital Status TaxableIncome Cheat No Single 75K? Yes Married 50K? No Married 150K? Yes Divorced 90K? No Single 40K? 10 No^ Married^ 80K^?^ TestSet

TrainingSet Learn Model Classifier

30/05/2007 DBDI / OLAP & DM 27

Predictive / Classification using Tree Induction

30/05/2007 DBDI / OLAP & DM 28

Predictive / Regression (Value Prediction)

  • Predict a value of a given continuous valued variable
  • Linear or nonlinear model
  • Statistics, neural network fields
  • Works well with linear data and is sensitive to the presence of outliers
  • Statistical measurements are fine for linear models

Predictive / Classification using Neural Induction

Predictive / Deviation Detection

  • Detect deviations from normal behavior
  • Relatively new operation
  • A true discovery
  • Various techniques & by-products
  • Many Applications

30/05/2007 DBDI / OLAP & DM 37

Summary

  • Relationship between OLAP and DW.
  • Key features of OLAP applications.
  • 12 rules for OLAP tools.
  • Main categories of OLAP tools.
  • OLAP SQL extension
  • Data mining concept and process.
  • Techniques associated with the data mining operations.
  • Characteristics of data mining tools.
  • The relationship between data mining and data warehousing.
  • Challenges in Data Mining