This course is not scheduled.
Overview
| Course code | DWC52AU | Delivery type | Classroom |
|---|---|---|---|
| Duration | 2.0 days | Course type | Public or Private on-site |
| Public price |
AUD $1,700.00 ex GST
AUD $1,870.00 inc GST |
Future price | AUD $1,800.00 ex GST For classes starting on or after: 30 Jun 2013 AUD $1,980.00 inc GST For classes starting on or after: 30 Jun 2013 |
This course is a subset of the course InfoSphere Warehouse 9 Components (DW352). It is designed to give you in-depth knowledge of the data mining and unstructured text analysis components of InfoSphere Warehouse. You will first be given a foundation in data mining. Then you will learn about the various types of data mining algorithms supported by InfoSphere Warehouses. Exercises will allow you to create mining flows that invoke the various mining algorithms. Much data resides in data warehouses in free-form text. Unstructured text analysis allows you to create dictionaries and extract relevant data from the text fields that can then be used to enhance a data mining run or to possibly add additional dimensions to a star schema.
View this course in other countries
Roadmaps that contain this course are:
Audience
The course is for students who will be using the data mining and unstructured text analysis components of InfoSphere Warehouse.
Prerequisites
You should be familiar with DB2 and have attended InfoSphere Warehouse 9 - SQL Warehouse Tool and Administration Console (DWA52). Attendance in Infosphere Warehouse 9 - Cubing Services (DWB52) would be a benefit, but is not required.
Objectives
- Describe the different data mining algorithms supported by InfoSphere Intelligent Miner
- Create mining flows that will create data mining models and score those models against new data
- Extract data from an unstructured text field in order to enhance a data mining run or create additional dimensions for a star schema
Course outline
A Data Mining Foundation
- Define data mining
- Distinguish between verification-driven and discovery-driven analysis
- Discuss where data mining can be applied
- Describe the key elements for a successful data mining project
- Describe the purposes and uses of a data mining process
- State six steps in a data mining process
An Introduction to InfoSphere Intelligent Miner
- Describe the components of InfoSphere Intelligent Miner
- List the different model types supported by InfoSphere Intelligent Miner Modeling
- Describe how InfoSphere Intelligent Miner Scoring is used
- Explain how to inspect your data using different distributions: Univariate, Bivariate, and Multivariate
- Describe how to execute a mining flow
- Discuss how to generate a Java class from a mining flow
InfoSphere Intelligent Miner Supported Mining Techniques
- Describe the Cluster function used in InfoSphere Intelligent Miner Modeling
- Describe the Classification function used in InfoSphere Intelligent Miner Modeling
- Describe the Regression function used in InfoSphere Intelligent Miner Modeling
- Describe the Associations function used in InfoSphere Intelligent Miner Modeling
- Describe the Sequential Rule function used in InfoSphere Intelligent Miner modeling
- Describe the Time Series function used in InfoSphere Intelligent Miner modeling
Unstructured Text Analytics
- Describe the regular expression extraction capabilities of InfoSphere Warehouse
- Describe how the frequent terms analysis capabilities of the Design Studio can aid in creating a dictionary
- Describe how list base information extraction can be used to enhance a data mining run
Agenda
Day 1
- Welcome
- Unit 1: A Data Mining Foundation
- Unit 2: An Introduction to InfoSphere Intelligent Miner
- Unit 3 - InfoSphere Intelligent Miner Supported Mining Techniques
- Topic 1: Clustering Functions
- Exercise for Clustering
- Topic 2: Predictive Models
Day 2
- Unit 3:InfoSphere Intelligent Miner Supported Mining Techniques
- (Continued)
- Exercise for Prediction
- Topic 3: Associations and Sequential Rule
- Exercise for Associations and Sequential Rule
- Unit 4: Unstructured Text Analysis
- Exercise for Unstructured Text Analysis
Remarks
Curriculum Relationship:
- InfoSphere Warehouse 9 Components (DW352)
- InfoSphere Warehouse 9 - SQL Warehouse Tool and Administration Console (DWA52)
- InfoSphere Warehouse 9 - Cubing Services (DWB52)
- Alphablox Essentials and Blox Builder (DW314)
- Managing Workloads for DB2 LUW and InfoSphere Warehouse (DW322)
- DB2 UDB Multi Partition Database Administration Workshop for UNIX (CF241)
- DB2 UDB Multi Partition Environment for Single Partition DBAs (CG241)
Practical Work:
This course is structured in a lecture/lab format. The hands-on sessions form a vital and integral part of the course.
October 2009
V6.1