• Select a country/region: United States
  • IBM®
  • Site map

  • Training - worldwide
  • Training
  • Course catalog

Course description: How to Analyze Data (Using InfoSphere Information Analyzer) - Web Based Training

  • Add course to my custom catalog
  • Add to my training plan

Overview

  • Special note
  • Audience
  • Prerequisites
  • Skills taught
  • Course outline
  • Machine requirements
List of course details in a data table
Course code 1I002 Skill level Basic
Duration 5.0 hours Delivery type Web Based Training
Course type Public only    
Public price USD $340.00 plus tax    

Note: This is a self-paced online course in which the Guided Tour Product Simulation requires no installations: Everything is through your IE 5.0+ browser. Please DO NOT make travel arrangements for this course. After you receive confirmation that you are registered, just follow the instructions to access the course.

This browser-based product simulation lets you ""learn-by-doing"" in a case-study scenario in which you solve a 'real-life' business problem - no installations needed just a browser and a passion to learn.

By the end of this training you'll have a better understanding of this InfoSphere product's terminology, an overview of its architecture, and a better understanding of how it can be implemented.

You can learn more about the FlexLearning library:

http://www-304.ibm.com/jct03001c/services/learning/ites.wss/us/en?pageType=page&c=a0011797

USA IBM Training Registrar:

e-mail: iiseduc@us.ibm.com

Before the Guided Tour Product Simulation begins, this course will start by talking about the intent of the tutorial. Then we'll attempt to answer the question ""What is data analysis?"". We'll discuss some use-cases and then we'll discuss what it means to understand information (from a process standpoint). In doing so, we'll try to answer questions such as ""What is it that you are trying to understand?"", ""What are the core scenarios for understanding data?"", ""What are you looking at when you go through source system analysis?"", ""What are you trying to discover?"", ""What are you trying to deliver?"".

We'll then look at basic practices in detail. Specifically, we'll discuss the integrity of data. We'll do this by assessing and analyzing:

Metadata Integrity (How well the descriptions of the data can be understood)

Domain Integrity (Looking at discrete chunks of data - for example Name, Credit Rating, or Creation Date are domains of data - and then assessing whether or not the elements that you have within a given domain are well-understood, complete, and valid. This section will discuss in detail the seven inferred (by Information Analyzer) data classifications;

  • Identifiers
  • Indicators
  • Codes
  • Quantifiers
  • Dates/Times
  • Text
  • Unknowns

Structural Integrity (Looking at how well-defined a key is. Looking at a table or a file and finding clear identification in it -and potentially across tables- thus ensuring no duplicated information). Here, we'll also look at how well-defined pieces of data are with regard to length, data type, and things that might affect ETL data processes.

Relational Integrity (Looking at two things. First, asking ""How well do keys support each other, and are key relationships maintained across tables?"" - known as referential integrity. Second, the consistency and potentially redundancy of information from one source to another source – For example, if you have a State Code in one table, is it consistent with State Codes in other tables, do they have the same abbreviation, do they have the same set of values, and do they have the same format?)

Finally, we'll wrap things up by talking about next steps. For example, once some analysis has been performed, how do you proceed? What additional analytic techniques can be brought to bear? From a project perspective, what best practices are available to you from the IPS Services group at IBM? Then we'll talk about how analysis might tie into a data integration project's life cycle, and how you might treat things from a broader project perspective within a methodology framework.

This tutorial is not going to teach you how to use the IBM Information Analyzer product. Instead, the intent of this tutorial is to teach you basic data profiling practices and show you how to go about understanding information through data profiling and data analysis.

View this course in other countries


Training Paths that reference this course are:

  • InfoSphere Overview - FlexLearning Library
  • Information Analysis – Business Analyst/Data Steward
  • Information Analysis – Data Quality Architect

Back to top

Special note

IBM Education Advantage Program Eligibility

  • Yes - IBM Education Pack - Online account

Back to top

Audience

This basic course is for:

  • Business Analysts
  • Data Analysts
  • ETL Developers
  • Database Administrators
  • Data Architects
  • Integration/Migration Team Members
  • Project Leads and Managers
  • IBM Information Analyzer Users
  • Data Stewards trying to better understand data
  • IT Professionals determining optimization of system resources
  • Managers interested in data compliance and governance

Back to top

Prerequisites

You should have:

  • Familiarity with relational databases (Understand basic terminology including the idea of a table, a column of information or alternately a file or field of data - if it's a database, the idea that it has some sort of a key or identifier, and that keys may have a relationship between multiple tables.
  • Some exposure to IBM Information Analyzer (IA), its navigation and procedural flow (This training focuses on what to do with the results of a previous profiling of the data). The exposure to IA might have been accomplished by taking the IA FlexLearning (Entitled "Introduction to Information Analyzer"), an Instructor Led Training course, or review of the documentation.

Familiarity with SQL and familiarity with programming tools are not required.

Back to top

Skills taught

  • Understand the basic scenarios for data profiling  
  • Understand how to analyze data at several levels
  • Determine the integrity of data
  • Use Information Analyzer to learn-by-doing

Back to top

Course outline

  • Beginning Data Analysis
  • Metadata Integrity
  • Domain Integrity
  • Data Classification
  • Structural Integrity
  • Relational Integrity
  • Analysis in the Project Lifecycle
  • Performing all of the above in a Guided Tour Product Simulation  

Back to top

Machine requirements

Mandatory: Learners must have screen resolution set to a minimum of 1024x768 (1200 x 1400 is recommended) and an IE 5.0+ browser!

Back to top

My IBM

  • Edit your profile

We're here to help

Easy ways to get the answers you need.

  • or call us at
    Call 1-800-426-8322
    Open M-F 9AM-7PM ET.

Find the right course

  • Training paths will help you decide

Offers and more

  • Subscribe to IBM Training News
  • Training special offers
  • IBM Education Pack

    Save on Training

  • Request a complimentary IBM Training Plan
  • Training with no travel required

Student center

  • My enrollments
  • My training wish list

  • How to get to class