The Cross-industry standard process for data mining (CRISP-DM) is a methodology for running and documenting the enterprise data mining process. It is technology independant. In this module we introduce this methodology in general, each of the next modules corresponds with a phase in this CRISP-DM methodology.
We must first be able to identify usefull data mining goals in our business. In this module you learn about the most common data mining goals such as regression and classification.
Since data is the key ingredient in any data mining process, we must take care it is of good quality. This module explains how the concept of data quality is different for data mining than it is for other business intelligence processes.
Modeling is the actual process of building mathematical models based on the data. You will get an overview of the different modeling techniques such as decission trees, neural networks, logistic regression, association rules and more!
Before a model can be used in production we must first be sure it's good enough. This can be done with statistical measurements, but also human inspection can be important.
For most of this training we use Visual Studio and Management Studio to build our models. But the applications which will query these models will be different. In this module we first show the Excel add-in for creating and consuming data mining models. Then the integration with Reporting Services is illustrated. We end with showing how .Net programmers can build applications on top of these models as well.
The business world is full of uncertainties. Nobody knows which customers are going to switch to the competitors, how sales will evolve over the next months,... This is why companies create models which help them tame this uncertainty. Data mining (or predictive analytics) is one of the techniques that help companies build models, which can then help decision makers in their daily job. But data mining can do more than just that: data quality control, data cleansing, analyzing social media, ... the list of machine learning applications is nearly endless.
The goal of this course is twofold:
This course is intended for people with no prior data mining knowledge who want to understand when data mining can be used, and how to use it with SQL Server Analysis Services. The target audience are BI developers who plan to develop data mining solutions, as well as project managers who need to understand the key aspects of building a data mining solution.
Prior knowledge on Analysis Services Multi-Dimensional is not needed, but we assume familiarity with relational databases. Small parts of this course use Excel, .Net coding skills and Reporting Services; at least a passive knowledge on these technologies is useful to participate in the whole course.