The modern data warehouse
The cloud requires to reconsider some of the choices made for on-premisses data handling. This module
introduces the different services in Azure that can be used for data processing, and compares them to the
traditional on-premisses data stack. It also provides a brief intro in Azure and the use of the Azure portal.
- From traditional to modern data warehouse
- Lambda architecture
- Overview of Big Data related Azure services
- Getting started with the Azure Portal
- LAB: Navigating the Azure Portal
Storing data in Azure
This module discusses the different types of storage available in Azure Storage as well as data
lake storage. Also some of the tools to load and manage files in Azure storage and Data lake storage are
- Introduction Azure Blob Storage
- Compare Azure Data Lake Storage Gen 2 with traditional blob storage
- Tools for uploading data
- Storage Explorer, AZCopy, PolyBase
- LAB: Uploading data into Azure Storage
Introducing Azure Data Factory
When the data is stored and analysed on
on-premisses you typically use ETL tools such as SQL Server Integration Services for this. But what if the data
stored in the Azure cloud? Then you can use Azure Data Factory, the cloud-based ETL service. First we need to
get used to
the terminology, then we can start creating the proper objects in the portal.
- Data Factory V2 terminology
- Setup a Data Factory with GIT support
- Exploring the Data Factory portal
- Creating Linked Services and Datasets
- Copying data with the Data Factory wizard
- LAB: Migrating data with Data Factory Wizard
Authoring pipelines in Azure Data Factory
This module dives into the process of building a Data Factory pipeline from scratch. The most common activities
are illustrated. The module also focusses on how
to work with variables and parameters to make the pipelines more dynamic.
- Adding activities to the pipeline
- Working with Expressions
- Variables and Parameters
- Debugging a pipeline
- LAB: Authoring and debugging an ADF pipeline
Creating Data Flows in Data Factory
With Data flows data can be transformed without the need to learn about another tool (such as Databricks or
Spark). Both Data flows as well as
Wrangling Data Flows are covered.
- From ELT to ETL
- Creating Data Factory (Mapping) Data flows
- Exploring Wrangling Data Flows
- LAB: Transforming data with a Data flow
Data Factory Integration Runtimes
Data Factory needs integration runtimes to control where the code executes. This module walks you through the 3
types of Integration Runtimes: Azure, SSIS and self-hosted runtimes.
- Integration runtime overview
- Controling the Azure Integration Runtime
- Setup self-hosted Integration Runtimes
- Lift and shift SSIS packages in Data Factory
Deploying and monitoring Data Factory pipelines
Once development has finished the pipelines need to be deployed and scheduled for execution. Monitoring the
deployed pipelines for failure, errors or just performance is another crucial topic discussed in this module.
- Adding triggers to pipelines
- Deploying pipelines
- Monitoring pipeline executions
- Restart failed pipelines
- LAB: Monitoring pipeline runs
Azure SQL Database
An easy way to create a business intelligence solution in the cloud is by taking SQL Server -- familiar to
BI developers -- and run it in the cloud. Backup and high availability happen automatically, and we can use
all the skills and tools we used on a local SQL Server on this cloud based solution as well.
- Provisioning an Azure SQL Database
- Migrating an on-premisses Data Warehouse to Azure SQL Database
- Ingesting Azure Blob Storage data
- Working with Columnstore Indexes
- LAB: Using Azure SQL Databases
In this training the modern data warehouse approach to handling
any volume of both cloud based as well as on-prem data is explained in detail.
First students see how to setup an Azure
Data Lake and inject data with Azure Data Factory. Then students learn how to cleanse the data and prepare it
for analysis with Azure Synapse Analytics
and Azure DataBricks.
The Lambda architecture (with focus on both batch data as well as a speed layer where live events are processed)
is discussed as well, and the speed layer
gets illustrated with Azure Stream Analytics.
In the end participants have hands-on experience with the most common Azure services to load, store and process
data in the cloud.
This course focusses on developers and administrators who are considering migrating existing
data solutions to the Microsoft Azure cloud. Some familiarity with relational database
systems such as SQL Server is handy. Prior knowledge of Azure is not required.