Call Us: +32 2 466 00 16
Email: info@u2u.be
Follow Us:

Getting Insights in Microsoft Dataverse with Azure Synapse Analytics

2 days
udvsams
2 days

Upcoming Sessions

Date: currently not scheduled

Format: Classroom

Price: 0€

Subscribe to waiting list

Date: currently not scheduled

Format: Classroom

Price: 0€

Subscribe to waiting list

Interested in a private company training? Request it here.

Microsoft Dataverse and the modern data warehouse

The cloud requires to reconsider some of the choices made for on-premises data handling. This module introduces the concept of a data lake and the data lake house. It also introduces the different services in Azure that can be used for data processing and compares them to the traditional on-premises data stack.

  • From traditional to modern data warehouse
  • Comparing data warehouse with data lake
  • Lambda architecture
  • Overview of Big Data related Azure services

An introduction to Azure Storage Accounts

This module discusses the different types of storage accounts available in Azure Storage. Also, some of the tools to load and manage files in an Azure Storage are covered.

  • Introduction to Azure Blob Storage
  • Introduction to Azure Data Lake Storage Gen 2
  • Compare Azure Data Lake Storage Gen 2 with traditional Azure Blob Storage
  • Tools for uploading data
  • Storage Explorer, AZCopy
  • LAB: Uploading data into Azure Storage

Overview of Azure Synapse Analytics and Azure Synapse Link for Dataverse

Synapse Analytics is the cornerstone service for the data engineer. It encompasses pipelines to copy data, Spark, and SQL to transform and query data, Data Explorer for near realtime analysis and data exploration and Power BI for reporting. This module provides a brief introduction into this service. In this module you will see how you can configure Azure Synapse Link for Microsoft Dataverse to ingest your Microsoft Dataverse data in close to real-time into a data lake hosted by Azure Synapse. You will also see some more advanced options like how you can partition the data during ingestion.

  • The different components of Synapse Analytics
  • Provisioning a Synapse Analytics workspace
  • Navigating Synapse Analytics Studio
  • Configuring a Synapse Link for Microsoft Dataverse
  • Manage table data to the Synapse workspace
  • Configure advanced settings in Synapse Link for Microsoft Dataverse
  • Monitor your Azure Synapse Link
  • LAB: Provision a Synapse Analytics workspace

Transforming Microsoft Dataverse data with Data Flow Pipelines

With Data flows data can be validated and transformed without the need to learn about another tool (such as Databricks or Spark). Using Data Flows you can transform and combine the ingested data from Microsoft Dataverse into a business-ready format.

  • Introducing Azure Synapse Analytics Pipelines
  • Pipeline terminology
  • Creating Pipelines, Linked Services and Datasets
  • From ELT to ETL
  • Creating Data flow to validate and transform data
  • Transform ingested data from Microsoft Dataverse that using Data flows
  • Executing Data Flows using Pipelines
  • LAB: Transforming data with a Data Flow

Using Synapse SQL Pools to transform and query Dataverse data

Once data has been loaded into the data lake, the next step is to cleanse the data, pre-aggregate the data and perform other steps to make the data accessible to reporting and analytical tools. Dependent on the transformations required and the skills of the data engineer, the SQL dialect common to the Microsoft data stack (T-SQL) could play an important role in this. You will learn about the concept of external tables and how to create and configure them. Once you have mastered the concept of external tables you will see the the Azure Synapse Link for Dataverse automatically creates external tables for the Dataverse tables you choose the replicate via Azure Synapse Link for Dataverse.

  • Provisioned versus Serverless Synapse Analytics SQL Pools
  • Creating and accessing databases
  • Using OPENROWSET for data access
  • Creating External Tables
  • Access near real-time data from Microsoft Dataverse
  • Access snapshot data from Microsoft Dataverse
  • LAB: Querying data via Azure Synapse Analytics Serverless databases

Accessing Dataverse data using Synapse Analytics Spark and Python

Spark doesn't have a proprietary data storage option, but consumes and produces regular files stored in Azure Storage. This module covers how to access and manipulate data stored in the Synapse Analytics data lake or other Azure storage locations. Azure Spark for Azure Synapse also comes with a Common Data Model (CDM) connector that allows you the easily read and transform the Dataverse data that is ingested into the data lake.

  • Introduction Spark framework
  • Spark Cluster setup
  • Connecting to Azure Blob Storage and Azure Data Lake Gen 2 Storage
  • Processing data using Spark Dataframes in Python
  • Using Spark SQL
  • Processing data from Microsoft Dataverse using the CDM connector
  • LAB: Processing data on a Spark cluster

Handling large volumes of data requires different skills: One must master storage options, tools to upload data performant, handling failed uploads, and convert data in a format appropriate for reporting and analysis. In the Microsoft Azure stack, Synapse Analytics is the cornerstone service for the data engineer. It encompasses pipelines to copy data, Spark, and SQL to transform and query data, Data Explorer for near real-time analysis and data exploration, and Power BI for reporting. Microsoft Dataverse securely stores and manages data that is used by business applications. As Microsoft Dataverse is used to store critical business data, you will almost always want to load this data into a data lake. This is where Synapse Link for Microsoft Dataverse comes in. It is a managed service that can ingest Dataverse data in close to real-time into a data lake. Once the data lands in the data lake, you can use the services provided by Azure Synapse Analytics to transform, cleanse the data and build either a logical or physical data warehouse on top of it.

This training teaches how to use Synapse Analytics to design, build and maintain a modern data lake architecture. The training also includes a few other Azure services which come in handy when working with Synapse Analytics, such as Azure Data Vault for handling authentication, Azure SQL Database for dealing with smaller datasets and Azure Databricks as an improved Spark engine.

This course focusses on developers and administrators who are considering migrating existing data solutions to the Microsoft Azure cloud, or start designing new data oriented solutions in the Azure cloud. Some familiarity with relational database systems such as SQL Server is handy. Prior knowledge of Azure is not required.

Contact Us
  • Address:
    U2U nv/sa
    Z.1. Researchpark 110
    1731 Zellik (Brussels)
    BELGIUM
  • Phone: +32 2 466 00 16
  • Email: info@u2u.be
  • Monday - Friday: 9:00 - 17:00
    Saturday - Sunday: Closed
Say Hi
© 2023 U2U All rights reserved.