Call Us: +32 2 466 00 16
Email: info@u2u.be
Follow Us:

Analyzing Big Data with Microsoft R

3days
Training code
ms20773
Book this course

Microsoft R Server and R Client

Explain how Microsoft R Server and Microsoft R Client work.

Topics:

  • What is Microsoft R server
  • Using Microsoft R client
  • The ScaleR functions

Exploring Big Data

At the end of this module the student will be able to use R Client with R Server to explore big data held in different data stores.

Topics:

  • Understanding ScaleR data sources
  • Reading data into an XDF object
  • Summarizing data in an XDF object

Visualizing Big Data

Explain how to visualize data by using graphs and plots.

Topics:

  • Visualizing In-memory data
  • Visualizing big data

Processing Big Data

Explain how to transform and clean big data sets.

Topics:

  • Transforming Big Data
  • Managing datasets

Parallelizing Analysis Operations

Explain how to implement options for splitting analysis jobs into parallel tasks.

Topics:

  • Using the RxLocalParallel compute context with rxExec
  • Using the revoPemaR package

Creating and Evaluating Regression Models

Explain how to build and evaluate regression models generated from big data

Topics:

  • Clustering Big Data
  • Generating regression models and making predictions

Creating and Evaluating Partitioning Models

Explain how to create and score partitioning models generated from big data.

Topics:

  • Creating partitioning models based on decision trees.
  • Test partitioning models by making and comparing predictions

Processing Big Data in SQL Server and Hadoop

Explain how to transform and clean big data sets.

Topics:

  • Using R in SQL Server
  • Using Hadoop Map/Reduce
  • Using Hadoop Spark

The main purpose of the course is to give students the ability to use Microsoft R Server to create and run an analysis on a large dataset, and show how to utilize it in Big Data environments, such as a Hadoop or Spark cluster, or a SQL Server database.

The primary audience for this course is people who wish to analyze large datasets within a big data environment.

The secondary audience are developers who need to integrate R analyses into their solutions.

© 2018 U2U All rights reserved.