Introduction Azure Synapse analytics

Introduction Azure Synapse analytics

Azure Synapse analytics is undoubtedly the most complex service offering of the Microsoft Azure Cloud. If you are working as a Data Engineer or planning to choose it as your future job role then this will be your most interesting topic to work on.

According to Microsoft, Synapse Analytics helps customers use their data more efficiently, productively, quickly and securely by unifying information from all data sources, repositories and big data analytics systems.

Businesses can also benefit from disruptive technologies such as artificial intelligence and data analytics. These technologies enable professionals to better understand the weather, workers to better handle menial tasks, and search engines to recognize destinations as users enter search terms.

Additionally, Azure Synapse Analytics is built to support your ever-growing DevOps strategy. Here, operations and development staff work closely together to build and run services that perform better throughout their lifecycle.

Properties of Azure Synapse

• Diversified analytics services for unprecedented time to insight

• Real-time data stream processing from millions of IoT devices

• Enterprise-level analytics delivered as a service

• ML algorithms for every intelligent application Apply

•Expand the insights you can discover from all your data.

• Break down data barriers and analyze data from operational and business apps with Azure Synapse Link.

• Protect your data with the industry's most advanced security and privacy features.

Azure Synapse Architecture

Azure Synapse architecture consists of four components:

· Synapse SQL: Complete T-SQL based analytics

· Dedicated SQL pool (pay per DWU provisioned)[1 to 60 nodes]

· Serverless SQL pool (pay per TB processed) )[1 to 60 nodes]

· Synapse Spark: Deeply integrated Apache Spark [1 to 200 nodes]

· Synapse Pipelines: Hybrid data integration - ETL

· Studio: Unified user experience

This short video provides a quick introduction to Azure Synapse Analytics.

Synapse SQL

Synapse SQL is the ability to perform T-SQL based analytics in your Synapse workspace. Synapse SQL has two consumption models, dedicated and serverless. For the dedicated model, use a dedicated SQL pool. A workspace can have any number of these pools. To use the serverless model, use serverless SQL pool. Each workspace has one of these pools.

Synapse dedicated SQL pool will provide always available SQL pool with maximum 60 compute nodes and 1 control node to run the MPP engine. You can scale it as per the requirement using data warehouse unit configuration. More details on Data warehouse units you can find on : learn.microsoft.com/en-us/azure/synapse-ana..

asa1.png

Apache Spark for Synapse

To use Spark analytics, create and use an Apache Spark pool in your Synapse workspace. When you start using a Spark pool, your workspace creates a Spark session and processes the resources associated with that session.There are two ways within Synapse to use Spark:

· Spark Notebooks for doing data Data Science and Engineering use Scala, PySpark, C#, and SparkSQL

· Spark job definitions for running batch Spark jobs using jar files.

Synapse Pipelines

Pipelines are how Azure Synapse provides Data Integration - allowing you to move data between services and orchestrate activities.

· Pipeline are logical grouping of activities that perform a task together.

· Activities defines actions within a Pipeline to perform on data such as copying data, running a Notebook or a SQL script.

· Data flows are a specific kind of activity that provide a no-code experience for doing data transformation that uses Synapse Spark under-the-covers.

· Linked Service – Connection to the Data store and compute which can integrate with Azure Synapse analytics workspace

· Trigger - Executes a pipeline. It can be run manually or automatically (schedule, tumbling window or event-based)

· Integration dataset - A named view of data that simply references or refers to data used as inputs and outputs in an activity. It belongs to the linked service.

asa2.jpg

Synapse Studio:

Synapse Studio provides workspaces for data preparation, data management, data exploration, data warehousing, big data, and AI tasks. Data engineers can manage data pipelines using a code-free visual environment.

Learn about Azure Synapse, Data Engineering and Azure Cloud visit our official site : mentorstag.com/s/store/courses/description/..

Did you find this article valuable?

Support Mentor's Tag by becoming a sponsor. Any amount is appreciated!