Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Find articles, guides, information and community news

Most Recent
pchristinami
Microsoft Employee
Microsoft Employee

Microsoft Fabric is an end-to-end analytics and data platform designed for enterprises that require a unified solution. It encompasses data movement, processing, ingestion, transformation, real-time event routing, and report building. It offers a comprehensive suite of services including Data Engineering, Data Factory, Data Science, Real-Time Analytics, Data Warehouse, and Databases. 

 

fabric-architecture.png

 

As a Power BI user, you now have access to a full suite of enhanced capabilities through its integration into a comprehensive SaaS platform, giving you even more tools to explore and utilize. In this blog article, we will focus on Fabric Data Pipelines. 

 

Fabric data pipelines are designed to simplify and automate the process of moving, transforming, and loading data into your desired destination. They provide a streamlined approach to handling data workflows, allowing you to focus on what you do best—analyzing data and driving insights. 

 

In this blog post, we'll explore how you can start using Fabric Data Pipelines as a Power BI user that wants to take full advantage of Microsoft Fabric. And if you have never used Power BI before but still want to start with pipelines, you are in the right place!

 

Some use cases that may be familiar to you and this blog series will try to address are: 

 

  • You have heard of OneLake and want to understand how you can use it to build your BI reports  
  • You want to move and transform data from your company’s legacy systems to a modern analytics platform where Power BI can access them easily  
  • You have heard about some activities in data pipelines like semantic model refresh or notifications in Teams/Outlook that would be useful in your project 

 

If any of these seem interesting to you, let’s get hands-on! 

 

Let’s start with the basics: 

 

➡️What kind of license do I need to start experimenting with Fabric Data Pipelines?  

 

There are different options available as of December 2024: 

 

  1. You have access to a Power BI premium capacity and your organization has allowed users to create Fabric items: You can easily start creating Fabric items like data pipelines.  
  2. You have access to a Fabric capacity  
  3. You can start a Fabric trial as described here

Either of these options will work so let’s go directly to powerbi.com! 

 

➡️Which are the important Fabric terms I should know before jumping into Fabric Pipelines? 

 

Fabric is a unified analytics platform and given the different experiences it offers from Data Engineering to Real Time Analytics it’s hard to master every component. 

 

In this section, we want to highlight some important items you should be aware of before starting your pipelines development in the context of this blog series: 

 

  • OneLake: A single, unified, logical data lake for your whole organization. A data Lake processes large volumes of data from various sources. Like OneDrive, OneLake comes automatically with every Microsoft Fabric tenant and is designed to be the single place for all your analytics data. 
  • Lakehouse: Microsoft Fabric Lakehouse is a data architecture platform for storing, managing, and analyzing structured and unstructured data in a single location. It's a flexible and scalable solution that allows organizations to handle large volumes of data using various tools and frameworks to process and analyze that data. Delta Lake is chosen as the unified table format. All Fabric experiences generate and consume Delta Lake tables, driving interoperability and a unified product experience. Delta Lake tables produced by one compute engine, such as Synapse Data warehouse or Synapse Spark, can be consumed by any other engine, such as Power BI. When you ingest data into Fabric, Fabric stores it as Delta tables by default. 
  • Warehouse: A lake-centric data warehouse built on an enterprise grade distributed processing engine that enables industry leading performance at scale while eliminating the need for configuration and management.   

 

In the following steps we will use a lakehouse as the destination of our data pipeline. We will create the lakehouse as part of the pipeline building process so there is no need to create it now. As a Power BI user, you are already familiar with the Power BI interface. Now with Fabric, if you go to powerbi.com you can easily switch between experiences from the bottom left and navigate to Data Factory: 

 

Landing Page.png

 

 

 

➡️How can I create my first pipeline? 

 

For the following steps, you can either create items in ‘My workspace’ or create a new workspace to host your new Fabric items. If you create a new workspace, make sure to check the advanced settings while creating it and choose the right capacity. 

Workspaces continue to be the collaborative environment where you can manage reports, semantic models etc. but now can also include fabric items like dataflows gen2, data pipelines and more. 

 

We will start by creating a pipeline item called SamplePipeline: 

 

Data pipeline.png

 

 

You can see 2 options, either to start with a blank canvas or with guidance. Given we are new to data pipelines as Power BI users, we will start with guidance and more specifically the Copy Data Assistant. 

 

Copy data assistant.png

 

 

As mentioned before, we can use data pipelines to ingest data at scale and schedule data workflows. Copy Data Assistant helps us in this process through a step-by-step experience of selecting our source and destination. In later blog articles, we will explore more advanced options! 

 

Let’s see now how Copy Data Assistant guides us through the process of creating our first pipeline: 

 

  • We start by selecting our data source which in our case is the NYC Taxi – Green sample data: NYC Taxi - Green.png

     

 

  • We see a preview of the data and click Next: Data Preview.png

     

 

  • We land the data in a new lakehouse that we create on the spot:Destination Lakehouse.png

     

 New Lakehouse.png

 

  • We leave the default options in the next page – A new table will be created with the following column mappings: Column Mappings.png

     

 

  • We click Save and Run: Review and save.png

     

 

  • We are redirected to the pipeline canvas where the copy activity is automatically inserted by the copy assistant to complete the data movement - The pipeline will be successful after some minutes: 

Successful pipeline.png

 

 

As you can see, the output pane includes information about the pipeline run which you can also export. You can also click on the Activity name and get more information about the pipeline run: 

 

Pipeline run details.png

 

You can now go to the BronzeLakehouse and verify that the table is created – just go the workspace on the left and choose the lakehouse: 

 

Lakehouse navigation.png

 

Table in Lakehouse.png

 

 

Congratulations, you just created your first pipeline that lands data in a lakehouse. 

 

From this point, you can transform the data with spark notebooks, add more data in your lakehouse and many more. And the most important aspect of a Power BI user starting to use Fabric: You can now use Direct Lake! 

 

Stay tuned for the next parts where we explore more advanced features of Fabric Data Pipelines.

 

 

sean_ms
Microsoft Employee
Microsoft Employee

In this blog we discuss how to use the expression language to handle referencing a field that may or may not exist at runtime, a non-existent property. 

Read more...

sean_ms
Microsoft Employee
Microsoft Employee

This short blog details a common scenario we saw in Azure Data Factory where we wanted to ignore zero-byte (empty) files landing in out storage accounts.  In this blog we show you how to achieve this functionality with data pipeline storage event triggers (preview) and provide references to the properties and schemas of the event grid topics which will empower you to specify the filters you need to be successful. 

Read more...

Pragati11
Super User
Super User

banner.jpg

 

For the last few days, I have been working on the Contoso Sales data to create a Power BI report as part of the learning. Currently, I am using the default ready to go Power BI data model provided by Microsoft which can be found here. As Microsoft Fabric is the new tech buzz, so I thought why don’t I get this data somehow in the Fabric environment.

Read more...

KevinChant
Most Valuable Professional
Most Valuable Professional

In this post I want to cover my initial tests of the Data Factory Testing Framework. Which is a unit testing framework you can use to test Microsoft Fabric Data Pipelines.

 

I wanted to cover this framework since I mentioned it in a previous post about unit tests on Microsoft Fabric items.

Read more...