Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now

Reply
eyeballkid
Regular Visitor

Moving CRM Data from Bronze to Silver lakehouses

Hi all,

I'm relatively new to Fabric and I'm trying to build out a POC to demonstrate a requirement for taking data from our Dynamics365 environments and other sytems, populating a lakehouse and moving the data through a medallion architecture.

 

I have set myslf up with a new workspace and provisioned bronze and silver lakehouses.  In the bronze lakehouse I have created a shortcut to some CRM data in an ADL Gen2 storage container that was populated using Azure Synapse Link, all of which has been pretty straight forward!

 

eyeballkid_0-1730907858858.png

 

What I am struggling with is then moving that raw data into the Silver lakehouse as Delta tables.  The csv files don't have any column headers and I can't seem to figure out a way to get them. 

 

In Azure Synapse I can create a Dataflow that connects to the data lake database and then reads the files in table format and allows me to process them as I wish.

 

eyeballkid_1-1730908323719.png

eyeballkid_2-1730908406070.png

 

Is there a way of doing this in Fabric?

 

Many Thanks!

 

 

3 ACCEPTED SOLUTIONS
spencer_sa
Resolver III
Resolver III

In Fabric, the Pipeline Copy Data activity has both a Dataverse and Dynamics CRM connector

spencer_sa_0-1730928114085.png

You'll probably need to set up a gateway (but if you've got Synapse talking to D365, this shouldn't be a hardship)

spencer_sa_1-1730928230525.png

 

View solution in original post

AndyDDC
Most Valuable Professional
Most Valuable Professional

Hi @eyeballkid the problem here is that as part of Synapse Link, the process sets up a lake database in Synapse and synchronises the dynamics table and column metadata which casts structure over the exported CSV files. If in Fabric you shortcut to and query these csv files directly then there is no metadata in the csv files.

 

Is there a possibility of using Fabric Link instead? This will create a lakehouse in a workspace and synchronise the metadata https://learn.microsoft.com/en-us/power-apps/maker/data-platform/azure-synapse-link-view-in-fabric

 

Be warned that the data itself is exported to delta parquet and stored in Dataverse database storage. 

View solution in original post

v-huijiey-msft
Community Support
Community Support

Hi @eyeballkid ,

 

Thanks for the reply from spencer_sa / AndyDDC .

 

I have a suggestion for your “csv file doesn't have any column headers” problem.

 

First, open a notebook:

vhuijieymsft_0-1730947987572.png

 

I can reproduce your csv file without any column headers.

vhuijieymsft_1-1730947987580.png

 

Please note that the column headers contain the first row of the data. To correct this issue, you need to modify the first line of the code as follows:

 

df = spark.read.format(“csv”).option(“header”, “false”).load(“Files/orders/2019.csv”)
display(df)

 

vhuijieymsft_2-1730948053722.png

 

Column names have now been changed to _c0, _c1, etc.

 

Descriptive column names help you make sense of your data. To create meaningful column names, you need to define the schema and data types. You also need to import a set of standard Spark SQL types to define the data types. Replace the existing code with the following:

 

from pyspark.sql.types import *
orderSchema = StructType([
StructField(“SalesOrderNumber”, StringType()),
StructField(“SalesOrderLineNumber”, IntegerType()),
StructField(“OrderDate”, DateType()),
StructField(“CustomerName”, StringType()),
StructField(“Email”, StringType()),
StructField(“Item”, StringType()),
StructField(“Quantity”, IntegerType()),
StructField(“UnitPrice”, FloatType()), StructField(“Tax”, FloatType()), StructField(“Tax”, FloatType())
StructField(“Tax”, FloatType()))
])
df = spark.read.format(“csv”).schema(orderSchema).load(“Files/orders/2019.csv”)
display(df)

 

vhuijieymsft_3-1730948085895.png

 

Save it as a delta table using the following code:

 

df.write.format(“delta”).saveAsTable(“test”)

 

vhuijieymsft_4-1730948113366.png

 

The final display looks like below and now you can use the table.

vhuijieymsft_5-1730948113374.png

 

In Fabric you can also create a dataflow that reads CSV files and processes them as needed.

vhuijieymsft_6-1730948129007.png

 

You can specify column names and data types for CSV files, and use dataflows to transform raw data as needed.

vhuijieymsft_7-1730948129015.png

 

After the data transformation is complete, you can use the dataflow's output settings to load the table into Silver Lakehouse.

vhuijieymsft_8-1730948140568.png
vhuijieymsft_9-1730948140570.png

 

After publish you can go to the target Lakehouse and see that the table has been successfully loaded over.

vhuijieymsft_0-1730948249639.png

 

You can also schedule the dataflow to run at specified intervals, which will automate the transfer of data from the bronze Lakehouse to the silver Lakehouse by updating it periodically.

vhuijieymsft_1-1730948258456.png

 

If you have any other questions please feel free to contact me.

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

View solution in original post

4 REPLIES 4
eyeballkid
Regular Visitor

Many thanks to everyone for all the really helpful replies.  For the purposes of my testing I went straight to the dataverse which provided my with column headers but there will be instances where I need to create my own schemas so that's very useful as well.

v-huijiey-msft
Community Support
Community Support

Hi @eyeballkid ,

 

Thanks for the reply from spencer_sa / AndyDDC .

 

I have a suggestion for your “csv file doesn't have any column headers” problem.

 

First, open a notebook:

vhuijieymsft_0-1730947987572.png

 

I can reproduce your csv file without any column headers.

vhuijieymsft_1-1730947987580.png

 

Please note that the column headers contain the first row of the data. To correct this issue, you need to modify the first line of the code as follows:

 

df = spark.read.format(“csv”).option(“header”, “false”).load(“Files/orders/2019.csv”)
display(df)

 

vhuijieymsft_2-1730948053722.png

 

Column names have now been changed to _c0, _c1, etc.

 

Descriptive column names help you make sense of your data. To create meaningful column names, you need to define the schema and data types. You also need to import a set of standard Spark SQL types to define the data types. Replace the existing code with the following:

 

from pyspark.sql.types import *
orderSchema = StructType([
StructField(“SalesOrderNumber”, StringType()),
StructField(“SalesOrderLineNumber”, IntegerType()),
StructField(“OrderDate”, DateType()),
StructField(“CustomerName”, StringType()),
StructField(“Email”, StringType()),
StructField(“Item”, StringType()),
StructField(“Quantity”, IntegerType()),
StructField(“UnitPrice”, FloatType()), StructField(“Tax”, FloatType()), StructField(“Tax”, FloatType())
StructField(“Tax”, FloatType()))
])
df = spark.read.format(“csv”).schema(orderSchema).load(“Files/orders/2019.csv”)
display(df)

 

vhuijieymsft_3-1730948085895.png

 

Save it as a delta table using the following code:

 

df.write.format(“delta”).saveAsTable(“test”)

 

vhuijieymsft_4-1730948113366.png

 

The final display looks like below and now you can use the table.

vhuijieymsft_5-1730948113374.png

 

In Fabric you can also create a dataflow that reads CSV files and processes them as needed.

vhuijieymsft_6-1730948129007.png

 

You can specify column names and data types for CSV files, and use dataflows to transform raw data as needed.

vhuijieymsft_7-1730948129015.png

 

After the data transformation is complete, you can use the dataflow's output settings to load the table into Silver Lakehouse.

vhuijieymsft_8-1730948140568.png
vhuijieymsft_9-1730948140570.png

 

After publish you can go to the target Lakehouse and see that the table has been successfully loaded over.

vhuijieymsft_0-1730948249639.png

 

You can also schedule the dataflow to run at specified intervals, which will automate the transfer of data from the bronze Lakehouse to the silver Lakehouse by updating it periodically.

vhuijieymsft_1-1730948258456.png

 

If you have any other questions please feel free to contact me.

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

AndyDDC
Most Valuable Professional
Most Valuable Professional

Hi @eyeballkid the problem here is that as part of Synapse Link, the process sets up a lake database in Synapse and synchronises the dynamics table and column metadata which casts structure over the exported CSV files. If in Fabric you shortcut to and query these csv files directly then there is no metadata in the csv files.

 

Is there a possibility of using Fabric Link instead? This will create a lakehouse in a workspace and synchronise the metadata https://learn.microsoft.com/en-us/power-apps/maker/data-platform/azure-synapse-link-view-in-fabric

 

Be warned that the data itself is exported to delta parquet and stored in Dataverse database storage. 

spencer_sa
Resolver III
Resolver III

In Fabric, the Pipeline Copy Data activity has both a Dataverse and Dynamics CRM connector

spencer_sa_0-1730928114085.png

You'll probably need to set up a gateway (but if you've got Synapse talking to D365, this shouldn't be a hardship)

spencer_sa_1-1730928230525.png

 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Oct Fabric Update Carousel

Fabric Monthly Update - October 2024

Check out the October 2024 Fabric update to learn about new features.

October NL Carousel

Fabric Community Update - October 2024

Find out what's new and trending in the Fabric Community.