Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
jose_henrique
Frequent Visitor

Load data from ADLS to Lakehouse using Notebook

So, here's my situation:
In Dynamics 365 F&O, i have an active Export to Data Lake action, where it saves in folders in my storage the data itself in a headerless CSV file and a JSON with the metadata (column name and column type).


In Databricks, i was used to run a notebook with this code:

 

from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql import Window
import json

def get_cdm_files(directory_path):
  """recursively list path of all cdm files in path directory """
  cdm_files = []
  files_to_treat = dbutils.fs.ls(directory_path)
  while files_to_treat:
    path = files_to_treat.pop(0).path
    if path.endswith('/'😞
      files_to_treat += dbutils.fs.ls(path)
    elif path.endswith('.cdm.json'😞
      cdm_files.append(path)
  return cdm_files
allcdm = get_cdm_files("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/name.operations.dynamics.com/Tables/")
 
def leTabelaDynamics(abfsspath):
  jsonpath = abfsspath
  f = open(jsonpath.replace("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/", "/dbfs/mnt/dynamics/"))
  heade = json.load(f)
  colss = []
  for item in heade['definitions'][0]['hasAttributes']:
    colss.append(item["name"])
 
  return spark.read.csv(abfsspath.replace(".cdm.json", "/") , multiLine=True,  inferSchema=True, enforceSchema=True ).toDF(*colss)
 
contratos = \
[
  "InventTable",
  "EcoResProductTranslation",
  "EcoResProductCategory",
  "EcoResCategory",
  "AgreementHeader",
  "AgreementClassification",
  "SalesLine",
  "CustInvoiceTrans",
  "CustInvoiceJour",
  "CustInvoiceJour_BR",
  "SalesPurchOperationType_BR",
  "FiscalDocument_BR",
  "PurchLine",
  "VendInvoiceTrans",
  "VendInvoiceJour",
  "InventDim",
  "AgreementLine",
  "VendTable",
  "DirPartyTable",
  "CustTable",
  "CustInvoiceTrans",
  "PurchTable_BR"
]

estoque = \
[
  "InventSumHistorical",
  "InventBatch",
  "InventDim",
  "InventSum",
  "InventTable",
  "EcoResProductTranslation",
  "EcoResProductCategory",
  "EcoResCategory"
]

recebiveis = \
[
  "FiscalEstablishment_BR"
  , "DimensionAttributeValueSet"
  , "DimensionAttributeAalueSetItem"
  , "DimensionAttributeValue"
  , "DimensionAttribute"
  , "DirPartyLocation"
  , "CustLedger"
  , "CustSettlement"
  , "CustTrans"
  , "CustTransOpen"
]

perdaseganhos = \
[
    "dimensionattributevalueset"
  , "generaljournalaccountentry"
  , "generaljournalentry"
  , "UserInfo"
]

todas = estoque + ["InventTransferTable", "InventTransferLine"] + ["UNITOFMEASURE", "UnitOfMeasureConversion"] + ["INVENTTABLE", "DIMENSIONATTRIBUTEVALUESET", "DIMENSIONATTRIBUTEVALUESETITEM", "DIMENSIONATTRIBUTEVALUE", "DIMENSIONATTRIBUTE", "DIMENSIONFINANCIALTAG"]   +  ["PurchLine", "VendInvoiceTrans", "VendInvoiceJour", "AgreementHeader", "AgreementLine", "AgreementClassification"] +  perdaseganhos + recebiveis +  contratos + ["PurchTable" , "EcoResProductVariantDimensionValue" , "EcoResConfiguration"]
 
sucess = []
fails = []

try:
  allcdm.remove("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/name.operations.dynamics.com/Tables/Custom/INVENTSUMHISTORICAL.cdm.json")
  allcdm.remove("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/name.operations.dynamics.com/Tables/Custom/UNITOFMEASURE.cdm.json")
  allcdm.remove("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/name.operations.dynamics.com/Tables/Custom/UNITOFMEASURECONVERSION.cdm.json")
except:
  pass
 
for item in allcdm:
  if "manifest" not in item and "resolved" not in item:
    tname = item.split("/")[-1].replace(".cdm.json", "").lower()
    if tname in [x.lower() for x in todas]:
 
      try:
        dftemp = leTabelaDynamics(item)
       
        try:
          spark.sql("drop table main.dynamics." + tname)
        except:
          pass
       
        dftemp.write.mode("overwrite").saveAsTable("main.dynamics." + tname)
        sucess.append(item)
        print("sucesso: " + item.replace("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/name.operations.dynamics.com/Tables/", ""))

      except:
        print("error on " + item)
        fails.append(item)

You can see it's a very hard code here.
So, what i was planning is:
 
1- to make this code easier to maintain, without coding all the tables
2- instead of sendind the tables do databricks, save them at lakehouse
 
I tried to adapt the code to Fabric's Notebook (dbutils doesnt work on Fabric btw), but without success.
Even when i modified all the code and worked it all, doesnt ingested the tables in my lakehouse.
 
Did someone tried something like this or can help me with this case?
4 REPLIES 4
DennesTorres
Impactful Individual
Impactful Individual

Hi,

Did you tried to create a shortcut to the ADLS in the Files area of the lakehouse and then read the data as if it's local to the lakehouse, converting from the Files to tables? 

Kind Regards,

 

Dennes

so, what i did was connect the files and then transforming using dataflow gen 2 (because the headerless csv and the header file), but now i'm facing another issue: i cant combine 2 csv files (because they're headerless):
image.png

Hi @jose_henrique 

 

Apologies for the delay in response.

Thanks for using Microsoft Fabric platform and posting your question here.

Just wanted to check whether your issue got resolved, please do let us know in case of any further queries.

Hi @jose_henrique 

 

Following up to see whether your issue got resolved, please do let us know in case of any further queries.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.