Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Reply
jose_henrique
Frequent Visitor

Load data from ADLS to Lakehouse using Notebook

So, here's my situation:
In Dynamics 365 F&O, i have an active Export to Data Lake action, where it saves in folders in my storage the data itself in a headerless CSV file and a JSON with the metadata (column name and column type).


In Databricks, i was used to run a notebook with this code:

 

from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql import Window
import json

def get_cdm_files(directory_path):
  """recursively list path of all cdm files in path directory """
  cdm_files = []
  files_to_treat = dbutils.fs.ls(directory_path)
  while files_to_treat:
    path = files_to_treat.pop(0).path
    if path.endswith('/'😞
      files_to_treat += dbutils.fs.ls(path)
    elif path.endswith('.cdm.json'😞
      cdm_files.append(path)
  return cdm_files
allcdm = get_cdm_files("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/name.operations.dynamics.com/Tables/")
 
def leTabelaDynamics(abfsspath):
  jsonpath = abfsspath
  f = open(jsonpath.replace("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/", "/dbfs/mnt/dynamics/"))
  heade = json.load(f)
  colss = []
  for item in heade['definitions'][0]['hasAttributes']:
    colss.append(item["name"])
 
  return spark.read.csv(abfsspath.replace(".cdm.json", "/") , multiLine=True,  inferSchema=True, enforceSchema=True ).toDF(*colss)
 
contratos = \
[
  "InventTable",
  "EcoResProductTranslation",
  "EcoResProductCategory",
  "EcoResCategory",
  "AgreementHeader",
  "AgreementClassification",
  "SalesLine",
  "CustInvoiceTrans",
  "CustInvoiceJour",
  "CustInvoiceJour_BR",
  "SalesPurchOperationType_BR",
  "FiscalDocument_BR",
  "PurchLine",
  "VendInvoiceTrans",
  "VendInvoiceJour",
  "InventDim",
  "AgreementLine",
  "VendTable",
  "DirPartyTable",
  "CustTable",
  "CustInvoiceTrans",
  "PurchTable_BR"
]

estoque = \
[
  "InventSumHistorical",
  "InventBatch",
  "InventDim",
  "InventSum",
  "InventTable",
  "EcoResProductTranslation",
  "EcoResProductCategory",
  "EcoResCategory"
]

recebiveis = \
[
  "FiscalEstablishment_BR"
  , "DimensionAttributeValueSet"
  , "DimensionAttributeAalueSetItem"
  , "DimensionAttributeValue"
  , "DimensionAttribute"
  , "DirPartyLocation"
  , "CustLedger"
  , "CustSettlement"
  , "CustTrans"
  , "CustTransOpen"
]

perdaseganhos = \
[
    "dimensionattributevalueset"
  , "generaljournalaccountentry"
  , "generaljournalentry"
  , "UserInfo"
]

todas = estoque + ["InventTransferTable", "InventTransferLine"] + ["UNITOFMEASURE", "UnitOfMeasureConversion"] + ["INVENTTABLE", "DIMENSIONATTRIBUTEVALUESET", "DIMENSIONATTRIBUTEVALUESETITEM", "DIMENSIONATTRIBUTEVALUE", "DIMENSIONATTRIBUTE", "DIMENSIONFINANCIALTAG"]   +  ["PurchLine", "VendInvoiceTrans", "VendInvoiceJour", "AgreementHeader", "AgreementLine", "AgreementClassification"] +  perdaseganhos + recebiveis +  contratos + ["PurchTable" , "EcoResProductVariantDimensionValue" , "EcoResConfiguration"]
 
sucess = []
fails = []

try:
  allcdm.remove("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/name.operations.dynamics.com/Tables/Custom/INVENTSUMHISTORICAL.cdm.json")
  allcdm.remove("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/name.operations.dynamics.com/Tables/Custom/UNITOFMEASURE.cdm.json")
  allcdm.remove("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/name.operations.dynamics.com/Tables/Custom/UNITOFMEASURECONVERSION.cdm.json")
except:
  pass
 
for item in allcdm:
  if "manifest" not in item and "resolved" not in item:
    tname = item.split("/")[-1].replace(".cdm.json", "").lower()
    if tname in [x.lower() for x in todas]:
 
      try:
        dftemp = leTabelaDynamics(item)
       
        try:
          spark.sql("drop table main.dynamics." + tname)
        except:
          pass
       
        dftemp.write.mode("overwrite").saveAsTable("main.dynamics." + tname)
        sucess.append(item)
        print("sucesso: " + item.replace("abfss://dynamics365-financeandoperations@datalakename.dfs.core.windows.net/name.operations.dynamics.com/Tables/", ""))

      except:
        print("error on " + item)
        fails.append(item)

You can see it's a very hard code here.
So, what i was planning is:
 
1- to make this code easier to maintain, without coding all the tables
2- instead of sendind the tables do databricks, save them at lakehouse
 
I tried to adapt the code to Fabric's Notebook (dbutils doesnt work on Fabric btw), but without success.
Even when i modified all the code and worked it all, doesnt ingested the tables in my lakehouse.
 
Did someone tried something like this or can help me with this case?
4 REPLIES 4
DennesTorres
Impactful Individual
Impactful Individual

Hi,

Did you tried to create a shortcut to the ADLS in the Files area of the lakehouse and then read the data as if it's local to the lakehouse, converting from the Files to tables? 

Kind Regards,

 

Dennes

so, what i did was connect the files and then transforming using dataflow gen 2 (because the headerless csv and the header file), but now i'm facing another issue: i cant combine 2 csv files (because they're headerless):
image.png

Hi @jose_henrique 

 

Apologies for the delay in response.

Thanks for using Microsoft Fabric platform and posting your question here.

Just wanted to check whether your issue got resolved, please do let us know in case of any further queries.

Hi @jose_henrique 

 

Following up to see whether your issue got resolved, please do let us know in case of any further queries.

Helpful resources

Announcements
November Fabric Update Carousel

Fabric Monthly Update - November 2025

Check out the November 2025 Fabric update to learn about new features.

Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.