The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hey all,
I've loaded my sources and my medallion stages into their own lakehouses for scalability and trying to keep things tidy from the get-go. I'm now in a position where it would be REALLY useful to, within a notebook, be able to pull data in from a source, run some SQL on it, and load it into a layer. Pretty standard simple transformation step.
I can add those lakehouses to my notebook so they're both there, but I can't get data from any lakehouse that isnt the default for that notebook. If we can't get data from it, why can we add it? Is this somehting thats coming soon or am I doing something wrong?
When I select 'load data' from a table in a lakehouse that isnt default to the notebook, it auto populates the cell, starting with the comment:
I went through the same situation and found an answer in Spark SQL can't read lakehouse tables in a schema with an uppercase name. You need to put the following code at the beginning of your notebook
spark.conf.set("spark.sql.caseSensitive", "true")
ps. you could also try an absolute approach to loading the correct lakehouse by using the abfss path
from delta.tables import DeltaTable
lakehousePath = "abfss://yourpathhere"
deltaTablePath = f"{lakehousePath}/Tables/{tableName}"
deltaTable = DeltaTable.forPath(spark, deltaTablePath)
deltaTable.toDF().show()
you can get the abfss path from the lakehouse. click onto the triple dots of the Tables Node in the left menu and open properties. if you don't see properties, be sure to be in lakehouse mode and not SQL Mode.
from our experience, you can access any lakehouse and warehouse in the same workspace. Cross workspace will require shortcuts.
The medallion architecture suggests to have different zones within a lokehouse.
If I understand your issue, you could create the file strcuture (zones) within workspae and in the same lakehouse.
Having a single lakehouse with all layers contained within it goes against the access and security suggestions. Individual lakehouses in their own workspace allow for role based / workspace access permissions, keeping bronze safe, Silver accessible to data scientists and gold accessible to reporting analysts.
The solution I've been given for this from microsoft was to do shortcuts to the active lakehouse for any data I want to read in. Writing out can be done from anywhere.