Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
I configured a Lakehouse and made a shortcut pointing to an existing ADL2 storage account.
To do so, I used "Get Data -> New Shortcut" from the top menu. I then went through the connection wizard - I used an access key for testing since neither using a service principal nor my org login would work ("Invalid credentials").
In the end, I had two entries added to "Tables" node of my Lakehouse and an error popped up, telling me my data cannot be used as tables and I'd have to move it to "Files".
So I clicked the tables in question and selected "Move to files". All it did was come back with an error, telling me that tables cannot be moved.
Well, I thought, then let me add the same ADL2 shortcut directly to the "Files" node by right-clicking it and selecting "New shortcut". This worked, I had my files referenced.
Next, I tried to get rid of the invalid tables-shortcuts and selected "Delete". Next thing I notice: my source files had been deleted from my ADL2!
Q1: Given that shortcuts are explained as "pointers" to the original data, I wonder if this is expected behavior? I certainly would have expected to only have the "Table" sub-nodes removed without affecting my underlying raw data.
Q2: Once I had the shortcuts correctly configured under "Files" I right clicked them and selected "Load to tables". When I browse OneLake, I find that a set of parquet files has now been created. This means my data has been replicated. Isn't one of the (promised) benefits of shortcuts that there's only one source of truth for my data? So if I were to modify something in my original ADL2, would this be reflected in the tables being generated?
Solved! Go to Solution.
Q2:
I know my colleague was able to connect data in ADLS into Fabric Lakehouse Table using shortcuts. It was live data. He didn't use Load to table.
What file type is your data in ADLS?
If your data is already in delta table format in ADLS, it should be fit to use in Fabric Lakehouse Table by using shortcut without any extra work.
I can imagine 3 alternative paths I would try, maybe some of them will work also with other file types:
I haven't tried all these ways of doing it, but these are some options I would try (starting from the top of the list).
I never tried using external table, but the concept of external table sounds like something which could be used for this.
If anyone has any comment to these suggestions, please share because I would also like to learn 😃
1. I think you can connect to the files using Notebook, Pipeline, Dataflow Gen2, Power BI desktop, etc. Basically any tool that can work with Parquet files. The Files section is like a data lake in my understanding.
You can also try to make external table (unmanaged table) in Table section which is using the shortcut in Files as it's data source. I don't have experience with external tables, but from what I read it should automatically show the updated data from the source.
(Maybe you can also make a shortcut in Table section which is referencing the shortcut in Files section. This sounds like something that will not work, but could try it and see if it works.)
2. No, with the load to tables option it will not get automatically updated. The load to tables creates a copy of your data at the time when you click the Load to tables button. After that, it does not get automatically updated.
If you make an external table, then I think this will get automatically updated when the data in the source gets updated.
(If you succeed in making a shortcut in Table section, which is pointing to the folder in Files section, it should also be automatically updated.)
3. Yes, that should work.
This is my understanding. Great if someone can confirm or correct 😃
Just back from vacation, hence the delayed response.
Thanke for the link to the docs about when referenced data gets deleted and when it doesn't. My problem was that the shortcut ended up under "Tables" where it shouldn't be. I was then unable to delete the shortcut object (this must be a bug). So I deleted the folders within, then I was able to also delete the shortcut object afterwards. Reading the docs explains why my data disappeared from the storage account. I must say, the way this is handled without further warnings is what I would call a suboptimal user epxerience.
My data in storage is parquest format (but not delta). I understand when I refeences this via a shortcut in "Files", no data is replicated but it will when I load it into tables. Follow-up questions:
1. Is there anything I can do with the data (query it?) while it is under "Files" only?
2. If I decide to load it into tables and I change the original source in my storage account, will the tables reflect these changes automatically?
3. If my data were in delta format, would I then be able to shortcut it directly into "Tables"?
1. I think you can connect to the files using Notebook, Pipeline, Dataflow Gen2, Power BI desktop, etc. Basically any tool that can work with Parquet files. The Files section is like a data lake in my understanding.
You can also try to make external table (unmanaged table) in Table section which is using the shortcut in Files as it's data source. I don't have experience with external tables, but from what I read it should automatically show the updated data from the source.
(Maybe you can also make a shortcut in Table section which is referencing the shortcut in Files section. This sounds like something that will not work, but could try it and see if it works.)
2. No, with the load to tables option it will not get automatically updated. The load to tables creates a copy of your data at the time when you click the Load to tables button. After that, it does not get automatically updated.
If you make an external table, then I think this will get automatically updated when the data in the source gets updated.
(If you succeed in making a shortcut in Table section, which is pointing to the folder in Files section, it should also be automatically updated.)
3. Yes, that should work.
This is my understanding. Great if someone can confirm or correct 😃
Q2:
I know my colleague was able to connect data in ADLS into Fabric Lakehouse Table using shortcuts. It was live data. He didn't use Load to table.
What file type is your data in ADLS?
If your data is already in delta table format in ADLS, it should be fit to use in Fabric Lakehouse Table by using shortcut without any extra work.
I can imagine 3 alternative paths I would try, maybe some of them will work also with other file types:
I haven't tried all these ways of doing it, but these are some options I would try (starting from the top of the list).
I never tried using external table, but the concept of external table sounds like something which could be used for this.
If anyone has any comment to these suggestions, please share because I would also like to learn 😃
Hi @Krumelur As far as Q1 is concerned, please refer to this doc and specifically the "Deleting content referenced by a shortcut" section.
OneLake shortcuts - Microsoft Fabric | Microsoft Learn
For Q2. When you "Load to Tables" it's actually copying your Files data into the Tables section as Delta files, this then enables benefits such as DirectLake. The Shortcut is as you say, a pointer to the source files, but then you can choose to "Load to Tables" which will then read the shortcut data and write to Delta tables in the Lakehouse.
Hi @Krumelur
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. Otherwise, will respond back with the more details and we will try to help.
Thanks
User | Count |
---|---|
39 | |
10 | |
4 | |
3 | |
2 |
User | Count |
---|---|
48 | |
16 | |
7 | |
6 | |
5 |