support openrowset/polybase

c_mettler · ‎06-08-2023

Data providers typically deliver data in formats such as parquet, JSON, or CSV. In traditional data warehouse scenarios, these files often land in a persistent staging area. Many organize these staging folders in a structured manner such as year/month/week/day. Currently, these files need to be converted to delta format with an additional workload before they can be consumed in the data warehouse.

The support of openrowset/create external table, as with parquet file globbing, would greatly simplify the load and transform to final delta files in the warehouse workload without the need for additional workloads/datalflow/pipelines complexity.

admin176 · ‎08-04-2023

we use this pattern extensively at work, ADF, azure storage, staging table in sql server, our data engineer dismissed Fabric because of the lack of this feature.

vasu_n · ‎01-24-2024

Hello,

Currently we are serving many BI workloads(Power BI Reports, Excel Tables, etc) through Synapse Serverless SQL openrowset with data from datalake in different formats and from different layers (in medallion architecture).

We are trying to move to Fabric now, but all our data is in parquet format in ADLS2 and existing pipelines are also ingesting the data into ADLS2 (external data, export setupts, etc).

Shortcut is good but it can't be served to BI through fabric without loading it into delta tables. So there needs another ETL to load the ADLS2 data into Delta Tables through spark ETL which is killing mosquito with hammer.

Hope to see OPENROWSET in fabric soon.

Thanks.

vasu_n · ‎03-13-2024

Hello Admin, is this being considered?

Thanks.

a_portoles · ‎08-22-2024

He use OpenRowset in all our projects.

Allow us

Implement data Virtualization, and avoid Staging tables
Grant access to data that is in the LZ but not in the User SQL tables with Views_LZ

Important to have

The ALDS / OneLake must be under firewall ==> Private endpoint compatible (**)
Use SAS or Acces Keys . . .., But compatible with AAD (User MI or System MI) (**)
Most of the Cases we use Authentication SQL <==> Storage: Delegation / Impersonate
In some cases for audit purpolses we need use Pass-Through Authentication, thus the AAD SQL user will appear on ADLS logs
The T-SQL Openrowset must be able to compile on Azure Data studio & Azure DevOps, currently is not able to do it and block all CI/CD progress (**)
Format files (FMT) are really obsolete, need maintenance and are not part of Database Project / GIT, need to find a 2024 replacement (**)
We hope that Select disctinct (col_1) [for Dim Tables] On parquet will be faster than on CSV, but sometimes is not (**)
This Synapse "sugar" is very appreciatee: '*/*/*/myfile*.parquet'
Currently select from openrowset INTO #Temptable is not supported (**)
All the others: AGG pushdown, Metadata, Vertipaq / V-Order, PurView , , . . . future??

(**) Hugo Queiroz knows about these issues and some of them are ACK by Azure support

Jugi · ‎08-22-2024

I think this idea asks for a similar approach https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=6bd2453c-3eed-ee11-a73d-000d3a7c5b9a

Sanjay_Rathod1 · ‎08-22-2024

Hello Jovan Popovic

Am using synapse serverless pool for converting JSON files on the fly to rows and columns

So openrowset is definitely required in Fabric

Thanks

Sanjay

Maxime_Garneau_ · ‎10-30-2024

I agree with folks; this is a must-have feature in a lakehouse: the ability to easily separate the physical layer from the abstract layer without constantly copying or materializing data.

Use case examples:

Rapidly profile and explore raw data (CSV, JSON, etc.) in the lakehouse/files using T-SQL/SQL OPENROWSET in SQL ENDPOINT.
MVVM style design pattern: Implement an interface (view) on top of raw data files (CSV, JSON, etc.) in SQL ENDPOINT using OPENROWSET to facilitate downstream data flows/ETL
de-serialize and standardize json schema easily in a view using OPENROWSET and OPENJSON
expose data in a "fast track mode" to consumer

So basically, just reply with the Synapse Serverless feature. 😉

Furthermore, it says that this idea is in "Planned" status, so can we get an ETA on it? This information could be very helpful in preparing our migration plan from Synapse to Fabric (or not).

Thanks! 🙂

CharlesWebbMSFT · ‎02-21-2025

Folks, please feel free to share more details about the scenarios this would unblock for you.

fbcideas_migusr · ‎02-25-2025

jovanpop-msft · ‎07-31-2025

OPENROWSET is available in Fabric DW and can read CSV, Parquet, and JSONL files. With the views on OPENROWSET you can reference external files.

If you need external tables vote for separate item: Support external tables for parquet/csv in Fabric ... - Microsoft Fabric Community