Solved: service principal vs own identity in data pipeline

Manu002 · ‎10-03-2025

Hi Experts,

I'm a memeber of a workspace and building a data pipeline. It is meant to pull data from Azure sql db to fabric lakehouse.

I want to use service principal to avoid pipeline failures due to separation of employee or any other such issue own entraid may face. Also, for better security. But, to use service principal auth method we have to use sql endpoint connection strings of lakehouse to connect it to it. Additionally, if sql endpoint connection is used the sink lakehouse doesn't accept the nvarchar(max) data type.

So i'm thinking to use direct lakehouse connection. my question is as below.

Assume i've created the lakehouse, and i directly connect to the lakehouse in pipeline. Does this use my own entity to connect to lakehouse or workspace entity. If i get separated from the org, does the pipeline gets failed because my id is removed from workspace?

Thanks in advance

tayloramy · ‎10-03-2025

Hi @Manu002,

You are thinking about exactly the right trade-offs. In Fabric pipelines you do not have to bind the pipeline's runtime to your personal identity. You can use either a workspace identity (managed service principal) or your own service principal in the cloud connection(s).

When you first connect to the lakehouse in a pipeline, you need to create a connection, which then is saved. This connection uses your credentials, so if you were to leave the org, yes, things wqould start to fail.

The way I have handled this is by using a service account (not a service principal). My service user is for all intents and purposes just another user account, but it is not tied to my identity so if I leave my org, it will remain and other members of my team will have access to it.

If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.

View solution in original post

anilgavhane · ‎10-03-2025

@Manu002

Using your own identity to connect directly to the Lakehouse in a pipeline means the pipeline relies on your personal access. If you're removed from the workspace or organization, the pipeline will likely fail.

Using a service principal avoids this risk and is better for long-term stability and security. However, it requires using the SQL endpoint, which may have limitations like not supporting nvarchar(max) in the sink.

If possible, consider using parameters or staging data to work around SQL endpoint constraints, or use your identity temporarily while planning a service principal setup for production.

View solution in original post

v-veshwara-msft · ‎10-07-2025

Hi @Manu002 ,

Thank you for sharing your question in Microsoft Fabric Community.

As @tayloramy and @anilgavhane noted, using your own identity for a direct Lakehouse connection will tie the pipeline to your personal access and may cause failures if your account is removed from the workspace or organization.

For scenarios where you want the pipeline to continue running independently of individual accounts, as mentioned by @tayloramy creating a dedicated service account to connect to the Lakehouse is a practical approach. This account can be used across pipelines and is not tied to a specific employee, allowing continuity even if team members change. Please ensure that the service account has the required permissions in the workspace and Lakehouse to avoid access issues.

Regarding the SQL endpoint, it is correct that nchar and nvarchar are not supported when writing through the endpoint, because Parquet, does not have equivalent Unicode types. The recommended alternatives are char or varchar, though using varchar with UTF-8 collation may consume more storage than nvarchar.

Reference: Data Types in Fabric Data Warehouse - Microsoft Fabric | Microsoft Learn

To handle this limitation, you can cast or truncate the column to a supported type before writing via the SQL endpoint, use a staging table, or continue using the direct Lakehouse connection with the dedicated service account. This ensures the pipeline can handle large text columns while remaining independent of individual users.

Similar discussions: Solved: Re: Copy Data to Fabric Warehouse – Handling nvarc... - Microsoft Fabric Community
8000 Character limit SQL Endpoint For Lakehouse Fa... - Microsoft Fabric Community

Thanks to @tayloramy and @anilgavhane for your suggestions.

Hope this helps. Please reach out for further assistance.

Thank you.

View solution in original post

v-veshwara-msft · ‎10-20-2025

Hi @Manu002 ,

Just checking in to see if you query is resolved and if any responses were helpful.
Otherwise, feel free to reach out for further assistance.

Thank you.

v-veshwara-msft · ‎10-13-2025

Hi @Manu002 ,
Just wanted to check if the responses provided were helpful. If further assistance is needed, please reach out.
Thank you.

v-veshwara-msft · ‎10-07-2025

Hi @Manu002 ,

Thank you for sharing your question in Microsoft Fabric Community.

As @tayloramy and @anilgavhane noted, using your own identity for a direct Lakehouse connection will tie the pipeline to your personal access and may cause failures if your account is removed from the workspace or organization.

For scenarios where you want the pipeline to continue running independently of individual accounts, as mentioned by @tayloramy creating a dedicated service account to connect to the Lakehouse is a practical approach. This account can be used across pipelines and is not tied to a specific employee, allowing continuity even if team members change. Please ensure that the service account has the required permissions in the workspace and Lakehouse to avoid access issues.

Regarding the SQL endpoint, it is correct that nchar and nvarchar are not supported when writing through the endpoint, because Parquet, does not have equivalent Unicode types. The recommended alternatives are char or varchar, though using varchar with UTF-8 collation may consume more storage than nvarchar.

Reference: Data Types in Fabric Data Warehouse - Microsoft Fabric | Microsoft Learn

To handle this limitation, you can cast or truncate the column to a supported type before writing via the SQL endpoint, use a staging table, or continue using the direct Lakehouse connection with the dedicated service account. This ensures the pipeline can handle large text columns while remaining independent of individual users.

Similar discussions: Solved: Re: Copy Data to Fabric Warehouse – Handling nvarc... - Microsoft Fabric Community
8000 Character limit SQL Endpoint For Lakehouse Fa... - Microsoft Fabric Community

Thanks to @tayloramy and @anilgavhane for your suggestions.

Hope this helps. Please reach out for further assistance.

Thank you.

anilgavhane · ‎10-03-2025

@Manu002

Using your own identity to connect directly to the Lakehouse in a pipeline means the pipeline relies on your personal access. If you're removed from the workspace or organization, the pipeline will likely fail.

Using a service principal avoids this risk and is better for long-term stability and security. However, it requires using the SQL endpoint, which may have limitations like not supporting nvarchar(max) in the sink.

If possible, consider using parameters or staging data to work around SQL endpoint constraints, or use your identity temporarily while planning a service principal setup for production.

tayloramy · ‎10-03-2025

Hi @Manu002,

You are thinking about exactly the right trade-offs. In Fabric pipelines you do not have to bind the pipeline's runtime to your personal identity. You can use either a workspace identity (managed service principal) or your own service principal in the cloud connection(s).

When you first connect to the lakehouse in a pipeline, you need to create a connection, which then is saved. This connection uses your credentials, so if you were to leave the org, yes, things wqould start to fail.

The way I have handled this is by using a service account (not a service principal). My service user is for all intents and purposes just another user account, but it is not tied to my identity so if I leave my org, it will remain and other members of my team will have access to it.

If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.