Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
In a previous post, we explored how to use the Presidio library with PySpark on Microsoft Fabric to detect and anonymize PII. While powerful, that approach requires managing external dependencies and custom logic. In this follow-up, we’ll explore a more native and streamlined alternative: the Fabric AI functions library.
With just a few lines of code, Fabric’s built-in AI functions like ai.extract and ai.generate_response allow you to identify and redact PII directly within your data pipelines—no external libraries required.
Fabric AI functions, powered by sophisticated language models, offer several advantages:
Let’s dive into how the Fabric AI functions library enables these capabilities.
The Fabric AI functions library is designed to be both intuitive and powerful, exposing a rich set of APIs for text analysis and transformation. Two functions are particularly relevant for privacy workflows:
For complete documentation check out AI functions overview page. Now let's look at each function in detail with practical examples.
Suppose you receive a dataset containing free-form customer feedback, and you want to flag all records containing PII. With Presidio, you would configure recognizers and patterns for each entity type; with Fabric AI functions, the process is more dynamic.
from synapse.ml.spark.aifunc.DataFrameExtensions import AIFunctions
from synapse.ml.services.openai import OpenAIDefaults
defaults = OpenAIDefaults()
defaults.set_deployment_name("gpt-4o-mini")
data = [
("Contact John Doe at john.doe@example.com or 555-123-4567.",),
("Jane Smith, 123 Main St, NY, can be reached at jane.smith@email.com.",),
("Call 800-555-6789 for support.",),
]
df = spark.createDataFrame(data, ["text"])
pii_extracted = df.ai.extract( input_col="text", labels=["PERSON","EMAIL_ADDRESS","PHONE_NUMBER"] )
display(pii_extracted)
The output looks like this:
A Fabric notebook python cell showing the results of executing ai.extract function
With the PII entities identified, you could take it a step further by using built-in PySpark regular expression functions to redact or mask this data in the original text. Next let’s look at an alternative approach to accomplish this in a single step.
Let’s say you want not only to detect, but also to redact sensitive information in text. With ai.generate_response, you can prompt the AI to both identify and redact PII in one step.
Here’s an example using the same sample data we defined above:
redaction_prompt = """Redact all PII from this text.
Input: {text}
Expected output: The original text but with PII replaced with [REDACTED] and no extra words added.
"""
redacted_df = df.ai.generate_response(prompt=redaction_prompt, is_prompt_template = True, output_col="redacted_text")
display(redacted_df)
The result would look like this:
A Fabric notebook python cell showing the results of executing ai.generate_response function
This approach is very flexible. You can customize the LLM instructions for anonymization, pseudonymization, or data masking, such as replacing names with generic labels (“NAME_1”) or random values.
Microsoft Fabric’s AI Functions library offers a powerful, low-code alternative to traditional PII detection and anonymization approaches like Presidio. By leveraging ai.extract and ai.generate_response, data teams can build privacy-preserving workflows with minimal setup, all within the Fabric ecosystem. This approach not only simplifies development but also aligns with privacy-by-design principles, ensuring compliance and data utility for analytics and AI applications.
For more details on AI Functions, check out the Microsoft Fabric documentation.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.