This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. We're covering it all. You won't want to miss it.
Learn moreDid you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now
Textual brings advanced entity-detection, redaction, and synthesis for unstructured text data directly into the Fabric ecosystem—empowering organizations to unlock datasets previously off limits due to privacy concerns for responsible and compliant AI/ML development.
With Tonic Textual integrated into the Fabric user experience, teams can prepare office documents, PDFs and images containing sensitive text for AI and machine learning tasks—all while protecting privacy and maintaining compliance with regulations such as HIPAA and GDPR.
Manually de-identifying large volumes of sensitive data is an overwhelming task, and often achievable. Each document must be reviewed line by line to locate private information, making in-house solutions both unsustainable and error-prone—especially as data volumes grow and compliance requirements evolve.
Tonic Textual for Microsoft Fabric eliminates that burden. By combining Fabric’s lake-centric architecture and governance with Tonic’s AI-driven de-identification engine, users can easily and automatically identify and protect sensitive entities—such as names, dates, and medical or financial identifiers—without moving data out of Fabric.
The result: privacy-preserving datasets that are immediately ready for downstream workflows including model training, generative AI workloads and AI Agents.
Using Tonic Textual in Microsoft Fabric, the organization can safely process and transform its unstructured EHR data directly in OneLake. Textual automatically detects and anonymizes sensitive entities while maintaining the integrity of clinical language—ensuring data utility for downstream analytics, ML training and AI.
This enables data scientists and clinicians and business users to collaborate confidently, knowing that sensitive data is protected and would never leave their secure Fabric environment.
From your Fabric console, navigate to Workloads and select the Tonic Textual workload to add it to your workspace. Once added, you will have access to the Textual UI directly from your Microsoft Fabric console.
Refer to the documentation to learn more about adding workloads.
Step 2: Configure your input and output location
Transform_sensitive_text_into_AI-ready_data_on_Microsoft_Fabric
Step 3: Create a Tonic Textual Item
Transform_sensitive_text_into_AI-ready_data_on_Microsoft_Fabric
To use Textual, first open your workspace and select 'New Item', then 'Tonic Textual' from the list of available items.
Next, choose the OneLake Lakehouse containing the files you want to process. Next, choose a target folder where sanitized files will be saved.
Step 3: Scan your files for sensitive text
Transform_sensitive_text_into_AI-ready_data_on_Microsoft_Fabric
Select the specific files or entire folders of files containing sensitive data you want to sanitize. In this example, we have scanned two files from the folder ‘Patient Data’. On the right-hand side, you can see the status of the job indicating multiple detections of first and last names.
Step 4: Configure your de-identification preferenceTransform_sensitive_text_into_AI-ready_data_on_Microsoft_Fabric
After reviewing the initial analysis of sensitive text identified within your documents, you need to decide what action to take. You can choose a combination of redactions or synthesis (replace with a true-to-life substitute – i.e 'John' becomes 'William'), or to leave certain entities untouched. In this example, we are using the 'Bulk Edit' to automatically redact all of the sensitive entities.
Step 5: Access your sanitized files
Transform_sensitive_text_into_AI-ready_data_on_Microsoft_Fabric
Once the de-identification job is complete, your sanitized files are accessible in the destination folder that you created in Step 2. These files are essentially replicas of the originals, but with redacted or replaced entities based upon your de-identification strategy. Your original files remain un-altered in their source Lakehouse – the sanitized versions are ready for downstream use. With this data unlocked, you can use Azure AI Foundry service to build AI agents in Azure Copilot Studio, enable search using Azure AI Search or train your own ML models using Azure Machine Learning.
Visit Unlock unstructured data with Tonic Textual on Microsoft Fabric to learn more about this integration.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.