Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
kirah2128
Helper II
Helper II

OCR using Tesseract in fabric

Dear All,

 

Were you able to install tesseract in Fabric?

We want to analyze the data from a pdf but not all pdf is smart or they are low quality and found a solution that we can convert pdf to image and use tesseract to perform OCR.

 

Is there an existing document that can help me install and use it in fabric?

 

My error is this: TesseractNotFoundError: tesseract is not installed or it's not in your PATH. See README file for more information.

 

 

1 ACCEPTED SOLUTION
kirah2128
Helper II
Helper II

I have found another solution that will help you guys convert from PDF to image and image to OCR

pip install pytesseract
pip install easyocr

import easyocr
import pytesseract

reader = easyocr.Reader(['en'])
result = reader.readtext('/lakehouse/default/Files/img_ocr/page_1.png')

all_text = ' '.join([detection[1] for detection in result])
print("Joined Text:")
print(all_text)

This is the solution that works on fabric. 

View solution in original post

7 REPLIES 7
kirah2128
Helper II
Helper II

I have found another solution that will help you guys convert from PDF to image and image to OCR

pip install pytesseract
pip install easyocr

import easyocr
import pytesseract

reader = easyocr.Reader(['en'])
result = reader.readtext('/lakehouse/default/Files/img_ocr/page_1.png')

all_text = ' '.join([detection[1] for detection in result])
print("Joined Text:")
print(all_text)

This is the solution that works on fabric. 

Hi @kirah2128 

 

Glad that your query got resolved. Please continue using Fabric Community for any help regarding your queries.

v-cboorla-msft
Microsoft Employee
Microsoft Employee

Hi @kirah2128 

 

Thanks for the ask and using the Fabric Community.

At this time, we are reaching out to the internal team to get some help on this. We will update you once we hear back from them.

 

Appreciate your patience.
Thanks

Appreciate your support in escalating this to internal team. It would really help us to do it. 

Hi @kirah2128 

 

Apologies for the delay in response.
Follow up to see if you have a resolution yet on your issue.
In case if you have any resolution please do share that same with the community as it can be helpful to others. 
In case if you didn't get any resolution.

Please go ahead and raise a support ticket to reach our support team: https://support.fabric.microsoft.com/support

Please provide the ticket number here as we can keep an eye on it.

 

Thank you.

Hi, I still don't get a response from the community

 

Hoping the internal team can assist us to do it.

 

Hi @kirah2128 

 

Thanks for your response.

Apologise for the issue you are facing.

Leveraging python's pytesseract library which needs a linux compiled binary tesseract-ocr and other linux dependencies. These are not currently available in Fabric.

 

Appreciate your patience

Helpful resources

Announcements
Fabric July 2025 Monthly Update Carousel

Fabric Monthly Update - July 2025

Check out the July 2025 Fabric update to learn about new features.

July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.