The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Dear All,
Were you able to install tesseract in Fabric?
We want to analyze the data from a pdf but not all pdf is smart or they are low quality and found a solution that we can convert pdf to image and use tesseract to perform OCR.
Is there an existing document that can help me install and use it in fabric?
My error is this: TesseractNotFoundError: tesseract is not installed or it's not in your PATH. See README file for more information.
Solved! Go to Solution.
I have found another solution that will help you guys convert from PDF to image and image to OCR
pip install pytesseract
pip install easyocr
import easyocr
import pytesseract
reader = easyocr.Reader(['en'])
result = reader.readtext('/lakehouse/default/Files/img_ocr/page_1.png')
all_text = ' '.join([detection[1] for detection in result])
print("Joined Text:")
print(all_text)
This is the solution that works on fabric.
I have found another solution that will help you guys convert from PDF to image and image to OCR
pip install pytesseract
pip install easyocr
import easyocr
import pytesseract
reader = easyocr.Reader(['en'])
result = reader.readtext('/lakehouse/default/Files/img_ocr/page_1.png')
all_text = ' '.join([detection[1] for detection in result])
print("Joined Text:")
print(all_text)
This is the solution that works on fabric.
Hi @kirah2128
Glad that your query got resolved. Please continue using Fabric Community for any help regarding your queries.
Hi @kirah2128
Thanks for the ask and using the Fabric Community.
At this time, we are reaching out to the internal team to get some help on this. We will update you once we hear back from them.
Appreciate your patience.
Thanks
Appreciate your support in escalating this to internal team. It would really help us to do it.
Hi @kirah2128
Apologies for the delay in response.
Follow up to see if you have a resolution yet on your issue.
In case if you have any resolution please do share that same with the community as it can be helpful to others.
In case if you didn't get any resolution.
Please go ahead and raise a support ticket to reach our support team: https://support.fabric.microsoft.com/support
Please provide the ticket number here as we can keep an eye on it.
Thank you.
Hi, I still don't get a response from the community
Hoping the internal team can assist us to do it.
Hi @kirah2128
Thanks for your response.
Apologise for the issue you are facing.
Leveraging python's pytesseract library which needs a linux compiled binary tesseract-ocr and other linux dependencies. These are not currently available in Fabric.
Appreciate your patience
User | Count |
---|---|
20 | |
17 | |
6 | |
2 | |
2 |
User | Count |
---|---|
51 | |
49 | |
17 | |
6 | |
4 |