Home > Forum > General > Reading tabular data from pdf

Reading tabular data from pdf
0

MVP

Hey! I've never worked with the OCR and AI Framework so I'm asking the following question: Is it possible to read the data as seen in the attached picture and save it to an Item List? If yes, what are the components that I would need? There is already a text layer in the pdf so there is no need to do the OCR. Any help would be much appreciated. Thank you, Martin

MVP

Hi Martin,

I haven't used the OCR, yes there are thinks I don't know. :)

If you really need this, it could be an option to use the Azure Cognitive Services to extract table data via REST:

Enhanced Table Extraction from documents with Form Recognizer
https://techcommunity.microsoft.com/t5/ai-cognitive-services-blog/enhanced-table-extraction-from-documents-with-form-recognizer/ba-p/2058011

Form Recognizer layout model
https://docs.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-layout

Supported languages
https://docs.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/language-support

Pricing, with a free tier subscription, only the first two pages are processed
https://azure.microsoft.com/en-us/pricing/details/form-recognizer/

REST API
https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1/operations/AnalyzeLayoutAsync

Best regards,
Daniel

Did you know that with WEBCON you can automate virtually any process? Even baking cookies 🍪
 
Speaking of cookies: we use the ones that are essential for our website to function properly, as well as additional ones that help us customize our content to your preferences. If you don’t mind cookies, click Accept. If you want to learn more, explore settings.
Settings