Home > Forum > General > Reading tabular data from pdf

Reading tabular data from pdf
0

MVP

Hey! I've never worked with the OCR and AI Framework so I'm asking the following question: Is it possible to read the data as seen in the attached picture and save it to an Item List? If yes, what are the components that I would need? There is already a text layer in the pdf so there is no need to do the OCR. Any help would be much appreciated. Thank you, Martin

MVP

Hi Martin,

I haven't used the OCR, yes there are thinks I don't know. :)

If you really need this, it could be an option to use the Azure Cognitive Services to extract table data via REST:

Enhanced Table Extraction from documents with Form Recognizer
https://techcommunity.microsoft.com/t5/ai-cognitive-services-blog/enhanced-table-extraction-from-documents-with-form-recognizer/ba-p/2058011

Form Recognizer layout model
https://docs.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-layout

Supported languages
https://docs.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/language-support

Pricing, with a free tier subscription, only the first two pages are processed
https://azure.microsoft.com/en-us/pricing/details/form-recognizer/

REST API
https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1/operations/AnalyzeLayoutAsync

Best regards,
Daniel