Home > Forum > General > Reading tabular data from pdf

Reading tabular data from pdf

0

04.06.2022

11:14

Hey! I've never worked with the OCR and AI Framework so I'm asking the following question: Is it possible to read the data as seen in the attached picture and save it to an Item List? If yes, what are the components that I would need? There is already a text layer in the pdf so there is no need to do the OCR. Any help would be much appreciated. Thank you, Martin

08.06.2022

23:03

Hi Martin,

I haven't used the OCR, yes there are thinks I don't know. :)

If you really need this, it could be an option to use the Azure Cognitive Services to extract table data via REST:

Enhanced Table Extraction from documents with Form Recognizer
https://techcommunity.microsoft.com/t5/ai-cognitive-services-blog/enhanced-table-extraction-from-documents-with-form-recognizer/ba-p/2058011

Form Recognizer layout model
https://docs.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-layout

Supported languages
https://docs.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/language-support

Pricing, with a free tier subscription, only the first two pages are processed
https://azure.microsoft.com/en-us/pricing/details/form-recognizer/

REST API
https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1/operations/AnalyzeLayoutAsync

Best regards,
Daniel

Solution (0)

Helpful (1)

Wrong answer (0)

26.07.2022

10:49

In reply to: Daniel Krüger (Cosmo Consult)

Hey Daniel,

Thank you for the links. The Form Recognizer does a perfect job extracting the data from a table. Also the pricing looks reasonable.

Again, thank you so much!

Cheers,
Martin

Solution (0)

Helpful (0)

Wrong answer (0)

Privacy overview