java - Extract textual data from a PDF -


i'm using java program extract textual data pdf.

when use type of pdf have no problem :

enter image description here

but when use type extraction not performed :

enter image description here

have idea resolve problem?

try using itext7 , following code:

file inputfile = new file("path_to_your_pdf"); pdfdocument pdfdocument = new pdfdocument(new pdfreader(inputfile)); string text = pdftextextractor.gettextfrompage(pdfdocument.getpage(1)); pdfdocument.close(); 

and let know output is. , whether output corresponds you'd expect.

as @mkl points out, may the difference between extracting form-fields or not. in case, links pdfs appreciated. code.

but can of course extract both using itext.

reading material:


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

Qt QGraphicsScene is not accessable from QGraphicsView (on Qt 5.6.1) -

What is happening when Matlab is starting a "parallel pool"? -