java - Extract textual data from a PDF -


i'm using java program extract textual data pdf.

when use type of pdf have no problem :

enter image description here

but when use type extraction not performed :

enter image description here

have idea resolve problem?

try using itext7 , following code:

file inputfile = new file("path_to_your_pdf"); pdfdocument pdfdocument = new pdfdocument(new pdfreader(inputfile)); string text = pdftextextractor.gettextfrompage(pdfdocument.getpage(1)); pdfdocument.close(); 

and let know output is. , whether output corresponds you'd expect.

as @mkl points out, may the difference between extracting form-fields or not. in case, links pdfs appreciated. code.

but can of course extract both using itext.

reading material:


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -