python - PDF Parsers and PyPi versus GitHub Solutions -


i aware parsing pdf awful time sink , of end of pdf's in binary. however, since i'm going try , anyway, have questions parser information i'm finding online.

first, pypi version of pdfminer hasn't been updated since march of 2014. know there stable versions , development versions, more 3 years of quiet made me complete replacement. after looking, can't find parser recommended more pdfminer. (please correct me, if wrong.)

second, used pypdf2 awhile. works fine, i'd prefer have image manipulation capabilities, line , pages numbers returned, , i've sunk time word separation without strong results.

third, looked @ several versions of pdfminer on github still seeing contributions. i'm finding promising version euske, however, haven't used many of larger github programs. there ways tell if mature/well done project before making substantial time investment?


Comments

Popular posts from this blog

What is happening when Matlab is starting a "parallel pool"? -

angular - DownloadURL return null in below code -

php - Cannot override Laravel Spark authentication with own implementation -