Reading Excel file in Google Cloud Dataflow from file stored in GCS -


currently google cloud dataflow not support reading excel file google cloud storage.

for work around, tried below things,

i trying use app engine tools read excel file , used apache poi , trying convert excel file csv file.

below code sample used,

gcsservice gcsservice = gcsservicefactory.creategcsservice(); gcsfilename filename = new gcsfilename("testbucket", "test1.xlsx"); gcsinputchannel readchannel = gcsservice.openprefetchingreadchannel(filename, 0, buffer_size); inputstream inputstream = channels.newinputstream(readchannel);

and added apache poi read inputstream

xssfworkbook workbook = new xssfworkbook(inputstream); xssfworkbook workbook = new xssfworkbook(new fileinputstream(inputfile)); xssfsheet sheet = workbook.getsheetat(0);

but received below error,

aug 17, 2017 6:58:35 pm com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl waitforfetch warning: prefetchinggcsinputchannelimpl [filename=gcsfilename(testbucket, test1.xlsx), blocksizebytes=2048, closed=false, eofhit=false, length=-1, fetchposition=0, pendingfetch=com.google.common.util.concurrent.futures$immediatefailedfuture@7770f470, retryparams=retryparams [requesttimeoutmillis=30000, requesttimeoutretryfactor=1.2, maxrequesttimeout=60000, retryminattempts=3, retrymaxattempts=6, initialretrydelaymillis=1000, maxretrydelaymillis=32000, retrydelaybackofffactor=2.0, totalretryperiodmillis=50000]]: ioexception fetching block java.util.concurrent.executionexception: java.io.ioexception: java.lang.nullpointerexception @ com.google.common.util.concurrent.futures$immediatefailedfuture.get(futures.java:234) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl.waitforfetch(prefetchinggcsinputchannelimpl.java:152) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl.access$000(prefetchinggcsinputchannelimpl.java:43) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl$1.call(prefetchinggcsinputchannelimpl.java:136) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl$1.call(prefetchinggcsinputchannelimpl.java:134) @ com.google.appengine.tools.cloudstorage.retryhelper.doretry(retryhelper.java:108) @ com.google.appengine.tools.cloudstorage.retryhelper.runwithretries(retryhelper.java:166) @ com.google.appengine.tools.cloudstorage.retryhelper.runwithretries(retryhelper.java:156) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl.waitforfetchwithretry(prefetchinggcsinputchannelimpl.java:134) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl.read(prefetchinggcsinputchannelimpl.java:212) @ sun.nio.ch.channelinputstream.read(unknown source) @ sun.nio.ch.channelinputstream.read(unknown source) @ sun.nio.ch.channelinputstream.read(unknown source) @ java.io.filterinputstream.read(unknown source) @ java.io.pushbackinputstream.read(unknown source) @ java.util.zip.zipinputstream.readfully(unknown source) @ java.util.zip.zipinputstream.readloc(unknown source) @ java.util.zip.zipinputstream.getnextentry(unknown source) @ org.apache.poi.openxml4j.util.zipinputstreamzipentrysource.<init>(zipinputstreamzipentrysource.java:51) @ org.apache.poi.openxml4j.opc.zippackage.<init>(zippackage.java:83) @ org.apache.poi.openxml4j.opc.opcpackage.open(opcpackage.java:267) @ org.apache.poi.util.packagehelper.open(packagehelper.java:39) @ org.apache.poi.xssf.usermodel.xssfworkbook.<init>(xssfworkbook.java:204) @ chalel.paratchalel.main(paratchalel.java:102) caused by: java.io.ioexception: java.lang.nullpointerexception @ com.google.appengine.tools.cloudstorage.dev.localrawgcsservice$blobstorageadapter.getinstance(localrawgcsservice.java:186) @ com.google.appengine.tools.cloudstorage.dev.localrawgcsservice$blobstorageadapter.access$000(localrawgcsservice.java:109) @ com.google.appengine.tools.cloudstorage.dev.localrawgcsservice.ensureinitialized(localrawgcsservice.java:194) @ com.google.appengine.tools.cloudstorage.dev.localrawgcsservice.readobjectasync(localrawgcsservice.java:432) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl.requestblock(prefetchinggcsinputchannelimpl.java:107) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl.<init>(prefetchinggcsinputchannelimpl.java:88) @ com.google.appengine.tools.cloudstorage.gcsserviceimpl.openprefetchingreadchannel(gcsserviceimpl.java:126) @ chalel.paratchalel.main(paratchalel.java:91) caused by: java.lang.nullpointerexception @ com.google.appengine.tools.cloudstorage.dev.localrawgcsservice$blobstorageadapter.<init>(localrawgcsservice.java:123) @ com.google.appengine.tools.cloudstorage.dev.localrawgcsservice$blobstorageadapter.getinstance(localrawgcsservice.java:184) ... 7 more  aug 17, 2017 6:58:35 pm com.google.appengine.tools.cloudstorage.retryhelper doretry info: retryhelper(44.11 ms, 1 attempts, com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl$1@7bedc48a): attempt #1 failed [java.io.ioexception: prefetchinggcsinputchannelimpl [filename=gcsfilename(testbucket, test1.xlsx), blocksizebytes=2048, closed=false, eofhit=false, length=-1, fetchposition=0, pendingfetch=com.google.common.util.concurrent.futures$immediatefailedfuture@77f1baf5, retryparams=retryparams [requesttimeoutmillis=30000, requesttimeoutretryfactor=1.2, maxrequesttimeout=60000, retryminattempts=3, retrymaxattempts=6, initialretrydelaymillis=1000, maxretrydelaymillis=32000, retrydelaybackofffactor=2.0, totalretryperiodmillis=50000]]: prefetch failed, prefetching again], sleeping 1146 ms aug 17, 2017 6:58:36 pm com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl waitforfetch warning: prefetchinggcsinputchannelimpl [filename=gcsfilename(testbucket, test1.xlsx), blocksizebytes=2048, closed=false, eofhit=false, length=-1, fetchposition=0, pendingfetch=com.google.common.util.concurrent.futures$immediatefailedfuture@77f1baf5, retryparams=retryparams [requesttimeoutmillis=30000, requesttimeoutretryfactor=1.2, maxrequesttimeout=60000, retryminattempts=3, retrymaxattempts=6, initialretrydelaymillis=1000, maxretrydelaymillis=32000, retrydelaybackofffactor=2.0, totalretryperiodmillis=50000]]: ioexception fetching block java.util.concurrent.executionexception: java.io.ioexception: java.lang.nullpointerexception @ com.google.common.util.concurrent.futures$immediatefailedfuture.get(futures.java:234) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl.waitforfetch(prefetchinggcsinputchannelimpl.java:152) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl.access$000(prefetchinggcsinputchannelimpl.java:43) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl$1.call(prefetchinggcsinputchannelimpl.java:136) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl$1.call(prefetchinggcsinputchannelimpl.java:134) @ com.google.appengine.tools.cloudstorage.retryhelper.doretry(retryhelper.java:108) @ com.google.appengine.tools.cloudstorage.retryhelper.runwithretries(retryhelper.java:166) @ com.google.appengine.tools.cloudstorage.retryhelper.runwithretries(retryhelper.java:156) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl.waitforfetchwithretry(prefetchinggcsinputchannelimpl.java:134) @ com.google.appengine.tools.cloudstorage.prefetchinggcsinputchannelimpl.read(prefetchinggcsinputchannelimpl.java:212) @ sun.nio.ch.channelinputstream.read(unknown source) @ sun.nio.ch.channelinputstream.read(unknown source) @ sun.nio.ch.channelinputstream.read(unknown source) @ java.io.filterinputstream.read(unknown source) @ java.io.pushbackinputstream.read(unknown source) @ java.util.zip.zipinputstream.readfully(unknown source) @ java.util.zip.zipinputstream.readloc(unknown source) @ java.util.zip.zipinputstream.getnextentry(unknown source) @ org.apache.poi.openxml4j.util.zipinputstreamzipentrysource.<init>(zipinputstreamzipentrysource.java:51) @ org.apache.poi.openxml4j.opc.zippackage.<init>(zippackage.java:83) @ org.apache.poi.openxml4j.opc.opcpackage.open(opcpackage.java:267) @ org.apache.poi.util.packagehelper.open(packagehelper.java:39) @ org.apache.poi.xssf.usermodel.xssfworkbook.<init>(xssfworkbook.java:204) @ chalel.paratchalel.main(paratchalel.java:102) 

note: have added below dependencies in pom.xml

<!-- https://mvnrepository.com/artifact/org.apache.poi/poi-ooxml --> <dependency> <groupid>org.apache.poi</groupid> <artifactid>poi-ooxml</artifactid> <version>3.9</version> </dependency>  <!-- https://mvnrepository.com/artifact/com.google.appengine.tools/appengine-gcs-client --> <dependency> <groupid>com.google.appengine.tools</groupid> <artifactid>appengine-gcs-client</artifactid> <version>0.6</version> </dependency> 

what can issue?


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -