Downloading a file from the internet with python -


i'm trying retrieve csv data website through link.

when downloaded manually synop.201708.csv.gz in fact csv wrongly named .gz, weights 2233kb

when running code :

import urllib  file_date = '201708' file_url = "https://donneespubliques.meteofrance.fr/donnees_libres/txt/synop/archive/synop.{}.csv.gz".format(file_date) output_file_name = "{}.csv.gz".format(file_date)  print "downloading {} {}".format(file_url, output_file_name) urllib.urlretrieve (file_url, output_file_name) 

i'm getting corrupted ~361kb file

any ideas why?

what seems happening météofrance site misusing content-encoding header. website reports serving gzip file (content-type: application/x-gzip) , encoding in gzip format transfer (content-encoding: x-gzip). saying page attachment, should saved under normal name (content-disposition: attachment)

in vacuum, make sense (to degree; compressing compressed file useless): server serves gzip file , compresses again transport. upon receipt, browser undoes transport compression , saves original gzip file. here, decompresses stream, since wasn't compressed again, doesn't work expected.


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

c# - Asp.net web api : redirect unauthorized requst to forbidden page -