python - ParserError with panda read

python - ParserError with panda read_csv -

August 15, 2010

i'm trying read txt file different number of columns per row. here's beginning of file:

60381 6 1 0.270 0.30 0.30 0.70 0.70 4.988 4.988 4.988 4.988 4.988 4.988 4.988 4.988 4.988 4.988 4.988 4.988 2 0.078 0.30 0.30 0.70 0.70 5.387 5.312 5.338 4.463 4.675 4.275 4.238 3.562 3.175 3.925 4.950 4.762 6 0.241 0.30 0.60 0.70 0.40 3.700 3.200 2.738 2.325 1.250 0.975 1.175 1.950 2.488 3.613 3.987 3.950 7 0.357 0.30 0.60 0.70 0.40 1.212 1.125 1.050 0.950 0.663 0.488 0.425 0.512 0.637 0.900 1.112 1.188 8 0.031 0.30 0.70 0.70 0.30 0.225 0.213 0.200 0.175 0.200 0.213 0.375 0.887 0.975 0.512 0.262 0.262 10 0.022 0.30 0.80 0.70 0.20 0.712 0.700 0.738 0.550 0.513 0.688 0.613 0.600 0.850 0.812 0.800 0.775 60382 5 6 0.197 0.30 0.60 0.70 0.40 3.700 3.200 2.738 2.325 1.250 0.975 1.175 1.950 2.488 3.613 3.987 3.950 7 0.413 0.30 0.60 0.70 0.40 1.212 1.125 1.050 0.950 0.663 0.488 0.425 0.512 0.637 0.900 1.112 1.188 8 0.016 0.30 0.70 0.70 0.30 0.225 0.213 0.200 0.175 0.200 0.213 0.375 0.887 0.975 0.512 0.262 0.262 10 0.111 0.30 0.80 0.70 0.20 0.712 0.700 0.738 0.550 0.513 0.688 0.613 0.600 0.850 0.812 0.800 0.775 11 0.263 0.30 0.50 0.70 0.50 1.812 1.388 1.087 0.825 0.538 0.400 0.338 0.400 0.500 0.925 0.962 1.100

i've tried using pandas read_csv read it:

import pandas pd data = pd.read_csv('./myfile.txt',header=none,sep='\s')

which gives following error:

parsererror: expected 6 fields in line 3, saw 12. error possibly due quotes being ignored when multi-char delimiter used.

so file doesn't have multi-char delimiter or quotation marks. i've tried solution found in forum, suggested using:

data = pd.read_csv(open('./myfile.txt','r'), header=none,encoding='utf-8', engine='c')

although solves error above, array i'm presented not use space delimiter of columns, , output has 1 column:

how should read file in order column each value? don't mind if there nan values fill rest.

if you've managed data in single column, can use series.str.split() workaround issue.

here example sample data provided (you can use string or regex delimiter in split()) :

df[0].str.split(' ', expand=true)       0      1      2      3      4      5      6      7      8      9   \ 0  0.270   0.30   0.30   0.70   0.70   none   none   none   none   none 1  4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988

if this, might create dataframe pd.dataframe(open(...).readlines()) or that, since don't benefit @ read_csv(), , file isn't standard csv file.

# f stringio of sample data simulate file df = pd.dataframe(line.strip().split(' ') line in f)         0      1      2      3      4      5      6      7      8      9   \ 0   60381      6   none   none   none   none   none   none   none   none 1       1  0.270   0.30   0.30   0.70   0.70   none   none   none   none 2   4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988 3       2  0.078   0.30   0.30   0.70   0.70   none   none   none   none 4   5.387  5.312  5.338  4.463  4.675  4.275  4.238  3.562  3.175  3.925

of course, can fix input file making sure every line contains same number of columns, solve parsererror issue.

Search This Blog

How Y

python - ParserError with panda read_csv -

Comments

Post a Comment

Popular posts from this blog

meteor - inserting data to database gives error "insert failed: Method '/texts/insert' not found" -

angular - DownloadURL return null in below code -

html - unterminated string literal “onclick” event in anchor -