writing .txt to .csv excel columns in Python -
i have rather large text file multiple columns must convert 15 column .csv file read in excel. logic parsing fields need written out below, having trouble writing .csv.
columns = [ 'transactn_nbr', 'record_nbr', 'sequence_or_pic_nbr', 'cr_db', 'rt_nbr', 'account_nbr', 'rsn_cod', 'item_amount', 'item_serial', 'chn_ind', 'reason_descr', 'seq2', 'archive_date', 'archive_time', 'on_us_ind' ] line in in_file: values = line.split() if 'print date:' in line: dtevalue = line.split(a,1)[-1].split(b)[0] lines.append(dtevalue) elif 'print time:' in line: timevalue = line.split(c,1)[-1].split(b)[0] lines.append(timevalue) elif (len(values) >= 4 , values[3] == 'c' , len(values[2]) >= 2 , values[2][:2] == '41'): print(values) elif (len(values) >= 4 , values[3] == 'd' , values[4] in rtnbr): on_us = '1' else: on_us = '0' print (lines[0]) print (lines[1])
i have tried csv module parsed rows written in 12 columns , not find way write date , time (parsed separately) in columns after each row looking @ pandas package have seen ways extract patterns, wouldn't work established parsed criteria
is there way write csv using above criteria? or have scrap , rewrite code within specific package? appreciated
edit: text file sample:
* start ******************************************************************************************************************** start * * start ******************************************************************************************************************** start * * start ******************************************************************************************************************** start * 1-------------------- 1antecr09 chek dpck_r_009 transit extract sub-system current date = 08/03/2017 journal report page 1 process date = id = 022000046-mnt file header = h080320171115 +____________________________________________________________________________________________________________________________________ r t sequence cr bt rsn item item chn user reaso nbr nbr or pic nbr db nbr nbr cod amount serial ind .......field.. descr 5,556 01 7450282689 c 538196640 9835177743 15 $9,064.81 00 credit 5,557 01 7450282690 d 031301422 362313705 38 $592.35 43431 dr cr 5,558 01 7450282691 d 021309379 601298839 38 $1,491.04 44896 dr cr 5,559 01 7450282692 d 071108834 176885 38 $6,688.00 1454 dr cr 5,560 01 7450282693 d 031309123 1390001566241 38 $293.42 6878 dr cr -------------------- 34,615 207 4100223726 c 538196620 9866597322 10 $645.49 00 credit 34,616 207 4100223727 d 022000046 8891636675 31 $645.49 111583 dr on- -------------------- 34,617 208 4100223728 c 538196620 11701364 10 $756.19 00 credit 34,618 208 4100223729 d 071923828 00 54 $305.31 11384597 bad ac 34,619 208 4100223730 d 071923828 35110011 30 $450.88 10913052 6 dr sel --------------------
desired output: looking @ lines containing seq starting 42, contains c
1293 83834 4100225908 c 538196620 9860890913 10 161.5 0 credit 41 3-aug-17 11:15:51 1294 83838 4100225911 c 538196620 25715845 10 138 0 credit 41 3-aug-17 11:15:51
look @ ‘pandas‘ package, more class dataframe. little cleverness ought able read table using ‘pandas.read_table()‘ returns dataframe can output csv ‘to_csv()‘ 2 line solution. you’ll need @ docs find parameters you’ll need read table format, should little easier doing manually.
Comments
Post a Comment