python - Pandas dataframe cannot convert columns datatype from object to string for further operation -
this working code, downloading excel file website. takes 40 seconds.
once run code, notice key1, key2 , key3 columns object dtypes. cleaned dataframe such key1 , key2 have alphanumeric values. still pandas keeping object dtype. need concatenate (as in ms excel) key1 , key2 create separate column called deviceid. realize cannot join 2 columns since object dtypes. how convert string can create new column?
import pandas pd import urllib.request import time start=time.time() url="https://www.misoenergy.org/library/repository/market%20reports/20170816_da_bcsf.xls" cnstsfxls = urllib.request.urlopen(url) xlsf = pd.excelfile(cnstsfxls) dfsf = xlsf.parse("sheet1",skiprows=3) dfsf.drop(dfsf.index[len(dfsf)-1],inplace=true) dfsf.drop(dfsf[dfsf['device type'] == 'un'].index, inplace=true) dfsf.drop(dfsf[dfsf['device type'] == 'unknown'].index, inplace=true) dfsf.drop(['constraint name','contingency name', 'constraint type','flowgate name'],axis=1, inplace=true) end=time.time() print("the entire process took - ", end-start, " seconds.")
i may missing point here. if want construct column where, example, deviceid = rch417
when key1 = rch
, key2 = 417
, dfsf['deviceid'] = dfsf['key1'] + dfsf['key2']
work fine though both columns of type object.
try this:
# check value types dfsf.dtypes # add desired column dfsf['deviceid'] = dfsf['key1'] + dfsf['key2'] # inspect columns of interest keep = ['key1', 'key2', 'deviceid'] df_keys = dfsf[keep] print(df_keys.dtypes)
print(df_keys.head())
Comments
Post a Comment