Python SciKitLearn and Pandas categoric data -

July 15, 2014

i'm working on multivariable regression csv, predicting crop performance based on multiple factors. of columns numerical , meaningful. others numerical , categorical, or strings , categorical (for instance, crop variety, or plot code or whatever.) how teach python use them? i've found 1 hot encoder (http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.onehotencoder.html#sklearn.preprocessing.onehotencoder) don't understand how apply here.

my code far:

import pandas pd import statsmodels.api sm sklearn.preprocessing import standardscaler df = pd.read_csv('filepath.csv')  df.drop(df[df['labeleddatacolumn'].isnull()].index.tolist(),inplace=true)  scale = standardscaler()  pd.options.mode.chained_assignment = none  # default='warn' x = df[['inputcolumn1', 'inputcolumn2', ...,'inputcolumn20']] y = df['labeleddatacolumn']  x[['inputcolumn1', 'inputcolumn2', ...,'inputcolumn20']] = scale.fit_transform(x[['inputcolumn1', 'inputcolumn2', ...,'inputcolumn20']].as_matrix())  #print (x)  est = sm.ols(y, x).fit()  est.summary()

you use get_dummies function pandas provides , convert categorical values.

something this..

predictor = pd.concat([data.get(['numerical_column_1','numerical_column_2','label']),                            pd.get_dummies(data['categorical_column1'], prefix='categorical_col1'),                            pd.get_dummies(data['categorical_column2'], prefix='categorical_col2'),                           axis=1)

then outcome/label column doing

outcome = predictor['label'] del predictor['label']

then call model on data doing

est = sm.ols(outcome, predictor).fit()

Search This Blog

How Y

Python SciKitLearn and Pandas categoric data -

Comments

Post a Comment

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

reflection - How to access the object-members of an object declaration in kotlin -

php - Doctrine Query Builder Error on Join: [Syntax Error] line 0, col 87: Error: Expected Literal, got 'JOIN' -