python - Sklearn | LinearRegression | Fit -


i'm having few issues linearregression algorithm in scikit learn - have trawled through forums , googled lot, reason, haven't managed bypass error. using python 3.5

below i've attempted, keep getting value error:"found input variables inconsistent numbers of samples: [403, 174]"

x = df[["impressions", "clicks", "eligible_impressions", "measureable_impressions", "viewable_impressions"]].values  y = df["total_conversions"].values.reshape(-1,1)  print ("the shape of x {}".format(x.shape)) print ("the shape of y {}".format(y.shape))  shape of x (577, 5) shape of y (577, 1)  x_train, y_train, x_test, y_test = train_test_split(x, y, test_size=0.3, random_state = 42) linreg = linearregression() linreg.fit(x_train, y_train) y_pred = linreg.predict(x_test) print (y_pred)  print ("the shape of x_train {}".format(x_train.shape)) print ("the shape of y_train {}".format(y_train.shape)) print ("the shape of x_test {}".format(x_test.shape)) print ("the shape of y_test {}".format(y_test.shape))  shape of x_train (403, 5) shape of y_train (174, 5) shape of x_test (403, 1) shape of y_test (174, 1) 

am missing glaringly obvious?

any appreciated.

kind regards, adrian

looks train , tests contain different number of rows x , y. , because you're storing return values of train_test_split() in incorrect order

change this

x_train, y_train, x_test, y_test = train_test_split(x, y, test_size=0.3, random_state = 42) 

to this

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state = 42) 

Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -