python - machine learning-how to use the past 20 rows as an input for X for each Y value -
i have simple machine learning code here:
# load dataset dataframe = pandas.read_csv("usdjpy,5.csv", header=none) dataset = dataframe.values x = dataset[:,0:59] y = dataset[:,59] #fit dense keras model model.fit(x, y, validation_data=(x,y_test), epochs=150, batch_size=10) my x values 59 features 60th column being y value, simple 1 or 0 classification label.
considering using financial data, lookback past 20 x values in order predict y value.
so how make algorithm use past 20 rows input x each y value?
i'm relatively new machine learning , spent time looking online solution problem yet not find simple case.
any ideas?
this typically done recurrent neural networks (rnn), retain memory of previous input, when next input received. thats breif explanation of goes on, there plenty of sources on internet better wrap understanding of how work.
lets break down in simple example. lets 5 samples , 5 features of data, , want 2 stagger data 2 rows instead of 20. here data (assuming 1 stock , oldest price value first). , can think of each row day of week
ar = np.random.randint(10,100,(5,5)) [[43, 79, 67, 20, 13], #<---monday--- [80, 86, 78, 76, 71], #<---tuesday--- [35, 23, 62, 31, 59], #<---wednesday--- [67, 53, 92, 80, 15], #<---thursday--- [60, 20, 10, 45, 47]] #<---firday--- to use lstm in keras, data needs 3-d, vs current 2-d structure now, , notation each diminsion (samples,timesteps,features). have (samples,features) need augment data.
a2 = np.concatenate([ar[x:x+2,:] x in range(ar.shape[0]-1)]) a2 = a2.reshape(4,2,5) [[[43, 79, 67, 20, 13], #see monday first [80, 86, 78, 76, 71]], #see tuesday second ---> predict value set tuesday [[80, 86, 78, 76, 71], #see tuesday first [35, 23, 62, 31, 59]], #see wednesday second ---> predict value set wednesday [[35, 23, 62, 31, 59], #see wednesday value first [67, 53, 92, 80, 15]], #see thursday values second ---> predict value set thursday [[67, 53, 92, 80, 15], #and on [60, 20, 10, 45, 47]]]) notice how data staggered , 3 dimensional. make lstm network. y remains 2-d since many-to-one structure, need clip first value.
model = sequential() model.add(lstm(hidden_dims,input_shape=(a2.shape[1],a2.shape[2])) model.add(dense(1)) this brief example moving. there many different setups work (including not using rnn), need find correct 1 data.
Comments
Post a Comment