tensorflow - Why attention mechanism, pretrained word vector not improve the seq2seq model performance? -


i'm build chatbot seq2seq model in tensorflow. , it's worked. model using lstm no attention. training set chat log, has 350k ask , 350k answer.

then try improve performance of it, want smaller loss. had try add these changes model. 1. add attention mechanism 2. using bidirection lstm 3. using pretrained word vector trained training set.

in same hyparameters, tried add each change model alone, or add 3 changes together, none of performance better original. can approximate original performance of 80% -90%.

so what’s wrong it, how can make these changes improve network performance?

thanks.


Comments