top of page

COMPARISON OF DIFFERENT LSTM MODELS ON UCI NEWS AGGREGATOR DATASET. 

WORK IMPLEMENTED: 

​

I implemented two simplidfied versions of LSTMs on the basis of parametric reduction and they are then compared with the base LSTM  at different activation functions and learning rates. The data is balanced and 10000 counts of news titles taken from each category. The model also uses a layer of bidirectional LSTM to increase efficiency. The two LSTM models are evaluated on the UCI News aggregator data set from kaggle.

CONCLUSION: 

​

Two   simplified   LSTM   models (LSTM10  and  LSTM11)  were implemented succesfully.

​

 It was found that  in  case  of  Sigmoid activation function  base  LSTM  performed  better  and  LSTM  10  and LSTM 11 and LSTM 10 outperforms LSTM11.

​

One important observation  was  that  the  model  was  overfitting  as  the  training accuracy  was  almost  higher  in  all  the  cases  when  compared  to the  validation  accuracy  (approx.  20%  higher).

​

This  issue  can be resolved by regularizing the hyperparameters or increasing the training data. To save some time only 10000 counts of news title was take from each category. 

Attached below is the formal IEEE format report for the project with proper drafted results.

github-logo.png
bottom of page