COMPARISON OF DIFFERENT LSTM MODELS ON UCI NEWS AGGREGATOR DATASET.
WORK IMPLEMENTED:
​
I implemented two simplidfied versions of LSTMs on the basis of parametric reduction and they are then compared with the base LSTM at different activation functions and learning rates. The data is balanced and 10000 counts of news titles taken from each category. The model also uses a layer of bidirectional LSTM to increase efficiency. The two LSTM models are evaluated on the UCI News aggregator data set from kaggle.
CONCLUSION:
​
Two simplified LSTM models (LSTM10 and LSTM11) were implemented succesfully.
​
It was found that in case of Sigmoid activation function base LSTM performed better and LSTM 10 and LSTM 11 and LSTM 10 outperforms LSTM11.
​
One important observation was that the model was overfitting as the training accuracy was almost higher in all the cases when compared to the validation accuracy (approx. 20% higher).
​
This issue can be resolved by regularizing the hyperparameters or increasing the training data. To save some time only 10000 counts of news title was take from each category.
Attached below is the formal IEEE format report for the project with proper drafted results.