Abstract
Automatic text summarization is indispensable for navigating today’s information overload, yet many organisations cannot afford to train or serve the latest Transformer models. Although additive (Bahdanau) and dot-product (Luong) attention remain the cornerstones of recurrent sequence-to-sequence (Seq2Seq) summarizers, the literature still lacks a controlled, head-to-head comparison of these mechanisms under identical conditions. We therefore built twin LSTM-based Seq2Seq systems that differed only in the attention layer and trained them on an 80 k/20 k article–title split of the Gigaword corpus, using 100-dimensional GloVe embeddings and the Adam optimiser with scheduled-decay teacher forcing, dropout and gradient clipping. Model quality was monitored through validation loss and ROUGE-1/-2/-L F1, complemented by visual inspection of attention heat-maps. The Luong variant converged in fewer epochs, achieved lower validation loss and outperformed the Bahdanau model on all ROUGE metrics, delivering summaries that were consistently more coherent and less repetitive. The additive model remained competitive but required more intensive hyper-parameter tuning and occasionally omitted salient details. Together, these results show that dot-product alignment offers the best accuracy-to-cost trade-off for resource-constrained abstractive summarization and provides a rigorous baseline for future work that augments Seq2Seq designs or benchmarks them against lightweight Transformer architectures.
Keywords: deep neural networks, natural language processing, automatic text summarization, recurrent neural networks, attention mechanism