An up-to-date list is available on Google Scholar.
2023
-
Few-shot Unified Question Answering: Tuning Models or Prompts?
Srijan Bansal, Semih Yavuz, Bo Pang, Meghana Bhat, and 1 more author
arXiv preprint arXiv:2305.14569, 2023
Question-answering (QA) tasks often investigate specific question types, knowledge domains, or reasoning skills, leading to specialized models catering to specific categories of QA tasks. While recent research has explored the idea of unified QA models, such models are usually explored for high-resource scenarios and require re-training to extend their capabilities. To overcome these drawbacks, the paper explores the potential of two paradigms of tuning, model, and prompts, for unified QA under a low-resource setting. The paper provides an exhaustive analysis of their applicability using 16 QA datasets, revealing that prompt tuning can perform as well as model tuning in a few-shot setting with a good initialization. The study also shows that parameter-sharing results in superior few-shot performance, simple knowledge transfer techniques for prompt initialization can be effective, and prompt tuning achieves a significant performance boost from pre-training in a low-resource regime. The research offers insights into the advantages and limitations of prompt tuning for unified QA in a few-shot setting, contributing to the development of effective and efficient systems in low-resource scenarios.
2022
-
PRO-CS : An Instance-Based Prompt Composition Technique for Code-Switched Tasks
Srijan Bansal, Suraj Tripathi, Sumit Agarwal, Teruko Mitamura, and 1 more author
In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Code-switched (CS) data is ubiquitous in today’s globalized world, but the dearth of annotated datasets in code-switching poses a significant challenge for learning diverse tasks across different language pairs. Parameter-efficient prompt-tuning approaches conditioned on frozen language models have shown promise for transfer learning in limited-resource setups. In this paper, we propose a novel instance-based prompt composition technique, PRO-CS, for CS tasks that combine language and task knowledge. We compare our approach with prompt-tuning and fine-tuning for code-switched tasks on 10 datasets across 4 language pairs. Our model outperforms the prompt-tuning approach by significant margins across all datasets and outperforms or remains at par with fine-tuning by using just 0.18% of total parameters. We also achieve competitive results when compared with the fine-tuned model in the low-resource cross-lingual and cross-task setting, indicating the effectiveness of our approach to incorporate new code-switched tasks.
-
R3 : Refined Retriever-Reader pipeline for Multidoc2dial
Srijan Bansal, Suraj Tripathi, Sumit Agarwal, Sireesh Gururaja, and 4 more authors
In Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, May 2022
In this paper, we present our submission to the DialDoc shared task based on the MultiDoc2Dial dataset. MultiDoc2Dial is a conversational question answering dataset that grounds dialogues in multiple documents. The task involves grounding a user’s query in a document followed by generating an appropriate response. We propose several improvements over the baseline’s retriever-reader architecture to aid in modeling goal-oriented dialogues grounded in multiple documents. Our proposed approach employs sparse representations for passage retrieval, a passage re-ranker, the fusion-in-decoder architecture for generation, and a curriculum learning training paradigm. Our approach shows a 12 point improvement in BLEU score compared to the baseline RAG model.
2021
-
Debiasing Multilingual Word Embeddings: A Case Study of Three Indian Languages
Srijan Bansal, Vishal Garimella, Ayush Suhane, and Animesh Mukherjee
In Proceedings of the 32nd ACM Conference on Hypertext and Social Media, 2021
In this paper, we advance the current state-of-the-art method for debiasing monolingual word embeddings so as to generalize well in a multilingual setting. We consider different methods to quantify bias and different debiasing approaches for monolingual as well as multilingual settings. We demonstrate the significance of our bias-mitigation approach on downstream NLP applications. Our proposed methods establish the state-of-the-art performance for debiasing multilingual embeddings for three Indian languages - Hindi, Bengali, and Telugu in addition to English. We believe that our work will open up new opportunities in building unbiased downstream NLP applications that are inherently dependent on the quality of the word embeddings used.
2020
-
Code-Switching Patterns Can Be an Effective Route to Improve Performance of Downstream NLP Applications: A Case Study of Humour, Sarcasm and Hate Speech Detection
Srijan Bansal, Vishal Garimella, Ayush Suhane, Jasabanta Patro, and 1 more author
In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020
In this paper, we demonstrate how code-switching patterns can be utilised to improve various downstream NLP applications. In particular, we encode various switching features to improve humour, sarcasm and hate speech detection tasks. We believe that this simple linguistic observation can also be potentially helpful in improving other similar NLP applications.
2019
-
A deep-learning framework to detect sarcasm targets
Jasabanta Patro, Srijan Bansal, and Animesh Mukherjee
In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
In this paper we propose a deep learning framework for sarcasm target detection in predefined sarcastic texts. Identification of sarcasm targets can help in many core natural language processing tasks such as aspect based sentiment analysis, opinion mining etc. To begin with, we perform an empirical study of the socio-linguistic features and identify those that are statistically significant in indicating sarcasm targets (p-values in the range(0.05,0.001)). Finally, we present a deep-learning framework augmented with socio-linguistic features to detect sarcasm targets in sarcastic book-snippets and tweets.We achieve a huge improvement in the performance in terms of exact match and dice scores compared to the current state-of-the-art baseline.
-
Can Siamese Networks Help in Stance Detection?
T. Y.S.S. Santosh, Srijan Bansal, and Avirup Saha
In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, 2019
An important component of fake news detection is to evaluate the stance, different news sources take towards the assertion. Automatic stance detection, would facilitate the process of fact checking. In this paper, we present our stance detection system which comprises of siamese adaptation of Long Short Term Memory (LSTM) networks augmented with an attention mechanism, as siamese adaptation forces the LSTM to entirely capture the semantic differences during training, rather than supplementing the network with a more complex learner that can help resolve shortcomings in the learned representations. Our experiments on a public benchmark dataset, FakeNewsChallenge (FNC), demonstrate the effectiveness of our approach. It focuses on classifying the stance of a news article body relative to a headline as agree, disagree, discuss, or unrelated.