publications | Srijan Bansal

An up-to-date list is available on Google Scholar.

2023

arxiv

Few-shot Unified Question Answering: Tuning Models or Prompts?

Srijan Bansal, Semih Yavuz, Bo Pang, Meghana Bhat, and 1 more author

arXiv preprint arXiv:2305.14569, 2023

Abs arXiv

Question-answering (QA) tasks often investigate specific question types, knowledge domains, or reasoning skills, leading to specialized models catering to specific categories of QA tasks. While recent research has explored the idea of unified QA models, such models are usually explored for high-resource scenarios and require re-training to extend their capabilities. To overcome these drawbacks, the paper explores the potential of two paradigms of tuning, model, and prompts, for unified QA under a low-resource setting. The paper provides an exhaustive analysis of their applicability using 16 QA datasets, revealing that prompt tuning can perform as well as model tuning in a few-shot setting with a good initialization. The study also shows that parameter-sharing results in superior few-shot performance, simple knowledge transfer techniques for prompt initialization can be effective, and prompt tuning achieves a significant performance boost from pre-training in a low-resource regime. The research offers insights into the advantages and limitations of prompt tuning for unified QA in a few-shot setting, contributing to the development of effective and efficient systems in low-resource scenarios.

2022

EMNLP

PRO-CS : An Instance-Based Prompt Composition Technique for Code-Switched Tasks

Srijan Bansal, Suraj Tripathi, Sumit Agarwal, Teruko Mitamura, and 1 more author

In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Abs URL Code Poster

Code-switched (CS) data is ubiquitous in today’s globalized world, but the dearth of annotated datasets in code-switching poses a significant challenge for learning diverse tasks across different language pairs. Parameter-efficient prompt-tuning approaches conditioned on frozen language models have shown promise for transfer learning in limited-resource setups. In this paper, we propose a novel instance-based prompt composition technique, PRO-CS, for CS tasks that combine language and task knowledge. We compare our approach with prompt-tuning and fine-tuning for code-switched tasks on 10 datasets across 4 language pairs. Our model outperforms the prompt-tuning approach by significant margins across all datasets and outperforms or remains at par with fine-tuning by using just 0.18% of total parameters. We also achieve competitive results when compared with the fine-tuned model in the low-resource cross-lingual and cross-task setting, indicating the effectiveness of our approach to incorporate new code-switched tasks.
ACL Dialdoc

R3 : Refined Retriever-Reader pipeline for Multidoc2dial

Srijan Bansal, Suraj Tripathi, Sumit Agarwal, Sireesh Gururaja, and 4 more authors

In Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, May 2022

Abs URL Code Poster

In this paper, we present our submission to the DialDoc shared task based on the MultiDoc2Dial dataset. MultiDoc2Dial is a conversational question answering dataset that grounds dialogues in multiple documents. The task involves grounding a user’s query in a document followed by generating an appropriate response. We propose several improvements over the baseline’s retriever-reader architecture to aid in modeling goal-oriented dialogues grounded in multiple documents. Our proposed approach employs sparse representations for passage retrieval, a passage re-ranker, the fusion-in-decoder architecture for generation, and a curriculum learning training paradigm. Our approach shows a 12 point improvement in BLEU score compared to the baseline RAG model.

2021

ACM Hypertext

Debiasing Multilingual Word Embeddings: A Case Study of Three Indian Languages

Srijan Bansal, Vishal Garimella, Ayush Suhane, and Animesh Mukherjee

In Proceedings of the 32nd ACM Conference on Hypertext and Social Media, 2021

Abs URL arXiv Code Slides

In this paper, we advance the current state-of-the-art method for debiasing monolingual word embeddings so as to generalize well in a multilingual setting. We consider different methods to quantify bias and different debiasing approaches for monolingual as well as multilingual settings. We demonstrate the significance of our bias-mitigation approach on downstream NLP applications. Our proposed methods establish the state-of-the-art performance for debiasing multilingual embeddings for three Indian languages - Hindi, Bengali, and Telugu in addition to English. We believe that our work will open up new opportunities in building unbiased downstream NLP applications that are inherently dependent on the quality of the word embeddings used.

2020

ACL

Code-Switching Patterns Can Be an Effective Route to Improve Performance of Downstream NLP Applications: A Case Study of Humour, Sarcasm and Hate Speech Detection

Srijan Bansal, Vishal Garimella, Ayush Suhane, Jasabanta Patro, and 1 more author

In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020

Abs URL Code

In this paper, we demonstrate how code-switching patterns can be utilised to improve various downstream NLP applications. In particular, we encode various switching features to improve humour, sarcasm and hate speech detection tasks. We believe that this simple linguistic observation can also be potentially helpful in improving other similar NLP applications.

2019

EMNLP

A deep-learning framework to detect sarcasm targets

Jasabanta Patro, Srijan Bansal, and Animesh Mukherjee

In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Abs URL Code Poster

In this paper we propose a deep learning framework for sarcasm target detection in predefined sarcastic texts. Identification of sarcasm targets can help in many core natural language processing tasks such as aspect based sentiment analysis, opinion mining etc. To begin with, we perform an empirical study of the socio-linguistic features and identify those that are statistically significant in indicating sarcasm targets (p-values in the range(0.05,0.001)). Finally, we present a deep-learning framework augmented with socio-linguistic features to detect sarcasm targets in sarcastic book-snippets and tweets.We achieve a huge improvement in the performance in terms of exact match and dice scores compared to the current state-of-the-art baseline.
CoDS-COMAD

Can Siamese Networks Help in Stance Detection?

T. Y.S.S. Santosh, Srijan Bansal, and Avirup Saha

In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, 2019

Abs URL

An important component of fake news detection is to evaluate the stance, different news sources take towards the assertion. Automatic stance detection, would facilitate the process of fact checking. In this paper, we present our stance detection system which comprises of siamese adaptation of Long Short Term Memory (LSTM) networks augmented with an attention mechanism, as siamese adaptation forces the LSTM to entirely capture the semantic differences during training, rather than supplementing the network with a more complex learner that can help resolve shortcomings in the learned representations. Our experiments on a public benchmark dataset, FakeNewsChallenge (FNC), demonstrate the effectiveness of our approach. It focuses on classifying the stance of a news article body relative to a headline as agree, disagree, discuss, or unrelated.