- Paragraph Retrieval : Mapping user queries to the most relevant context in the Knowledge base.
- Predicting if query is answerable or not. Span Prediction if answerable.
- Answer Retrieval : From the retrieved context
- Minimising Latency and creating space efficient models
The pipeline for generating synthetic data focuses on Data Augmentation techniques which is more suited that traditional GANs for fast paced Question Answer Generation.
-
Average Generation Time per Question Answer Pair: 2s
-
Quality of Generated Question - Answer Pairs:
- F1 score of generated question-answer pairs = 0.80855
- This F1 score was generated by comparing generated answers with answers given by large Question/Answering Models.
-
Sketchy Reading : Makes an initial Judgement about answerability of a question. Three main subprocesses.
- Embedding generation
- Interaction
- External Front Verification
-
Intensive Reading : Verifies answerability of earlier predictions through application of Multi-headed cross attention and threshold verification
-
Rear Verification: Score combination of results of both the modules - Sketchy and Intensive.
- Naive Implementation of Deformer: Deformer architecture was implemented naively by effectively changing last layers of a model to work with a different architecture. Thus a Roberta Model working on an Electra Architecture serves for a simple deformer layout.
- Transformer Compression: Quantisation, Prunification.
Team Members :
- Samvaidan (Team Leader)
- Akarshan
- Taraksh
- Ekansh
- Vansh
- Arush
- Ashutosh
- Mukesh
- Deepali
- Raj Singh