Selected Projects

For a complete list, kindly see my CV


Developed a Siamese network of a combination of Convolutional, Max Pooling, Batch Norm and Dropout layers, to verify whether the signatures are signed by the same person or not using one-shot learning. Github Link

Chest X-Rays Pneumonia Detection

CNN was trained for detecting Pneumonia given Chest X-Rays. After training the model, test accuracy of 94.56% and a recall score of 0.97 was achieved.

Silatra (Sign Language Translation)

An Android application designed to translate the signs and gestures performed by the hearing- and speech-impaired into voice using Image Processing, Segmentation and Machine Learning algorithms KNN and HMMs. Paper published in IEEE.

Machine Learning Lab - IITB Virtual Labs

Contributed to the development of Machine Learning Lab on behalf of K J Somaiya College of Engineering, Mumbai and hosted on IIT Bombay's Virtual Labs website. As part of this, I developed simulations of neural networks, optical character recognition using Tesseract JS, Hebbian, Perceptron Learning Rules.


Information Sciences Institute, USC

January 2022 - Present

Research Assistant, May Team

Working on designing bots capable of understanding communication and strategizing for the board game – diplomacy under the supervision of Dr. Jon May

Increased the win rates by an average of 5% for the powers in the game by generating and understanding DAIDE messages and using rules on top of the pre-trained reinforcement learning based Dipnet bot

May 2022 - August 2022

Machine Learning Engineer Intern

Generated a dataset of 1 million invoices and statements from user data after sanity checks for improving the amount OCR model

Leveraged an ensemble of document classification models to fix the imbalanced data distribution of statements in training dataset

Improved the amount prediction test accuracy from 75.5 to 77.6% with 50% confidence threshold

Information Sciences Institute, USC

February 2021 - January 2022

Research Assistant, Centre of Knowledge Graphs

Constructed a framework for identifying low quality statements in Wikidata knowledge graph amongst 1.1 billion statements on the basis of deleted, deprecated statements and constraint violations

Enhanced the graph embeddings of nodes using retrofitting based on BERT embeddings and structural, textual properties extracted from Wikidata, Probase and DBPedia datasets increasing Spearman correlation from 0.66 to 0.73 on WordSim353 benchmark

Barclays Global Service Centre

July 2018 - December 2020

Software Developer, Barclaycard UK

Devised a prototype fraud detection pipeline using Kafka queues, Cassandra DB and PySpark servers having an ensemble of ML models

Designed a real-time tweets sentiment analysis engine to enable quick customer service response achieving an accuracy of around 90 % in pilot runs

Created a classifier application utilizing ML algorithm LDA to extract insights from iOS and Android application reviews and customer complaints

Deployed a system that helps in connecting the colleagues with available bandwidth and skillsets with the colleagues needing assistance in their work, using AngularJS, Java, MySQL, saving more than 900 man-hours annually

Implemented dashboards for automated generation of real-time delivery metrics of more than 30 teams from Agile Central and Jira data sources which have been saving around 150 man-hours annually. Bagged the Barclays Award of Stewardship for this initiative

Virtual Labs, Indian Institute of Technology, Bombay

March 2017 – August 2017

Web Development Intern, Team Leader

Led a team of three to develop a Virtual Lab for the online demonstration of machine learning concepts such as neural networks, learning rules and optical character recognition

This lab has won the Global Online Laboratory Consortium International Lab Award


We present an open-domain dialogue system for spoken conversation that uses a topic-agnostic dialogue manager based on a simple generate-and-rank approach. Leveraging recent advances of generative dialogue systems powered by large language models, Viola fetches a batch of response candidates from various neural dialogue models trained with different datasets and knowledge-grounding inputs.
4th Proceedings of Alexa Prize (Alexa Prize 2020)

We constructed a framework for identifying low quality statements in Wikidata knowledge graph amongst 1.1 billion statements on the basis of deleted, deprecated statements and constraint violations.
Journal of Web Semantics. Elsevier.

We discuss several use cases where the KGTK kypher, a query language and processor can be used to execute various types of analyses on the full Wikidata KG on a laptop.
2021 Wikidata Workshop at ISWC

We designed an Android application which uses image processing, segmentation, HMM and KNN models hosted on a server to translate the gestures and signs of Indian Sign Language used by a hearing- or speech-impaired person.
9th International Conference on Computing, Communication and Networking Technologies (ICCCNT)

I have explored the dataset for keystroke dynamics which could help in identifying patterns hidden in user's keystrokes. This could potentially help in performing non-intrusive real-time authentication of users. The analysis and models training has been covered over 2 parts.
Towards Data Science

I have detailed the derivation of the scary back propagation formulae of LSTMs in this Medium article.


Deep Learning Specialization

Courses Included: 1) Neural Networks and Deep Learning, 2) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization 3) Structuring Machine Learning Projects, 4) Convolutional Neural Networks, 5) Sequence Models.

Machine Learning

Topics Covered: Logistic Regression, Artificial Neural Network, Machine Learning Algorithms, Principal Component Analysis, Collaborative Filtering


Please feel free to reach out to me for any query