Research and Project Work
I'm interested in computer vision, natural language processing, image processing and machine learning. Most of my work is related at the intersection of different modalities such as image and text with leverage from strong Optimization techniques.
|
|
AMUSED: A Multi-Stream Vector Representation Method for Use In Natural Dialogue
Gaurav Kumar, Rishabh Joshi, Jaspreet Singh, Promod Yenigalla
Goal is to build a coherent and non-monotonous conversational agent with proper
discourse and coverage. We propose an end to end multi-stream deep learning
architecture which learns unified embeddings
for query-response pairs by leveraging contextual information from memory networks and
syntactic information by incorporating Graph
Convolution Networks (GCN) over their dependency parse. A stream of this network
also utilizes transfer learning by pre-training
a bidirectional transformer to extract semantic representation for each input sentence and
incorporates external knowledge through the
neighborhood of the entities from a Knowledge Base (KB). We benchmark these embeddings on next sentence prediction task and
significantly improve upon the existing techniques.
|
|
Understanding physical properties of objects by visual cues using neural networks
Research Work at Centre for Integrative Neuroscience,
University of Tubingen
Worked on varying cloth videos to understand their material properties based on certain external parameters.
Experimented with CNN+LSTM networks, 2 stream, multi loss networks along with optical flow for feature extraction
Proposed triplet loss function for videos similar to FaceNet paper to get their low dimension representation.
Results showed strong correlation between human perception of physical properties and neural networks in case of motion with the perceptual graph being logarithmic. Work published in Journal of Vision 2019.
|
|
Converting Handwritten Mathematical equations to LaTeX
Course project, Image Modeling Techniques: Supervised by Dr. Tanaya Guha
[Slides]/[Report]
Developed a deep learning based pipeline to convert handwritten mathematical equations to LaTeX from scratch.
Incorporated edge detectors, Hough Transform along with other segmentation methods for pre-processing the image and built a novel tree structure based algorithm to identify multilevel superscripts and subscripts.
Trained multiple classifiers including SVM's, CNN, Random Forests for symbol recognition and reached accuracies close to 98% even on complex trigonometric symbols.
Awarded best project out of 20 others.
|
|
Detecting Semantically Similar Question Pairs
Course project, Natural Language Processing: Supervised by Dr. Harish Karnick
[Slides]
Goal was to detect duplicate question pairs on Quora Question Dataset based on sentence semantics & structure.
Experimented with Siamese LSTM's and attention based methods along with varied embedding approaches. Incorporated features from dependency tree to model syntactic information which improved the performance.
Multiple such models with minor variations when used in ensemble generated accuracies close to 82%.
|
|
Captcha Breaking
Course project, Machine Learning Techniques, supervised by Dr. Purushottam Kar
[Slides]
Objective was to build an efficient algorithm to break online squirrel mail client captchas using neural networks
Performed feature engineering methods such as clustering, dominating color based segmentation to remove heavy noise and orthogonal cluttering through captcha text before feeding it to a character recognizer.
Implemented variants of CNN models from scratch for character recognition and reached 98% accuracy. Entire Captcha breaking reached 85% accuracy for very noisy captchas and 98% for less noisy captchas.
|
|
Robust Principal Component Analysis and its applications
Term paper, supervised by Dr. Ketan Rajawat
[Report]
Utilized Robust PCA to perform foreground-background separation in videos and image impainting
Worked on methods such as alternating projections, IALM, gradient descent for extracting low rank and sparse components in a matrix. Used CVX Toolbox and ProPack Package in MATLAB to perform the experiments.
|
Inspired from this website
|
|