Past
Projects

Listing some of the projects I worked upon in the past recent years. These projects have helped me to develop intuition in problem solving and how to quickly onboard a new problem statement.

As an applied AI engineer, I have worked at the intersection of research domain & industry projects commercialization. This has made me better at reading research papers as well as putting them into production.

I carry a wide exposure to industry level challenges and this experience makes me a good fit to tackle real-world problem statements.

Text-to-Speech model for Hindi language (Flipkart)

The aim of this project was to generate audio speech in Hindi language for the given text sentences.  I started on this project while I was working with Flipkart.

Key learnings:

  • Unsupervised generative models

  • Autoregressive models

  • Flow (& inverse Flow) based models

  • Papers I've read - Link

tts.jpg
MNIST-Muddle (Personal project)

The aim of this project was to get a practical exposure of how Latent Domain works in deep learning. This is a simple project which tries to generate poorly written hand-digits by interpolating between the nearest cluster.

Got practical exposure to:

  • Auto Encoders

  • Latent domain

  • PyTorch hosting using Streamlit

  • Project Demo - Link

mnistMuddleCast.gif
C++ decoder for Speech recognition engine (Flipkart)

Worked on the decoder module of the ASR pipeline (Automated Speech Recognition). 

Key responsibilities:

  • Implemented new features into production.

  • Handled __ lines of C++ code base.

  • Improved latency & memory consumption.

  • Blog post 1 - Intro to CTC Loss

  • Blog post 2 - DP based prefix beam search PPT

asr4.jpg
Single Image HDR reconstruction & enhancement(Samsung)

Camera sensors typically have low dynamic range which often leads to over-exposure or under-exposure of regions in outdoor scenes. This project aimed at enhancing the overall image by improving such regions.

Key learnings:

  • Un-paired images DL training

  • Cyclic loss, encoder-decoder

  • HDR, tone mapping, linear vs non-linear images, histogram

  • What is HDR ? (Notes Link)

hdr.jpg
Demosaicing using CNNs (Samsung)

A camera sensor captures single channel image where each pixel contains only 1 color information. The process of interpolating the remaining 2 colors via neighboring pixels to construct a full colored image is called Demosaicing. We used a deep learning model to approach this problem.

Key learning:

  • Used a ResNet-Bottleneck based deep learning architecture.

  • Wrote the architecture from scratch in Tensorflow 1.x to conduct multiple experiments

  • Residual learning; skip connections

  • Links - Paper | Pre-print

demosaic.jpg