Diptanu Sarkar
hello [at] diptanu [dot] com

I am a computer science graduate student at the Rochester Institute of Technology. Currently, I am a student researcher at NLP and SLP Center at RIT under Dr. Marcos Zampieri. I am also a member of the Center on Access Technology research group under Dr. Micheal Stinson.

Earlier, I worked as a software engineering intern at Wayfair's platform engineering team. I've earned my bachelor's degree in electronics engineering from NIT Agartala, India in 2015. I also worked with the Infosys Engineering team in Bengaluru, India.

Post-graduation, I will join Microsoft's Cloud and AI team as a Software Engineer.

profile photo

Research

I am passionate about large-scale distributed systems and natural language processing.

Polarization Detection in fake news on Social Media platforms
Alexander G. Ororbia II, Diptanu Sarkar, Marcos Zampieri

Research on a novel approach for finding polarization in fake news and model that data to flag fake or offensive posts on social media in real-time.

Automatic Speech Recognition to provide better accessibility for Deaf or Hard of Hearing
Diptanu Sarkar, Lisa B. Elliot, Micheal Stinson

A model based on word importance in utterances using machine learning for Automatic Speech Recognition (ASR) systems to provide better phone captioning and accessibility to the deaf or hard of hearing (DHH) community.

Projects
Automatic Language Identification in Text  [Live]
Technology Stack: Python, NumPy, SciPy, Scikit-learn, Flask

Developed an automatic language identification model using the Bi-gram, Naive Bayes, Artificial Neural Network to detect ten different natural languages. The model is trained using the WiLI-2018 benchmark dataset, and the highest accuracy achieved on the test dataset is 99.7% with paragraph text.

e-Paste and Share [Live]
Technology Stack: Java, Spring Boot, JavaScript, Docker, Kubernetes, AWS

Designed and developed an online easy text sharing tool. Started as a hobby project, now deployed for free.

Part-of-Speech (POS) Tagger for the English Language
Technology Stack: Python, NLTK, Bag-of-Words, Hidden Markov Model, Bayes Net, Naive Bayes

Implemented a part-of-speech tagger in the English language using the Hidden Markov Model, Bayesian Net, and Naive Bayes. Then, compared the performance of the Forward-Backward Algorithm and the Viterbi Algorithm. The model resulted in over 91.2% word accuracy with 63.6% sentence accuracy.

Image Classification using Deep Neural Networks
Technology Stack: Pyhton, PyTorch, NumPy, OpenCV

Built image classification deep learning architectures - AlexNet, VGG16, and ResNet using transfer learning and fine-tuning in PyTorch. Final model accuracies achieved are AlexNet-81.2%, VGGNet-85.6%, ResNet-84.7% on 10K test images.

Articles
Data Structures & Algorithms: Asymptotic Analysis & Notations , The Startup

In this article, the importance of asymptotic analysis is explained, followed by the introduction to asymptotic notations. The worst, average, and best case time complexity analysis are also briefly discussed.

Detecting Emotions in Lyrics

Music stimulates strong human emotions and feelings. Music platforms provide highly customized playlists to every user along with playlists based on moods. Emotions are subjective, and the subjective nature of emotions makes emotion detection a very challenging task when applied to music. Previously, music emotion detection solely relied on acoustic features. In recent studies, it’s observed that using music lyrics features along with acoustic features significantly improves the classification result.

Automatic Language Identification in Short Utterances

Language Identification in Natural Language Processing is the process of identifying the spoken language in speech utterances. This blog examines three different models to recognize languages automatically - Dynamic Hidden Markov Networks model, Deep Neural Network model, and Long Short-Term Memory Recurrent Neural Network model.



Appreciate the aesthetics? Credit: Jon Barron.
Last updated: 01 Nov 2020