Hi, I am Arjun

Arjun Prashanth

I'm a Machine Learning Engineer who loves taking AI research off the whiteboard and into the real world (where things break, scale, and actually make money). Over the last few years in Boston, I’ve built and deployed production-grade systems for search, recommendations, and Gen AI - mostly in the e-commerce space.

I enjoy tinkering with transformers, vector search, and the occasional project involving LLMs. I speak Python fluently, but I also get along well with PyTorch, PySpark, Kubernetes, Langchain, MongoDB, and friends. I’ve designed SDKs, optimized inference with quantization, accelerated training using distributed GPUs and containerized more things than I can count.

Now heading back to India, I’m looking to join a team that’s building cool stuff in ML, NLP, or GenAI - and where shipping things matters just as much as training them. If you’re solving interesting problems and need someone who can turn research into revenue, wrangling ML infra for speed and some coffee, I’m game.

Problem Solver
Team Player
Versatile
Inquisitive
Detail-Oriented
Dependable

Skills

Experiences

1
Machine Learning Engineer
Ahold Delhaize USA

August 2021 – June 2025, Boston, MA

Responsibilities:
  • Improved key search metrics and eliminated null queries by productionizing AI-based Hybrid Search System using fine-tuned bi-encoder/cross-encoder transformers and Mongo Vector database.
  • Developed top revenue generating Recommendation Systems for cross sell and newly discovered products using Contrastive Learning, Two-Tower model and Neural Collaborative Filtering.
  • Built Multi Agent RAG chatbot for different HR functions using RAG techniques, LangGraph, Llama 3, Guardrails and efficient memory management.
  • Significantly accelerated model training pipelines using Distributed training, Model distillation and Quantization techniques.
  • Improved latency of ML APIs using NVIDIA Triton inference server, TensorRT and optimized MongoDB queries.
  • Deployed scalable inference pipelines for Search and Recommendation Systems using containers, Kubernetes & Seldon.
  • Spearheaded team of two on Model Portal, enhancing team visibility and driving increased collaboration with new teams.
  • Decreased time to ship features, models and APIs by developing multiple SDKs to (i) efficiently push and retrieve features to offline and online Feature Store using Databricks and MongoDB (ii) track ML experiments, register ML models and log metrics using MLFlow (iii) develop templates for boilerplate API builds using OOPs.
  • Experimented with Sequential and Session-based recommendation transformers on clickstream data to predict the next item to purchase based on the customer journey.
  • Developed end-to-end chit-chat/noise classifier on Chatbot data using BERT and MLFlow.

Data Science Intern
Glasswing Ventures

January 2021 - July 2021, Boston, MA

Responsibilities:
  • Designed MLOps Platform using GCP Vertex AI from Data Extraction to Model Deployment to continuously improve predictive performance of prospective AI investments.
  • Created baseline models on time-series data using Neural Networks in Tensorflow.
  • Mitigated Memory Usage and Runtime by 3 times by efficiently optimizing Glasswing’s Data Pipeline.
  • Eliminated manual effort by 100% by containerizing and deploying Glasswing’s data pipeline on Google Cloud Run.
  • Visualized, analyzed and aggregated data to provide insights to the investment team and enrich the Glasswing platform.
2

3
Machine Learning Research Intern
DRDO

Dec 2018 - May 2019, Bangalore, India

Responsibilities:
  • Researched WhatsApp’s network architecture and conducted experiments to collect WhatsApp network traffic-flow data.
  • Attained 97.3% accuracy using a 2-layer Ensemble ML model consisting of Naive Bayes, KNN, Decision Trees, Logistic Regression to identify whether media transfer occurred in a WhatsApp chat.
  • Obtained 95.6% accuracy using XGBoost to classify WhatsApp messages as delivered, received, or seen.

Education

Sept 2019- Apr 2022
MS in Computer Science
CGPA: 3.67 out of 4
Courses Taken:
  • Deep Learning
  • Large Scale Data Processing
  • Data Mining Techniques
  • Information Retrieval
  • Database Management
  • Algorithms
  • Program Design Paradigm
Jul 2015- May 2019
B.Tech in Software Engineering
CGPA: 8.52 out of 10
Taken Courses
  • Machine Learning
  • Linear Algebra
  • Probability and Statistics
  • Advanced Calculus
  • Software Testing
  • Agile Software Process

Projects/Open Source

Fine-tuned Phi-3 VLM on CIFAR10 Image
Jul 2020

● Created a dataset with Question Answer pairs by generating image descriptions using SmolVLM2 on CIFAR10 images.
● Generated aligned image embeddings using SigLIP model.
● Fine-tuned Phi-3 as VLM for vision-text alignment using QLoRA and descriptions based on image embeddings from the SigLIP model.
● Optimized training with 4-bit quantization, Flash Attention 2, gradient checkpointing, and mixed-precision training, achieving memory-efficient and stable model convergence.

Shakespearean Text Generator
Jul 2020

● Designed and trained a GPT-2 model from scratch on Shakespeare’s complete works to generate stylistically authentic text.
● Implemented tokenization, dataset preprocessing, and transformer architecture customization with UI app using Gradio.

News Summarization Agent
Jul 2020

● Developed an AI-driven news aggregation and summarization agent leveraging LangGraph and MCP to automate the discovery, analysis, and synthesis of news articles.
● Designed a dynamic workflow that interprets user queries, collects and evaluates articles from multiple sources using NewsAPI and Beautiful Soup, and generates cohesive, multi-article summaries using advanced language models like gpt-4o-mini.
● Implemented adaptive search strategies, parallel content analysis, and context-aware summarization to deliver actionable insights and reduce information overload.

Sentiment Analysis Web App
May 2020

● Developed a Web App that predicts the sentiment of an user input review.
● Performed text cleaning and preprocessing including stemming, stopword removal, tokenization and HTML parsing for over 50,000 reviews and uploaded the transformed data to AWS S3.
● Built an LSTM model with Word Embedding layer using skip-gram architecture to learn sentiments from the data.
● Deployed the model for testing on AWS Sagemaker and achieved a test accuracy of 84%.
● Hosted the model on my Web App using AWS Lambda and AWS API Gateway.

Star
Machine Translation
Jun 2020

● Built a Machine Translation model that translates English sentences to French using Keras.
● Developed a comprehensive pipeline to preprocess over 1.8 million English and French words.
● Experimented with different architectures that include; Embedding layer + Bidirectional-GRU, Embedding layer with GRU, Bidirectional-GRU, Vanilla GRU, Encoder-Decoder with LSTM.

Star
Image Denoiser Using Convolutional Autoencoder
Jul 2020

● Created custom MNIST image dataset by adding Gaussian noise.
● Implemented a Denoiser by using an Encoder-Decoder model with Convolutional layers.
● Trained the Denoiser by supplying noise images as input and original images as target.

Dog Breed Classifier Using CNN
Apr 2020

● Created a CNN that predicts the dog breed if given a dog image or the closest dog breed resemblance when given a human image.
● Detected human faces in the images using OpenCVs Haar Cascades.
● Performed dog face detection and breed classification using Transfer Learning from VGG16 model and achieved 86% accuracy on unseen data.

Star
Patient Experience Website
Oct 2019 - Dec 2020

● Designed and built a JavaScript-based website for patients to look up doctors based on medical conditions and location.
● Integrated RESTful services as a Middle Level Tier to handle CRUD operations using JPA controllers and DAOs.
● Built a robust database using MySQL and formulated advanced queries like joins, nested queries, triggers, views.
● Hosted the database on AWS RDS and the entire website on AWS Elastic Beanstalk as an EC2 instance.

Star
Topical Web Crawler
Jan 2020

● Implemented an algorithm for a web crawler using link graphs and customized priority queues.
● Crawled over 140,000 web pages on Barack Obama and indexed the crawl data on ElasticCloud.
● Created a Vertical Search using Flask to retrieve relevant pages based on keywords using BM25 text retrieval model.

Star
Neural Style Transfer
May 2020

● Developed a CNN that applies the style of an image onto the content of another image.
● Extracted features and constructed loss function for style and content using Gram Matrix.
● Performed Transfer Learning from VGG19 model to build CNN.

Star
Automatic Speech Recogniser
Jul 2020

● Implemented an End-to-End Automatic Speech recognition pipeline using Keras.
● Preprocessed raw audio to feature representations like MFCC and Spectrograms.
● Built Acoustic Models to map audio features to the transcribed text.
● Experimented with different Neural Network architectures that include; Deep RNN + TimeDistributed Dense, CNN + RNN + TimeDistributed Dense, Bidirectional RNN + TimeDistributed Dense.

Star
Email Spam/Ham Classifier
Apr 2020

● Performed text preprocessing by email parsing, stemming and stopword removal using NLTK.
● Indexed the data to Elasticsearch and transformed text data to sparse matrices using CountVectorizer.
● Devised feature extraction using NLP techniques like Skipgrams and TFIDF.
● Developed Decision Trees, Logistic Regression and SVM models to achieve an ROC score of 96%.

Image Segmentation using K-Means Clustering on MapReduce
Nov 2020

● Attained Speedup of 1.35 and Scaleup of 1.05 while performing Image Segmentation.
● Designed the K-Means algorithm from scratch to work on distributed sources of data and paralleize compute using Hadoop ecosystem.

Star
Face Mask Detection
Dec 2020

● Achieved sensitivity of 96% on the largest real-world Face Mask Dataset.
● Built a custom ResNet model with Data Augmentation.
● Implemented a Face Detection pipeline using YOLOv5 and OpenCV to detect faces.

Human Protein Classification
Jul 2020

● Ranked in the top 20% of Human Protein Classification Kaggle Competition.
● Developed CNNs using Densenet and Resnet capable of classifying mixed patterns of proteins in microscope images.

Star
Image Processing Application
Nov 2019

● Built an Image Filter App that applies various filters on user input images.
● Implemented image processing operations such as blur, greyscale, sepia and mosaic from scratch.
● Devised an MVC architecture using Object Oriented Programming and Command Pattern Design.
● Additional features include generating custom images such as rainbow, checkerboards and flags based on user input.