Dip Rajeshbhai Shambhvani | Portfolio

Technical Expertise

C++ / C Python System Programming PyTorch / TensorFlow LangChain SQL / NoSQL Git Linux / Bash

Featured Projects

Avadhana-LLM Framework

M.Tech Thesis

A novel benchmarking framework inspired by the ancient art of "Avadhana" to evaluate LLM multitasking, memory retention, and distraction handling.

Implemented distraction pipelines using LangChain & LangGraph.
Designed "Memory Recall Score" (MRS) & "Word Overlapping Score" (WOS).
Benchmarked Llama 3, Mistral, and GPT-OSS under high cognitive load.

PythonLangChainResearch

BlinkDB: In-Memory KV Store

Systems

A high-performance Redis-inspired database engineered in C++ utilizing asynchronous I/O for massive concurrency.

Achieved 159k+ GET ops/sec and 133k+ SET ops/sec.
Implemented RESP-2 protocol and kqueue for non-blocking I/O.
Scaled to handle 1M+ requests with 1,000 concurrent clients.

C++TCP/IPKqueue

MemFS: Multithreaded File System

Systems

A volatile in-memory file system designed for thread safety and low-latency batch operations.

Engineered mutex-based synchronization for thread safety.
Achieved ~60µs latency for create/read operations.
Implemented thread pooling to maximize CPU throughput.

C++MultithreadingMutex

Robust Image Captioning

Deep Learning

A ViT-GPT2 vision-language model fine-tuned for resilience against image occlusion and corruption.

Designed ViT-GPT2 encoder-decoder architecture.
Outperformed SmolVLM baseline in 10-80% occlusion tests.
Developed a BERT classifier (99.7% F1) to detect generated captions.

PyTorchTransformersViT

Assembly Simulator & Interpreter

Compiler Design

A full-stack language processor that simulates register-based hardware to execute custom assembly code.

Built lexer and parser using Python PLY (Lex & Yacc).
Simulated register memory architecture and arithmetic logic.
Implemented complex control flow (branching, loops).

PythonPLYInterpreter

Misinformation Detection

NLP

A high-accuracy text classification system for detecting COVID-19 misinformation using BERT variants.

Fine-tuned TwHIN-BERT achieving state-of-the-art 98.7% accuracy.
Optimized hyperparameters using Optuna.
Processed 10k+ tweet dataset with custom tokenization.

BERTOptunaNLP

Competitive Programming

LeetCode Stats

Links

GitHub LinkedIn Resume