my projects

multimodal voice search agent

view projectview repo

deployed a multimodal voice AI agent combining voice and screen context using Whisper (STT), RL fine-tuned Qwen2.5 1.5B model, and gpt-4o-mini-tts (TTS) through EC2 on AWS with WebSocket streaming.

developed a perplexity sub-agent that can search the web and answer questions using the agent's context using perplexity MCP.

LangGraphAWS EC2OpenAI SDKFastAPIDSPyWebSocketsMCP

fine-tuned qwen2.5-1.5b with grpo rl

view model

fine-tuned Qwen 2.5 (1.5B) model using Hugging Face TRL with LoRA-based SFT warmup & reinforcement learning via GRPO, aligning model outputs to first-principles reasoning across ~750 annotated interactions using RLAIF with curriculum scheduling.

TRLLoRAGRPOSFTRLAIFUnsloth

podcast summarization rag (podnotes)

view repo

designed a RAG-based podcast summarization using OpenAI Whisper for transcription, Gemma3 for summarization, and hybrid RAG with semantic and BM25 retrieval through LangChain, with speaker diarization and DynamoDB storage.

WhisperGemma3LangChainChromaDynamoDBBM25RAG

other repos

🤖

multiagents

AI agents with LangChain, LangGraph, DSPy for orchestration patterns.

view repo
🚇

mta routing

NYC subway routing with RAPTOR algorithm through a custom MCP server used with RAG search via an MCP client.

view repo
🧠

reinforcement learning

Reinforcement learning experiments and algorithm implementations.

view repo

tech stack across projects

🤖

ML/DL & LLM Frameworks

building and fine-tuning large language models with cutting-edge frameworks

PyTorchTransformersLangChainLangGraphDSPyTRLUnslothOllamaLlamaIndex

LM Techniques & Training

advanced techniques for model alignment, fine-tuning, and optimization

LoRASFTRLHFRLAIFPPOGRPOQuantizationRAGFlash AttentionMCP
🔧

MLOps & Infrastructure

deploying and scaling ML systems in production environments

DockerKubernetesGitHub ActionsRedisAWS EC2SageMakerS3DynamoDB
🌐

Web & APIs

building robust web applications and real-time communication systems

ReactFastAPIREST APIsWebSocketsWebRTCJavaScriptTypeScriptAngular