my projects
deployed a multimodal voice AI agent combining voice and screen context using Whisper (STT), Gemini 2.0 Multimodal model, and gpt-4o-mini-tts (TTS) through EC2 on AWS with WebSocket streaming.
developed a perplexity sub-agent that can search the web and answer questions using the agent's context using perplexity MCP. orchestrated with langgraph workflows and DSPy for prompt optimization.
fine-tuned qwen2.5-1.5b with grpo rl
view modelfine-tuned Qwen 2.5 (1.5B) model using Hugging Face TRL with LoRA-based SFT warmup & reinforcement learning via GRPO, aligning model outputs to first-principles reasoning across ~750 annotated interactions using RLAIF with curriculum scheduling.
the munger talks (agentic rag)
view repobuilt an agentic rag system that emulates charlie munger's thinking style (including his mental models) by integrating langgraph workflows, DSPy prompt optimization, and multi-agent orchestration (planner, retriever, mental model analyzer, synthesizer, verifier).
integrated DSPy for prompt optimization across planner, synthesizer, and verifier modules; retrieval pipeline uses FAISS with cross-encoder reranking, sustaining ~2.5s median end-to-end latency for multi-turn queries.
podcast summarization rag (podnotes)
view repodesigned a RAG-based podcast summarization using OpenAI Whisper for transcription, Gemma3 for summarization, and hybrid RAG with semantic and BM25 retrieval through LangChain, with speaker diarization and DynamoDB storage.
other repos
mta routing
NYC subway routing with RAPTOR algorithm through a custom MCP server used with RAG search via an MCP client.
view repotech stack across projects
ML/DL & LLM Frameworks
building and fine-tuning large language models with cutting-edge frameworks
LM Techniques & Training
advanced techniques for model alignment, fine-tuning, and optimization
MLOps & Infrastructure
deploying and scaling ML systems in production environments
Web & APIs
building robust web applications and real-time communication systems