I am a Software Engineer focused on the intersection of Modern LLM Orchestration and Asynchronous Backend Infrastructure. My work focuses on building application layers that transform unstructured data into high-precision knowledge bases. To ensure production readiness, I use test-driven development for system reliability and optimize token usage to minimize LLM runtime costs.
PDF Intelligence RAG Engine A scalable RAG pipeline featuring stateful conversational memory, NLU-driven query expansion, and automated evaluation suites.
- Frameworks: LangChain, HuggingFace Transformers
- Vector Databases: FAISS
- Models & Local Inference: Ollama, Llama 3.3, Sentence-Transformers
- LLMOps & Observability: Langfuse v3 (Application Tracing & Evaluation)
- Concepts: Agentic RAG, Chain-of-Thought (CoT), Prompt Chaining, Semantic Search
- Languages: Python, SQL, C/C++, Java
- APIs & Frameworks: FastAPI (Async/Await), RESTful Design, Pydantic
- Data Stores: MySQL, MongoDB
- Cloud & DevOps: Docker, GitHub Actions, CI/CD Pipelines
- AI Evaluation & Quality Assurance: Test-Driven Development (TDD), RAG Faithfulness & Relevancy Benchmarking, Automated Evaluation Suites via
pytest, Asynchronous Logic Verification (httpx). - System Design: Architectural Hygiene, Asynchronous Non-Blocking I/O, Persistent Resource Management (Docker Volume Optimization).
- ⚡ Engineering Philosophy: I am a Documentation-First developer who practices Test-Driven AI Development. I maintain a "Second Brain" in Obsidian and believe that if a system isn't documented or tested, it isn't finished—I treat technical notes, READMEs, and automated evaluation suites with the same precision as the source code.
- ☕ Fuel Source: My code is almost exclusively powered by italiano espresso roast coffee and an unhealthy amount of technical curiosity.
- 🐾 The Hot Take: I am a staunch Cat Enthusiast 🐱. While I respect the loyalty of dogs, I'd prefer the quiet, independent judgment of a cat watching me debug code for three hours.
- 📘 AI Engineering (Chip Huyen) — Benchmarks, cost/latency tradeoffs, and orchestration loops.
- 📙 Python Testing with pytest (Brian Okken) — Async test harnesses and Test-Driven Development (TDD).
- 📗 Build a Large Language Model (From Scratch) (Sebastian Raschka) — Transformer mechanics, self-attention, and context constraints.
- 🚧 Currently working on: Integrating Langfuse v3 to trace LLM reasoning latencies, evaluating Chain-of-Thought (CoT) prompting strategies via local Ollama instances, and refining test-driven RAG evaluation harnesses using Pytest-Asyncio.
