Sophomore Computer Science and Math Student @ Harvey Mudd College
Welcome to my portfolio! I'm a software engineer passionate about natural language processing, machine learning, and building impactful solutions.
Outside of school, I love tinkering with mechanical keyboards, and I'm especially big fan of perfumes! I'm always looking for new keyboards and fragrances to add my ever-growing collection >:)
American Red Cross | September 2025 - Present
Yujun Venture Capital Management | July 2025 - August 2025
The Nature Conservancy | May 2025 - July 2025
Shenzhen Lanyang Technology | May 2024 - July 2024
Lab for Cognition and Attention in Time and Space (Lab for CATS) @ Harvey Mudd College | September 2025 - Present
Workflows for Humanistic Inference of Statistical Knowledge (WHISK) Lab @ Harvey Mudd College | January 2025 - May 2025
Sponsors for Educational Opportunity (SEO) - First Year Academy
Selected out of 1200 applicants to participate in software engineering pre-professional development training | February 2025 - August 2025
I'm a software engineer with experience in natural language processing, machine learning, and full-stack development. I'm passionate about leveraging AI and data-driven solutions to solve real-world problems.
My work spans from building RAG-based LLM systems for environmental research to developing patented computer vision algorithms for safety applications. I've been fortunate to work with organizations like The Nature Conservancy, Yujun Venture Capital, and Shenzhen Lanyang Technology.
Leading a 5-person team building a neural-network detection system that identifies vulnerable buildings in Indonesia from 360° Mapillary imagery — 75% benchmarked accuracy.
Selenium scraper that consolidates public-company data for a venture-capital team, accelerating pre-investment research by 15% and surfacing perfume-market trends for an investor presentation.
RAG-based LLM system processing 2,000+ research papers, automating literature review for agroforestry queries.
Patented computer vision algorithm with 70% accuracy for detecting fraudulent liquid petroleum gas tanks using OpenCV.
Won Best Use of Streamlit out of 50 teams at Caltech Hacktech — full-stack interactive environmentalism game.
Won Most Unique Game out of 24 teams at SEO First Year Academy Summer Program.
Web app for digitally coordinating college room-draw plans. Used by 900+ students at Harvey Mudd to plan housing each year.
A computer vision detection system built with the American Red Cross to identify structurally vulnerable buildings in Indonesia using 360° street-level imagery from Mapillary. The model helps humanitarian teams prioritize disaster-preparedness efforts by surfacing at-risk structures at scale.
American Red Cross | September 2025 - Present
Computer Vision Team Lead — leading a 5-person team through data pipeline design, model training, and evaluation.
The detection system gives the Red Cross a scalable way to survey building stock in at-risk regions of Indonesia — work that would otherwise require extensive ground-truth teams. Identifying vulnerable structures early enables better targeting of retrofits, evacuation planning, and disaster-response resources.
A Selenium-based web scraper built for Yujun Venture Capital Management to consolidate public-company information for the firm's pre-investment research process. The tool was also applied to the perfume industry to surface market trends for a pitch to the internship team and investors.
Yujun Venture Capital Management | July 2025 - August 2025
Replaced a manual information-gathering workflow with an automated pipeline, freeing analysts to focus on interpretation instead of collection. The perfume-market analysis demonstrated the scraper's applicability to thesis-driven research and informed the firm's discussion of that sector.
A sophisticated Retrieval-Augmented Generation (RAG) system developed at The Nature Conservancy to automate literature review processes for agroforestry research. The system processes over 2,000 research papers to provide domain-specific answers to complex queries.
The Nature Conservancy | May 2025 - July 2025
The system uses ChromaDB as a vector database to store embeddings of research papers. When a query is received, the RAG system retrieves the most relevant document chunks and uses the Qwen language model to generate comprehensive, contextually accurate answers. The regex-based preprocessing pipeline ensures clean, well-structured text for optimal retrieval performance.
To evaluate system performance, I created a comprehensive dataset of approximately 7,000 unique questions covering various aspects of agroforestry. This dataset serves as a robust benchmark for testing retrieval accuracy and answer quality, and can be used to compare different RAG implementations.
This system dramatically accelerates the research process for agroforestry experts at The Nature Conservancy, enabling them to quickly find relevant information across thousands of papers. The automated literature review capability supports more efficient decision-making for conservation and sustainable agriculture initiatives.
A patented computer vision algorithm developed to detect fraudulent liquid petroleum gas (LPG) tanks, preventing the illegal sale of hazardous units. This project achieved 70% accuracy in identifying counterfeit tanks while reducing implementation costs by 80%, making safety technology accessible to low-income families in China.
Shenzhen Lanyang Technology | May 2024 - July 2024
The algorithm uses advanced image comparison techniques with OpenCV to analyze LPG tank features and identify discrepancies that indicate counterfeit products. The system compares tank characteristics including markings, serial numbers, manufacturing details, and physical features against a database of authentic tank specifications.
Counterfeit LPG tanks pose serious safety risks, as they may lack proper safety mechanisms and can lead to explosions or gas leaks. However, traditional detection methods were prohibitively expensive for many households. This project aimed to develop an affordable, accessible solution that could be widely deployed.
By leveraging computer vision and standard imaging equipment, the solution reduced costs by 80% compared to existing hardware-based authentication systems. This dramatic cost reduction was crucial for making the technology accessible to low-income families who are most vulnerable to purchasing counterfeit products.
This project directly contributes to public safety by preventing the distribution of dangerous counterfeit LPG tanks. By making detection technology affordable and accessible, it protects vulnerable communities from the risks associated with substandard gas containers. The patent ensures the technology can be commercialized and deployed at scale.
EcoBuddies is an interactive full-stack chatbot application designed to promote environmental awareness and sustainable living. The project won Best Use of Streamlit at Caltech's premier hackathon, Hacktech 2025.
🏆 Best Use of Streamlit - Hacktech 2025 (Caltech)
EcoBuddies integrates Google's Gemini LLM with Streamlit's web framework to create an engaging conversational interface that educates users about environmental topics, sustainability practices, and climate action. The application makes learning about environmentalism accessible and interactive through AI-powered conversations.
The application leverages Streamlit's component-based architecture to create a responsive and interactive web interface. The Google Gemini API powers the conversational capabilities, providing accurate and contextually relevant responses to user queries about environmental topics.
Winning Best Use of Streamlit at Caltech's Hacktech 2025 demonstrated the application's effective use of modern web frameworks and AI technology. The judges recognized the project's clean implementation, user-friendly design, and impactful mission of promoting environmental awareness.
Rainforest Revival is an innovative game that combines entertainment with environmental education. The project features three distinct mini-games and an integrated AI character chatbot, creating a unique gaming experience that raises awareness about rainforest conservation.
🏆 Most Unique Game - SEO First Year Academy Summer Program
Developed during the SEO First Year Academy Summer Program, Rainforest Revival stands out for its creative combination of traditional game mechanics with cutting-edge AI technology. The game provides an engaging platform for learning about environmental conservation.
Built using the Pygame framework for game mechanics and graphics, with Google Gemini API integration for the chatbot feature. The three mini-games were designed to be both entertaining and educational, each highlighting different conservation themes.
The project earned the Most Unique Game award for its innovative combination of traditional 2D gaming with advanced AI conversation capabilities. The seamless integration of an LLM-powered chatbot into a Pygame environment was particularly noteworthy.
Digidraw is a web application that digitally coordinates the annual room-draw (housing selection) process at Harvey Mudd College. It replaces an error-prone paper-and-whiteboard workflow with a shared interface where students can plan, preview, and finalize their room picks in real time.
Used by 900+ students
Adopted campus-wide, Digidraw has become the standard tool students use during room draw each year — streamlining a high-stakes event that previously depended on in-person coordination and shared spreadsheets.
A multimodal lie detection system that analyzes videos using both natural language processing and computer vision techniques. The system combines sentiment analysis from speech and visual cues to identify potential deception indicators.
This project explores the intersection of NLP and computer vision by implementing a comprehensive system that processes video content to detect deception through multimodal analysis.
The system processes videos by extracting audio and visual streams. Google Cloud Speech API converts speech to text for NLP sentiment analysis, while computer vision algorithms analyze facial expressions and behavioral cues simultaneously.
By combining NLP sentiment analysis and computer vision, the system leverages multiple information sources. This approach provides more comprehensive detection than single-modality systems by analyzing both verbal and non-verbal communication.
This project serves as an exploration of multimodal AI techniques. It's designed for educational and research purposes to understand the capabilities and challenges of combining NLP and computer vision for complex analysis tasks.
I'm currently looking for software engineering internships for summer 2027, and I would love to get in contact to chat more!