jason@jasonkim.io~$ whoami
Jason Kim
machine learning engineer (ex data scientist, ex software engineer) in sf
building and deploying deep learning models end-to-end — from experimentation and development to production deployment and real-time inference serving millions of users
jason@jasonkim.io~$ cat experience.txt
machine learning engineer
- Filed two patents and coauthored a research paper on multitask deep learning for delivery logistics; poster accepted and presented at NVIDIA GTC 2026
- Architected a mixture-of-experts deep learning model for ETA prediction across five delivery verticals serving millions of daily orders with +13% relative accuracy gain, driving $XX M+ incremental annual order volume. Currently in full production
- Built a probabilistic deep neural network that models the full uncertainty distribution of delivery assignment durations, resulting in 10.4% relative reduction in MAE, +7.4% relative improvement in on-time accuracy, and 75% reduction in systematic bias vs production baseline
- Developed an offline reinforcement learning policy to replace hand-tuned heuristics in DoorDash's real-time delivery assignment optimizer, reframing a legacy cost estimation system as a learned decision problem
- Designed and mentored a graduate intern project on non-parametric probabilistic ETA modeling, reframing point prediction as full distribution learning; on track to replace the current champion production model
- All models shipped end-to-end through architecture search, power analysis, production A/B testing with guardrail metrics, and deployment at scale
data scientist
- First hire on the Data Science team at a $13B AUM multi-manager hedge fund, generating trading signals that portfolio managers actively traded on for positive P&L
- Built and internally deployed a pip-installable Python package for quantitative signal generation from unstructured text using LLMs and retrieval-augmented generation (RAG)
- Trained a weighted ensemble of 13 base models to forecast US Treasury yield behavior from auction announcement to sale, achieving a Sharpe ratio of X.XX on backtest and outperforming the S&P 500 over the same period
- Automated daily training and deployment of a LightGBM model for oil futures trend prediction on AWS SageMaker and Apache Airflow, with a custom loss function optimized for directional accuracy. Built a Streamlit dashboard for portfolio managers to view predictions and Shapley-based interpretability
- Engineered NLP research pipelines to extract trading signals from daily research reports using OpenAI embeddings, AWS, and Snowflake
- Partnered with the equity risk team to estimate cross-pod portfolio correlations using LLM-generated signals for a 100-stock covariance matrix, validating statistical significance with regression analysis
- Led vendor analysis, POC, and adoption of AWS SageMaker for the Data Science team
strategy analyst
- Built a data science platform analyzing financial and nonfinancial data to recommend managerial actions for Fortune 500 companies, computing 171 financial and 128 nonfinancial metrics per company
- Ran 77 million correlations and identified 200 significant links between managerial actions and company performance; designed a Power BI interface to visualize correlations and conduct what-if analyses
- Built financial datasets from public filings across all publicly listed healthcare companies and conducted quantitative analysis on stock performance, industry segmentation, and market dynamics for a healthcare conglomerate client
- Developed Python-based geospatial analysis tools using Google Maps API to map and analyze clinical research site distributions for a pharmaceutical client
- Built financial models projecting revenue and cost scenarios for new product offerings in home health
software engineer
- Software engineer at YC startup later acquired by Klarna
- Built full-stack features in Spring Boot including receipt search with multi-field filtering and a historical marketing promotions viewer
- Improved codebase quality through bug fixes, test coverage expansion, and repository modernization
graduate teaching assistant
- CIS545 (Graduate) Big Data Analytics
- CIT594 (Graduate) Data Structures & Software Design
- CIT592 (Graduate) Mathematical Foundations of Computer Science
- NETS213 Crowdsourcing & Human Computation
jason@jasonkim.io~$ cat education.txt
university of pennsylvania
the wharton school
jason@jasonkim.io~$ cat contact.txt
- jason@jasonkim.io
- linkedin.com/in/jasonkim-io
- github
- github.com/jasonkim-io