<E.Justice/>
ML Researcher | LLM Inference Optimization | vLLM
Hi, I'm Ethan Justice - a Computer Science student at the University of Michigan researching KV-cache optimization and multi-agent LLM systems. Building production-grade ML infrastructure on HPC clusters.
02. Experience Timeline
From ML research to enterprise software engineering, building production systems at scale.
03. Work & Projects
From ML research to hackathon wins, building solutions across the stack.
Machine Learning Researcher
Optimization of Large Language Model inference infrastructure focusing on KV-cache management.
- ▹Productionizing research from 'Compute Or Load KV Cache? Why Not Both?' paper into the vLLM ecosystem
- ▹Implementing bidirectional KV prefill logic to minimize Time-to-First-Token (TTFT) latency
- ▹Porting experimental logic from legacy codebases to current main branches of vLLM and LMCache
- ▹Optimizing GPU memory transfer overheads on Great Lakes HPC (A100/H100) clusters
AI Engineering Intern
Core infrastructure engineer for an agentic memory platform; built the complete multimodal interaction pipeline.
- ▹Engineered the entire end-to-end multimodal pipeline, enabling seamless file uploads and voice interfaces
- ▹Achieved sub-second voice-to-voice agent interactions through optimized audio processing streams
- ▹Implemented a semantic caching layer via Convex, reducing API response times by over 99%
- ▹Built an 'LLM-as-a-Judge' testing suite to quantitatively evaluate agent memory performance
Efficient Heterogeneous LLM Multi-Agent Debate
Research framework and class project reducing inference costs for complex reasoning tasks by 40% via heterogeneous agents.
- ▹Achieved 40% reduction in total FLOPs using a confidence-based gating mechanism for agent responses
- ▹Deployed high-performance inference pipelines on Great Lakes HPC cluster using Slurm scheduling
- ▹Implemented a Factory Pattern to modularly switch backends between vLLM and HuggingFace
- ▹Identified 'Syntactic Determinism' failure modes in token-confidence calibration
Information Digest
Agentic information retrieval pipeline with robust Infrastructure-as-Code and CI/CD automation.
- ▹Automated complete infrastructure lifecycle using Terraform and GitHub Actions
- ▹Engineered an LLM agent that autonomously decomposes complex queries and synthesizes answers
- ▹Developed comprehensive Pytest suites ensuring reliability before deployment to GCP Cloud Run
- ▹Designed a strongly-typed REST interface guaranteeing deterministic JSON outputs
Cycle-Accurate Processor Simulator
High-performance microarchitectural simulator modeling a 5-stage pipelined CPU with a configurable cache hierarchy.
- ▹Engineered a 5-stage pipeline simulator (IF, ID, EX, MEM, WB) achieving cycle-accurate execution of LC-2K binaries
- ▹Implemented hazard detection logic to handle data dependencies via forwarding units and stall injection
- ▹Integrated a unified instruction/data cache with configurable associativity, block size, and write-back/allocate-on-write policies
- ▹Optimized memory access simulation using Least Recently Used (LRU) eviction to minimize miss rates
LC-2K Compilation Toolchain
End-to-end translation system converting assembly code into executable machine binaries via intermediate object files.
- ▹Developed a two-pass assembler generating object files with symbol tables and relocation entries for global label resolution
- ▹Built a static linker capable of combining multiple object files, resolving cross-file dependencies, and managing stack allocation
- ▹Wrote optimized LC-2K assembly algorithms, including a recursive combination function managing stack frames and return addresses
- ▹Implemented bitwise simulation of the LC-2K ISA to validate machine code execution behavior
Software Engineering Intern
Backend development and DevOps automation for enterprise Computer Systems serving 5,000+ franchise locations.
- ▹Architected a C# .NET microservice to handle SMS automation at the scale of 5,000+ stores
- ▹Designed and implemented full CI/CD pipelines in Azure DevOps to automate deployment workflows
- ▹Created a batch-processing API that achieved a 90% reduction in developer time for data management tasks
- ▹Built compliance tracking tools for franchise store hours using MongoDB and Docker
Deep Metric Learning for Facial Recognition
Facial recognition system built using ResNet-18 and non-parametric instance discrimination.
- ▹Fine-tuned a ResNet-18 backbone using Contrastive Loss to optimize feature embeddings for facial identity
- ▹Implemented Non-Parametric Instance Discrimination (NPID) to maintain a memory bank of feature vectors
- ▹Visualized high-dimensional embedding clusters using t-SNE to verify class separation
- ▹Engineered a custom CNN architecture to benchmark against pre-trained models on a held-out dataset
Interview Bot Pro
AI-powered interview practice application featuring auditory analysis and content grading.
- ▹Trained a Random Forest ML model on the MIT interview dataset for voice prosody analysis
- ▹Integrated Google Gemini to grade answers based on STAR method structure and job requirements
- ▹Developed logic for dynamic follow-up questions tailored to previous user responses
Multi-Resolution Image Blending
Seamless image combination engine using Gaussian and Laplacian pyramids.
- ▹Constructed 4-level Laplacian Pyramids to decompose images into distinct frequency bands
- ▹Implemented seamless blending masks to combine images without visible seams or artifacts
- ▹Reconstructed high-fidelity final images by collapsing pyramid levels upsampled via Gaussian kernels
Steerable Filter Edge Detection
Advanced edge detection system using gradient filters and convolution math.
- ▹Designed steerable filters to detect edges at arbitrary angles (pi/4, pi/2, 3pi/4)
- ▹Implemented Gaussian and Box filters to analyze noise reduction trade-offs in edge detection
- ▹Calculated gradient magnitude and orientation maps to visualize structural image features
Multi-Threaded Network File Server
Concurrent network file server using fine-grained locking and crash-consistent disk logging.
- ▹Implemented fine-grained reader/writer locks (`boost::shared_mutex`) for high-concurrency file access
- ▹Ensured file system consistency and crash recovery via strictly ordered disk writes (inode vs. data)
- ▹Built hierarchical directory management and inode allocation logic
- ▹Designed thread-safe network communication using TCP sockets
Virtual Memory Pager
External pager managing virtual address spaces with eviction policies and copy-on-write sharing.
- ▹Implemented the 'Clock' page replacement algorithm to approximate LRU eviction efficiency
- ▹Engineered copy-on-write optimizations for `fork` operations to minimize physical memory usage
- ▹Managed swap-backed vs. file-backed pages and eager swap reservation
- ▹Handled page faults and memory protection bits to simulate hardware MMU behavior
The Situation Room - Cal Hacks
Unity-based simulation game using AI for dialogue analysis in de-escalation scenarios.
- ▹Integrated Hume AI for real-time sentiment analysis of user voice input
- ▹Utilized Google Gemini to dynamically generate NPC responses based on conversation history
Ribbet - MHacks Project
Cross-platform social media application with betting mechanics built during MHacks.
- ▹Architected MERN Stack backend to support cross-platform mobile functionality
- ▹Implemented full CRUD operations for user profiles, feeds, and betting transactions
Research Assistant
Embedded ML development for real-time user activity sensing and identification.
- ▹Engineered an embedded system on Orange Pi for real-time activity level identification
- ▹Optimized Llama 3 inference on edge hardware using strategic core routing and multithreading
- ▹Processed high-frequency sensor data streams for immediate classification
User-Level Thread Library
High-performance threading library supporting context switching and preemptive scheduling.
- ▹Implemented thread initialization and context switching using `getcontext`, `makecontext`, and `swapcontext`
- ▹Built synchronization primitives including Mutexes and Condition Variables with wait queues
- ▹Handled timer interrupts for preemptive thread scheduling and CPU time slicing
- ▹Designed RAII wrappers for automatic resource management and deadlock prevention
Inner Voice AI
Conversational AI therapist focusing on mental health, winner of 1st place at Stemist Hacks.
- ▹Won 1st Place at Stemist Hacks for best overall project
- ▹Engineered natural conversational flow using OpenAI API and Hume AI emotional analysis
- ▹Implemented session management and history tracking using Firebase
Software Development Intern
Full-stack modernization of internal employee rewards platforms used by 750+ team leads.
- ▹ drove a 20% increase in monthly active users by re-engineering the legacy rewards platform
- ▹Managed backend operations and Apex controllers for a dataset exceeding 50,000 entries
- ▹Implemented automated testing protocols ensuring high availability for critical internal tools
Flight Systems Developer
Autonomous navigation software development for M-Fly Aero Design competition planes.
- ▹Implemented 3DVFH* algorithm for dynamic aerial obstacle avoidance
- ▹Developed autonomous control loops using ROS and MAVLink for actuator management
- ▹Integrated remote identification Computer Systems for real-time aircraft tracking
Lead Programmer
Technical lead for FRC Team 3536; architected software for 4 competitive robots (Reaper, Blade, Raptor, NavPod).
- ▹Designed 'NavPod' custom sensor fusion system combining Optical Flow and IMU data for precise localization
- ▹Implemented Hermite spline pathing for complex autonomous trajectory generation
- ▹Programmed advanced drivetrain logic including Differential Swerve and field-oriented control
- ▹Developed autonomous turret tracking using computer vision and physics-based launch calculations
- ▹Mentored student engineers in C++, object-oriented design, and control theory
04. Education
Coursework, projects, and academic achievements at Michigan Engineering.
University of Michigan
Bachelor of Science in Engineering in Computer Science
Aug 2023 - May 2026• Ann Arbor, Michigan