Patrick McBride
AI/ML Infrastructure Engineer
Summary
AI/ML infrastructure engineer with 9+ years building production-grade machine learning systems. Specialized in LLM integration, embedding systems, and scalable, cost-aware ML infrastructure. Currently building backend AI infrastructure at Bill.com and serving as co-founder and Head of AI at ApplyPass, where I built systems handling 800K+ API requests and 3B+ tokens per month.
Experience
Senior Software Engineer, Backend AI Infrastructure
Aug 2025 - PresentBill.com • Remote, CA
- •Design and operate agentic LLM and ML inference pipelines as async FastAPI services on AWS ECS Fargate
- •Build high-throughput AI backend services with per-request token budgeting, exponential-backoff retries, and dead-letter queues
- •Instrument production AI systems with structured JSON logging, correlation-ID tracing, and prompt-hash response caching for cost control
- •Integrate Amazon Bedrock and OpenAI models behind a unified, retry-safe client layer for multi-agent orchestration
Head of AI & Machine Learning (Co-Founder)
June 2023 - PresentApplyPass • Remote, CA
- •Built the entire AI backend from scratch in Python/FastAPI — 800K requests and 3B tokens per month against the OpenAI API
- •Shipped a function-calling job classifier and a two-tower recommendation system, lifting job-match accuracy from 70% to 90%+
- •Cut LLM inference cost ~95% by migrating GPT-4 workloads to a fine-tuned GPT-3.5 with no measurable quality loss
- •Led the full migration of the AI backend from AWS ECS Fargate to DigitalOcean App Platform with Terraform IaC, GitHub Actions CI/CD, and a zero-downtime DNS cutover
- •Built an MCP server and Cowork agent-plugin suite (OAuth2 PKCE) for natural-language platform operations
Staff AI Software Engineer
Sept 2022 - Aug 2025Verizon • Remote, CA
- •Spearheaded an AI chatbot for contract analysis achieving 90% accuracy in deviation detection
- •Directed a Neo4j GraphRAG tool for legal contracts, reducing hallucinations in complex document analysis
- •Fine-tuned Llama/Mistral models for insurance language extraction — 90% accuracy, saving 5000+ hours/quarter
- •Created a GenAI competitive-intel chatbot with SQL queries to BigQuery — POC in 2 weeks, MVP in 1 month
- •Developed real-time speech-to-text with Whisper, reducing latency from 30s to 5s
Senior AI Software Engineer
Mar 2016 - Sep 2022KLA • Milpitas, CA
- •Developed a data storage API with PostgreSQL + MinIO — 4X write speed, 6X read speed improvement
- •Created a multi-container Docker Compose app for DL inference analyzing 10,000+ images per workload
- •Led an SRGAN+CNN image classification team, improving SEM review throughput by 4X
Education
M.S. Materials Science and Engineering
2014 - 2016Stanford University
B.S. Physics
2010 - 2014University of California, Santa Barbara
Technical Skills
LLM & AI
OpenAI API, Amazon Bedrock, RAG, GraphRAG, Function Calling, Structured Outputs, Fine-tuning, Evals, MCP
ML Frameworks
PyTorch, TensorFlow, Scikit-learn, Hugging Face, ONNX
Infrastructure
FastAPI, AWS ECS Fargate, DigitalOcean, Docker, Terraform, GitHub Actions CI/CD
Data
PostgreSQL, pgvector, MongoDB, Redis, Neo4j, BigQuery