Skip to main content

Patrick McBride

AI/ML Infrastructure Engineer

Summary

AI/ML infrastructure engineer with 9+ years building production-grade machine learning systems. Specialized in LLM integration, embedding systems, and scalable, cost-aware ML infrastructure. Currently building backend AI infrastructure at Bill.com and serving as co-founder and Head of AI at ApplyPass, where I built systems handling 800K+ API requests and 3B+ tokens per month.

Experience

Senior Software Engineer, Backend AI Infrastructure

Aug 2025 - Present

Bill.comRemote, CA

  • Design and operate agentic LLM and ML inference pipelines as async FastAPI services on AWS ECS Fargate
  • Build high-throughput AI backend services with per-request token budgeting, exponential-backoff retries, and dead-letter queues
  • Instrument production AI systems with structured JSON logging, correlation-ID tracing, and prompt-hash response caching for cost control
  • Integrate Amazon Bedrock and OpenAI models behind a unified, retry-safe client layer for multi-agent orchestration

Head of AI & Machine Learning (Co-Founder)

June 2023 - Present

ApplyPassRemote, CA

  • Built the entire AI backend from scratch in Python/FastAPI — 800K requests and 3B tokens per month against the OpenAI API
  • Shipped a function-calling job classifier and a two-tower recommendation system, lifting job-match accuracy from 70% to 90%+
  • Cut LLM inference cost ~95% by migrating GPT-4 workloads to a fine-tuned GPT-3.5 with no measurable quality loss
  • Led the full migration of the AI backend from AWS ECS Fargate to DigitalOcean App Platform with Terraform IaC, GitHub Actions CI/CD, and a zero-downtime DNS cutover
  • Built an MCP server and Cowork agent-plugin suite (OAuth2 PKCE) for natural-language platform operations

Staff AI Software Engineer

Sept 2022 - Aug 2025

VerizonRemote, CA

  • Spearheaded an AI chatbot for contract analysis achieving 90% accuracy in deviation detection
  • Directed a Neo4j GraphRAG tool for legal contracts, reducing hallucinations in complex document analysis
  • Fine-tuned Llama/Mistral models for insurance language extraction — 90% accuracy, saving 5000+ hours/quarter
  • Created a GenAI competitive-intel chatbot with SQL queries to BigQuery — POC in 2 weeks, MVP in 1 month
  • Developed real-time speech-to-text with Whisper, reducing latency from 30s to 5s

Senior AI Software Engineer

Mar 2016 - Sep 2022

KLAMilpitas, CA

  • Developed a data storage API with PostgreSQL + MinIO — 4X write speed, 6X read speed improvement
  • Created a multi-container Docker Compose app for DL inference analyzing 10,000+ images per workload
  • Led an SRGAN+CNN image classification team, improving SEM review throughput by 4X

Education

M.S. Materials Science and Engineering

2014 - 2016

Stanford University

B.S. Physics

2010 - 2014

University of California, Santa Barbara

Technical Skills

LLM & AI

OpenAI API, Amazon Bedrock, RAG, GraphRAG, Function Calling, Structured Outputs, Fine-tuning, Evals, MCP

ML Frameworks

PyTorch, TensorFlow, Scikit-learn, Hugging Face, ONNX

Infrastructure

FastAPI, AWS ECS Fargate, DigitalOcean, Docker, Terraform, GitHub Actions CI/CD

Data

PostgreSQL, pgvector, MongoDB, Redis, Neo4j, BigQuery