Staff AI/ML Engineer • Co-Founder

Patrick McBride

I build production-grade LLM systems that handle millions of requests. Currently leading AI at ApplyPass and shipping ML infrastructure at Verizon.

800K+

API requests/month

3B+

Tokens processed/month

9+

Years in ML

About Me

I'm a Staff AI/ML Engineer with a unique path: from physics and materials science to deep learning and production LLM systems.

At Verizon, I lead AI initiatives including contract analysis chatbots achieving 90% accuracy, GraphRAG systems for legal documents, and real-time speech-to-text reducing latency from 30s to 5s.

As Co-Founder and Head of AI at ApplyPass, I built the entire AI backend from scratch—handling 800K requests/month with 3B tokens, developing recommendation systems that boosted match accuracy from 30% to 90%.

I care deeply about systems that work in production, not just in notebooks. That means structured outputs, evaluation loops, and cost-conscious architecture.

LLMs

OpenAI APIRAGGraphRAGFunction CallingStructured OutputsFine-tuning

ML Frameworks

PyTorchTensorFlowScikit-learnHugging Face

Infrastructure

FastAPIAWSDockerKubernetesTerraform

Data

PostgreSQLMongoDBRedisNeo4jMilvus

Experience

Verizon

Staff AI Software Engineer

Sept 2022 - PresentRemote, CA

  • Spearheaded comprehensive AI Chatbot for contract analysis achieving 90% accuracy in deviation detection
  • Directed Neo4j GraphRAG tool for long contracts, reducing hallucinations in legal document analysis
  • Fine-tuned Llama/Mistral models for insurance language extraction—90% accuracy, saving 5000+ hours/quarter
  • Created GenAI competitive intel chatbot with SQL queries to BigQuery—POC in 2 weeks, MVP in 1 month
  • Developed real-time speech-to-text with Whisper, reducing latency from 30s to 5s

ApplyPass

Head of AI & Machine Learning (Co-Founder)

June 2023 - PresentRemote, CA

  • Built entire AI backend with OpenAI API in Python/FastAPI—800K requests/month, 3B tokens/month
  • Created Job Classifier using function calling to convert listings to structured JSON
  • Migrated GPT-4 to GPT-3.5 achieving comparable accuracy at 5% of original cost
  • Fine-tuned custom GPT-3.5 improving job filter accuracy from 80% to 95%
  • Developed Two-Tower recommendation system boosting match accuracy from 70% to 90%+

KLA

Senior AI Software Engineer

Mar 2016 - Sep 2022Milpitas, CA

  • Developed data storage API with PostgreSQL + MinIO—4X write speed, 6X read speed improvement
  • Created multi-container Docker Compose app for DL inference analyzing 10,000+ images per workload
  • Led SRGAN+CNN image classification team, improving SEM review throughput by 4X
  • Characterized GoogLeNet classification—reduced workloads from 10 hours to 30 minutes with 99%+ accuracy

Projects

Production

ApplyPass AI Backend

Production LLM system handling 800K+ requests/month with 3B tokens. Features job classification, answer generation, and Two-Tower recommendation system.

FastAPIOpenAI APIPostgreSQLEmbeddings
Enterprise

Contract Analysis GraphRAG

Neo4j-based GraphRAG system for parsing legal contracts, identifying relationships between amendments and master agreements, reducing hallucinations.

Neo4jLangChainPythonGraphRAG
Open Source

OutcoGPT Interview Chatbot

Mock technical interview chatbot with speech input, real-time code grading, and generated audio responses using Gradio interface.

ChatGPTGradioSpeech-to-TextPython
Enterprise

Real-time Speech Transcription

OpenAI Whisper integration for agent-customer call transcription, reducing latency from 30s to 5s with speaker diarization.

WhisperPyAnnoteFastAPIStreaming