>_ Data Science & Agentic AI

Ben Robinson

Product data science, experimentation, and causal inference.
Production ML and agentic AI at Google. Evidence that drives product decisions.

Rigor in data science. Evidence through experimentation.

I work in applied ML at Google, with a background in platform security and abuse prevention and a current focus on product data science: how users adopt AI, what experiments show, and how measurement should inform roadmap and design. My work spans causal inference, A/B testing, production ML for growth, and agentic systems that fit real workflows, with clear standards for how we model, evaluate, and ship.

I stay close to the work: code, evaluation frameworks, and modeling decisions. I care about understanding how people integrate AI into their workflows, not just whether a model scores well offline. I introduced experimentation frameworks that became the standard for how we measure intervention impact. The goal is the right product decision, grounded in evidence. My working thesis is that models will commoditize and the best integrations will win, which is why I care as much about product judgment and roadmap as I do about the model itself.

I've worked across adtech, healthcare, media, and big tech. The constant: rigorous analysis, production-grade ML, and turning data into decisions Product and Engineering can act on.

Data Science AI Agents Product Data Science Experimentation Causal Inference MLOps

Core areas

Data Science

Product-oriented data science: causal inference, experimentation, and measurement so product and engineering can make grounded decisions. The discipline is in how we define the question, choose the method, and interpret the result for the user and the business.

  • Causal inference & experimentation
  • Product metrics & impact sizing
  • Evaluation frameworks & metrics
  • From insight to the right decision

AI Agents

Designing and shipping AI agents that run in production, with clear evaluation and guardrails. I focus on how users actually integrate AI into their workflows: what they trust, where they intervene, and what experiments and telemetry say about adoption and outcomes.

  • Agent architecture & orchestration
  • Workflow integration & adoption
  • Evaluation in production
  • Product decisions from usage data

Product Data Science

Partnering with Product on experiments, metrics, and roadmap tradeoffs on high-volume platforms. Classification and graph-based models where they serve user and business outcomes, plus the measurement discipline to know whether a change worked.

  • A/B testing & experiment design
  • User behavior & adoption analysis
  • Graph ML & classification
  • Platform-scale deployment

MLOps & Evaluation

Modeling standards, evaluation frameworks, and review practices for production systems. How we define metrics, run experiments, monitor drift, and document decisions so models stay trustworthy in production.

  • Evaluation frameworks & metrics
  • A/B testing & experiment design
  • Monitoring & iteration
  • Documentation & reproducibility

Selected work

AI Agents Production

Quota Allocation AI Agent, Google Cloud

Built and shipped an AI agent that automates quota allocation decisions, replacing a manual, vendor-dependent process. Defined evaluation and guardrails before rollout; eliminated vendor OpEx and reduced on-call burden by 90%. Rigorous measurement before scale was key.

PythonGCP
Data Science Production

Quota Tier Redesign, Google Cloud

Partnered with Product to redesign quota tier structures using ML-driven controls and experimentation. Improved platform outcomes (85% reduction in unwanted activity) without impacting legitimate users. Required tight coordination across modeling, policy, and product on what “right” looks like for customers.

PythonSQLCausal Inference
Data Science Production

Medical Cost Reduction, Aetna (CVS Health)

Executed five data science projects delivering $9.5M in annualized medical claim cost reductions for Medicare. Production ML pipelines and causal inference within strict compliance. Owned technical direction across four data scientists; presented results to the Chief Actuary and executive stakeholders.

PythonSQLCausal Inference

Experience

Product data science across industries. The through-line: experimentation, causal inference, and decisions that stick in production.

Applied ML Manager & Staff Applied ML Scientist

Google

2025 – Present

Own production ML systems and roadmap across customer growth, revenue, and access on Google Cloud. Partner with Product and Engineering on experiments, metrics, and how users adopt AI in their workflows. Background includes platform security and abuse prevention; current emphasis is product data science and evidence-based roadmap decisions.

  • ML systems that influence $300M+ in annual outcomes; experimentation and measurement applied consistently
  • Shipped an AI agent for quota allocation with defined evaluation and guardrails; eliminated vendor OpEx, 90% reduction in on-call toil
  • Partnered with Product on quota tier redesign; strong product outcomes without harming legitimate users
  • Causal inference and A/B testing frameworks adopted as org-wide standard

Senior Machine Learning Scientist

Google

2022 – 2025

ML and experimentation for GCP products; partnered with Product and Engineering on controls, metrics, and intervention design at scale. Introduced causal inference and experimentation as the standard for measuring what works.

  • Customer classification model: 60 days to 3 for creditworthy identification; $50M+ revenue acceleration
  • Established causal inference and A/B testing frameworks for interventions; standard adopted across the broader org
  • Interim technical ownership during a 4-month leave period; maintained delivery on production ML roadmap

Data Scientist & Senior Data Scientist

Aetna (CVS Health)

2020 – 2022

Production ML and causal inference for Aetna Medicare, within strict compliance. Owned technical direction for four data scientists; rigorous evaluation and clear reporting to executive stakeholders.

  • $9.5M annualized medical cost reductions across five projects; tight cycles, clear metrics
  • Presented to Chief Actuary and executive stakeholders; clarity and accountability in reporting
  • Informed resource allocation across clinical and marketing; data science in service of the right decision

Data Scientist

Dotdash

2019 – 2020

Content recommendation and monetization: seasonality and text-similarity modeling at scale. Cross-functional work with Product and Ad Ops during a volatile period; focus on stable, interpretable results.

Data Analyst & Senior Data Analyst

AppNexus

2015 – 2019

Revenue prediction and distributed pipelines for real-time bidding and programmatic advertising. Data science in a high-throughput, production environment.

  • Revenue prediction models; ~$8M in annual seller revenue improvements
  • Distributed pipelines for high-volume real-time bidding

M.S., Mathematics

City College of New York

2019

B.A., Mathematics (Distinction)

Washington University in St. Louis

2010

Data to insight to the right decision, at speed and scale

Collaboration, questions, or a conversation about product data science, experimentation, causal inference, or AI in production. Please reach out.