Lg
Loading · Project 001
Project 001 · Latest
AI Agents LLM Systems Agentic Commerce

Agentic Checkout
Optimization

A platform for autonomous agentic checkout testing. Hundreds of AI agents with varying personas stress-test purchase flows and optimize sites for agent-driven commerce.

Type
Original research / build
Timeline
2025
Stack
LLM agents · Python · Web
Output
Optimization report + site copy
Phase 01
Execution Optimization
Hundreds of agents with varying personas and instructions visit a site and attempt to checkout. Methodology inspired by recent autoresearch techniques. When agents hit failure points in site design, instruction gaps, or functionality, a copy-site is generated with targeted fixes, and agents re-run to measure improvement in execution rate.
Phase 02
Ranking Optimization
Evaluates the likelihood an independent agent will purchase a specific product given a broad order (e.g. "order a comfortable pair of casual shoes"). Agents scrape text data as a key decision signal. Using a decision-compression framework with hesitation tracking and action confidence, an LLM assigns a purchase-likelihood metric. Dozens of site variants then test optimizations to improve agent preferencing.
Live Demo Output

A real report from a water brand optimization run.

Below is a live scrollable excerpt of the optimization report generated for a DTC water company. The platform ran 200+ agents across two levels of optimization, identified execution failures, generated modified site copies, and produced ranked recommendations.

The left column shows quantitative agent performance metrics. The right column shows the actual recommendation output: the kind of artifact a brand or their dev team would use to improve agentic purchasability.

optimization_report.json — live output
Client
Clearflow
DTC water brand · Run: 2025-03-14 · Agents: 240
Level 01 — Execution Scores (Baseline)
Checkout Completion
34%
Add to Cart
61%
Product Found
78%
Subscription Flow
18%
Agent Run Matrix — Checkout Attempt (n=48)
A01 ✓
A02 ✗
A03 ✓
A04 ✗
A05 △
A06 ✓
A07 ✗
A08 ✗
A09 ✓
A10 △
A11 ✗
A12 ✓
A13 ✗
A14 ✓
A15 ✓
A16 ✗
A17 △
A18 ✗
A19 ✓
A20 ✗
✓ pass   ✗ fail   △ partial — showing first 20 of 48
Failure Clustering — Top Dropout Points
01
Checkout page requires account creation before purchase. Agents without persistent session state fail here (31% of failures)
02
Subscription upsell modal blocks cart review without a clear skip path. Agent hesitation spike detected (22% of failures)
03
Product quantity selector is non-standard UI. Agents default to 0 and cannot proceed (18% of failures)
04
Shipping address form has ambiguous field labeling. Agents frequently populate incorrect fields (14% of failures)
Level 01 — Post-Optimization Scores (Site Copy v2)
Checkout Completion
81%
Add to Cart
94%
Subscription Flow
67%
Level 02 — Ranking Likelihood Scores
Clearflow (baseline)
28%
Competitor A
44%
Competitor B
18%
Clearflow (v2 copy)
71%
Recommendations — Site Text & Design
R1
Add explicit "ships same-day" copy above fold. Agents heavily weight fulfillment speed signals in purchase decisions
R2
Replace subscription framing from "Save 20%" to "Never run out". Agents respond to scarcity/reliability framing over discount language
R3
Add structured ingredient/source data in machine-readable format. LLM agents parse product metadata as a key trust signal
R4
Remove interstitial newsletter capture pre-checkout — increases agent session abandonment by 2.3x
R5
Add clear quantity-select affordances with semantic HTML labeling for non-visual agent parsing
Final Verdict
Post-optimization site copy shows 2.4× checkout completion improvement and 71% agent purchase-likelihood, up from 28% baseline. Recommend shipping v2 copy to production with A/B validation against human traffic cohort. Full modification list in appendix.

Built on decision-compression and hesitation tracking.

The core insight behind this project is that agentic purchasing is fundamentally different from human purchasing. Agents parse structured signals, respond to semantic clarity over emotional framing, and fail at UI patterns humans navigate intuitively.

The decision-compression framework monitors two signals: hesitation tracking (how often an agent reconsiders before acting) and action confidence (the probability weight assigned to each click). An LLM on top of the agents synthesizes these into a purchase-likelihood metric, enabling direct comparison across site variants.

Inspired by the autoresearch methodology, the execution layer generates modified site copies algorithmically rather than manually, meaning the optimization loop runs autonomously once a site is entered.