Brayden WatkinsEmail ↗

01 · Index 2026

BraydenWatkins.

Expert at using AI to ship software. Six products in five months. Full-stack, in production, tested by hand.

Scroll
02Manifesto

The next decade of software gets written by people talking to machines. I’m an expert at that.

Hire me and you get an engineer who ships 3-5x faster using Claude Code, Codex CLI, and Gemini CLI, and who writes the verifier suites that catch what the model would have shipped into your production system. Building with AI and breaking AI compound.

Day job: at DataAnnotation I red-team coding agents directly. I design adversarial tasks that expose where Claude Code, Codex CLI, and Gemini CLI fail and write the reference reasoning that calibrates how grader models score future attempts. Public benchmark, real run data, methodology you can read.

Available now · Remote-first US · $50-150/hr · Phoenix metro on-site fine

03Selected Work

Why hire me?
Run the code yourself.

Five artifacts, each pointed at a specific hiring question. The Coding-Agent Shootout is live. The essays are in flight. The open-source tool, the production SaaS, and the OSS trail are next on the docket. Open any of them. Run the code yourself.

Drag, scroll, or use ← → keys.

I · 01 / 05Live

The Benchmark

Coding-Agent Shootout · Public eval framework

A public benchmark comparing the major coding agents (Claude Code, Codex CLI, Gemini CLI, Cursor agent, Cline) on real engineering tasks. Each task ships with an executable Jest verifier suite. Pass means the generated code compiles, runs, and matches expected behavior. Monthly re-runs with statistical confidence intervals. Designed to be cited.

JestNode.jsCoding agent CLIs

Live · 8 tasks · 138 tests · Opus 4.7 (98.6%) > Sonnet 4.6 (97.1%)

Open ↗
II · 02 / 05Coming Soon

The Tool

Open-source AI dev tool

An open-source tool that real AI engineers install and keep installed. Solves one real friction in the daily agent-driven workflow. Featured in the MCP marketplace. When someone asks AI Twitter what they should use for X, this is the answer.

MCPTypeScriptNode.js

Target · 500+ stars · weekly downloads

Coming soon
III · 03 / 05In Development

The Essays

Technical writing series

Twelve to fifteen deep posts on coding-agent failure modes, eval design philosophy, agent orchestration patterns, and comparison studies. Several engineered to hit Hacker News front page or trend on AI Twitter. Authority you cannot fake.

WritingDistribution

Target · 12+ posts · 50K+ cumulative views

Coming soon
IV · 04 / 05In Development

The Production Build

Founder · AI-built SaaS

A real AI-powered SaaS shipped to paying customers. Built almost entirely with Claude Code, Codex CLI, and Gemini CLI orchestrated in tmux. Public revenue and usage metrics page. The proof that the AI workflow ships to market, not just to GitHub.

RemixSupabaseStripeClaude API

Target · first paying customer · public metrics

Coming soon
V · 05 / 05Coming Soon

The OSS Trail

Contributions to AI dev tools

Twenty-five plus merged PRs across the AI tooling ecosystem (Cline, Aider, Continue, Vercel AI SDK, Cursor extensions, MCP servers). Recognized contributor. Maintainer of at least one feature on a major repo. The Zed-model back-door hire pipeline.

ClineAiderContinueVercel AI SDK

Target · 25+ merged PRs

Coming soon
05Where I work

The day-to-day.

DataAnnotation

AI Trainer & Evaluator

April 2026 · Present

  • Design adversarial tasks targeting coding agents (Claude Code, Codex CLI, Gemini CLI) to expose ambiguity errors and instruction-following gaps
  • Write golden chain-of-thought reference reasoning that calibrates how AI grader models score responses
  • Rate and compare AI-generated responses across pairwise and direct rating tasks
  • Passed coding qualifications: agent operation, accounting, AWS, Docker tooling

Solo Founder

AI Engineer

January 2026 · Present

  • Ship full-stack LLM applications using Claude Code, Codex CLI, and Gemini CLI orchestrated in tmux
  • Build production Claude API integrations (Sonnet, Opus, Haiku) for live products
  • Implement REST APIs, webhooks, Stripe billing, and data pipelines on PostgreSQL and Supabase
  • Deploy to Vercel and Oracle Cloud ARM VMs; containerize with Docker
06Contact

Hire me.
Today.

Available immediately for remote AI engineering work. $50-150/hr contract or $90-180K base W2. Full-time, contract, or part-time. Phoenix metro on-site is also fine. Email gets the fastest response.