The honest range: $5k to $200k+

"It depends" is the consultant's hedge. So here are the brackets we see across our own pipeline and 12 competitors we benchmarked: $5k–$200k+ for the initial build, depending on tier and scope. Most production agents land between $25k and $80k for a first version.

That spread isn't because pricing is mysterious — it's because "an AI agent" can mean five very different things. Below: the three tiers we quote against, and what each actually gets you.

POC: $5k–$15k · 2–4 weeks

A working demo against your real data. Not production. Not multi-user. Not monitored. The point is to prove the agent can do the task at all — and to give you a real artifact to circulate internally before committing budget.

  • One workflow, single happy path
  • Run on the developer's machine or a sandbox deploy
  • No auth, no logging, no rate limiting
  • Eval against ~20 sample inputs

What you get out: a Loom video, a Notion doc, a working agent that you can poke at. What you don't get: something you can put in front of customers.

MVP: $15k–$45k · 4–8 weeks

The first version actual humans can use. Production deploy, basic auth, logging, eval harness, one round of iteration on prompts and tools after seeing real usage.

  • Production deployment (Vercel, Fly, or your stack)
  • Auth, rate limiting, basic observability (Langfuse or similar)
  • Tool integrations: 2–4 real ones
  • Eval set of 100–200 inputs, regression-tested before deploys
  • Documentation and handoff

This is what 70% of our clients actually need. They don't realize it until they've tried to spec a "full system" and the scope balloons.

Production system: $45k–$200k+ · 8–16 weeks

Multi-tenant, multi-channel, real SLA. Custom data ingestion. Fine-tuned prompts. Eval pipelines. Human-in-the-loop for high-stakes outputs. Integrations across 5+ tools. Compliance work if you're in regulated space (HIPAA, SOC 2).

This tier is where you stop calling it "the agent" and start calling it "the platform." Maintenance becomes a real line item.

Where the money goes

The number that surprises clients: model and prompt work is the smallest line.

Line item% of budget
Integration engineering (APIs, webhooks, data plumbing)50–60%
Frontend / UX / dashboards10–15%
Prompts, tools, model selection, eval10–20%
Deployment, observability, security10–15%
PM + writing + docs5–10%

Monthly running cost (after launch)

Once it's live, you're paying for: model tokens, infra, observability tools, ongoing maintenance.

  • Low volume agent (under 5k runs/mo): $80–$300 model + $50–$150 infra = ~$200–$500/mo
  • Medium volume (5k–50k runs/mo): $400–$2k model + $200–$500 infra = ~$800–$3k/mo
  • High volume (50k+ runs/mo): negotiated rates with Anthropic/OpenAI, hosting on dedicated. $3k–$20k+/mo depending on token weight

Plus retainer if you want us (or another team) to keep optimizing — typically $3k–$8k/mo for 1 day/week of engineering.

Where you'll waste money

  1. Building before measuring. If you can't quantify the manual cost of the task today, you can't quantify the savings tomorrow. Measure two weeks first.
  2. Multi-agent orchestration for solo tasks. CrewAI and friends are sexy. Most use cases need one agent with three tools, not five agents talking to each other.
  3. Buying a platform before having a use case. $15k/year for an AI-ops platform you'll use for one workflow. Free tier exists.
  4. Over-engineering eval. A spreadsheet with 50 inputs/outputs is fine for v1. You don't need a full eval framework day one.
Most AI projects fail on integration, not on the model. Budget like that's true and you'll be right more often.