Most AI chatbots cap at 3–5 tool calls per conversation turn. ChatGPT does. Claude.ai does. The shiny new agent demos you saw on Twitter last week? They do too. Hit the cap, the model stops, hands back a partial result, and asks you to continue.

Foreman AI — the construction agent built into Cornerstone PM™ — chains 75 actions per turn. One prompt can spin up an entire design center category: create the OptionClasses, seed every attribute value, attach pricing modifiers, set tier access, lock buyer compatibility rules, and confirm. All in a single conversation turn. All in under a minute.

People ask us how. The short answer is we control our own agentic loop. The long answer is four pieces of architecture stacked on top of each other. Here's the breakdown.

TL;DR

We wrote our own tool-execution loop in our app code. We set the iteration cap.
A 24,500-word construction knowledge base keeps the model from drifting.
Every skill is a typed function with input validation, not generated code.
Skills hit Postgres directly — no HTTP round-trip, no rate limit.

Result: 75 calls at ~50ms each = under 4 seconds of real execution. The model spends most of the turn thinking, not waiting.

1. We control the loop

When you chat with ChatGPT or Claude.ai, you're talking to a generic agentic loop that lives on OpenAI's or Anthropic's servers. They send the model a prompt, get back a tool call request, execute it, feed the result back, and iterate. That loop has a conservative cap — usually 3 to 5 rounds — because OpenAI and Anthropic are protecting their own infrastructure and the general public from runaway agents.

That cap is set by the platform, not by the model's capability. The model is perfectly capable of chaining more. The cap is a policy decision.

We host Foreman AI's loop ourselves. It lives at app/api/agent/route.ts in the Cornerstone codebase. It calls Anthropic's Messages API to get the model's next step, executes the requested skill against our own database, feeds the result back as the next message, and loops — up to 75 rounds.

The simplified loop

// app/api/agent/route.ts (simplified)
const MAX_ITERATIONS = 75;
let messages = [systemPrompt, ...userMessages];

for (let i = 0; i < MAX_ITERATIONS; i++) {
  const response = await anthropic.messages.create({ messages, tools });

  if (response.stop_reason === "end_turn") break;

  // Execute every tool call the model just requested
  const toolResults = await Promise.all(
    response.tool_calls.map(execSkill)
  );

  // Feed results back as the next message
  messages.push(response.message, { role: "user", content: toolResults });
}

That's it. The model decides when to stop (end_turn); we just give it more rope than the public chat apps do.

Why ChatGPT doesn't do this:they can't. They serve millions of arbitrary users with arbitrary tools. A 75-iteration cap on a public agent means runaway loops, runaway bills, and runaway hallucinations. Conservative limits are the right call for a general-purpose assistant.

But we're not general-purpose. We're construction. Which gets us to the second piece.

2. A 24,500-word knowledge base keeps the model on track

Here's the dirty secret of generic agentic AI: it goes off the rails after a few steps. It hallucinates a skill name. It passes the wrong parameters. It tries to call a tool that doesn't exist. It loops infinitely calling something that returned an empty array.

That's why the public agents cap at 3-5. Not because the model is incapable of more — because the model is incapable of more without domain knowledge.

Foreman ships with three memory files baked into its system prompt:

📖

App Knowledge

158 KB · 2,300+ lines

Every feature, workflow, role, and data model in Cornerstone PM.

📐

Estimating Formulas

8 KB · 39 formulas

Drywall coverage, roofing squares, paint sqft, lumber takeoffs, concrete volume.

🎯

Prompt Library

16 KB · 10 categories

Pre-built setup recipes for the design center. Copy, swap vendors, run.

That's 24,500 words of construction expertise in the model's context before a single user message lands. Foreman doesn't have to guess what a spec level is, how design center selections cascade, or which skill to call to add a pricing modifier to a Quartz attribute group. It knows.

More context = more reliable behavior = safe to allow more iterations. The knowledge base is what makes the 75-action loop tractable. Without it, 75 turns of generic Claude would be 75 turns of slowly drifting nonsense.

3. Every skill is a typed function, not generated code

There's a tempting alternative architecture floating around right now: have the model write code, then execute the code in a sandbox. “Code is the universal tool.”You've seen the demos.

That approach has the same problem as a generic agentic loop. The model can hallucinate function names. It can write subtly wrong logic. It can misinterpret a schema and corrupt 200 rows of your production database before you notice.

Foreman doesn't do that. Every one of its 396 skills is a hardcoded TypeScript function in our codebase with input validation, typed parameters, and structured error handling. The model isn't generating SQL. The model is choosing from a menu.

A real skill, simplified

// skills/design-center/createMultipleAttributeValues.ts
export async function createMultipleAttributeValues(input: {
  optionClassId: string;
  attributeGroupId: string;
  values: Array<{ name: string; description?: string; priceModifier?: number }>;
}) {
  // 1. Validate input shape (Zod schema)
  const parsed = AttrValueSchema.parse(input);

  // 2. Verify ownership against the user's org
  await assertOrgAccess(parsed.optionClassId, ctx.userOrgId);

  // 3. Hit Postgres directly via Prisma
  const created = await prisma.attributeValue.createMany({
    data: parsed.values.map(v => ({ ...v, attributeGroupId: parsed.attributeGroupId })),
  });

  return { success: true, count: created.count };
}

Typed inputs. Org-level ACL. Direct DB write. Returns a structured result the next loop iteration can reason about. Multiply this by 396.

When the model calls createMultipleAttributeValues, it's not asking us to generate code — it's passing structured JSON to a function that's been audited, type-checked, and battle-tested in production. Deterministic. Safe. Repeatable.

That's why we can let the loop run 75 times without losing sleep. Worst-case, a skill fails its input validation and returns a typed error message that the model reads and adjusts. There's no “the AI wroteDELETE FROM homes and now we have a problem.”

4. Direct database access (no HTTP middleman)

Most AI agents you've seen wired into a SaaS product look like this:

Model calls a tool
Tool makes an HTTP request to the SaaS REST API
The REST API authenticates the request
The REST API talks to its own database
Response comes back through the same chain

Every one of those hops adds latency (50–500ms per call), introduces a rate limit, and burns an API key quota. Multiply by 75 iterations and you're looking at minutes of wall-clock time and a $10 OpenAI bill per prompt.

Foreman's skills run inside the same Next.js server as the rest of Cornerstone PM. They import Prisma directly. They hit Postgres directly. They inherit the user's session and org permissions automatically.

The latency math

Generic HTTP-tool agent

~300ms per call (HTTP + auth + DB)
75 calls = ~22 seconds
Plus rate limits, plus API keys

Foreman (direct Prisma)

~50ms per call (Prisma + Postgres)
75 calls = under 4 seconds
No external rate limits, no key juggling

The model's thinking time is the bottleneck, not the tool execution. Foreman feels instant because it is instant on the execution side.

Why this matters for builders

Stack the four pieces together and you get something most construction software buyers haven't seen before:

What 75 actions looks like in practice

You type: “Hey Foreman, set up Countertops with three material types: Granite, Quartz, and Laminate. Create separate OptionClasses for each. Add 5 options per class. Create attributes with real brand names. Set tier access so budget is Standard, premium is Upgrade III+.”

Foreman makes 60+ skill calls in one turn: 3 OptionClasses created, 15 options seeded, 30+ attribute values added across Door Style / Wood Species / Finish, pricing modifiers applied per value, tier access locked, compatibility rules saved, summary returned. Total wall time: under 90 seconds. Manual equivalent: 4 hours of clicking.

This is the difference between an AI that helps you do work and an AI that does the work for you. A 3-call cap means “summarize this home’s budget.” A 75-call cap means “rebuild my entire design center from scratch using the Bayshore vendor list.”

That's why our marketing copy says Foreman doesn't talk — it builds. It's not a slogan. It's the architecture.

Where this goes next

We'll keep raising the cap. Today it's 75. The knowledge base keeps growing. The skill registry keeps growing — we crossed 396 skills across 20 categories last week, with more shipping every release. As both grow, the ceiling moves with them.

The four-part recipe doesn't change: own the loop, ground the model in domain knowledge, give it typed tools, talk to your database directly. That's what an agent looks like when it's built for a specific industry instead of bolted onto a generic chatbot.

Want to see the 75-action loop in action?

Cornerstone PM™ Beta access is free for the first 100 builders. Foreman AI lives on the Pro plan ($499/mo flat, up to 30 users).

Get Beta Access →See all 396+ skills

How We Let Foreman AI Chain 75 Actions in One Prompt (And Why Your AI Can't)