How I Run 6 Business Functions With One Multi-Agent AI System
<p>I run six business functions with a single multi-agent AI system: content production, stock research, operations, e-commerce, marketing, and sales. No employees. No freelancers for core operations. One orchestrator routes every task to the right specialized agent, and those agents coordinate through a checkpoint system that catches errors before they reach clients.</p>
<p>This is not a concept deck. This is the system I built and operate daily. I am going to walk you through the architecture, each agent's role, how they coordinate, and what the numbers look like after six months of production use.</p>
<h2 id="the-problem-one-person-six-business-functions">The problem: one person, six business functions</h2>
<p>Before this system existed, my week looked like this:</p>
<p>Monday and Tuesday were stock research for a Discord community. Reading earnings reports, pulling analyst ratings, scanning sector data. 18-20 hours. Wednesday was editing and scheduling social media clips across TikTok, Snapchat, and Instagram. 6-8 hours. Thursday was client work: landing pages, ad campaigns, onboarding workflows. 8-10 hours. Friday was Amazon FBA inventory analysis, auction sourcing, pricing research. 5-6 hours. Weekends were for catching up on everything that slipped.</p>
<p>Total: 45-55 hours per week, and I was still dropping balls. Follow-up emails slipped through cracks. Research reports were late. Social media scheduling was inconsistent. I was the bottleneck in every workflow.</p>
<p>The breaking point came when I missed a client onboarding deadline because I was buried in stock research. That week, I started building the system.</p>
<h2 id="architecture-overview-the-model-workspace-protocol">Architecture overview: the model workspace protocol</h2>
<p>The entire system runs on what I call the Model Workspace Protocol, or MWP. The idea: divide your business into rooms, give each room a specialized agent, and route tasks through a central orchestrator that knows which room handles what.</p>
<p>Here is the high-level architecture:</p>
<table>
<thead>
<tr>
<th>Layer</th>
<th>Purpose</th>
<th>What lives here</th>
</tr>
</thead>
<tbody>
<tr>
<td>Layer 0</td>
<td>Orchestrator</td>
<td>Task classification, routing, checkpoints</td>
</tr>
<tr>
<td>Layer 1</td>
<td>Room contexts</td>
<td>6 specialized rooms with their own rules</td>
</tr>
<tr>
<td>Layer 2</td>
<td>Stage folders</td>
<td>Granular workflow stages within rooms</td>
</tr>
<tr>
<td>Layer 3</td>
<td>Tools and skills</td>
<td>Shared utilities, API clients, design systems</td>
</tr>
</tbody>
</table>
<p>The backbone is Claude Code running as the execution engine. Every task enters through the orchestrator at Layer 0, gets classified, and routes to the correct room. The orchestrator never does the work itself. It delegates, monitors, and verifies.</p>
<h3 id="why-rooms-instead-of-one-big-agent">Why rooms instead of one big agent</h3>
<p>I tried the monolithic approach first. One agent prompt with instructions for everything. It failed within two weeks. The context window filled up with irrelevant details, the agent confused stock research terminology with marketing copy, and error rates climbed to 23% on complex tasks.</p>
<p>Rooms solve this by isolation. The content agent never sees commerce data. The research agent never loads marketing context. Each room has its own context file that defines inputs, processes, outputs, and constraints for that domain.</p>
<p>The result: error rates dropped from 23% to under 4% within the first month of switching to the room architecture.</p>
<h2 id="the-six-agents">The six agents</h2>
<h3 id="1-content-agent">1. Content agent</h3>
<p>Manages a 6-stage production pipeline for YouTube scripts, video thumbnails, social media clips, and editor handoff packages.</p>
<p>The stages: brief generation (structured hooks, key points, target metrics), research (supporting data and statistics), script writing (tested format with hook in first 8 seconds, pattern interrupt every 45 seconds), thumbnail compositing (AI generation followed by programmatic text/arrow overlay using Pillow), edit notes (timestamped instructions for the video editor), and editor handoff (everything packaged into a Drive folder with standardized naming).</p>
<p>The agent also handles cross-platform video distribution. It takes a single landscape video and produces 9:16 vertical crops with safe zones (250px top margin, 450px bottom margin, 35px side margins) for TikTok, Snapchat, and Instagram Reels. Each platform gets its own scheduling queue through the OneUp API.</p>
<p>Weekly output: 4-6 video packages, 15-20 social media clips scheduled across 3 platforms.</p>
<h3 id="2-research-agent">2. Research agent</h3>
<p>This agent saved the most time. The manual process was brutal. I would spend 3-4 hours per ticker deep-diving into financials, analyst ratings, and sector comparisons. The agent does it in 12-15 minutes per ticker.</p>
<p>It pulls data from Finnhub (analyst ratings, earnings calendars, company profiles), processes it through Claude for analysis, and generates formatted reports. These reports get posted directly to Discord using Discord.py, with embedded charts and PDF attachments.</p>
<p>Weekly output: 8-12 ticker deep dives, 2 sector scans, 1 weekly market summary. All delivered to Discord automatically.</p>
<h3 id="3-operations-agent">3. Operations agent</h3>
<p>The maintenance crew. It handles recurring tasks that are critical but not creative.</p>
<p>Social media scheduling: manages posting queues for MakeStockMoney across TikTok and Snapchat. Posts go out during peak hours (12 PM - 9 PM) based on platform engagement data. Discord bot management: monitors the Sentinel bot that runs 24/7 for community management and security alerts. SOP documentation: when any agent develops a new workflow, operations generates a standard operating procedure so my partner Jake can understand and manage it without touching code. Client onboarding: automated sequences for new CrestSetup clients covering welcome emails, intake forms, project setup, and first milestone scheduling.</p>
<p>Weekly output: 35-50 scheduled posts, 2-3 updated SOPs, onboarding sequences for any new clients.</p>
<h3 id="4-commerce-agent">4. Commerce agent</h3>
<p>I sell consumer electronics accessories (iPad cases, AirPod cases) through Amazon FBA, sourced primarily from Nellis Auction in Las Vegas. The commerce agent handles inventory scanning (monitoring auction listings, comparing prices against Amazon selling prices, calculating margin after FBA fees), ASIN matching (taking product descriptions from auction manifests and matching them to existing Amazon listings to estimate demand and competition), pricing analysis (pulling SellerBoard data to track unit economics), and restock alerts (flagging when stock drops below 14-day supply).</p>
<p>Weekly output: 3-5 auction scans, daily inventory monitoring, weekly profitability report.</p>
<h3 id="5-marketing-agent">5. Marketing agent</h3>
<p>CrestSetup is my premium AI automation agency. The marketing agent manages website builds (Next.js 15, Tailwind, Framer Motion with animated scroll transitions), SEO execution (technical audits, schema markup, content strategy through 11 specialized skills), ad management (Google Ads campaign structure, keyword research, ad copy, performance monitoring), and lead generation (conversion funnels, form submissions, call bookings, pipeline tracking).</p>
<p>Weekly output: 2-3 landing page iterations, ongoing SEO improvements, ad copy variations, weekly performance reports.</p>
<h3 id="6-sales-agent">6. Sales agent</h3>
<p>Turns capabilities into revenue. Pitch deck creation using a custom design system (dark themes, gradient accents, card layouts, each deck verified through a 5-gate quality check). Platform listings on Gumroad, Fiverr, Upwork, and Whop. Outreach sequences using researched prospect data and a 3-touch framework: value lead, case study follow-up, direct ask. Pricing strategy through competitive analysis and tiered packaging.</p>
<p>Weekly output: 1-2 pitch decks, outreach sequences for active campaigns, listing updates as offerings evolve.</p>
<h2 id="how-they-coordinate-the-orchestrator-pattern">How they coordinate: the orchestrator pattern</h2>
<p>The orchestrator is the traffic controller. When a task comes in, it follows this sequence:</p>
<ol>
<li>Classify: which room does this belong to?</li>
<li>Route: load that room's context (and only that room's)</li>
<li>Plan: break the task into numbered steps</li>
<li>Execute: do the work, one step at a time</li>
<li>Verify: run quality checks</li>
<li>Checkpoint: log what was done</li>
<li>Confirm: get sign-off for plans only (never for execution)</li>
</ol>
<h3 id="routing-logic">Routing logic</h3>
<p>The orchestrator uses keyword matching and intent classification. Stock ticker or company name routes to Research. "Schedule" or "SOP" routes to Operations. YouTube or thumbnail routes to Content. Amazon or ASIN routes to Commerce. Landing page or SEO routes to Marketing. Pricing or pitch deck routes to Sales.</p>
<p>For ambiguous tasks, the orchestrator asks one clarifying question. For tasks that span multiple rooms, it executes sequentially, completing and checkpointing in one room before moving to the next.</p>
<h3 id="cross-room-handoffs">Cross-room handoffs</h3>
<p>The most common handoff: any room to Operations. When the content agent finishes a video package, operations handles the scheduling. When sales creates a new product offering, operations documents the SOP.</p>
<p>Each handoff follows a strict rule: complete and checkpoint the current room before entering the next. This prevents half-finished work from clogging the pipeline. In six months of operation, I have had zero cross-room data corruption issues.</p>
<h3 id="parallel-execution">Parallel execution</h3>
<p>For independent tasks, the orchestrator runs up to 4 agents simultaneously. A typical morning batch: research agent pulls overnight earnings data, content agent generates thumbnail composites, operations agent checks for failed OneUp posts and reschedules, commerce agent runs morning inventory scan.</p>
<p>All four run at the same time. The orchestrator collects their outputs and surfaces anything that needs my attention, which is typically nothing. On an average day, I review outputs for about 45 minutes total.</p>
<h3 id="the-checkpoint-system">The checkpoint system</h3>
<p>Every completed task gets logged with what was done, which agent did it, time taken, an automated quality rating, and any errors encountered. This creates an audit trail I review weekly. More importantly, it feeds back into optimization. I can spot which agents are slower than expected or producing lower quality output, and tune their prompts.</p>
<h2 id="results-six-months-of-production-data">Results: six months of production data</h2>
<p>I have been running this system in full production since October 2025. Here are the numbers.</p>
<h3 id="time-savings">Time savings</h3>
<table>
<thead>
<tr>
<th>Function</th>
<th>Before (hrs/week)</th>
<th>After (hrs/week)</th>
<th>Savings</th>
</tr>
</thead>
<tbody>
<tr>
<td>Stock research</td>
<td>18-20</td>
<td>2-3</td>
<td>85%</td>
</tr>
<tr>
<td>Content production</td>
<td>12-15</td>
<td>3-4</td>
<td>73%</td>
</tr>
<tr>
<td>Operations</td>
<td>8-10</td>
<td>1-2</td>
<td>83%</td>
</tr>
<tr>
<td>Commerce</td>
<td>5-6</td>
<td>1</td>
<td>82%</td>
</tr>
<tr>
<td>Marketing</td>
<td>10-12</td>
<td>3-4</td>
<td>68%</td>
</tr>
<tr>
<td>Sales</td>
<td>6-8</td>
<td>2-3</td>
<td>65%</td>
</tr>
<tr>
<td>Total</td>
<td>59-71</td>
<td>12-17</td>
<td>77%</td>
</tr>
</tbody>
</table>
<p>That is roughly 50 hours per week freed up. I now spend most of my time on creative strategy, relationship building, and system improvements. The three things that actually require a human.</p>
<h3 id="task-throughput">Task throughput</h3>
<p>Tasks completed per day: 25-40 (up from 8-12 manually). Average task completion time: 8 minutes (down from 45 minutes). Tasks requiring human intervention: 12% (down from 100%).</p>
<h3 id="error-rates">Error rates</h3>
<p>First-month error rate with the monolithic agent: 23%. Current error rate with the room architecture: 3.8%. Critical errors (wrong data delivered to clients): 0.4%. Errors caught by the verification framework before delivery: 91%.</p>
<p>The verification framework is what makes this work. Every deliverable passes through a 5-gate quality check: existence and format, rendering integrity, content accuracy, visual quality, and domain-specific validation. This catches 91% of errors before they reach anyone.</p>
<h3 id="revenue-impact">Revenue impact</h3>
<p>I am not going to share exact revenue numbers, but here is the structure. Capacity: I went from serving 3-4 clients simultaneously to 8-10 without quality degradation. New revenue streams: the system itself became a product, and I now consult on multi-agent architecture for other businesses. Reduced costs: zero freelancer costs for core operations, down from $2,000-3,000/month on research assistants and content help.</p>
<h2 id="what-i-would-do-differently">What I would do differently</h2>
<p>Six months of operating this system taught me things I wish I had known at the start.</p>
<h3 id="start-with-two-rooms-not-six">Start with two rooms, not six</h3>
<p>I built all six rooms in the first two weeks. That was a mistake. Each room needs 40-60 hours of prompt tuning, workflow testing, and edge case handling before it is production-ready. Building six simultaneously meant all six were half-baked for the first month.</p>
<p>If I started over, I would build Research and Operations first (highest time savings), get them stable, then add rooms one at a time every 2-3 weeks.</p>
<h3 id="invest-in-verification-earlier">Invest in verification earlier</h3>
<p>I added the 5-gate verification system in month three. The first two months had noticeably higher error rates because outputs were going straight to delivery without automated quality checks. The verification framework should be the second thing you build, right after the orchestrator.</p>
<h3 id="log-everything-from-day-one">Log everything from day one</h3>
<p>I did not start systematic logging until month two. That means I have incomplete data on early performance, which makes it harder to quantify the full improvement trajectory. Every task, every error, every timing metric should be logged from the first day.</p>
<h3 id="design-handoffs-before-building-agents">Design handoffs before building agents</h3>
<p>My initial cross-room handoffs were ad hoc. Agent A would finish and dump output in a shared folder, and Agent B would pick it up sometimes. Designing a formal handoff protocol with completion verification would have saved me two weeks of debugging lost deliverables.</p>
<h3 id="build-manager-friendly-dashboards">Build manager-friendly dashboards</h3>
<p>My operations partner Jake manages several workflows but does not write code. It took me too long to build SOP documentation and monitoring dashboards that he could actually use. If you are building a multi-agent system for a team, invest in the management layer early. The system is only valuable if non-technical team members can operate and monitor it.</p>
<h2 id="the-architecture-is-the-product">The architecture is the product</h2>
<p>The most unexpected outcome: the architecture itself became my most valuable offering. Clients do not just want me to automate their social media or build their landing page. They want the system. The orchestrator, the room structure, the verification framework, the checkpoint logging.</p>
<p>A well-designed multi-agent system is not just a productivity hack. It compounds. Every week, the agents get better tuned. Every month, the system handles more edge cases automatically. After six months, the gap between what this system can do and what a solo operator can do manually is enormous.</p>
<p>If you are considering building something similar, start small. One orchestrator, two rooms, proper logging. Get that stable. Then expand. The system will grow faster than you expect.</p>
<h2 id="frequently-asked-questions">Frequently asked questions</h2>
<h3 id="how-much-does-it-cost-to-run">How much does it cost to run?</h3>
<p>My monthly costs break down to approximately $150-250 in API usage (Claude API, Finnhub, various integrations), plus the Claude Code subscription. Compare that to $2,000-3,000/month I was previously spending on freelancer help. Infrastructure costs (VPS, domains, SSL) add another $50-80/month.</p>
<h3 id="do-i-need-to-know-how-to-code">Do I need to know how to code?</h3>
<p>You need enough technical literacy to write prompts, understand file structures, and debug basic issues. You do not need to be a software engineer. I built this system primarily using Claude Code itself, describing what I needed and iterating on the output. That said, understanding API calls, JSON, and basic scripting will make you 3-4x faster at building and debugging.</p>
<h3 id="how-long-did-it-take-to-build">How long did it take to build?</h3>
<p>The initial architecture took about 3 weeks of focused work. Getting all six rooms to production quality took another 8-10 weeks of iterative tuning. Total investment was roughly 250-300 hours spread over three months. The payback period was about 6 weeks. After that, the time savings exceeded the time invested.</p>
<h3 id="what-happens-when-an-agent-makes-a-mistake">What happens when an agent makes a mistake?</h3>
<p>The verification framework catches 91% of errors before delivery. For the remaining 9%, the checkpoint system creates an audit trail so I can identify what went wrong and where. Critical errors (wrong data to clients) happen at a rate of 0.4%, and in those cases I have a manual review and correction process. Every error also triggers a prompt improvement, so the same mistake rarely happens twice. I wrote more about the verification methodology on my <a href="/services/automation-audit">automation audit page</a>.</p>