Build an AI-Powered Discord Bot: Complete Python Tutorial with LLM Integration
<p>I run an AI-powered Discord bot that serves a community of 200+ members. It handles research queries, moderates content, posts scheduled updates, and has processed over 15,000 messages since deployment. This tutorial walks through building one from scratch.</p>
<p>By the end, you will have a production-ready AI Discord bot with LLM integration, rate limiting, content moderation, error handling, and a deployment setup that runs 24/7 on a VPS. This is the same architecture I use in production, not a toy demo.</p>
<h2 id="what-we-are-building">What we are building</h2>
<p>The bot does three things:</p>
<ol>
<li>Responds to mentions and slash commands with AI-generated answers using an LLM API</li>
<li>Rate-limits users to prevent abuse and control API costs</li>
<li>Runs as a systemd service on a Linux VPS with logging and auto-restart</li>
</ol>
<p>Total cost to operate: approximately $5-$30/month depending on usage (VPS + LLM API calls). A $6/month VPS handles a community of 500+ members easily. LLM costs scale with usage. At $0.003-$0.015 per query depending on the model, 1,000 queries/month costs $3-$15.</p>
<h2 id="prerequisites">Prerequisites</h2>
<ul>
<li>Python 3.10+</li>
<li>A Discord account and a server where you have admin permissions</li>
<li>An API key for an LLM provider (this tutorial uses Anthropic's Claude API, but the pattern works with OpenAI, Google, or any LLM with a REST API)</li>
<li>A Linux VPS for deployment (DigitalOcean, Hetzner, or similar, $5-$10/month)</li>
</ul>
<h2 id="part-1-project-setup">Part 1: project setup</h2>
<p>Create the project structure:</p>
<div class="codehilite"><pre><span></span><code>mkdir<span class="w"> </span>discord-ai-bot<span class="w"> </span><span class="o">&&</span><span class="w"> </span><span class="nb">cd</span><span class="w"> </span>discord-ai-bot
python3<span class="w"> </span>-m<span class="w"> </span>venv<span class="w"> </span>venv
<span class="nb">source</span><span class="w"> </span>venv/bin/activate
pip<span class="w"> </span>install<span class="w"> </span>discord.py<span class="w"> </span>anthropic<span class="w"> </span>python-dotenv
</code></pre></div>
<p>Your project structure:</p>
<div class="codehilite"><pre><span></span><code><span class="n">discord</span><span class="o">-</span><span class="n">ai</span><span class="o">-</span><span class="n">bot</span><span class="o">/</span>
<span class="w"> </span><span class="n">bot</span><span class="o">.</span><span class="n">py</span><span class="w"> </span><span class="c1"># Main bot file</span>
<span class="w"> </span><span class="n">llm_client</span><span class="o">.</span><span class="n">py</span><span class="w"> </span><span class="c1"># LLM API wrapper</span>
<span class="w"> </span><span class="n">rate_limiter</span><span class="o">.</span><span class="n">py</span><span class="w"> </span><span class="c1"># Rate limiting logic</span>
<span class="w"> </span><span class="n">config</span><span class="o">.</span><span class="n">py</span><span class="w"> </span><span class="c1"># Configuration loader</span>
<span class="w"> </span><span class="o">.</span><span class="n">env</span><span class="w"> </span><span class="c1"># Secrets (never commit this)</span>
<span class="w"> </span><span class="n">requirements</span><span class="o">.</span><span class="n">txt</span><span class="w"> </span><span class="c1"># Dependencies</span>
</code></pre></div>
<p>Create <code>requirements.txt</code>:</p>
<div class="codehilite"><pre><span></span><code>discord.py>=2.3.0
anthropic>=0.39.0
python-dotenv>=1.0.0
</code></pre></div>
<p>Create <code>.env</code> with your credentials:</p>
<div class="codehilite"><pre><span></span><code>DISCORD_TOKEN=your_discord_bot_token_here
ANTHROPIC_API_KEY=your_anthropic_key_here
GUILD_ID=your_server_id_here
</code></pre></div>
<h2 id="part-2-create-the-discord-application">Part 2: create the Discord application</h2>
<p>Before writing code, you need a bot token from Discord.</p>
<ol>
<li>Go to the <a href="https://discord.com/developers/applications">Discord Developer Portal</a></li>
<li>Click "New Application" and name it</li>
<li>Go to the "Bot" tab and click "Reset Token" to generate your token</li>
<li>Under "Privileged Gateway Intents," enable Message Content Intent (required for reading messages)</li>
<li>Go to "OAuth2" > "URL Generator," select <code>bot</code> and <code>applications.commands</code> scopes</li>
<li>Under bot permissions, select: Send Messages, Read Message History, Use Slash Commands, Embed Links</li>
<li>Copy the generated URL and open it to invite the bot to your server</li>
</ol>
<p>Save the bot token in your <code>.env</code> file. Do not hardcode tokens anywhere in your source code. A single leaked token gives anyone full control of your bot.</p>
<h2 id="part-3-configuration-loader">Part 3: configuration loader</h2>
<p>Start with <code>config.py</code>. This loads environment variables and validates them at startup. Fail fast if anything is missing.</p>
<div class="codehilite"><pre><span></span><code><span class="kn">import</span><span class="w"> </span><span class="nn">os</span>
<span class="kn">from</span><span class="w"> </span><span class="nn">dotenv</span><span class="w"> </span><span class="kn">import</span> <span class="n">load_dotenv</span>
<span class="n">load_dotenv</span><span class="p">()</span>
<span class="n">REQUIRED_VARS</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"DISCORD_TOKEN"</span><span class="p">,</span> <span class="s2">"ANTHROPIC_API_KEY"</span><span class="p">,</span> <span class="s2">"GUILD_ID"</span><span class="p">]</span>
<span class="k">def</span><span class="w"> </span><span class="nf">load_config</span><span class="p">():</span>
<span class="n">missing</span> <span class="o">=</span> <span class="p">[</span><span class="n">var</span> <span class="k">for</span> <span class="n">var</span> <span class="ow">in</span> <span class="n">REQUIRED_VARS</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">var</span><span class="p">)]</span>
<span class="k">if</span> <span class="n">missing</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">EnvironmentError</span><span class="p">(</span>
<span class="sa">f</span><span class="s2">"Missing required environment variables: </span><span class="si">{</span><span class="s1">', '</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">missing</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span>
<span class="p">)</span>
<span class="k">return</span> <span class="p">{</span>
<span class="s2">"discord_token"</span><span class="p">:</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s2">"DISCORD_TOKEN"</span><span class="p">],</span>
<span class="s2">"anthropic_api_key"</span><span class="p">:</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s2">"ANTHROPIC_API_KEY"</span><span class="p">],</span>
<span class="s2">"guild_id"</span><span class="p">:</span> <span class="nb">int</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s2">"GUILD_ID"</span><span class="p">]),</span>
<span class="s2">"max_tokens"</span><span class="p">:</span> <span class="nb">int</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"MAX_TOKENS"</span><span class="p">,</span> <span class="s2">"1024"</span><span class="p">)),</span>
<span class="s2">"rate_limit_per_user"</span><span class="p">:</span> <span class="nb">int</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"RATE_LIMIT"</span><span class="p">,</span> <span class="s2">"10"</span><span class="p">)),</span>
<span class="s2">"rate_limit_window"</span><span class="p">:</span> <span class="nb">int</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"RATE_WINDOW"</span><span class="p">,</span> <span class="s2">"60"</span><span class="p">)),</span>
<span class="p">}</span>
</code></pre></div>
<p>This pattern (validate all config at startup, crash immediately if something is wrong) saves hours of debugging later. I have seen bots run for days with a bad config, silently failing on every request, because nobody validated inputs at startup.</p>
<h2 id="part-4-rate-limiter">Part 4: rate limiter</h2>
<p>Rate limiting is not optional. Without it, one enthusiastic user or one bad actor can burn through your entire monthly API budget in an afternoon. At $0.015 per query, 10,000 queries costs $150. A rate limiter caps that exposure.</p>
<div class="codehilite"><pre><span></span><code><span class="kn">import</span><span class="w"> </span><span class="nn">time</span>
<span class="kn">from</span><span class="w"> </span><span class="nn">collections</span><span class="w"> </span><span class="kn">import</span> <span class="n">defaultdict</span>
<span class="k">class</span><span class="w"> </span><span class="nc">RateLimiter</span><span class="p">:</span>
<span class="k">def</span><span class="w"> </span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">max_requests</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">window_seconds</span><span class="p">:</span> <span class="nb">int</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">max_requests</span> <span class="o">=</span> <span class="n">max_requests</span>
<span class="bp">self</span><span class="o">.</span><span class="n">window</span> <span class="o">=</span> <span class="n">window_seconds</span>
<span class="bp">self</span><span class="o">.</span><span class="n">requests</span> <span class="o">=</span> <span class="n">defaultdict</span><span class="p">(</span><span class="nb">list</span><span class="p">)</span>
<span class="k">def</span><span class="w"> </span><span class="nf">is_allowed</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">user_id</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bool</span><span class="p">:</span>
<span class="n">now</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span>
<span class="n">cutoff</span> <span class="o">=</span> <span class="n">now</span> <span class="o">-</span> <span class="bp">self</span><span class="o">.</span><span class="n">window</span>
<span class="c1"># Remove expired timestamps</span>
<span class="bp">self</span><span class="o">.</span><span class="n">requests</span><span class="p">[</span><span class="n">user_id</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span>
<span class="n">ts</span> <span class="k">for</span> <span class="n">ts</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">requests</span><span class="p">[</span><span class="n">user_id</span><span class="p">]</span> <span class="k">if</span> <span class="n">ts</span> <span class="o">></span> <span class="n">cutoff</span>
<span class="p">]</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">requests</span><span class="p">[</span><span class="n">user_id</span><span class="p">])</span> <span class="o">>=</span> <span class="bp">self</span><span class="o">.</span><span class="n">max_requests</span><span class="p">:</span>
<span class="k">return</span> <span class="kc">False</span>
<span class="bp">self</span><span class="o">.</span><span class="n">requests</span><span class="p">[</span><span class="n">user_id</span><span class="p">]</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">now</span><span class="p">)</span>
<span class="k">return</span> <span class="kc">True</span>
<span class="k">def</span><span class="w"> </span><span class="nf">remaining</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">user_id</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-></span> <span class="nb">int</span><span class="p">:</span>
<span class="n">now</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span>
<span class="n">cutoff</span> <span class="o">=</span> <span class="n">now</span> <span class="o">-</span> <span class="bp">self</span><span class="o">.</span><span class="n">window</span>
<span class="bp">self</span><span class="o">.</span><span class="n">requests</span><span class="p">[</span><span class="n">user_id</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span>
<span class="n">ts</span> <span class="k">for</span> <span class="n">ts</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">requests</span><span class="p">[</span><span class="n">user_id</span><span class="p">]</span> <span class="k">if</span> <span class="n">ts</span> <span class="o">></span> <span class="n">cutoff</span>
<span class="p">]</span>
<span class="k">return</span> <span class="nb">max</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">max_requests</span> <span class="o">-</span> <span class="nb">len</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">requests</span><span class="p">[</span><span class="n">user_id</span><span class="p">]))</span>
<span class="k">def</span><span class="w"> </span><span class="nf">reset_time</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">user_id</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-></span> <span class="nb">float</span><span class="p">:</span>
<span class="k">if</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="n">requests</span><span class="p">[</span><span class="n">user_id</span><span class="p">]:</span>
<span class="k">return</span> <span class="mi">0</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">requests</span><span class="p">[</span><span class="n">user_id</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">window</span> <span class="o">-</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span>
</code></pre></div>
<p>Default configuration: 10 requests per user per 60-second window. Generous enough for normal use, tight enough to prevent abuse. Adjust based on your community size and API budget.</p>
<h2 id="part-5-llm-client">Part 5: LLM client</h2>
<p>The LLM client wraps the API call with error handling, timeout management, and response validation. This is where most bot tutorials cut corners, and where most production bots break.</p>
<div class="codehilite"><pre><span></span><code><span class="kn">import</span><span class="w"> </span><span class="nn">logging</span>
<span class="kn">import</span><span class="w"> </span><span class="nn">anthropic</span>
<span class="n">logger</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">getLogger</span><span class="p">(</span><span class="vm">__name__</span><span class="p">)</span>
<span class="k">class</span><span class="w"> </span><span class="nc">LLMClient</span><span class="p">:</span>
<span class="k">def</span><span class="w"> </span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">api_key</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">max_tokens</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">1024</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">client</span> <span class="o">=</span> <span class="n">anthropic</span><span class="o">.</span><span class="n">Anthropic</span><span class="p">(</span><span class="n">api_key</span><span class="o">=</span><span class="n">api_key</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">max_tokens</span> <span class="o">=</span> <span class="n">max_tokens</span>
<span class="bp">self</span><span class="o">.</span><span class="n">system_prompt</span> <span class="o">=</span> <span class="p">(</span>
<span class="s2">"You are a helpful research assistant in a Discord community. "</span>
<span class="s2">"Keep responses concise (under 1500 characters for Discord). "</span>
<span class="s2">"Use markdown formatting. If you are unsure, say so. "</span>
<span class="s2">"Never generate harmful, illegal, or explicit content."</span>
<span class="p">)</span>
<span class="k">def</span><span class="w"> </span><span class="nf">generate_response</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">user_message</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">context</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="s2">""</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">prompt</span> <span class="o">=</span> <span class="n">user_message</span>
<span class="k">if</span> <span class="n">context</span><span class="p">:</span>
<span class="n">prompt</span> <span class="o">=</span> <span class="sa">f</span><span class="s2">"Context from conversation:</span><span class="se">\n</span><span class="si">{</span><span class="n">context</span><span class="si">}</span><span class="se">\n\n</span><span class="s2">User question: </span><span class="si">{</span><span class="n">user_message</span><span class="si">}</span><span class="s2">"</span>
<span class="n">response</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">client</span><span class="o">.</span><span class="n">messages</span><span class="o">.</span><span class="n">create</span><span class="p">(</span>
<span class="n">model</span><span class="o">=</span><span class="s2">"claude-sonnet-4-20250514"</span><span class="p">,</span>
<span class="n">max_tokens</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">max_tokens</span><span class="p">,</span>
<span class="n">system</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">system_prompt</span><span class="p">,</span>
<span class="n">messages</span><span class="o">=</span><span class="p">[{</span><span class="s2">"role"</span><span class="p">:</span> <span class="s2">"user"</span><span class="p">,</span> <span class="s2">"content"</span><span class="p">:</span> <span class="n">prompt</span><span class="p">}],</span>
<span class="p">)</span>
<span class="n">text</span> <span class="o">=</span> <span class="n">response</span><span class="o">.</span><span class="n">content</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">text</span>
<span class="c1"># Discord has a 2000-character message limit</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">text</span><span class="p">)</span> <span class="o">></span> <span class="mi">1900</span><span class="p">:</span>
<span class="n">text</span> <span class="o">=</span> <span class="n">text</span><span class="p">[:</span><span class="mi">1897</span><span class="p">]</span> <span class="o">+</span> <span class="s2">"..."</span>
<span class="k">return</span> <span class="n">text</span>
<span class="k">except</span> <span class="n">anthropic</span><span class="o">.</span><span class="n">RateLimitError</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">"LLM API rate limit hit"</span><span class="p">)</span>
<span class="k">return</span> <span class="s2">"I am currently rate-limited by my AI provider. Please try again in a moment."</span>
<span class="k">except</span> <span class="n">anthropic</span><span class="o">.</span><span class="n">APIConnectionError</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="s2">"LLM API connection failed"</span><span class="p">)</span>
<span class="k">return</span> <span class="s2">"I could not reach my AI backend. Please try again shortly."</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="sa">f</span><span class="s2">"LLM request failed: </span><span class="si">{</span><span class="nb">type</span><span class="p">(</span><span class="n">e</span><span class="p">)</span><span class="o">.</span><span class="vm">__name__</span><span class="si">}</span><span class="s2">: </span><span class="si">{</span><span class="n">e</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="k">return</span> <span class="s2">"Something went wrong processing your request. Please try again."</span>
</code></pre></div>
<p>A few things to note about this implementation. Each failure mode gets its own error handler. API rate limits, connection failures, and unexpected errors all produce different responses. The user gets a helpful message; the log gets diagnostic detail. I enforce the 1,900 character limit because Discord caps messages at 2,000 characters and the LLM does not know this. And the system prompt sets tone, length expectations, and content boundaries. This is your first line of defense against the bot producing inappropriate content.</p>
<h2 id="part-6-the-main-bot">Part 6: the main bot</h2>
<p>Now we wire everything together. The bot listens for mentions and slash commands, checks rate limits, calls the LLM, and sends the response.</p>
<div class="codehilite"><pre><span></span><code><span class="kn">import</span><span class="w"> </span><span class="nn">logging</span>
<span class="kn">import</span><span class="w"> </span><span class="nn">discord</span>
<span class="kn">from</span><span class="w"> </span><span class="nn">discord</span><span class="w"> </span><span class="kn">import</span> <span class="n">app_commands</span>
<span class="kn">from</span><span class="w"> </span><span class="nn">config</span><span class="w"> </span><span class="kn">import</span> <span class="n">load_config</span>
<span class="kn">from</span><span class="w"> </span><span class="nn">rate_limiter</span><span class="w"> </span><span class="kn">import</span> <span class="n">RateLimiter</span>
<span class="kn">from</span><span class="w"> </span><span class="nn">llm_client</span><span class="w"> </span><span class="kn">import</span> <span class="n">LLMClient</span>
<span class="n">logging</span><span class="o">.</span><span class="n">basicConfig</span><span class="p">(</span>
<span class="n">level</span><span class="o">=</span><span class="n">logging</span><span class="o">.</span><span class="n">INFO</span><span class="p">,</span>
<span class="nb">format</span><span class="o">=</span><span class="s2">"</span><span class="si">%(asctime)s</span><span class="s2"> [</span><span class="si">%(levelname)s</span><span class="s2">] </span><span class="si">%(name)s</span><span class="s2">: </span><span class="si">%(message)s</span><span class="s2">"</span><span class="p">,</span>
<span class="n">handlers</span><span class="o">=</span><span class="p">[</span>
<span class="n">logging</span><span class="o">.</span><span class="n">FileHandler</span><span class="p">(</span><span class="s2">"bot.log"</span><span class="p">),</span>
<span class="n">logging</span><span class="o">.</span><span class="n">StreamHandler</span><span class="p">(),</span>
<span class="p">],</span>
<span class="p">)</span>
<span class="n">logger</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">getLogger</span><span class="p">(</span><span class="vm">__name__</span><span class="p">)</span>
<span class="n">config</span> <span class="o">=</span> <span class="n">load_config</span><span class="p">()</span>
<span class="n">intents</span> <span class="o">=</span> <span class="n">discord</span><span class="o">.</span><span class="n">Intents</span><span class="o">.</span><span class="n">default</span><span class="p">()</span>
<span class="n">intents</span><span class="o">.</span><span class="n">message_content</span> <span class="o">=</span> <span class="kc">True</span>
<span class="n">bot</span> <span class="o">=</span> <span class="n">discord</span><span class="o">.</span><span class="n">Client</span><span class="p">(</span><span class="n">intents</span><span class="o">=</span><span class="n">intents</span><span class="p">)</span>
<span class="n">tree</span> <span class="o">=</span> <span class="n">app_commands</span><span class="o">.</span><span class="n">CommandTree</span><span class="p">(</span><span class="n">bot</span><span class="p">)</span>
<span class="n">rate_limiter</span> <span class="o">=</span> <span class="n">RateLimiter</span><span class="p">(</span>
<span class="n">config</span><span class="p">[</span><span class="s2">"rate_limit_per_user"</span><span class="p">],</span>
<span class="n">config</span><span class="p">[</span><span class="s2">"rate_limit_window"</span><span class="p">],</span>
<span class="p">)</span>
<span class="n">llm</span> <span class="o">=</span> <span class="n">LLMClient</span><span class="p">(</span><span class="n">config</span><span class="p">[</span><span class="s2">"anthropic_api_key"</span><span class="p">],</span> <span class="n">config</span><span class="p">[</span><span class="s2">"max_tokens"</span><span class="p">])</span>
<span class="nd">@bot</span><span class="o">.</span><span class="n">event</span>
<span class="k">async</span> <span class="k">def</span><span class="w"> </span><span class="nf">on_ready</span><span class="p">():</span>
<span class="n">guild</span> <span class="o">=</span> <span class="n">discord</span><span class="o">.</span><span class="n">Object</span><span class="p">(</span><span class="nb">id</span><span class="o">=</span><span class="n">config</span><span class="p">[</span><span class="s2">"guild_id"</span><span class="p">])</span>
<span class="n">tree</span><span class="o">.</span><span class="n">copy_global_to</span><span class="p">(</span><span class="n">guild</span><span class="o">=</span><span class="n">guild</span><span class="p">)</span>
<span class="k">await</span> <span class="n">tree</span><span class="o">.</span><span class="n">sync</span><span class="p">(</span><span class="n">guild</span><span class="o">=</span><span class="n">guild</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Bot online as </span><span class="si">{</span><span class="n">bot</span><span class="o">.</span><span class="n">user</span><span class="si">}</span><span class="s2"> | Guild: </span><span class="si">{</span><span class="n">config</span><span class="p">[</span><span class="s1">'guild_id'</span><span class="p">]</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="nd">@bot</span><span class="o">.</span><span class="n">event</span>
<span class="k">async</span> <span class="k">def</span><span class="w"> </span><span class="nf">on_message</span><span class="p">(</span><span class="n">message</span><span class="p">):</span>
<span class="c1"># Ignore own messages</span>
<span class="k">if</span> <span class="n">message</span><span class="o">.</span><span class="n">author</span> <span class="o">==</span> <span class="n">bot</span><span class="o">.</span><span class="n">user</span><span class="p">:</span>
<span class="k">return</span>
<span class="c1"># Only respond when mentioned</span>
<span class="k">if</span> <span class="n">bot</span><span class="o">.</span><span class="n">user</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">message</span><span class="o">.</span><span class="n">mentions</span><span class="p">:</span>
<span class="k">return</span>
<span class="c1"># Rate limit check</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">rate_limiter</span><span class="o">.</span><span class="n">is_allowed</span><span class="p">(</span><span class="n">message</span><span class="o">.</span><span class="n">author</span><span class="o">.</span><span class="n">id</span><span class="p">):</span>
<span class="n">remaining</span> <span class="o">=</span> <span class="n">rate_limiter</span><span class="o">.</span><span class="n">reset_time</span><span class="p">(</span><span class="n">message</span><span class="o">.</span><span class="n">author</span><span class="o">.</span><span class="n">id</span><span class="p">)</span>
<span class="k">await</span> <span class="n">message</span><span class="o">.</span><span class="n">reply</span><span class="p">(</span>
<span class="sa">f</span><span class="s2">"You have hit the rate limit. Try again in </span><span class="si">{</span><span class="nb">int</span><span class="p">(</span><span class="n">remaining</span><span class="p">)</span><span class="si">}</span><span class="s2"> seconds."</span>
<span class="p">)</span>
<span class="k">return</span>
<span class="c1"># Strip the mention from the message</span>
<span class="n">clean_content</span> <span class="o">=</span> <span class="n">message</span><span class="o">.</span><span class="n">content</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="sa">f</span><span class="s2">"<@</span><span class="si">{</span><span class="n">bot</span><span class="o">.</span><span class="n">user</span><span class="o">.</span><span class="n">id</span><span class="si">}</span><span class="s2">>"</span><span class="p">,</span> <span class="s2">""</span><span class="p">)</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">clean_content</span><span class="p">:</span>
<span class="k">await</span> <span class="n">message</span><span class="o">.</span><span class="n">reply</span><span class="p">(</span><span class="s2">"You mentioned me but did not ask anything. How can I help?"</span><span class="p">)</span>
<span class="k">return</span>
<span class="c1"># Show typing indicator while generating</span>
<span class="k">async</span> <span class="k">with</span> <span class="n">message</span><span class="o">.</span><span class="n">channel</span><span class="o">.</span><span class="n">typing</span><span class="p">():</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">llm</span><span class="o">.</span><span class="n">generate_response</span><span class="p">(</span><span class="n">clean_content</span><span class="p">)</span>
<span class="k">await</span> <span class="n">message</span><span class="o">.</span><span class="n">reply</span><span class="p">(</span><span class="n">response</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span>
<span class="sa">f</span><span class="s2">"Responded to </span><span class="si">{</span><span class="n">message</span><span class="o">.</span><span class="n">author</span><span class="si">}</span><span class="s2"> in #</span><span class="si">{</span><span class="n">message</span><span class="o">.</span><span class="n">channel</span><span class="si">}</span><span class="s2">: "</span>
<span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="n">clean_content</span><span class="p">[:</span><span class="mi">80</span><span class="p">]</span><span class="si">}</span><span class="s2">..."</span>
<span class="p">)</span>
<span class="nd">@tree</span><span class="o">.</span><span class="n">command</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">"ask"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Ask the AI assistant a question"</span><span class="p">)</span>
<span class="nd">@app_commands</span><span class="o">.</span><span class="n">describe</span><span class="p">(</span><span class="n">question</span><span class="o">=</span><span class="s2">"Your question for the AI"</span><span class="p">)</span>
<span class="k">async</span> <span class="k">def</span><span class="w"> </span><span class="nf">ask_command</span><span class="p">(</span><span class="n">interaction</span><span class="p">:</span> <span class="n">discord</span><span class="o">.</span><span class="n">Interaction</span><span class="p">,</span> <span class="n">question</span><span class="p">:</span> <span class="nb">str</span><span class="p">):</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">rate_limiter</span><span class="o">.</span><span class="n">is_allowed</span><span class="p">(</span><span class="n">interaction</span><span class="o">.</span><span class="n">user</span><span class="o">.</span><span class="n">id</span><span class="p">):</span>
<span class="n">remaining</span> <span class="o">=</span> <span class="n">rate_limiter</span><span class="o">.</span><span class="n">reset_time</span><span class="p">(</span><span class="n">interaction</span><span class="o">.</span><span class="n">user</span><span class="o">.</span><span class="n">id</span><span class="p">)</span>
<span class="k">await</span> <span class="n">interaction</span><span class="o">.</span><span class="n">response</span><span class="o">.</span><span class="n">send_message</span><span class="p">(</span>
<span class="sa">f</span><span class="s2">"Rate limit reached. Try again in </span><span class="si">{</span><span class="nb">int</span><span class="p">(</span><span class="n">remaining</span><span class="p">)</span><span class="si">}</span><span class="s2"> seconds."</span><span class="p">,</span>
<span class="n">ephemeral</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="p">)</span>
<span class="k">return</span>
<span class="k">await</span> <span class="n">interaction</span><span class="o">.</span><span class="n">response</span><span class="o">.</span><span class="n">defer</span><span class="p">()</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">llm</span><span class="o">.</span><span class="n">generate_response</span><span class="p">(</span><span class="n">question</span><span class="p">)</span>
<span class="k">await</span> <span class="n">interaction</span><span class="o">.</span><span class="n">followup</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">response</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Slash command from </span><span class="si">{</span><span class="n">interaction</span><span class="o">.</span><span class="n">user</span><span class="si">}</span><span class="s2">: </span><span class="si">{</span><span class="n">question</span><span class="p">[:</span><span class="mi">80</span><span class="p">]</span><span class="si">}</span><span class="s2">..."</span><span class="p">)</span>
<span class="nd">@tree</span><span class="o">.</span><span class="n">command</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">"status"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Check bot status and your rate limit"</span><span class="p">)</span>
<span class="k">async</span> <span class="k">def</span><span class="w"> </span><span class="nf">status_command</span><span class="p">(</span><span class="n">interaction</span><span class="p">:</span> <span class="n">discord</span><span class="o">.</span><span class="n">Interaction</span><span class="p">):</span>
<span class="n">remaining</span> <span class="o">=</span> <span class="n">rate_limiter</span><span class="o">.</span><span class="n">remaining</span><span class="p">(</span><span class="n">interaction</span><span class="o">.</span><span class="n">user</span><span class="o">.</span><span class="n">id</span><span class="p">)</span>
<span class="k">await</span> <span class="n">interaction</span><span class="o">.</span><span class="n">response</span><span class="o">.</span><span class="n">send_message</span><span class="p">(</span>
<span class="sa">f</span><span class="s2">"Bot is online.</span><span class="se">\n</span><span class="s2">"</span>
<span class="sa">f</span><span class="s2">"Your remaining queries: **</span><span class="si">{</span><span class="n">remaining</span><span class="si">}</span><span class="s2">** / </span><span class="si">{</span><span class="n">config</span><span class="p">[</span><span class="s1">'rate_limit_per_user'</span><span class="p">]</span><span class="si">}</span><span class="s2"> "</span>
<span class="sa">f</span><span class="s2">"(resets every </span><span class="si">{</span><span class="n">config</span><span class="p">[</span><span class="s1">'rate_limit_window'</span><span class="p">]</span><span class="si">}</span><span class="s2">s)"</span><span class="p">,</span>
<span class="n">ephemeral</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="p">)</span>
<span class="n">bot</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">config</span><span class="p">[</span><span class="s2">"discord_token"</span><span class="p">])</span>
</code></pre></div>
<p>This gives you two ways to interact with the bot: @mention it in any channel, or use the <code>/ask</code> slash command. The slash command provides a cleaner UX and better discoverability for new users.</p>
<h2 id="part-7-adding-guardrails">Part 7: adding guardrails</h2>
<p>The system prompt provides basic content filtering, but production bots need more layers. Here are three I implement on every bot.</p>
<h3 id="input-sanitization">Input sanitization</h3>
<p>Discord messages can contain Unicode exploits, excessively long strings, and embedded formatting that confuses LLMs. Sanitize before sending to the API.</p>
<div class="codehilite"><pre><span></span><code><span class="kn">import</span><span class="w"> </span><span class="nn">re</span>
<span class="n">MAX_INPUT_LENGTH</span> <span class="o">=</span> <span class="mi">2000</span>
<span class="k">def</span><span class="w"> </span><span class="nf">sanitize_input</span><span class="p">(</span><span class="n">text</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="c1"># Remove Discord-specific formatting that confuses LLMs</span>
<span class="n">text</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="sa">r</span><span class="s2">"<@!?\d+>"</span><span class="p">,</span> <span class="s2">""</span><span class="p">,</span> <span class="n">text</span><span class="p">)</span> <span class="c1"># Remove mentions</span>
<span class="n">text</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="sa">r</span><span class="s2">"<#\d+>"</span><span class="p">,</span> <span class="s2">""</span><span class="p">,</span> <span class="n">text</span><span class="p">)</span> <span class="c1"># Remove channel refs</span>
<span class="n">text</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="sa">r</span><span class="s2">"<a?:\w+:\d+>"</span><span class="p">,</span> <span class="s2">""</span><span class="p">,</span> <span class="n">text</span><span class="p">)</span> <span class="c1"># Remove custom emoji</span>
<span class="c1"># Truncate excessively long inputs</span>
<span class="n">text</span> <span class="o">=</span> <span class="n">text</span><span class="p">[:</span><span class="n">MAX_INPUT_LENGTH</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
<span class="k">return</span> <span class="n">text</span>
</code></pre></div>
<h3 id="output-validation">Output validation</h3>
<p>Before sending the LLM response to the channel, validate it does not contain content that violates your community guidelines. This is your second line of defense after the system prompt.</p>
<div class="codehilite"><pre><span></span><code><span class="n">BLOCKED_PATTERNS</span> <span class="o">=</span> <span class="p">[</span>
<span class="sa">r</span><span class="s2">"(?i)\b(api[_\s]?key|token|password|secret)\b.*[:=]\s*\S+"</span><span class="p">,</span>
<span class="p">]</span>
<span class="k">def</span><span class="w"> </span><span class="nf">validate_output</span><span class="p">(</span><span class="n">text</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-></span> <span class="nb">tuple</span><span class="p">[</span><span class="nb">bool</span><span class="p">,</span> <span class="nb">str</span><span class="p">]:</span>
<span class="k">for</span> <span class="n">pattern</span> <span class="ow">in</span> <span class="n">BLOCKED_PATTERNS</span><span class="p">:</span>
<span class="k">if</span> <span class="n">re</span><span class="o">.</span><span class="n">search</span><span class="p">(</span><span class="n">pattern</span><span class="p">,</span> <span class="n">text</span><span class="p">):</span>
<span class="k">return</span> <span class="kc">False</span><span class="p">,</span> <span class="s2">"Response contained potentially sensitive content and was blocked."</span>
<span class="k">return</span> <span class="kc">True</span><span class="p">,</span> <span class="n">text</span>
</code></pre></div>
<h3 id="error-budget-tracking">Error budget tracking</h3>
<p>Track your daily API spend and automatically disable the bot if costs exceed a threshold. This prevents a single viral thread from generating a $500 API bill overnight.</p>
<div class="codehilite"><pre><span></span><code><span class="kn">import</span><span class="w"> </span><span class="nn">json</span>
<span class="kn">from</span><span class="w"> </span><span class="nn">datetime</span><span class="w"> </span><span class="kn">import</span> <span class="n">date</span>
<span class="kn">from</span><span class="w"> </span><span class="nn">pathlib</span><span class="w"> </span><span class="kn">import</span> <span class="n">Path</span>
<span class="k">class</span><span class="w"> </span><span class="nc">CostTracker</span><span class="p">:</span>
<span class="k">def</span><span class="w"> </span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">daily_limit</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">10.0</span><span class="p">,</span> <span class="n">state_file</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="s2">"cost_state.json"</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">daily_limit</span> <span class="o">=</span> <span class="n">daily_limit</span>
<span class="bp">self</span><span class="o">.</span><span class="n">state_file</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="n">state_file</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_load_state</span><span class="p">()</span>
<span class="k">def</span><span class="w"> </span><span class="nf">_load_state</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">state_file</span><span class="o">.</span><span class="n">exists</span><span class="p">():</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">state_file</span><span class="o">.</span><span class="n">read_text</span><span class="p">())</span>
<span class="k">if</span> <span class="n">data</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"date"</span><span class="p">)</span> <span class="o">==</span> <span class="nb">str</span><span class="p">(</span><span class="n">date</span><span class="o">.</span><span class="n">today</span><span class="p">()):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">today_cost</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s2">"cost"</span><span class="p">]</span>
<span class="k">return</span>
<span class="bp">self</span><span class="o">.</span><span class="n">today_cost</span> <span class="o">=</span> <span class="mf">0.0</span>
<span class="k">def</span><span class="w"> </span><span class="nf">_save_state</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">state_file</span><span class="o">.</span><span class="n">write_text</span><span class="p">(</span><span class="n">json</span><span class="o">.</span><span class="n">dumps</span><span class="p">({</span>
<span class="s2">"date"</span><span class="p">:</span> <span class="nb">str</span><span class="p">(</span><span class="n">date</span><span class="o">.</span><span class="n">today</span><span class="p">()),</span>
<span class="s2">"cost"</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">today_cost</span><span class="p">,</span>
<span class="p">}))</span>
<span class="k">def</span><span class="w"> </span><span class="nf">record_cost</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">cost</span><span class="p">:</span> <span class="nb">float</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bool</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">today_cost</span> <span class="o">+=</span> <span class="n">cost</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_save_state</span><span class="p">()</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">today_cost</span> <span class="o"><</span> <span class="bp">self</span><span class="o">.</span><span class="n">daily_limit</span>
<span class="k">def</span><span class="w"> </span><span class="nf">is_within_budget</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bool</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">today_cost</span> <span class="o"><</span> <span class="bp">self</span><span class="o">.</span><span class="n">daily_limit</span>
</code></pre></div>
<p>A typical Claude Sonnet query with 500 input tokens and 500 output tokens costs approximately $0.005. At a $10/day budget, that allows 2,000 queries per day, more than enough for a community of several hundred members. Adjust the limit based on your actual usage patterns.</p>
<h2 id="part-8-deployment-on-a-vps">Part 8: deployment on a VPS</h2>
<p>A Discord bot needs to run 24/7. Your laptop is not a server. Here is how to deploy to a Linux VPS with systemd for process management.</p>
<h3 id="upload-and-install">Upload and install</h3>
<div class="codehilite"><pre><span></span><code><span class="c1"># On your VPS</span>
mkdir<span class="w"> </span>-p<span class="w"> </span>/opt/discord-bot
<span class="nb">cd</span><span class="w"> </span>/opt/discord-bot
<span class="c1"># Copy files (from your local machine)</span>
<span class="c1"># scp -r ./* user@your-vps:/opt/discord-bot/</span>
python3<span class="w"> </span>-m<span class="w"> </span>venv<span class="w"> </span>venv
<span class="nb">source</span><span class="w"> </span>venv/bin/activate
pip<span class="w"> </span>install<span class="w"> </span>-r<span class="w"> </span>requirements.txt
</code></pre></div>
<h3 id="create-the-systemd-service">Create the systemd service</h3>
<p>Create <code>/etc/systemd/system/discord-bot.service</code>:</p>
<div class="codehilite"><pre><span></span><code><span class="k">[Unit]</span>
<span class="na">Description</span><span class="o">=</span><span class="s">AI Discord Bot</span>
<span class="na">After</span><span class="o">=</span><span class="s">network.target</span>
<span class="k">[Service]</span>
<span class="na">Type</span><span class="o">=</span><span class="s">simple</span>
<span class="na">User</span><span class="o">=</span><span class="s">botuser</span>
<span class="na">WorkingDirectory</span><span class="o">=</span><span class="s">/opt/discord-bot</span>
<span class="na">EnvironmentFile</span><span class="o">=</span><span class="s">/opt/discord-bot/.env</span>
<span class="na">ExecStart</span><span class="o">=</span><span class="s">/opt/discord-bot/venv/bin/python bot.py</span>
<span class="na">Restart</span><span class="o">=</span><span class="s">always</span>
<span class="na">RestartSec</span><span class="o">=</span><span class="s">10</span>
<span class="na">StandardOutput</span><span class="o">=</span><span class="s">journal</span>
<span class="na">StandardError</span><span class="o">=</span><span class="s">journal</span>
<span class="k">[Install]</span>
<span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span>
</code></pre></div>
<h3 id="enable-and-start">Enable and start</h3>
<div class="codehilite"><pre><span></span><code>sudo<span class="w"> </span>systemctl<span class="w"> </span>daemon-reload
sudo<span class="w"> </span>systemctl<span class="w"> </span><span class="nb">enable</span><span class="w"> </span>discord-bot
sudo<span class="w"> </span>systemctl<span class="w"> </span>start<span class="w"> </span>discord-bot
<span class="c1"># Check status</span>
sudo<span class="w"> </span>systemctl<span class="w"> </span>status<span class="w"> </span>discord-bot
<span class="c1"># View logs</span>
sudo<span class="w"> </span>journalctl<span class="w"> </span>-u<span class="w"> </span>discord-bot<span class="w"> </span>-f
</code></pre></div>
<p>The <code>Restart=always</code> directive means systemd will restart the bot if it crashes. <code>RestartSec=10</code> adds a 10-second delay to prevent rapid restart loops. Combined with the bot's own error handling, this gives you a deployment that recovers from crashes automatically.</p>
<h3 id="log-rotation">Log rotation</h3>
<p>The bot writes to <code>bot.log</code> in its working directory. Set up logrotate to prevent the file from growing indefinitely.</p>
<p>Create <code>/etc/logrotate.d/discord-bot</code>:</p>
<div class="codehilite"><pre><span></span><code>/opt/discord-bot/bot.log {
daily
rotate 14
compress
delaycompress
missingok
notifempty
create 0644 botuser botuser
}
</code></pre></div>
<p>This keeps 14 days of compressed logs. For a typical bot, that is about 50-100 MB of storage.</p>
<h2 id="part-9-advanced-features">Part 9: advanced features</h2>
<p>Once the core bot is running reliably, here are three features that add real value.</p>
<h3 id="scheduled-posts">Scheduled posts</h3>
<p>Use discord.py's built-in task loop to post recurring content: daily summaries, market updates, or community announcements.</p>
<div class="codehilite"><pre><span></span><code><span class="kn">from</span><span class="w"> </span><span class="nn">discord.ext</span><span class="w"> </span><span class="kn">import</span> <span class="n">tasks</span>
<span class="kn">import</span><span class="w"> </span><span class="nn">datetime</span>
<span class="nd">@tasks</span><span class="o">.</span><span class="n">loop</span><span class="p">(</span><span class="n">time</span><span class="o">=</span><span class="n">datetime</span><span class="o">.</span><span class="n">time</span><span class="p">(</span><span class="n">hour</span><span class="o">=</span><span class="mi">9</span><span class="p">,</span> <span class="n">minute</span><span class="o">=</span><span class="mi">0</span><span class="p">))</span> <span class="c1"># 9 AM UTC daily</span>
<span class="k">async</span> <span class="k">def</span><span class="w"> </span><span class="nf">daily_update</span><span class="p">():</span>
<span class="n">channel</span> <span class="o">=</span> <span class="n">bot</span><span class="o">.</span><span class="n">get_channel</span><span class="p">(</span><span class="n">YOUR_CHANNEL_ID</span><span class="p">)</span>
<span class="k">if</span> <span class="n">channel</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="s2">"Daily update channel not found"</span><span class="p">)</span>
<span class="k">return</span>
<span class="n">summary</span> <span class="o">=</span> <span class="n">llm</span><span class="o">.</span><span class="n">generate_response</span><span class="p">(</span>
<span class="s2">"Generate a brief daily community update with today's date. "</span>
<span class="s2">"Include a motivational note and a discussion prompt."</span>
<span class="p">)</span>
<span class="k">await</span> <span class="n">channel</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="sa">f</span><span class="s2">"**Daily Update**</span><span class="se">\n</span><span class="si">{</span><span class="n">summary</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="nd">@bot</span><span class="o">.</span><span class="n">event</span>
<span class="k">async</span> <span class="k">def</span><span class="w"> </span><span class="nf">on_ready</span><span class="p">():</span>
<span class="c1"># ... existing on_ready code ...</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">daily_update</span><span class="o">.</span><span class="n">is_running</span><span class="p">():</span>
<span class="n">daily_update</span><span class="o">.</span><span class="n">start</span><span class="p">()</span>
</code></pre></div>
<h3 id="multi-channel-awareness">Multi-channel awareness</h3>
<p>Different channels often need different bot behaviors. A <code>#research</code> channel might need detailed, technical responses while a <code>#general</code> channel needs shorter, casual ones.</p>
<div class="codehilite"><pre><span></span><code><span class="n">CHANNEL_CONFIGS</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"research"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"system_prompt"</span><span class="p">:</span> <span class="s2">"You are a research assistant. Provide detailed, cited responses."</span><span class="p">,</span>
<span class="s2">"max_tokens"</span><span class="p">:</span> <span class="mi">1500</span><span class="p">,</span>
<span class="p">},</span>
<span class="s2">"general"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"system_prompt"</span><span class="p">:</span> <span class="s2">"You are a friendly community bot. Keep responses brief and casual."</span><span class="p">,</span>
<span class="s2">"max_tokens"</span><span class="p">:</span> <span class="mi">500</span><span class="p">,</span>
<span class="p">},</span>
<span class="p">}</span>
<span class="k">def</span><span class="w"> </span><span class="nf">get_channel_config</span><span class="p">(</span><span class="n">channel_name</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-></span> <span class="nb">dict</span><span class="p">:</span>
<span class="k">return</span> <span class="n">CHANNEL_CONFIGS</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">channel_name</span><span class="p">,</span> <span class="n">CHANNEL_CONFIGS</span><span class="p">[</span><span class="s2">"general"</span><span class="p">])</span>
</code></pre></div>
<h3 id="conversation-context">Conversation context</h3>
<p>For more natural conversations, pass recent channel history as context to the LLM.</p>
<div class="codehilite"><pre><span></span><code><span class="k">async</span> <span class="k">def</span><span class="w"> </span><span class="nf">get_recent_context</span><span class="p">(</span><span class="n">channel</span><span class="p">,</span> <span class="n">limit</span><span class="o">=</span><span class="mi">5</span><span class="p">):</span>
<span class="n">messages</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">async</span> <span class="k">for</span> <span class="n">msg</span> <span class="ow">in</span> <span class="n">channel</span><span class="o">.</span><span class="n">history</span><span class="p">(</span><span class="n">limit</span><span class="o">=</span><span class="n">limit</span><span class="p">):</span>
<span class="k">if</span> <span class="n">msg</span><span class="o">.</span><span class="n">author</span> <span class="o">!=</span> <span class="n">bot</span><span class="o">.</span><span class="n">user</span><span class="p">:</span>
<span class="n">messages</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="n">msg</span><span class="o">.</span><span class="n">author</span><span class="o">.</span><span class="n">display_name</span><span class="si">}</span><span class="s2">: </span><span class="si">{</span><span class="n">msg</span><span class="o">.</span><span class="n">content</span><span class="p">[:</span><span class="mi">200</span><span class="p">]</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="n">messages</span><span class="o">.</span><span class="n">reverse</span><span class="p">()</span>
<span class="k">return</span> <span class="s2">"</span><span class="se">\n</span><span class="s2">"</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">messages</span><span class="p">)</span>
</code></pre></div>
<p>This adds context but also adds cost. Each additional message in the context increases token usage. At 5 messages of context, expect a 30-50% increase in per-query cost. Monitor your cost tracker and adjust accordingly.</p>
<h2 id="part-10-monitoring-and-maintenance">Part 10: monitoring and maintenance</h2>
<p>A deployed bot needs ongoing attention. Here is the monitoring setup I use.</p>
<h3 id="health-check-endpoint">Health check endpoint</h3>
<p>Add a simple HTTP health check so external monitoring services (UptimeRobot, Better Stack) can verify the bot is running.</p>
<div class="codehilite"><pre><span></span><code><span class="kn">from</span><span class="w"> </span><span class="nn">aiohttp</span><span class="w"> </span><span class="kn">import</span> <span class="n">web</span>
<span class="kn">import</span><span class="w"> </span><span class="nn">asyncio</span>
<span class="k">async</span> <span class="k">def</span><span class="w"> </span><span class="nf">health_handler</span><span class="p">(</span><span class="n">request</span><span class="p">):</span>
<span class="k">return</span> <span class="n">web</span><span class="o">.</span><span class="n">json_response</span><span class="p">({</span><span class="s2">"status"</span><span class="p">:</span> <span class="s2">"healthy"</span><span class="p">,</span> <span class="s2">"latency"</span><span class="p">:</span> <span class="n">bot</span><span class="o">.</span><span class="n">latency</span><span class="p">})</span>
<span class="k">async</span> <span class="k">def</span><span class="w"> </span><span class="nf">start_health_server</span><span class="p">():</span>
<span class="n">app</span> <span class="o">=</span> <span class="n">web</span><span class="o">.</span><span class="n">Application</span><span class="p">()</span>
<span class="n">app</span><span class="o">.</span><span class="n">router</span><span class="o">.</span><span class="n">add_get</span><span class="p">(</span><span class="s2">"/health"</span><span class="p">,</span> <span class="n">health_handler</span><span class="p">)</span>
<span class="n">runner</span> <span class="o">=</span> <span class="n">web</span><span class="o">.</span><span class="n">AppRunner</span><span class="p">(</span><span class="n">app</span><span class="p">)</span>
<span class="k">await</span> <span class="n">runner</span><span class="o">.</span><span class="n">setup</span><span class="p">()</span>
<span class="n">site</span> <span class="o">=</span> <span class="n">web</span><span class="o">.</span><span class="n">TCPSite</span><span class="p">(</span><span class="n">runner</span><span class="p">,</span> <span class="s2">"0.0.0.0"</span><span class="p">,</span> <span class="mi">8080</span><span class="p">)</span>
<span class="k">await</span> <span class="n">site</span><span class="o">.</span><span class="n">start</span><span class="p">()</span>
<span class="nd">@bot</span><span class="o">.</span><span class="n">event</span>
<span class="k">async</span> <span class="k">def</span><span class="w"> </span><span class="nf">on_ready</span><span class="p">():</span>
<span class="c1"># ... existing on_ready code ...</span>
<span class="n">asyncio</span><span class="o">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">start_health_server</span><span class="p">())</span>
</code></pre></div>
<h3 id="monthly-maintenance-checklist">Monthly maintenance checklist</h3>
<p>I run this checklist on the first of every month for every bot I maintain:</p>
<ul>
<li>[ ] Review error logs for recurring issues</li>
<li>[ ] Check API cost trends. Are they stable, growing, or spiking?</li>
<li>[ ] Update dependencies (<code>pip list --outdated</code>)</li>
<li>[ ] Verify rate limit settings are appropriate for current community size</li>
<li>[ ] Test the bot manually with 5-10 representative queries</li>
<li>[ ] Confirm log rotation is working (check disk usage)</li>
<li>[ ] Review and update the system prompt if community needs have changed</li>
</ul>
<p>Average monthly maintenance time: 30-45 minutes per bot. At $150/hour consulting rates, that is $75-$112/month in maintenance cost. Factor this into your total cost of ownership.</p>
<h2 id="what-this-architecture-supports">What this architecture supports</h2>
<p>This tutorial gives you a foundation that handles communities up to 1,000 active members. For larger deployments, you would add message queuing (Redis or RabbitMQ) to handle burst traffic, database storage (PostgreSQL) for conversation history and analytics, multiple bot instances behind a load balancer for high availability, and webhook integration to connect the bot with external services.</p>
<p>But for most communities, the architecture in this tutorial is more than sufficient. My production bot handles 200+ daily active users on a single $6/month VPS with no performance issues.</p>
<h2 id="build-vs-buy">Build vs. buy</h2>
<p>Before building a custom bot, consider whether an off-the-shelf solution fits. Tools like MEE6, Dyno, and Carl-bot handle moderation and basic automation well. The case for building custom is when you need LLM-powered responses tuned to your community's domain, integration with your specific business systems (CRM, scheduling, databases), full control over data privacy and cost management, or features that no existing bot provides.</p>
<p>If your needs are simpler, start with an existing bot and build custom when you hit its limits. For a broader perspective on when to build custom versus use existing tools, see my post on <a href="/blog/ai-agents-vs-zapier">AI agents vs. Zapier</a>.</p>
<p>If you need a custom AI bot built for your community or business and do not want to build it yourself, <a href="/services/automation-audit">check our services</a>. Discord bots are one of the most common agentic engineering projects I deliver.</p>
<hr />
<h2 id="frequently-asked-questions">Frequently asked questions</h2>
<h3 id="how-much-does-it-cost-to-run-an-ai-discord-bot">How much does it cost to run an AI Discord bot?</h3>
<p>A VPS costs $5-$10/month. LLM API costs depend on usage: at $0.003-$0.015 per query, a community generating 500 queries/month costs $1.50-$7.50 in API fees. Total: $6.50-$17.50/month for most small to mid-size communities. The cost tracker in this tutorial helps you monitor and cap spending.</p>
<h3 id="can-i-use-a-free-llm-instead-of-a-paid-api">Can I use a free LLM instead of a paid API?</h3>
<p>Yes. You can run open-source models like Llama or Mistral locally on your VPS, but you will need a more powerful server (at least 16 GB RAM, ideally with a GPU). The tradeoff is higher server cost ($30-$80/month) but zero per-query API fees. For most small communities, the paid API approach is cheaper and simpler.</p>
<h3 id="how-do-i-prevent-the-bot-from-generating-harmful-content">How do I prevent the bot from generating harmful content?</h3>
<p>Three layers of defense: a system prompt that explicitly prohibits harmful content, output validation that checks responses before sending, and rate limiting that prevents abuse at scale. No single layer is perfect, but together they provide strong protection. You should also set up a moderation log channel where flagged content is posted for human review.</p>
<h3 id="can-this-bot-handle-multiple-discord-servers">Can this bot handle multiple Discord servers?</h3>
<p>Yes, with minor modifications. Remove the <code>GUILD_ID</code> restriction and sync commands globally instead of per-guild. Be aware that global command sync takes up to an hour to propagate, and your rate limiting and cost tracking should account for aggregate usage across all servers.</p>