Documentation Local LLM (LM Studio)

Local LLM via LM Studio

BunkerM can connect to a locally running large language model through LM Studio, giving you a fully private, offline-capable AI chat assistant that understands your broker's live state — no cloud subscription required.

Available on all plans — including the free Community edition. Local LLM works entirely on your own hardware. No BunkerAI Cloud account is needed.

What Is Local LLM Mode?

Local LLM mode lets BunkerM's AI chat use a model running on your own machine via LM Studio instead of BunkerAI Cloud. The AI receives live broker context on every message — connected clients, active topics and their latest payloads, broker statistics — and can execute actions like creating clients or publishing messages directly through BunkerM's internal APIs.

This is functionally equivalent to Cloud AI for the web chat interface. The key differences are:

  • Privacy: All data stays on your machine. Nothing leaves your network.
  • No usage limits: No interaction quota or monthly billing.
  • Model choice: Use any model supported by LM Studio (Llama, Mistral, Qwen, Phi, etc.).
  • No Telegram / Slack: Those connectors route through BunkerAI Cloud and cannot use a local model.

Prerequisites

  1. LM Studio installed — Download from lmstudio.ai. Available for macOS, Windows, and Linux.
  2. A model downloaded in LM Studio — Recommended starting points:
    • lmstudio-community/Qwen2.5-7B-Instruct-GGUF — good balance of speed and capability
    • bartowski/Llama-3.2-3B-Instruct-GGUF — fast on CPU, lower RAM usage
    • lmstudio-community/mistral-7b-instruct-v0.3-GGUF — solid instruction following
  3. LM Studio local server running — Enable it inside LM Studio under the Local Server tab. The default address is http://localhost:1234.

Step 1 — Start the LM Studio Server

  1. Open LM Studio.
  2. Go to the Local Server tab (the <-> icon in the sidebar).
  3. Select the model you want to use from the dropdown.
  4. Click Start Server. The status bar should show Server running on port 1234.

LM Studio exposes an OpenAI-compatible API at http://localhost:1234/v1. BunkerM talks to this endpoint directly.

Step 2 — Configure BunkerM

  1. In BunkerM, go to Settings → Integrations.
  2. Scroll to the Local LLM card.
  3. Enter the LM Studio server URL. If BunkerM runs in Docker on the same machine, use:
    http://host.docker.internal:1234
    If BunkerM runs directly on the host (not in Docker), use:
    http://localhost:1234
  4. Click Fetch Models — BunkerM will query LM Studio and populate the model dropdown.
  5. Select the model you loaded in LM Studio.
  6. Toggle the Enable switch on.
  7. Click Save.
Tip: Toggling the Enable switch automatically fetches available models from LM Studio so you don't need to click Fetch Models separately.

Step 3 — Switch to Local LLM in the Chat

  1. Open AI → Chat in the BunkerM sidebar.
  2. In the chat header, click the Local LLM button (next to the Cloud button).
  3. Your mode selection is saved automatically — it persists when you navigate away and return.
  4. Type a message and press Enter. The local model will respond with full knowledge of your broker's current state.

What the AI Knows About Your Broker

On every message, BunkerM injects a live context snapshot into the local model's system prompt. This includes:

  • Broker statistics — connected clients count, messages per second, uptime, subscriptions
  • Active topics — all topics with their latest published payload
  • Connected clients — currently active client IDs
  • Registered clients — all DynSec clients with their roles

This means you can ask questions like "What is the current value of home/sensor/temp?" or "How many clients are connected?" and get accurate, live answers.

What the AI Can Do

The local AI can execute actions against your broker directly — not just describe how to do it manually:

Action Example prompt
Create a single MQTT client "Create a client called sensor-01 with a secure password"
Create multiple clients at once "Create 5 clients named device-01 through device-05"
Publish a message to a topic "Publish 'ON' to home/lights/kitchen"
Delete a client "Delete client old-device-03"

Resetting the Conversation

If the model gets confused by previous context, you can reset the conversation two ways:

  • Type /reset in the chat input and press Enter.
  • Click the trash icon (🗑) in the chat header (visible when there are messages).

Both clear only the current mode's history (Local LLM history and Cloud AI history are stored separately).

Choosing the Right Model

Larger models (7B+) follow instructions more reliably, handle complex requests, and are less likely to add unnecessary action blocks to read-only queries. Smaller models (1–3B) respond faster and use less RAM but may occasionally misinterpret instructions.

For best results with BunkerM, use a model with strong instruction-following capability — Qwen2.5-Instruct and Llama-3-Instruct series perform well. Avoid base (non-instruct) models.

Troubleshooting

No models appear after clicking Fetch Models

  • Make sure the LM Studio local server is running (status bar shows green).
  • Check that the URL in BunkerM matches the LM Studio server address.
  • If BunkerM is in Docker, use http://host.docker.internal:1234 not localhost.

The model responds but doesn't know my broker state

  • Ensure Local LLM is enabled and saved in Settings → Integrations.
  • The broker context is fetched live on each message — check that BunkerM's internal services are running (docker compose logs -f).

The model creates clients when I only asked a question

  • This is a known limitation of smaller models that may ignore the instruction "only act on explicit requests".
  • Switch to a larger or more capable model (7B+).
  • Be explicit: start questions with "Tell me…" or "What is…" rather than imperative phrasing.

Connection refused error

  • LM Studio server may not be running. Go to LM Studio → Local Server tab and click Start Server.
  • If using a custom port, update the URL in Settings → Integrations accordingly.