Swival

A coding agent for any model. Documentation

Swival is a CLI coding agent built to be practical, reliable, and easy to use. It works with frontier models, but its main goal is to be as reliable as possible with smaller models, including local ones. It is designed from the ground up to handle tight context windows and limited resources without falling apart.

It connects to LM Studio, llama.cpp, HuggingFace Inference API, OpenRouter, Google Gemini, ChatGPT Plus/Pro, any OpenAI-compatible server (ollama, mlx_lm.server, vLLM, etc.), or any external command (codex exec, custom wrappers, etc.), sends your task, and runs an autonomous tool loop until it produces an answer. With LM Studio and llama.cpp it auto-discovers your loaded model, so there's nothing to configure. Pure Python, no framework.

Quickstart

Pick the provider that matches how you want to run models:

Provider	Auth	Required flags	First command
LM Studio	none	none	`swival "Refactor src/api.py"`
llama.cpp	none	`--provider llamacpp`	`swival --provider llamacpp "Refactor src/api.py"`
HuggingFace	`HF_TOKEN` or `--api-key`	`--provider huggingface --model ORG/MODEL`	`swival --provider huggingface --model zai-org/GLM-5 "task"`
OpenRouter	`OPENROUTER_API_KEY` or `--api-key`	`--provider openrouter --model MODEL`	`swival --provider openrouter --model z-ai/glm-5 "task"`
Google Gemini	`GEMINI_API_KEY`, `OPENAI_API_KEY`, or `--api-key`	`--provider google --model MODEL`	`swival --provider google --model gemini-2.5-flash "task"`
ChatGPT Plus/Pro	browser auth on first run or `CHATGPT_API_KEY`	`--provider chatgpt --model MODEL`	`swival --provider chatgpt --model gpt-5.4 "task"`
Generic	optional `OPENAI_API_KEY`	`--provider generic --base-url URL --model MODEL`	`swival --provider generic --base-url http://127.0.0.1:8080 --model my-model "task"`
AWS Bedrock	AWS credential chain (`AWS_PROFILE`, env vars, IAM)	`--provider bedrock --model MODEL`	`swival --provider bedrock --model global.anthropic.claude-opus-4-6-v1 "task"`
Command	none	`--provider command --model "COMMAND"`	`swival --provider command --model "codex exec --full-auto" "task"`

Run swival --help for the grouped CLI reference and copy-paste examples.

LM Studio

Install LM Studio and load a model with tool-calling support. Recommended first model: qwen3-coder-next (great quality/speed tradeoff on local hardware). Crank the context size as high as your hardware allows.
Start the LM Studio server.
Install Swival (requires Python 3.13+):

uv tool install swival

On macOS you can also use Homebrew: brew install swival/tap/swival

Run:

swival "Refactor the error handling in src/api.py"

That's it. Swival finds the model, connects, and goes to work.

llama.cpp

Start llama-server with a model (use --fit on to auto-size context to available memory):

llama-server --reasoning auto --fit on \
    -hf unsloth/gemma-4-26B-A4B-it-GGUF:UD-Q4_K_XL

Install Swival:
```
uv tool install swival
```

Run (model is auto-discovered from the server):

swival --provider llamacpp "Refactor the error handling in src/api.py"

The default base URL is http://127.0.0.1:8080. Override with --base-url.

HuggingFace

export HF_TOKEN=hf_...
uv tool install swival
swival "Refactor the error handling in src/api.py" \
    --provider huggingface --model zai-org/GLM-5

You can also point it at a dedicated endpoint with --base-url and --api-key.

OpenRouter

export OPENROUTER_API_KEY=sk_or_...
uv tool install swival
swival "Refactor the error handling in src/api.py" \
    --provider openrouter --model z-ai/glm-5

Google Gemini

export GEMINI_API_KEY=...
uv tool install swival
swival "Refactor the error handling in src/api.py" \
    --provider google --model gemini-2.5-flash

ChatGPT Plus/Pro

Use OpenAI models through your existing ChatGPT Plus or Pro subscription -- no API key needed.

uv tool install swival
swival "Refactor the error handling in src/api.py" \
    --provider chatgpt --model gpt-5.4

On first use, a device code and URL are printed to your terminal. Open the URL, enter the code, and authorize with your ChatGPT account. Tokens are cached locally for subsequent runs.

Generic (OpenAI-compatible)

swival "Refactor the error handling in src/api.py" \
    --provider generic \
    --base-url http://127.0.0.1:8080 \
    --model my-model

Works with ollama, mlx_lm.server, vLLM, DeepSeek API, and anything else that speaks the OpenAI chat completions protocol. No API key required for local servers.

Interactive sessions

swival

The REPL carries conversation history across questions, which makes it good for exploratory work and longer tasks.

Task Input From Stdin

If you omit the positional task and pipe stdin, Swival reads the task from stdin.

swival -q < objective.md

cat prompts/review.md | swival --provider huggingface --model zai-org/GLM-5

Useful for long prompts, shell-quoting avoidance, and scripted workflows.

Updates and uninstall

uv tool upgrade swival    # update (uv)
uv tool uninstall swival  # remove (uv)
brew upgrade swival       # update (Homebrew)
brew uninstall swival     # remove (Homebrew)

What makes it different

Reliable with small models. Context management is one of Swival's strengths. It keeps things clean and focused, which is especially important when you are working with models that have tight context windows. Graduated compaction, persistent thinking notes, and a todo checklist all survive context resets, so the agent doesn't lose track of multi-step plans even under pressure.

Your models, your way. Works with LM Studio, llama.cpp, HuggingFace Inference API, OpenRouter, Google Gemini, ChatGPT Plus/Pro, any OpenAI-compatible server, and any external command. With LM Studio and llama.cpp, it auto-discovers whatever model you have loaded — nothing to configure. With HuggingFace or OpenRouter, point it at any supported model. With Google Gemini, use Gemini models through Google's native API. With ChatGPT Plus/Pro, authenticate through your browser and use OpenAI's models through your existing subscription. With the generic provider, connect to ollama, mlx_lm.server, vLLM, or any other compatible server. With the command provider, shell out to any program that reads a prompt on stdin and writes a response on stdout. You pick the model and the infrastructure.

Review loop and LLM-as-a-judge. Swival has a configurable review loop that can run external reviewer scripts or use a built-in LLM-as-judge to automatically evaluate and retry agent output. Good for quality assurance on tasks that matter.

Built for benchmarking. Pass --report report.json and Swival writes a machine-readable evaluation report with per-call LLM timing, tool success/failure counts, context compaction events, and guardrail interventions. Useful for comparing models, settings, skills, and MCP servers systematically on real coding tasks.

Secrets stay on your machine. Swival transparently detects API keys and credential tokens in LLM messages and encrypts them before they leave your machine when you enable secret encryption with --encrypt-secrets. The LLM never sees the real values. Decryption happens locally when the response comes back, so tools still work normally. See Secret Encryption for details.

Cross-session memory. The agent remembers things across sessions. It stores notes in a local memory file and retrieves the most relevant entries for each new conversation using BM25 ranking, so context from past work carries forward without bloating the prompt. Use /learn in the REPL to teach it something on the spot.

Pick up where you left off. When a session is interrupted — Ctrl+C, max turns, context overflow — Swival saves its state to disk. Next time you run it in the same directory, it picks up where it left off: what it was doing, what it had figured out, and what was left.

A2A server mode. Run swival --serve and your agent becomes an A2A endpoint that other agents can call over HTTP. Multi-turn context, streaming, rate limiting, and bearer auth are built in.

Skills, MCP, and A2A. Extend the agent with SKILL.md-based skills for reusable workflows, connect to external tools via the Model Context Protocol, and talk to remote agents via the Agent-to-Agent (A2A) protocol.

Small enough to read and hack. A compact Python codebase with no framework underneath. If something doesn't work the way you want, change it.

CLI-native. stdout is exclusively the final answer. All diagnostics go to stderr. Pipe Swival's output straight into another command or a file.

Extensible with custom commands. Drop scripts or prompt templates into ~/.config/swival/commands/ and invoke them with !name in the REPL. The swival-commands community repo has ready-made commands like a security auditor and a PR reviewer.

Documentation

Full documentation is available at swival.dev.

Getting Started -- installation, first run, what happens under the hood
Usage -- one-shot mode, REPL mode, CLI flags, piping, exit codes
Tools -- what the agent can do: file ops, search, editing, web fetching, thinking, task tracking, command execution
Safety and Sandboxing -- path resolution, symlink protection, filesystem access modes, command execution modes
Skills -- creating and using SKILL.md-based agent skills
Customization -- config files, project instructions, system prompt overrides, tuning parameters
Context Management -- compaction, snapshots, knowledge survival, and how Swival handles tight context windows
Providers -- LM Studio, HuggingFace, OpenRouter, Google Gemini, ChatGPT Plus/Pro, AWS Bedrock, generic OpenAI-compatible server, and command (external program) configuration
MCP -- connecting external tool servers via the Model Context Protocol
A2A -- connecting to remote agents via the Agent-to-Agent protocol
Reports -- JSON reports for benchmarking and evaluation
Web Browsing -- Chrome DevTools MCP, Lightpanda MCP, and agent-browser for web interaction
Reviews -- external reviewer scripts for automated QA and LLM-as-judge evaluation
Secret Encryption -- transparent encryption of credentials before they reach the LLM provider
Outbound LLM Filter -- user-defined scripts to redact or block outbound LLM requests
Lifecycle Hooks -- startup/exit hooks for syncing state to remote storage
Custom Commands -- REPL custom command setup and execution (see also the community commands repo)
Command Middleware -- pre-execution command rewriting and policy enforcement (RTK integration)
Python API -- library API for embedding Swival in Python applications
Not Just for Frontier Models -- why Swival is built to work well with small and open models too
Using Swival with AgentFS -- copy-on-write filesystem sandboxing for safe agent runs

Name		Name	Last commit message	Last commit date
Latest commit History 649 Commits
.github/workflows		.github/workflows
.media		.media
docs.md		docs.md
docs		docs
homebrew-tap @ d2d6883		homebrew-tap @ d2d6883
scripts		scripts
swival		swival
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
ChangeLog.md		ChangeLog.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
build.py		build.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Swival

Quickstart

LM Studio

llama.cpp

HuggingFace

OpenRouter

Google Gemini

ChatGPT Plus/Pro

Generic (OpenAI-compatible)

Interactive sessions

Task Input From Stdin

Updates and uninstall

What makes it different

Documentation

About

Uh oh!

Releases 52

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Swival

Quickstart

LM Studio

llama.cpp

HuggingFace

OpenRouter

Google Gemini

ChatGPT Plus/Pro

Generic (OpenAI-compatible)

Interactive sessions

Task Input From Stdin

Updates and uninstall

What makes it different

Documentation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 52

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages