Quenda A Lightweight Layered Agent Framework
Quenda: A Lightweight, Layered Agent Framework for Python
Date: 2026-07-05 Project: github.com/xvshiting/quenda
When I first started building LLM-powered applications, I kept running into the same problem: agent frameworks either handed me a black box that was impossible to reason about, or they dumped so many primitives on my desk that wiring them together took longer than the actual product. I wanted something in between — a framework small enough to hold in my head, but structured enough to scale from a one-shot script to a persistent coding assistant. That’s why I built Quenda.
1. The Design Goal: Small Surface, Strict Layers
Quenda’s entire public SDK reduces to four concepts: Agent, Session, @tool, and a provider registry. Everything else is an implementation detail. The reason the surface stays small is a strict four-layer architecture:
Interface → Host → Runtime → Kernel
| Layer | Responsibility |
|---|---|
| Kernel | Synchronous model-tool loop. No knowledge of agents, sessions, users. |
| Runtime | Async Agent/Session/Run lifecycle, event emission, context mgmt. |
| Host | Persistence, identity, permissions, instruction composition, skills. |
| Interface | Event rendering, user interaction, REPL. |
Each layer depends only on the layer inside it. The innermost Kernel is a pure function of (messages, tools, model) → events, which means you can test the entire core with fake models and never touch the network. This is the property I cared most about: tests that don’t flake on rate limits or model downtime.
2. The Kernel: A Testable Model-Tool Loop
Most agent loops interleave concerns — they fetch instructions, persist state, render UI, and call the model all in the same function. Quenda’s Kernel refuses to do any of that. It only knows how to:
- Send the current messages + tools to a model
- Dispatch any tool calls the model returns
- Loop until the model produces a final message
Because nothing else leaks in, replacing the model with a deterministic fake turns the loop into a pure state machine. This makes regression tests instant and behavior changes auditable.
3. The Runtime: Async Sessions and Events
On top of the Kernel, the Runtime adds the async lifecycle: Agent, Session, and Run. A Session is a resumable conversation; a Run is a single send() invocation within that session. Every step of the run emits structured events — tool-started, tool-finished, model-responding, context-compressed — so any Interface can render progress without coupling to internals.
from quenda import Agent, tool
from quenda.providers import get_provider_registry
from quenda.tools import get_core_tools
import asyncio
@tool
def calculate(expression: str) -> float:
"""Safely evaluate a math expression."""
import ast
node = ast.parse(expression, mode='eval')
return eval(compile(node, '<string>', 'eval'), {"__builtins__": {}}, {})
model = get_provider_registry().get_model("deepseek", "deepseek-v4-flash")
agent = Agent(
name="assistant",
system_prompt="You are a helpful assistant.",
tools=[calculate, *get_core_tools(".")],
model=model,
)
async def main():
session = agent.open_session()
result = await session.send("What is 15% of 847?")
print(result)
asyncio.run(main())
That’s a complete agent. No boilerplate, no framework-specific DSL, no implicit globals.
4. Providers: 26 Behind One Registry
A common pain point is vendor lock-in. Quenda ships with 26 built-in providers covering 300+ models — OpenAI, Anthropic, DeepSeek, DashScope, Moonshot, OpenRouter, Ollama, and more — all behind one get_provider_registry() and one ModelSpec interface. Switching models mid-session is a single call:
/model deepseek/deepseek-v4-flash
Adding your own provider takes five lines — point it at your base URL, declare the API flavor (openai-completions or otherwise), and register it.
5. Tools: Workspace-Scoped by Default
get_core_tools(workspace) returns nine essential tools:
| Tool | Capability |
|---|---|
list_files |
Browse directories (ls, find, tree) |
search_text |
Search file contents (grep, rg) |
read_file |
View files with line ranges |
write_file |
Create or overwrite files |
apply_patch |
Apply targeted text patches |
run_shell |
Execute shell commands (filtered) |
execute_python |
Run Python in a sandbox |
request_interaction |
Ask the user for input |
request_skill_activation |
Ask to activate a skill |
None of them reach outside the workspace root, and the shell/Python tools enforce command filtering and import restrictions. Security lives in the code path, not in a checklist.
6. Skills: Composable Capability Packages
The newest addition is the Skills framework, introduced in the 2026-06 release. A skill is a package of instructions, resources, and optional tools that an agent can discover and activate on demand. Think of them as plug-in competencies — a “git-workflow” skill, a “code-review” skill, a “thesis-formatting” skill — that composition into a system prompt without bloating the base context.
Skills compose cleanly with context compression (also new in 2026-06): when the context grows large, Quenda summarizes earlier turns automatically, and the /compress command lets you trigger it manually. The agent stays responsive even in long sessions.
7. Quenda Code: A Coding Agent That Eats Its Own Dog Food
The flagship application is Quenda Code, an AI coding agent that runs in the terminal:
pip install quenda quenda-code
quenda code
It reads your codebase, writes code, runs commands, and helps you ship. What makes it a useful stress test of the framework is that it exercises every layer — Kernel math, Runtime session persistence, Host instruction composition + skills, and the Interface REPL — in a real workflow. If the SDK has an awkward corner, the coding agent finds it first.
A typical session:
> read the main entry point and explain how it works
I'll read the main entry point...
[Reads src/quenda/cli.py]
The entry point is `cli.py:main()` ...
> add a --version flag to the CLI
[Applies patch to cli.py]
Done. Added `--version` flag that prints the version and exits.
> run the tests
[Runs pytest]
All 42 tests passed.
REPL Commands
| Command | Description |
|---|---|
/help |
Show available commands |
/mode [code\|architect\|chat] |
Switch interaction mode |
/model <provider>/<model> |
Switch model mid-session |
/skill list |
List available skills |
/skill activate <name> |
Activate a skill |
/compress |
Manually compress context |
/status |
Show session and token info |
/reset |
Clear conversation history |
8. What I Learned Building It
A few things crystallized over the course of the project:
- Layering is a forcing function for testability. Once the Kernel had no I/O, the rest of the system got cheap to test almost for free.
- Tools should be policies, not magic.
run_shellfilters commands;execute_pythonrestricts imports. Putting that logic in the tool — rather than hoping the model behaves — is the only thing that scales. - Providers are a registry, not a class hierarchy. Modeling every vendor behind one
ModelSpecremoved a whole category of accidental complexity. - Events beat callbacks. Emitting structured events from the Runtime let me swap interfaces (CLI, HTTP, test harness) without touching the agent.
9. Roadmap
The 2026-06 release added skills, context compression, interaction requests, and command extensions. What’s next on my list:
- Skill marketplace — share + discover community skills
- Multi-agent orchestration — first-class agent-to-agent messaging built on the same event stream
- Fine-tuned draft models — speculative-decoding-style acceleration for agent loops
- More providers — push past 26 toward a self-describing provider spec
10. Get Started
pip install quenda quenda-code # CLI coding agent
pip install quenda # SDK only
Requires Python 3.12+. Zero required runtime dependencies.
- Repository: github.com/xvshiting/quenda
- SDK Tutorials: 8 chapters covering agents, tools, providers, sessions, and events
- CLI Tutorials: 5 chapters on Quenda Code
- Architecture Decisions: ADR records in
docs/decisions/
Quenda is intentionally small. If you’ve been looking for an agent framework that fits in your head but doesn’t disappear when the workflow gets serious, give it a try.