Describe what you want — re:factory builds it, tests it, and keeps improving it. Design an idea from scratch or point at an existing project for continuous improvement. Runs with Claude Code, Bob Shell, and OpenAI Codex.
All state is local — per-project in .factory/ (add to .gitignore), global in ~/.factory/. See Architecture for the full deep-dive.
Prerequisites: Python 3.11+, uv, and Claude Code (installed and authenticated).
git clone https://github.com/akashgit/remote-factory.git
cd remote-factory
uv syncThen just run:
uv run factoryThe welcome wizard launches automatically — a conversational agent that asks what you want to do, classifies your input (an idea, a file path, a GitHub URL, or a description), and presents the right command. No flags to memorize. Paste an idea and the wizard handles the rest.
You can also skip the wizard and call commands directly:
uv run factory ceo "Build a personal homepage with a blog" --mode designSee the full setup guide for authentication and environment variables.
| I want to… | Command |
|---|---|
| Start from a raw idea | uv run factory ceo "my idea" --mode design |
| Build from a spec or repo | uv run factory ceo spec.md |
| Improve an existing project | uv run factory ceo /path/to/project |
| Fix or add one thing | uv run factory ceo /path --focus "add dark mode" |
| Target a GitHub issue | uv run factory ceo /path --focus 42 |
Use design mode when you want to brainstorm before building. Start a conversation with the CEO to refine an idea, then build:
# From a raw idea — discuss and refine into a buildable spec
uv run factory ceo "distributed task runner" --mode design
# From a spec file — read and discuss before building
uv run factory ceo ~/ideas/my-app-spec.md --mode designDesign mode also works on existing projects. The CEO studies the backlog, eval scores, open issues, and experiment history, then discusses what to work on before executing:
uv run factory ceo ~/factory-projects/my-app --mode design
# Seed the conversation with a topic
uv run factory ceo ~/factory-projects/my-app --mode design --focus "auth layer"When you already have a spec file, a GitHub repo, or a clear description, re:factory builds directly — no design step needed:
uv run factory ceo ~/ideas/spec.md
uv run factory ceo https://github.com/user/repo
uv run factory ceo "Build a personal homepage with a blog"The pipeline: Researcher surveys best practices → Strategist creates a plan → Builder implements and commits → E2E gate confirms it runs. Override the output directory with --dir my-site. (If you start with a raw idea via --mode design, the CEO refines it into a spec first, then transitions into this same build pipeline automatically. --mode interactive remains accepted as an alias.)
After the first build, a backlog appears at .factory/strategy/backlog.md — deferred features that feed future improvement cycles. Manage it with uv run factory backlog-list, uv run factory backlog-add, and uv run factory backlog-remove.
Point re:factory at an existing project and it enters Improve mode automatically:
uv run factory ceo ~/factory-projects/my-appEach cycle: observe → hypothesize → build → review → measure → decide (keep or revert) → archive. The Strategist picks work from the backlog using FEEC priority (Fix > Exploit > Explore > Combine).
When you know exactly what you want, --focus pins a single target — one hypothesis, one experiment, done:
uv run factory ceo ~/my-app --focus "add dark mode toggle"
uv run factory ceo ~/my-app --focus 42 # GitHub issue
uv run factory ceo ~/my-app --focus "owner/repo#42" # Issue shorthandOther ways to steer: file GitHub issues (the Strategist reads them), add to the backlog manually, or pass a spec file with --prompt.
After a build or improve cycle finishes in foreground mode, the CEO stays active — it doesn't exit. Ask for changes directly:
"Fix the typo in the header" "Add error handling to the upload endpoint" "Make the tests more thorough"
Each request runs through the full experiment pipeline: the Refiner scopes it → Builder implements → review + eval + E2E gate → keep/revert verdict. No shortcuts — every refinement is a tracked experiment with its own PR.
You can also invoke refinements directly with --refine:
uv run factory ceo ~/my-app --refine "add rate limiting to the API"There's no cap on refinements. Advisory warnings appear at 5 and 10 to flag context growth, but the user decides when to stop.
Every change is measured by an 11-dimension composite score across three tiers: Hygiene (tests, lint, types, coverage), Growth (API surface, experiment diversity, observability), and Project (user-defined domain metrics). On first run, uv run factory discover auto-detects your project's language and framework to generate the eval profile. See Eval System for scoring details, weights, and guards.
| Project | What it does | Mode |
|---|---|---|
| SWE-bench solver | Autonomous agent that resolves GitHub issues, improved via failure analysis | Research |
| HMMT math solver | Multi-agent team that solved HMMT Feb 2025 Combinatorics Problem 7 | Research |
| Text/Sketch → CAD | Natural language and sketches to executable CadQuery Python code for 3D models | Research |
| HLS design space explorer | Per-function AI agents + ILP solver for HLS optimization — 92% execution time reduction | Build |
| Pluck | iOS app that extracts structured data from screenshots using on-device AI | Build + Improve |
| SDG Hub | Agent-maintained open-source framework for synthetic data generation | Build + Improve |
| OpenSkies Airline Corpus | 85-document fictional airline corpus for RAG/fine-tuning evaluation with cross-document consistency validation | Design + Improve |
| re:factory itself | Runs on itself — continuously improved via its own experiment outcomes | Meta |
Built something with re:factory? Open a PR to add it here.
# Core workflow
uv run factory ceo <path|url|idea> # Build or improve
uv run factory ceo <path> --mode design # Discuss, then execute
uv run factory ceo <path> --focus "..." # One target, one experiment
uv run factory ceo <path> --refine "..." # Single targeted refinement
uv run factory ceo <path> --loop # Continuous improvement loop
uv run factory tmux <path> --loop # Loop in detached tmux sessionSee uv run factory --help for the complete list.
re:factory supports multiple CLI backends. Default is Claude Code — switch with --runner or FACTORY_RUNNER:
# Direct
CODEX_API_KEY="..." uv run factory ceo /path --runner codex
BOBSHELL_API_KEY="..." uv run factory ceo /path --runner bob
# Via config.toml profile (persistent)
uv run factory ceo /path --profile codexConfigure profiles in ~/.factory/config.toml:
[credentials.codex]
FACTORY_RUNNER = "codex"
CODEX_API_KEY = "..."
[credentials.bob]
FACTORY_RUNNER = "bob"
BOBSHELL_API_KEY = "..."Run uv run factory config show to see resolved config, or uv run factory config edit to open the file. See Setup Guide for full details.
LangFuse provides LLM observability and tracing — track agent invocations, token usage, and execution flow across all factory runs.
# Start LangFuse services
scripts/langfuse-setup start
# Set the env vars the factory needs
export LANGFUSE_HOST=http://localhost:3000
export LANGFUSE_BASE_URL=http://localhost:3000
export LANGFUSE_PUBLIC_KEY=pk-lf-dev-local-key
export LANGFUSE_SECRET_KEY=sk-lf-dev-local-key
export TELEMETRY_PLATFORM=langfuseThe dev credentials above match the docker-compose setup. Add them to your ~/.bashrc or ~/.zshrc to persist across sessions.
- Start LangFuse:
scripts/langfuse-setup start - Run the factory:
uv run factory ceo /path/to/project - Open
http://localhost:3000in your browser - Login:
dev@localhost.local/devpassword123
scripts/langfuse-setup start # Start LangFuse services
scripts/langfuse-setup stop # Stop services
scripts/langfuse-setup status # Show status and credentials- Docker or Podman — any of
docker compose,docker-compose, orpodman-composeworks
To disable tracing without stopping LangFuse:
export LANGFUSE_TRACING_ENABLED=falseFor LLM connection setup, trace structure details, and troubleshooting, see infra/langfuse/README.md.
re:factory is also distributed as a fully-bundled Claude Code plugin — agents, skills, and slash commands packaged together. A GitHub Actions workflow rebuilds the plugins branch of this repo on every push to main, so it always tracks the latest generated artifacts.
From inside Claude Code:
/plugin marketplace add akashgit/remote-factory#plugins
/plugin install factory@remote-factory
/reload-plugins
Once installed, the plugin exposes:
- The
/factory:implementslash command (entry point for the multi-agent pipeline). - Namespaced subagents — invoke with
factory:ceo,factory:researcher,factory:builder, etc. - The bundled skills under
.agents/skills/(e.g.pipeline-subagents,implement).
The plugin still shells out to the factory CLI for the heavy lifting, so you'll need uv and the factory package installed locally as described in Quick Start.
To update later: /plugin marketplace update remote-factory. To remove: /plugin uninstall factory@remote-factory.
If you'd rather skip the marketplace and just register the specialist agents as standalone Claude Code (or Codex) subagents, use the built-in installer:
uv run factory install # Install all 9 agents to ~/.claude/agents/
uv run factory install --runner codex # Or install Codex TOML agents to ~/.codex/agents/
claude --agent factory-ceo "improve this project"
claude --agent factory-researcher "study the auth system"This path only ships the agent prompts (no skills, no slash commands) and is independent of the plugin marketplace install above.
| Doc | What's in it |
|---|---|
| Setup Guide | Installation, authentication, environment variables |
| Getting Started | Lifecycle walkthrough, research mode details, factory.md config |
| Architecture | Three-layer system, agent roles, state machine, data flow |
| Eval System | Hygiene/growth/project tiers, scoring, guards, precheck |
| Configuration | factory.md reference — all sections and options |
| ACE Self-Improvement | How re:factory evolves its own agent playbooks |
| Contributing | Dev setup, code style, testing, PR workflow |
uv sync --all-groups # Install all deps including dev
uv run pytest -v # Full test suite
uv run ruff check . # Lint
uv run mypy factory/ # Type checkMIT — Akash Srivastava
