Discover
Clarify outcome, constraints, repo state, side effects, credentials, and risks.
AI software delivery discipline
Stop accepting “looks good” from coding agents. A portable Agent Skill for Codex, Hermes, Claude Code, Cursor, and AGENTS.md-compatible agents, end-to-end-loop turns vague software requests into a gated workflow: discover, plan, execute, verify, test, deliver, and report with real evidence.
end-to-end-loop is a self-learning delivery for-loop for AI coding agents. It keeps the agent inside a disciplined workflow until a code component or application change is properly understood, implemented, verified, tested, and safely handed off.
AI researchers, builders, and power users who want more reliable agentic software delivery: less skipped context, fewer fake test claims, clearer deploy gates, and better operational reports.
DevBoss is the virtual office maintaining the skill: research, evaluation, release governance, CI, security review, documentation, and website work — with Tijmen as supervisory board chair.
Before
After
Works with your agent stack
The product
Most coding-agent failures are boring and repeatable: edit too early, skip reproduction, forget tests, hide uncertainty, or deploy without a rollback story. end-to-end-loop makes those failure modes explicit gates.
Clarify outcome, constraints, repo state, side effects, credentials, and risks.
Define small steps, acceptance criteria, test strategy, and delivery target.
Make scoped changes through the required CAVEMAN/Cavekit lane for code-producing work.
Prove behavior with observed evidence: commands, tests, diff review, or manual checks.
Run relevant automated checks, smoke paths, and security review proportional to risk.
Commit, PR, artifact, readiness report, or approved deploy — with limitations named.
Evidence-backed reports
Every completed task should leave an audit trail that a human can inspect: changed files, commands run, pass/fail results, known limitations, and the next recommended action.
Safety model
Virtual office
The office runs research, evaluation, release planning, website work, CI, and security review. Roles are explicit so one agent does not write, approve, and deploy its own work.
Current status
README, evaluation rubric, trigger cases, paper cleanup, and DevBoss handoff are under active improvement.
The skill stays private until docs, metrics, evals, and release readiness are strong enough.
dev-boss.nl explains the product and will later link to install docs, research, changelog, and approved release notes.