Discover
Clarify outcome, constraints, repo state, side effects, credentials, and risks.
AI software delivery discipline
end-to-end-loop is a portable Agent Skill that forces software work through discovery, planning, implementation, verification, tests, deploy gates, and evidence-backed reporting before calling anything done.
A self-learning delivery for-loop for AI coding agents. It keeps the agent inside a disciplined workflow until a code component or application change is understood, implemented, verified, tested, and safely handed off.
AI researchers, software builders, and power users who want reliable agentic development: less skipped context, fewer fake test claims, clearer deploy gates, and better operational reports.
It turns “agent says it is done” into an auditable artifact: planned scope, changed files, commands run, results, risks, limitations, and the next recommended action.
Before
After
Works with your agent stack
The product
Most coding-agent failures are boring and repeatable: edit too early, skip reproduction, forget tests, hide uncertainty, or deploy without a rollback story. end-to-end-loop makes those failure modes explicit gates.
Clarify outcome, constraints, repo state, side effects, credentials, and risks.
Define small steps, acceptance criteria, test strategy, and delivery target.
Make scoped changes through the required CAVEMAN/Cavekit lane for code-producing work.
Prove behavior with observed evidence: commands, tests, diff review, or manual checks.
Run relevant automated checks, smoke paths, and security review proportional to risk.
Commit, PR, artifact, readiness report, or approved deploy — with limitations named.
Evidence-backed reports
Every completed task should leave an audit trail that a human can inspect: changed files, commands run, pass/fail results, known limitations, and the next recommended action.
Safety model
Research and performance documents
The research thesis, design implications, limitations, and artifact architecture.
Performance baselineReliability metrics and release postureCurrent measurable baseline: trigger coverage, outcome scenarios, validation gates, and gaps.
Evaluation protocolHow end-to-end-loop should be testedTrigger accuracy, loop compliance, deploy safety, CAVEMAN behavior, and result schema.
Deploy readinessWhen agents may ship live changesA pass/fail checklist for external writes, CI, rollback, smoke paths, and approvals.
Current status
README, evaluation rubric, trigger cases, outcome scenarios, and deploy readiness docs are in place.
The skill stays private until docs, metrics, evals, and install examples are strong enough.
dev-boss.nl is now a lean product site for end-to-end-loop and its research/performance artifacts.