Spec Loop — Design-First AI-Assisted Development

Spec-Loop infographic

There are two common ways people use AI for coding.

Vibecoding: you describe intent, the model fills in the gaps, and you get a large diff with undocumented decisions. Review becomes archaeology. Tests are optional by accident.

Waterfall: you try to avoid that by writing a complete spec first. You can’t. Constraints appear during implementation. The spec inflates, then it either blocks change or gets ignored.

Spec Loop avoids both: write the next small spec, review it, then implement it with tests. Keep the spec local to the next step. Repeat until done.

Spec Loop is a framework of reusable skills.

Getting Started

Install the skills with npx skills

Recommended path:

Install the core task-workflow skills together. spec-loop-plan-task, spec-loop-clarify-task, spec-loop-prepare-implementation-approval, spec-loop-implementation-flow, and spec-loop-write-adr hand off to each other, reuse the shared spec-loop-plan-task bundle, and support the same planning artifacts.

Install spec-loop-write-glossary when your project uses a glossary. It is mandatory for the Spec Loop AsciiDoc glossary format.

spec-loop-setup-doc-rendering is optional. Install it only if you want the rendering setup and troubleshooting helper for task files and glossary files.

spec-loop-assess-pull-request is optional. Install it only if you need retrospective review of pull requests, merge requests, or commit ranges from repositories you trust. It fetches provider or Git content as review evidence and is not required for the main planning workflow.

Ensure Node.js is available so npx works.

Install Spec Loop in the current project with:

npx skills add dpolivaev/spec-loop -s '*'

Variations:

For selective, single-agent, or other installation variants, see https://github.com/vercel-labs/skills.

Prepare task and glossary rendering

Spec Loop task files use embedded PlantUML diagrams, and Spec Loop glossaries may include embedded diagrams. Prepare your editor for reviewing rendered task files and glossary files before continuing.

Ask the agent to use the spec-loop-setup-doc-rendering skill to prepare your editor preview setup.

If you do not want to use the skill and prefer manual setup, use these editor-specific references: VS Code-Based IDE Setup and JetBrains Setup Reference.

For example:

Please use the `spec-loop-setup-doc-rendering` skill to help me
prepare my editor for reviewing rendered Spec Loop task files and
glossary files.

My coding harness may run in a terminal, but I review files in
<VS Code, Cursor, another VS Code-based IDE, or JetBrains>.

If you review files in VS Code, Cursor, or another VS Code-based IDE, the same extension IDs and settings apply. When your editor exposes a supported CLI command, you can also run the helper script directly instead of asking an agent to use the skill. You can either run it from a local checkout or download the current main-branch copy directly from setup-vscode-server-based.sh. The script requires a supported editor CLI command on PATH (code, code-insiders, cursor, code.cmd, code-insiders.cmd, or cursor.cmd) and is intended for macOS, Linux, WSL, and Git Bash for Windows. Other VS Code-based IDEs should apply the same extension IDs and settings manually.

The helper automates only the server-based PlantUML preview path for supported VS Code-based IDEs together with the AsciiDoc extension used by Spec Loop glossaries. It does not automate the local-only PlantUML path or JetBrains setup.

From a local checkout:

bash skills/spec-loop-setup-doc-rendering/scripts/setup-vscode-server-based.sh --check
bash skills/spec-loop-setup-doc-rendering/scripts/setup-vscode-server-based.sh --apply

Without a local checkout:

curl -fsSLO https://raw.githubusercontent.com/dpolivaev/spec-loop/refs/heads/main/skills/spec-loop-setup-doc-rendering/scripts/setup-vscode-server-based.sh
bash setup-vscode-server-based.sh --check
bash setup-vscode-server-based.sh --apply

How to update

Project-level update:

npx skills update

Global update:

npx skills update -g

Manual fallback when npx is unavailable

If npx is not available, clone or download this repository and copy the core task-workflow skills from skills/ into your agent's skills directory. Keep that core bundle together.

Install spec-loop-write-glossary when your project uses a glossary.

Install spec-loop-setup-doc-rendering only if you want rendering setup or troubleshooting help.

Install spec-loop-assess-pull-request only if you need retrospective review of trusted repositories.

Which directory your agent uses is agent-specific. See https://github.com/vercel-labs/skills for agent-specific installation details.

Further Reading

License

Licensed under the MIT License. See LICENSE.

Origin

This framework was developed and applied in Freeplane.

How Spec Loop Works

Spec Loop follows this workflow:

  • clarify - spec-loop-clarify-task resolves material unresolved questions before or during planning.
  • plan - the spec-loop-plan-task bundle governs plan-first work, including the fileless planning path in chat, the task-file path when needed, ADR and documentation routing, glossary triggers, and the gate before implementation.
  • approve - you approve either a fileless task in chat or a task-file plan; on the task-file path, spec-loop-prepare-implementation-approval prepares the task for that approval step.
  • implement - after implementation approval on either planning path, spec-loop-implementation-flow governs implementation-time work.
  • review/ready - spec-loop-implementation-flow also governs the move to review on the task-file path and readiness reporting on the fileless path.

The planning and approval rules for that workflow live in the spec-loop-plan-task bundle and its companion files.

The planning bundle starts with SKILL.md and common-task-guidance.md, plus chat-only-path-guidance.md on the chat-only path and task-file-path-guidance.md on the task-file path.

The spec-loop-write-glossary skill defines the Spec Loop AsciiDoc glossary format in glossary-format.md.

The spec-loop-setup-doc-rendering skill helps users prepare and troubleshoot rendering for task files and glossary files. If a user does not want to use the skill directly, see vscode-setup.md and jetbrains-setup.md for manual editor-specific setup references.

The spec-loop-assess-pull-request skill is optional. It reconstructs retrospective review files from trusted pull requests, merge requests, or commit ranges and can generate GitHub-friendly Mermaid variants from them when needed.

The model uses these skills while drafting and updating plans, task, or review artifacts; you review and approve either a fileless chat task or a task-file plan before implementation. Approved implementation then continues under spec-loop-implementation-flow. On task-file work, it governs implementation-time routing, Implementation notes, and the move to review. On the fileless path, it governs canonical chat-task maintenance, recovery re-emission or promotion, and readiness reporting. When the code already exists, you inspect retrospective review files instead.

Spec Loop also defines explicit work phases: plan, implementation, and done. Any transitions to implementation and to done require explicit user approval.

When a project maintains a glossary described by the shared task semantics project glossary section, that glossary defines the shared domain language above individual tasks and the code. It keeps design documents, tests, code symbols, and commit text aligned on the same terms across the whole project.

Spec Loop is designed to work with existing codebases at scale. Before any design or implementation step, the model captures relevant knowledge in the Research section of the active task artifact: existing behavior, constraints, APIs, interfaces, and established code practices.

It follows the classic research–plan–implement approach, broken down into small, incremental sub-tasks.

The research is explicitly scoped to the next increment. It captures only what is required to implement that increment correctly, and is intentionally partial. The result is a bounded, reviewable understanding whose size remains manageable.

For large codebases, the glossary is especially useful because it keeps domain terms stable across many increments, files, and subsystems.

Because the scope can be kept reasonably small and the research is written down, you can verify that the model examined the right parts of the codebase, identified the correct interfaces, and aligned with existing practices before any code is written. This is especially valuable in legacy systems: it prevents clean-room redesigns and makes incremental change safer.

Document Types and Lifetimes

Spec Loop uses more than one document type on purpose. They do not have the same job or the same lifetime.

  • Fileless chat tasks are short-lived canonical chat artifacts for simple work on the fileless path. They exist to drive research, implementation, and verification without task-file overhead. If alignment becomes unsafe, they are re-emitted or promoted to task files.
  • Task files are short-lived working artifacts for the next concrete slice of work when the task-file path is in use. They exist to drive research, review, implementation, and testing of that slice.
  • ADRs capture durable decisions and the reasons behind them.
  • Documentation-only work may stand on its own when no executable change is involved and no project rule requires a task file.
  • A glossary captures stable shared language across tasks, design, tests, code symbols, and commits.
  • Review files reconstruct and assess already-implemented work from trusted pull requests, merge requests, or commit ranges. When needed, they may also produce GitHub-friendly Mermaid variants for sharing the review.
  • Living project documents capture current truth that should remain useful after the task is accepted, such as technical shape, operations, or other stable project knowledge.

Historical task files do not need to be kept mutually consistent across time. The active task artifact, however, should stay aligned with the glossary, living project documents, and implemented code for its scope.

If a project maintains a technical design document, its purpose is to describe the current technical shape, stable boundaries, and important flows. It should not become a second glossary or a catalog of transient implementation detail.

Governance, Review, and Traceability

This document defines the governance, review, and traceability rules around Spec Loop work.

What the workflow rules, common task guidance, and task-file path guidance enforce

The workflow rules are the normative contract between the human developer and the model. common-task-guidance.md defines the shared no-subtask task form used on both planning paths. When Spec Loop uses a task file, the task-file path guidance adds task-file-only mechanics. Together, they enforce at minimum:

  • Explicit planning before executable work.
  • A fileless planning path in chat only for first-pass, straight-line work with lightweight research, a single clear implementation path, lightweight verification, no existing task file, and no need for subtasks or diagrams.
  • One shared main-task structure and section semantics across both planning paths, with task-file-only additions for subtasks, lifecycle, and diagrams.
  • Task files as the source of truth for scope, constraints, research, design, test expectations, and execution status when the task-file path is in use, plus Implementation notes when meaningful implementation-time history must remain visible.
  • A canonical fileless chat task as the source of truth on the fileless path, allowing an initial task with only the established sections, then section-only chat updates and full-task recovery re-emission when reconstruction confidence drops.
  • ADRs and documentation may stand as their own planning artifacts when they are the requested work and no task-file rule overrides that.
  • A default approval gate before any code, test, or configuration changes, with either fileless-task approval on the fileless path or task-file approval on the task-file path.
  • A separate post-approval implementation skill on both planning paths. On the task-file path it governs implementation-time clarification, task maintenance, and the move to review. On the fileless path it governs canonical chat-task maintenance, implementation-time clarification, recovery or promotion, and readiness reporting.
  • Implementation completeness: design, constraints when present, and test specification implemented unless tests are explicitly waived, plus any required implementation-note traceability captured.
  • Traceability discipline: identifiers in commit messages, and status/folder consistency where task files are in use.

If a local convention conflicts with the applicable workflow rules or Task-file Path Guidance, the governing rule wins.

The human developer’s role

A central assumption of Spec Loop is that the human developer remains the primary source of understanding and intent.

The model is treated as a powerful implementation and reasoning aid that operates under explicit constraints, not as an independent decision-maker.

The developer is responsible for judging correctness, scope, and relevance. The model operates within the boundaries defined by the approved plan and requires explicit approval to cross implementation gates. On the task-file path, the task file is the source of truth for that approved plan.

Task files as present truth

A task file is not a general historical narrative. It is the stabilized description of what must be true now to implement the next increment correctly.

Practically:

  • Research records observations and verified facts only.
  • Constraints record binding limits for the increment when needed.
  • Design records the approved target design intent for the increment.
  • Test specification defines the verification that must exist for completion.
  • Implementation notes, when present, keep only the bounded implementation-time decision trail that later review needs.

History still belongs primarily in version control. Implementation notes is the narrow exception for implementation-time decisions that would otherwise be lost. The task file still represents the current intent.

Constraints as a control layer

When a task includes Constraints, they capture the limits that the target design and implementation must obey.

Typical examples are semantic invariants, non-goals, compatibility limits, identity rules, performance limits, and forbidden simplifications.

If Design conflicts with Constraints, Constraints wins.

Briefing as a soft entry point

Each task includes a Briefing section that serves as a soft entry point:

  • for someone unfamiliar with the codebase,
  • for the contributor returning to the task after time has passed,
  • for onboarding new contributors.

The briefing explains what matters, where to look first, and which modules, classes, and stack decisions orient a newcomer quickly.

It is not a summary of the task history. It is a guide for understanding the current intent.

Approval boundaries

Spec Loop has two planning approval surfaces:

  • Fileless planning-path approval in chat.
  • Task-file approval on the task-file path.

On the fileless planning path, the model must ask the user to approve both skipping task-file creation and implementing from the fileless chat task.

On the task-file path:

  • The model may edit task files without prior approval.
  • If task files were edited and there is no implementation directive, the model must request user review before changing code, tests, or configuration.
  • An explicit directive such as “implement”, “go ahead”, or “proceed” counts as approval and must not trigger another approval request.
  • After task-file implementation approval, spec-loop-implementation-flow governs implementation-time clarification, the post-implementation Implementation notes checkpoint, and the move to review.

On the fileless path, after fileless implementation approval, spec-loop-implementation-flow governs implementation-time clarification, canonical chat-task updates, full-task recovery re-emission when needed, promotion to the task-file path when fileless simplicity no longer holds, and readiness reporting.

If implementation stays within the approved design and only bounded clarification is needed, the canonical task artifact is updated in place and work continues. If scope or another approved contract changes materially, the model proposes next steps and requests renewed approval before continuing.

Phase model

Spec Loop defines work phases: PLAN, IMPLEMENTATION and DONE.

By default, phase transitions are constrained:

  • PLAN -> IMPLEMENTATION requires explicit approval.
  • IMPLEMENTATION -> DONE requires explicit approval.
  • Any new request, refinement, extension, or follow-up resets work to PLAN.

This keeps the model aligned and prevents implementation from continuing by inertia after scope changes.

Review boundaries that map to normal practice

Spec Loop separates agreement on intent from review of implementation. Even with simplified statuses, review gates still exist at the implementation-approval boundary, at the task-file-path move-to-review boundary when that path is in use, and at final completion approval.

Reviewers assess correctness against approved intent.

PlantUML as a design artifact

The task-file path guidance requires Design sections on the task-file path to use PlantUML diagrams that model structure or flow (class, component, sequence), with strict formatting rules.

Design remains reviewable as a first-class artifact and is not encoded only in implementation.

Traceability mechanics

Spec Loop makes intent recoverable after the fact:

  • Task files define the intent boundary for a set of commits when the task-file path is in use.

  • Fileless tasks define the intent boundary in chat while the fileless path remains active.

  • Commit messages are structured artifacts and must start with the Primary Identifier:

    • Ticket ID when present, otherwise the Task Identifier.

This links implementation changes to an explicit, reviewable specification.

Status folders and lifecycle discipline

On the task-file path, work is organized by status folders in the task directory:

  • backlog: planned or deferred work; research and design live here until design is approved.
  • in-progress: active research, design, implementation, or verification; subtasks carry explicit status.
  • done: user-verified completion; prefix rules preserve ordering.

Before commits on the task-file path, the model validates task status consistency and proposes folder or status updates. These are applied only after explicit user confirmation, unless the user explicitly instructed to commit.

Definition of done in team context

Completion is not inferred from working code.

An increment is considered done only when:

  • the approved design is fully implemented,
  • the test specification is implemented and passing,
  • any deviations are documented in the active task artifact,
  • the user explicitly approves the transition to done.

This applies equally to human-written and model-written code.

Architecture Decision Records

Use ADRs for decisions that outlive a single task, such as public behavior, dependencies, or long-term design.

ADRs capture context, decision, and consequences without turning task files into long-lived design encyclopedias.

AI Workflow Framework Comparison

Relative-fit comparison of AI workflow frameworks. Rows are actions or workflow goals ordered roughly by when they appear in the software-development cycle.

The stars show how strongly the reviewed materials support each activity for LLM use, based on the specificity, operational clarity, and enforcement visible in the documents. They are not formal benchmark results or a full real-world performance score.

  • ★★★★★ = strongest support
  • ★★★★ = strong support
  • ★★★ = meaningful secondary support
  • ★★ = limited but real support
  • = weak / indirect support
  • - = not a purpose there, or not evidenced in the materials reviewed

Frameworks compared

  • OpenSpec — repo-local change/spec system with proposal, specs, design, tasks, verify, and archive flow.
  • Superpowers — full coding-agent methodology with design gates, task planning, TDD, subagents, worktrees, and closeout workflows.
  • Spec Loop — governed task/increment workflow with explicit research, context-building, approval, and implementation control.
  • grill-with-docs — clarification skill with strong domain-language pressure, contradiction surfacing, and optional inline glossary/ADR capture.
  • agent-skills — broad engineering workflow library with strong anti-rationalization and verification patterns.

Some start with a broad problem or idea. Others start with a specific task that is already chosen.

  • OpenSpec and Superpowers start earlier from a broader problem, idea, or change and then move toward spec/design/tasks/implementation.
  • Spec Loop is strongest once work has already been narrowed to a task or increment and you want explicit research, alignment, approval, and governed implementation for that increment.
  • grill-with-docs is mainly a clarification and shared-language component, not a full end-to-end SDLC framework.
  • agent-skills spans many stages, but less as one integrated artifact model.

1. Upstream discovery and scoping

Action / purposeOpenSpecSuperpowersSpec Loopgrill-with-docsagent-skills
Analyze a broad problem area before choosing implementation work★★★★★★★★★★★★★★★★★★
Clarify a specific requested task or increment before implementation★★★★★★★★★★★★★★★★★★★★★
Split a broad initiative or change into smaller deliverable slices/tasks★★★★★★★★★★★★★★★

2. Shared language and durable decision context

Action / purposeOpenSpecSuperpowersSpec Loopgrill-with-docsagent-skills
Challenge proposed terms against existing shared language and surface terminology conflicts★★★★★★★★★
Maintain a shared project glossary / terminology-★★★★★★★★★★
Model multiple domains/contexts and their boundaries★★-★★★★★-
Surface and record architecture decisions that need durable rationale★★★-★★★★★★★★★★★

3. Define the intended change

Action / purposeOpenSpecSuperpowersSpec Loopgrill-with-docsagent-skills
Create durable spec/change artifacts that remain the source of truth★★★★★★★★★★★★★★★★★★★
Write detailed technical design before implementation★★★★★★★★★★★★★★★★★★★★
Make current and target structure/behavior explicit with reviewable diagrams★★★★★
Maintain brownfield deltas between current and proposed behavior★★★★★---

4. Make the next increment implementation-ready

Action / purposeOpenSpecSuperpowersSpec Loopgrill-with-docsagent-skills
Make one implementation increment ready by explicitly capturing research, constraints, design, and test expectations★★★★★★★★★★★★★★★★★★
Break approved work into actionable implementation tasks/checklists★★★★★★★★★★★★★★★★★★
Support lightweight planning for one simple increment without opening a full formal artifact workflow★★★★★★★★★

5. Govern implementation while coding

Action / purposeOpenSpecSuperpowersSpec Loopgrill-with-docsagent-skills
Keep implementation constrained to the approved increment/task/change during coding★★★★★★★★★★★★★★-★★★
Use explicit guardrails against rationalization and unjustified confidence during execution★★★★★★★★★★★★★
Treat test-first development as a required implementation method★★★★★★★-
Require root-cause analysis before fixes when debugging-★★★★★---
Use subagents plus review loops as a primary implementation strategy-★★★★★--
Use isolated development workspaces/branches as part of the normal implementation flow★★★★★---
Coordinate implementation across multiple repos or linked workspaces★★★★★----

6. Verify and close out

Action / purposeOpenSpecSuperpowersSpec Loopgrill-with-docsagent-skills
Check implemented work against the agreed artifacts before calling it done★★★★★★★★★★★★★★-★★★★
Assess pull requests, merge requests, or diffs and prepare review artifacts★★★★★★★★★--
Drive merge or branch-closeout as an operational workflow step-★★★★★---
Preserve completed change context in an archive or other durable historical record★★★★★★★--

7. Workflow costs

This is a different kind of comparison. Lower is not automatically better: a framework can be cheaper here because it covers less of the job, or because it keeps less written state.

These cost labels are comparative judgments based on the reviewed materials, not measurements or benchmark results.

Activities

  • Before coding = the cost of clarification, specification, design, planning artifacts, and approvals before implementation starts.
  • Coding and testing = the cost once the increment is already chosen: coding mechanics, testing method, review loops, and implementation-time clarification.
  • Maintaining authoritative written artifacts as the system grows = the cost of keeping specs, glossary files, ADRs, or similar written artifacts believable as the codebase and behavior evolve.
  • Repeated research and re-alignment per increment = the cost of re-checking current truth and rebuilding enough local context for each new increment.

Analysis

ActivityOpenSpecSuperpowersSpec Loopgrill-with-docsagent-skills
Before codingHighVery highMediumLowHigh
Coding and testingHighVery highMediumHighHigh
Maintaining authoritative written artifacts as the system growsHighMediumLowMediumMedium
Repeated research and re-alignment per incrementMediumMediumMediumHighMedium
  • Spec Loop: low artifact-maintenance cost because it keeps the authoritative written state relatively narrow, but medium repeated re-alignment cost because it re-checks current system truth through codebase research for each increment.
  • OpenSpec: higher artifact-maintenance cost because it asks a larger enduring spec set to stay believable as the system grows.
  • Superpowers: very high upfront and execution-phase cost because it wants design, planning, TDD, and strong execution controls before and during coding.
  • grill-with-docs: low upfront cost mainly because it covers the clarification/shared-language slice, not the whole end-to-end workflow, but repeated re-alignment cost is higher because it does not carry the later implementation workflow itself.
  • agent-skills: scored here as a representative spec -> plan -> implement -> verify path, not as the whole catalog abstractly.

8. References used

This comparison is based on the following materials.

  • Spec Loop: current repository skills and docs, especially skills/spec-loop-plan-task/, skills/spec-loop-clarify-task/, skills/spec-loop-implementation-flow/, docs/review-responsibility-and-traceability.md, and README.md.
  • grill-with-docs: SKILL.md, CONTEXT-FORMAT.md, and ADR-FORMAT.md.
  • OpenSpec: README.md, docs/concepts.md, and docs/workflows.md.
  • agent-skills: README.md, AGENTS.md, docs/getting-started.md, docs/skill-anatomy.md, and selected skills including interview-me, idea-refine, spec-driven-development, planning-and-task-breakdown, documentation-and-adrs, and doubt-driven-development.
  • Superpowers: README.md, AGENTS.md, and selected workflow skills including brainstorming, writing-plans, executing-plans, subagent-driven-development, test-driven-development, systematic-debugging, verification-before-completion, using-git-worktrees, requesting-code-review, receiving-code-review, finishing-a-development-branch, using-superpowers, and writing-skills.

This is a comparison of representative core materials and selected skills, not a full-repository audit of every compared project.

Skills Overview

Included Skills

This repository currently ships these skills. The core task-workflow bundle is spec-loop-plan-task, spec-loop-clarify-task, spec-loop-prepare-implementation-approval, spec-loop-implementation-flow, and spec-loop-write-adr. spec-loop-write-glossary is required when a project uses a glossary. spec-loop-setup-doc-rendering and spec-loop-assess-pull-request are optional.

  1. spec-loop-plan-task

  2. spec-loop-clarify-task

    • the clarification skill for underspecified task creation, task updates, and design updates; preferred over generic grill-me variants in Spec Loop workflows;
    • defined by skills/spec-loop-clarify-task/SKILL.md.
  3. spec-loop-prepare-implementation-approval

  4. spec-loop-implementation-flow

  5. spec-loop-write-glossary

  6. spec-loop-setup-doc-rendering

    • the optional setup and troubleshooting skill for rendering task files and glossary files.
  7. spec-loop-write-adr

  8. spec-loop-assess-pull-request

Documentation

  1. Check the planning, clarification, and implementation-flow skills briefly.

  2. Study the Wordle example by commit history.

    • The Wordle commit history shows the workflow under real version-control pressure: how task specifications evolve step by step, and how implementation and tests follow approved design.
  3. Check Governance, Review, and Traceability. It explains how fileless chat tasks, task files, workflow rules, common task guidance, and the task-file path guidance map to team development practice: boundaries, responsibility, commit linking, and status discipline.

  4. Compare framework trade-offs.

  5. Follow one of the hands-on tutorials.

    • Wordle Tutorial walks through a compact Java example with staged planning, approvals, implementation, glossary maintenance, and testing.
    • Online Art Game Tutorial walks through a complete browser-oriented example with staged planning, approvals, implementation, and testing.
    • The two tutorials teach the same Spec Loop workflow: planning first, explicit approval before implementation, small reviewable tasks or subtasks, verification, and user correction when the LLM misses a supporting update. The main difference is the technical setting: Wordle is a compact Java path, while the online art game is browser-oriented. You can choose either tutorial.
  6. Project glossary conventions.

Recommended quick-check order:

Diagram and Rendering Policy

Spec Loop treats diagrams as specification artifacts: they make design intent reviewable at the same boundary as the surrounding text.

Where the task-file path guidance requires diagrams in task files, use PlantUML by default.

Mermaid is a poorer but still possible alternative when the User or another governing instruction explicitly prefers Mermaid, for example when GitHub or similar environments are used and PlantUML is not rendered.

PlantUML remains the recommended default in practice because it is usually easier to keep precise and reviewable for the structural and behavioral design work used in Spec Loop.

For inline PlantUML rendering in Markdown on the web, view the repo on GitLab. GitHub does not render PlantUML embedded in Markdown natively, so reading there can degrade the intended experience.

For local preview setup, use the spec-loop-setup-doc-rendering skill. If you do not want to use the skill and prefer manual setup, use these editor-specific references: VS Code-Based IDE Setup and JetBrains Setup Reference.

Online Art Game Tutorial: You Send, You See

This tutorial uses public data from the Art Institute of Chicago (AIC). This project is not affiliated with or endorsed by AIC.

Bootstrap

B1. Create an empty museum-tutorial-project

Run this from a workspace directory of your choice:

mkdir -p museum-tutorial-project
cd museum-tutorial-project
git init

B2. Install the Spec Loop skills

npx skills add dpolivaev/spec-loop -s '*'

This recommended path requires Node.js because it uses npx. For global installation for all agents, use:

npx skills add dpolivaev/spec-loop -g --all

--all installs all skills for all supported agents. For other installation variants, see https://github.com/vercel-labs/skills.

B3. Open the project

Open museum-tutorial-project in your coding tool.

B4. Select the model explicitly

For this tutorial, select the model explicitly instead of relying on automatic model choice. With an unknown model, poor instruction following is more likely.

Continue with Step 1 from the museum-tutorial-project root. Send the tutorial prompts from there unless a later step says otherwise.

B5. Prepare task and glossary rendering in your editor

Run this step unless you already know your editor is prepared to render:

  • Markdown task files with embedded PlantUML
  • AsciiDoc glossary files with embedded diagrams

If you review in VS Code, Cursor, or another VS Code-based IDE and want to run the helper script directly instead of using the skill, use the instructions in README.md: Prepare task and glossary rendering. Then skip the You send prompt below. Use Verification to confirm the expected editor state.

If you do not want to use the skill, use these editor-specific references instead: VS Code-Based IDE Setup and JetBrains Setup Reference.

You send

Please use the `spec-loop-setup-doc-rendering` skill to help me
prepare my editor for reviewing rendered Spec Loop task files and
glossary files.

My coding tool may run in a terminal, but I review files in
<VS Code, Cursor, another VS Code-based IDE, or JetBrains>.

You see

  • uses the spec-loop-setup-doc-rendering skill,
  • reads the setup document for your editor,
  • guides you through the rendering setup needed for task and glossary review.

Verification

  • your editor is ready to review Markdown task files with embedded PlantUML,
  • your editor is ready to review AsciiDoc glossary files with embedded diagrams.

⚠️ Default rule for later clarification questions

For the rest of this tutorial, if the assistant asks a clarification question and gives a recommendation, follow the recommendation unless you intentionally want a different path.

If the assistant starts asking too many separate clarification questions and you want to speed the rest up, tell it: Please prefer decision batches over separate questions for the rest of this clarification round.

Step 1: Confirm Spec Loop in the tutorial project

You send

I am following the Spec Loop online art game tutorial from my browser.
Please work in this project according to the Spec Loop workflow defined by the installed skills.

Tutorial-specific goals:
- use the normal planning workflow for non-trivial work,
- later tutorial steps will create and maintain `glossary.adoc`,
- rendering setup help is only needed again if a later step requires
  it,
- tell me how you will work here and restate the
  `PLAN -> IMPLEMENTATION` approval rule in one sentence.

Your intent

  • Confirm that the assistant is actually following the installed Spec Loop workflow in this repository.
  • Make it restate the planning-before-implementation approval boundary before any real work starts.

You see

Read the assistant's final response carefully, even if you skip intermediate reasoning. Before continuing, confirm these points:

  • the assistant says it will follow the Spec Loop workflow defined by the installed skills in this project;
  • the assistant makes clear that non-trivial work will go through the normal planning path before implementation;
  • the assistant correctly restates the PLAN -> IMPLEMENTATION approval rule.

You learned (this step)

  • Setup is now package installation, with a separate editor-rendering step when needed.
  • The tutorial may be open in your browser while the assistant only sees the museum-tutorial-project, so prompts must still carry the context it needs.

If setup seems wrong

  1. Ask the assistant which installed skills are active.
  2. Ask it to restate the PLAN -> IMPLEMENTATION approval rule.
  3. If that still looks wrong, reinstall the skills with:
npx skills add dpolivaev/spec-loop -s '*'
  1. For global installation for all agents, use -g --all. For other installation variants, check https://github.com/vercel-labs/skills.
  2. If npx is not available or does not help, copy the needed part of the skills/ directory from https://github.com/dpolivaev/spec-loop into the tool-specific skills directory.
  3. If the tool still does not automatically apply the expected workflow, explicitly ask for the needed skill by name.
  4. Continue only when the assistant clearly understands the setup and the workflow rules.

From here on

  • each You send block is a prompt to adapt and send,
  • each You see block describes the expected outcome,
  • if you want to finish the tutorial in minimum time, send the next prompt first and then read it and think about it while the assistant works, because the assistant also needs time to act and respond,
  • validate progress from the changed files and the assistant's final response before continuing,
  • for routine steps, you can usually skip intermediate reasoning and read the assistant's final response carefully once it finishes,
  • if the assistant misses a required setup, project instructions, glossary, or status update, ask it to fix that before continuing,
  • if the setup or workflow rules seem wrong, use the recovery steps above before continuing.

Possible misalignment

If one of these happens, interrupt the flow and ask the assistant to correct it before continuing:

  • it starts changing files or config before showing the plan and getting approval,
  • it cannot clearly explain which Spec Loop setup is active or restate the PLAN -> IMPLEMENTATION approval rule,
  • it ignores the installed workflow rules,
  • it starts implementation before explicit approval,
  • unrelated changes are mixed into one subtask,
  • implementation changes are made without verification evidence,
  • it misses required supporting updates such as glossary, task status, or ignore rules,
  • the assistant's final response does not match the actual changed files,
  • a task or subtask is moved to done without explicit user confirmation.

Step 2: Project README (README.md)

You send

Project brief:

We are building a small website with two parts:
1) a museum overview page based on Art Institute of Chicago data,
2) a game called Progressive Timeline.

Data source attribution:
- Art Institute of Chicago (AIC): https://www.artic.edu/
- Attribution must be preserved in generated outputs.
- This project is an educational exercise and should clearly attribute
  AIC as the source of museum content and artwork metadata.

In Progressive Timeline, the player must order artworks by year
from earliest to latest.

Level progression:
- Level 1: 2 artworks
- Level 2: 3 artworks
- Level 3: 4 artworks
- each next level adds one artwork

Data rule:
- use only artworks with a clearly extractable year
- exclude artworks with ambiguous years

The game includes a leaderboard sorted by:
1) reached level (desc)
2) total completion time (asc) for ties

Please write `README.md` for this repository based on the project brief.
Include the project brief verbatim in the README under a "Project Brief"
section. The README must preserve the AIC attribution requirements from
the brief and clearly describe the two parts (museum overview page +
Progressive Timeline game), the core rules, and the leaderboard sorting.
Keep the README concise and practical.

Also create `glossary.adoc` from the approved project brief. It should
define the canonical project terms needed for this tutorial and keep
their wording consistent with the brief.

Also create `.gitignore` if you find any harness-specific or IDE-specific
configuration files in this repository.

Also update the active project instructions file (for example
`AGENTS.md`) so it explicitly tells the assistant to read `README.md`
and follow the "Project Brief" section there for project requirements
unless I explicitly override it. The instructions file must also say
that this project never uses the fileless planning path: any code change
requires creation of a task file.

This is documentation-only work, we do not need a task file for it.

Your intent

  • Turn the project brief into durable project files before implementation starts.
  • Lock in the shared vocabulary, attribution rules, and the rule that every later code change needs a task file.

You see

  • README.md:
    • Exists and captures the project brief requirements.
    • Includes the project brief text under "Project Brief".
  • glossary.adoc:
    • Exists and defines the canonical project terms from the brief.
    • Uses wording consistent with the brief so later tasks can reuse it.
  • .gitignore:
    • Exists if harness-specific or IDE-specific configuration files were found.
  • Project instructions file:
    • Explicitly points the assistant to README.md as the source of the project brief and requirements.
    • States that the fileless planning path is never allowed in this project and that any code change requires a task file.

After completion (commit)

  • After you accept this work item as done: ask the assistant to commit the README, glossary.adoc, .gitignore (if created), and instructions-file changes.

You learned (this step)

  • The assistant can create documentation, add lasting instructions that point to the project brief, establish glossary.adoc as the project vocabulary, and (after you accept it) commit without creating a task file.

Step 3: Museum Overview Page (site/index.html) + Just-Enough API Research

Optional note: Playwright MCP or Playwright CLI can be helpful later if you want the assistant to navigate, check, and debug the web pages and scripts it produces. Depending on your tool, you can discuss with the assistant whether to install one of them now or later. Playwright MCP: https://github.com/microsoft/playwright-mcp#getting-started This is helpful, but not important for finishing the tutorial.

You send

Ensure a sibling `data-aggregator` checkout exists at
`../data-aggregator` relative to this repository.

If it is missing, clone
`https://github.com/art-institute-of-chicago/data-aggregator.git`
into a parallel directory first.

If the clone fails because you do not have the needed access, stop and
ask me either to run the clone myself or to give you the needed access.

After the correct location is confirmed, add it to the active
project instructions file so future work can reuse it without
re-asking.

Your intent

  • Resolve the external sibling dependency up front instead of letting later steps guess or re-ask.
  • Record the confirmed path in project instructions so later work can reuse it.

You see

  • ../data-aggregator exists as a sibling checkout.
  • If the assistant had enough access, it performed the clone itself.
  • If the clone could not be performed automatically, the assistant stopped and told you exactly what to do before continuing.

You send

Let us work on the museum overview page in this repository by creating
`site/index.html`.

Requirements:
- run real HTTP checks with curl (or equivalent) against the public AIC
  API; do not run a local instance
- introduce AIC as the data source
- show departments
- show exactly 20 representative artworks with title, artist,
  department, and image for each item
- use API data and image URLs programmatically without manual downloads
- add automated checks that prove the page can be served and opened
- report the exact local serve command in chat

Your intent

  • Force real external API research before implementation instead of invented or local-only assumptions.
  • Keep the page task reviewable with exact serve/open verification requirements.

You see (plan)

  • A task file is created automatically, and implementation still waits for explicit approval.
  • Instructions: the active project instructions file is updated to record the confirmed sibling data-aggregator path.
  • Task file:
    • Contains Scope, Motivation, Research, Design, and Test specification (and other required sections, for example Scenario when applicable).
    • Research includes curl verification evidence and practical rules needed for the museum page (including image URL rules) and any relevant reference notes from data-aggregator.

Approve only after the task definition looks correct. If the assistant does not create the task automatically, the task content does not have the required form, or embedded PlantUML does not render correctly, correct it before approving anything. If needed, send the error text or a screenshot and ask the assistant to fix the diagram.

You see (after implementation is completed)

  • Verification evidence includes the exact local serve/open command and its result.
  • site/index.html: exists and shows exactly 20 artworks with title, artist, department, and image.
  • The task file is in review.
  • The task file may include Implementation notes when relevant; if present, review them as part of the reviewer-facing task artifact.

After acceptance (move to done / commit)

  • After you accept this work item as done: tell the assistant move the task to done and commit.

You learned (this step)

  • Implementation starts only after explicit approval and is verified with concrete evidence.

For all work items below that include implementation: the assistant is expected to follow the Spec Loop workflow rules automatically; direct manual guidance is the exception. If the assistant starts implementation before planning and explicit approval boundaries, or over-designs future work too early, first check whether it remembers the workflow rules (for example ask it to restate the PLAN -> IMPLEMENTATION approval gate), then tell it to stop and follow those workflow rules strictly.

Step 4: Architecture Decision Record (ADR) for Game Stack and Core Design Style

This step is intentionally more explicit than many real prompts for an initial implementation. Its purpose is to demonstrate architectural decision capture, tooling selection, reviewable design expectations, and later task alignment in a single example. In a smaller or lower-risk project, a lighter ADR prompt may be sufficient.

You send

Please create one ADR for stack selection and core design style for the
initial game implementation in `architecture-decisions/`.

First discuss the criteria with me. We want a stack and design approach
for the initial game implementation that support a clean, layered,
class-based design: the game rules should live in explicit domain
classes, should not be tied to the UI, the design should stay visible
and reviewable with a class diagram, and most core logic should be
testable without the browser. Persistence stack decisions are deferred.

Then compare 3-5 realistic stack options for the initial game
implementation with pros and cons. Include at least one simpler option
and at least one option that is a strong fit for clean or hexagonal
architecture.

Record one final choice with rationale. In the same ADR:
- define the practical test tooling
- define the exact test command(s)
- define the browser-based tooling for gameplay and design checks
- define the expected high-level architecture for the initial game
  implementation
- require a class-based core design with explicit domain classes and
  clear UI-adapter boundaries
- state that later task Design should be reviewable with a class
  diagram
- explain why the chosen stack and design style are a good fit for
  clean, reviewable design
- mark persistence as out of scope and deferred to the leaderboard work

Your intent

  • Ask for the criteria discussion in a way that should make the assistant use the normal spec-loop-clarify-task flow instead of free-form brainstorming.
  • Capture stack, design style, tooling, and the persistence deferral in one durable ADR.

You see

  • The ADR is preceded by a decision-criteria discussion in the normal spec-loop-clarify-task format.
  • If the assistant starts an unstructured discussion instead, stop it and say: Use the spec-loop-clarify-task skill for the criteria discussion before writing the ADR.
  • ADR:
    • Compares realistic stack options for the initial game implementation and records the chosen one with rationale.
    • Records the required core design style, not only the implementation stack.
    • Explains the choice in terms of clean/layered class-based design, not only implementation speed.
    • Includes test tooling and the exact test command(s).
    • Includes browser-based tooling for gameplay and design checks.
    • Defines the expected high-level architecture for the initial game implementation.
    • Requires explicit domain classes for core gameplay logic and clear boundaries to UI/browser code.
    • Makes later class-diagram-based design review an explicit expectation.
    • Marks persistence as out of scope and deferred to the leaderboard work.

After completion (commit)

  • After you accept the ADR as done: ask the assistant to commit the ADR. This step is ADR-only and does not involve moving anything to done.

You learned (this step)

  • ADRs capture long-lived decisions (including the exact test command) without requiring a task file.

Step 5: Core Gameplay (Subtasks)

You send

Starting point: reuse relevant AIC API research already recorded in
this repository and follow the ADR.

Let us work on core gameplay in this repository.

The scope must include a Level 1 playable flow with 2 artworks,
progressive levels where each next level adds one artwork, and strict
year eligibility that accepts only standalone 4-digit years like 1879
and rejects ranges, circa/ca., decades, null or unknown values, and
mixed text values. Ensure the game page is reachable from a link on
site/index.html.

For the initial task creation, do not fully design every future
subtask. Create only:
- the overall task,
- subtasks containing Scope and Motivation each.

Your intent

  • Make the assistant break gameplay into reviewable subtasks instead of designing the whole feature in one pass.
  • Reuse the approved ADR and earlier research rather than rediscovering those decisions inside the task.

You see (plan)

  • A task file is created automatically with a task header and an ordered subtask breakdown, and it is waiting for your review.
  • Task file:
    • Overall Scope and Motivation are clear.
    • Each subtask has Scope and Motivation, but future subtasks are not fully designed yet.
    • Relevant earlier task-file research is referenced where needed.
    • Task and subtask terminology aligns with glossary.adoc.

Subtask-by-subtask workflow

  • Review the task header and the task breakdown first.
  • If the breakdown needs adjustment, ask the assistant to revise it before any implementation starts.
  • If it looks good, ask the assistant to fully design only the first subtask.
  • Review that current subtask detail. If it looks good, ask the assistant to implement only that subtask.
  • After each implemented subtask reaches review, either ask for changes or accept it and ask the assistant to move that subtask to done.
  • Then ask it to create a separate commit and only after that ask it to design the next subtask.

You see (current subtask design)

  • Only the current subtask is fully designed, and implementation still waits for explicit approval.
  • Task file: the current subtask is designed with all class diagrams; future subtasks remain lightweight.
  • The current subtask Design and Constraints, when present, use glossary terms from glossary.adoc consistently and make any glossary term change explicit before approval.

You see (during subtask implementation)

  • Only the approved current subtask is implemented before the next review step.
  • The implemented current subtask moves to review when local verification is complete.
  • When the last remaining unfinished subtask reaches review and no more work remains, the overall task moves to review too.
  • Tests: separate verification evidence is provided per implemented subtask.
  • Git: there is a separate commit per accepted subtask.
  • Code: game is reachable from site/index.html and playable (after relevant subtasks complete).
  • glossary.adoc: expands to cover the core gameplay terms introduced by the implementation and links those terms to the relevant code.

After acceptance (move to done / commit)

  • After you accept an earlier subtask as done: ask the assistant to move that subtask to done, then commit.
  • After you accept the final subtask as done: ask the assistant to move that subtask to done; if no more work remains, also move the overall task to done, then commit.

You learned (this step)

  • Keep future subtasks lightweight until you reach them: review the current subtask in detail, implement it, verify it, commit it, then move on.

Step 6: Leaderboard Clarification (In-Memory, Then Persistence)

You send

Let us work on the leaderboard in this repository.

I want you to fully design the new leaderboard task in the backlog.

Your intent

  • Intentionally leave the leaderboard under-specified so the assistant must surface the missing persistence decision.
  • Once that branch is resolved, keep the work staged: in-memory first, persistence later.

You see (clarification)

  • If the assistant starts fully designing the leaderboard task instead of clarifying first, stop it and say: Use the spec-loop-clarify-task skill before designing this task.
  • The assistant does not fully design the task immediately.
  • It first surfaces the material unresolved branch or branches and asks clarifying questions in the normal spec-loop-clarify-task format:
    • Question:
    • Recommendation:
    • Options: when explicit options are needed
    • Reason:

If the assistant's first clarification is about persistence scope, reply exactly with:

Break the implementation work down in this order:
1. in-memory leaderboard implementation
2. persistence implementation

Design only the in-memory leaderboard subtask fully.

If any other unresolved decisions remain, please prefer decision
batches over separate questions for the rest of this clarification
round.

If the assistant asks any other clarification question, or presents a decision batch, accept the recommended options unless you intentionally want a different path. If it includes persistence scope again and recommends something else, correct that answer to the in-memory-then- persistence path above.

You see (plan after clarification)

  • A separate leaderboard backlog task is created automatically and is waiting for your review.
  • Task file:
    • exists with ordered implementation subtasks,
    • keeps future implementation subtasks lightweight,
    • requires a separate persistence ADR before persistence implementation is fully designed, and
    • has the in-memory leaderboard subtask fully designed.

You send

Implement it.

You see (in-memory implementation)

  • Verification evidence is provided for the in-memory leaderboard subtask.
  • The in-memory leaderboard subtask is in review.
  • Behavior: leaderboard sorting matches the required rules.
  • glossary.adoc: links the leaderboard terms to the implemented code.

You send

Please create the persistence ADR. The ADR must define the
chosen persistence approach, storage location, reset procedure for local
development and tests with an exact command, and practical verification
commands.

You see (persistence ADR)

  • ADR: records the chosen persistence approach, storage location expectations, reset procedure expectations, and practical verification commands before the persistence implementation subtask is fully designed.

You send

Please design the remaining subtask.

You see (persistence subtask design)

  • Task file: the persistence implementation subtask is fully designed.

You send

Implement it.

You see (persistence implementation)

  • Verification evidence is provided for the persistence implementation subtask.
  • The persistence implementation subtask is in review.
  • If no more work remains, the overall task is in review too.
  • Docs: storage location and reset procedure are documented with an exact command.
  • Behavior: leaderboard sorting matches the required rules and data survives restart.

After acceptance (move to done / commit)

  • After you accept the in-memory leaderboard subtask as done: ask the assistant to move that subtask to done, then commit.
  • After you accept the persistence ADR: ask the assistant to commit the ADR change.
  • After you accept the persistence implementation subtask as done: ask the assistant to move that subtask to done; if no more work remains, also move the overall task to done, then commit.

You learned (this step)

  • Intentionally incomplete prompts can trigger proactive clarification before task drafting.
  • Ordered delivery reduces risk: get the in-memory behavior working first, make the persistence decision explicitly, then implement persistence.

You learned

Each step follows the Spec Loop workflow model:

  • In chat, you ask the assistant to work on a feature or long-lived design decision.
  • For implementation work, the assistant should create the needed task automatically before making executable changes.
  • For larger tasks, the first planning pass may stop at the task header and an ordered subtask breakdown; only the current subtask is designed in detail before implementation.
  • You approve or reject implementation explicitly.
  • Only after explicit approval should the assistant make executable changes (code/tests/config/runtime assets).
  • Tasks should include automated tests for their deliverables.
  • In large implementation steps, ask the assistant to decompose work into smaller implementation subtasks before detailed design and implementation approval.
  • Every implementation subtask includes both implementation and testing.
  • When subtasks exist, require separate status updates per subtask (each subtask is tracked independently).
  • Review-ready implementation moves the current task or subtask to review; after you accept it, you may ask the assistant to move it to done.
  • If the assistant plans too much, skips needed file updates, or starts implementation too early, correct it and ask it to return to the expected workflow.
  • After you explicitly accept a work item as done, ask the assistant to commit before moving on.
  • Depending on your tool, you may be asked to confirm the commit command (review the commit message there), or the commit may happen immediately (review the commit message right after). If it does not match the work item's purpose, or it is misleading about what changed, ask the assistant to improve the message and amend the commit.
  • When a step is implemented via subtasks: move the overall task to done only after the last subtask is done.

Learning outcomes:

  • Keep task and subtask scopes small and reviewable.
  • Use ADRs for architectural decisions with clear rationale.
  • Verify behavior using concrete evidence, not assumptions.

How to think while running this tutorial:

  • Keep the process meaningful, not bureaucratic.
  • Low-risk, small cleanup that does not change behavior may be done and (after you accept it) you can ask the assistant to commit it as part of a step when appropriate (for example: .gitignore, documentation typo fixes).
  • Chat is for coordination and approvals; task files and ADRs are the long-lived specification files.
  • Trust the installed skills to choose the workflow, and correct the assistant explicitly if it skips planning, over-designs future work, or misses a required file update.
  • Only the user may relax or override these workflow rules.

Wordle Tutorial: You Send, You See

Bootstrap

B1. Create an empty wordle-tutorial-project

Run this from a workspace directory of your choice:

mkdir -p wordle-tutorial-project
cd wordle-tutorial-project
git init

B2. Install the Spec Loop skills

npx skills add dpolivaev/spec-loop -s '*'

This recommended path requires Node.js because it uses npx. For global installation for all agents, use:

npx skills add dpolivaev/spec-loop -g --all

--all installs all skills for all supported agents. For other installation variants, see https://github.com/vercel-labs/skills.

B3. Open the project

Open wordle-tutorial-project in your coding tool.

B4. Select the model explicitly

For this tutorial, select the model explicitly instead of relying on automatic model choice. With an unknown model, poor instruction following is more likely.

Continue with Step 1 from the wordle-tutorial-project root. Send the tutorial prompts from there unless a later step says otherwise.

B5. Prepare task and glossary rendering in your editor

Run this step unless you already know your editor is prepared to render:

  • Markdown task files with embedded PlantUML
  • AsciiDoc glossary files with embedded diagrams

If you review in VS Code, Cursor, or another VS Code-based IDE and want to run the helper script directly instead of using the skill, use the instructions in README.md: Prepare task and glossary rendering. Then skip the You send prompt below. Use Verification to confirm the expected editor state.

If you do not want to use the skill, use these editor-specific references instead: VS Code-Based IDE Setup and JetBrains Setup Reference.

You send

Please use the `spec-loop-setup-doc-rendering` skill to help me
prepare my editor for reviewing rendered Spec Loop task files and
glossary files.

My coding tool may run in a terminal, but I review files in
<VS Code, Cursor, another VS Code-based IDE, or JetBrains>.

You see

  • uses the spec-loop-setup-doc-rendering skill,
  • reads the setup document for your editor,
  • guides you through the rendering setup needed for task and glossary review.

Verification

  • your editor is ready to review Markdown task files with embedded PlantUML,
  • your editor is ready to review AsciiDoc glossary files with embedded diagrams.

⚠️ Default rule for later clarification questions

For the rest of this tutorial, if the assistant asks a clarification question and gives a recommendation, follow the recommendation unless you intentionally want a different path.

If the assistant starts asking too many separate clarification questions and you want to speed the rest up, tell it: Please prefer decision batches over separate questions for the rest of this clarification round.

Step 1: Confirm Spec Loop in the tutorial project

You send

I am following the Spec Loop Wordle tutorial from my browser.
Please work in this project according to the Spec Loop workflow defined by the installed skills.

Tutorial-specific goals:
- use the normal planning workflow for non-trivial work,
- later tutorial steps will create and maintain `glossary.adoc`,
- rendering setup help is only needed again if a later step requires
  it,
- browser automation setup is not needed for this tutorial,
- tell me how you will work here and restate the
  `PLAN -> IMPLEMENTATION` approval rule in one sentence.

Your intent

  • Confirm that the assistant is actually following the installed Spec Loop workflow in this repository.
  • Make it restate the planning-before-implementation approval boundary before any real work starts.

You see

Read the assistant's final response carefully, even if you skip intermediate reasoning. Before continuing, confirm these points:

  • the assistant says it will follow the Spec Loop workflow defined by the installed skills in this project;
  • the assistant makes clear that non-trivial work will go through the normal planning path before implementation;
  • the assistant correctly restates the PLAN -> IMPLEMENTATION approval rule.

You learned (this step)

  • Setup is now package installation, with a separate editor-rendering step when needed.
  • The tutorial may be open in your browser while the assistant only sees the wordle-tutorial-project, so prompts must still carry the context it needs.

If setup seems wrong

  1. Ask the assistant which installed skills are active.
  2. Ask it to restate the PLAN -> IMPLEMENTATION approval rule.
  3. If that still looks wrong, reinstall the skills with:
npx skills add dpolivaev/spec-loop -s '*'
  1. For global installation for all agents, use -g --all. For other installation variants, check https://github.com/vercel-labs/skills.
  2. If npx is not available or does not help, copy the needed part of the skills/ directory from https://github.com/dpolivaev/spec-loop into the tool-specific skills directory.
  3. If the tool still does not automatically apply the expected workflow, explicitly ask for the needed skill by name.
  4. Continue only when the assistant clearly understands the setup and the workflow rules.

From here on

  • each You send block is a prompt to adapt and send,
  • each You see block describes the expected outcome,
  • if you want to finish the tutorial in minimum time, send the next prompt first and then read it and think about it while the assistant works, because the assistant also needs time to act and respond,
  • validate progress from the changed files and the assistant's final response before continuing,
  • for routine steps, you can usually skip intermediate reasoning and read the assistant's final response carefully once it finishes,
  • if the assistant misses a required setup, project instructions, glossary, or status update, ask it to fix that before continuing,
  • if the setup or workflow rules seem wrong, use the recovery steps above before continuing.

Possible misalignment

If one of these happens, interrupt the flow and ask the assistant to correct it before continuing:

  • it starts changing files or config before showing the plan and getting approval,
  • it cannot clearly explain which Spec Loop setup is active or restate the PLAN -> IMPLEMENTATION approval rule,
  • it ignores the installed workflow rules,
  • it starts implementation before explicit approval,
  • unrelated changes are mixed into one subtask,
  • implementation changes are made without verification evidence,
  • it misses required supporting updates such as glossary, task status, or ignore rules,
  • the assistant's final response does not match the actual changed files,
  • a task or subtask is moved to done without explicit user confirmation.

Step 2: Project README (README.md)

You send

Project brief:

We are building a small Java implementation of Wordle.

Gameplay rules:
- the system selects one hidden five-letter solution word
- the player submits five-letter guesses
- each guessed letter produces feedback:
  - `=` correct letter in the correct position
  - `~` correct letter in the wrong position
  - `.` letter not present in the solution
- duplicate letters must be evaluated deterministically
- the player has a limited number of attempts; default 6

Interaction modes:
- CLI mode is required
- later, add a minimal UI that reuses the same core logic

Word list rules:
- keep an internal packaged word list
- later, allow overriding the word list source with a file path or URL

Technical direction:
- use Java with Gradle
- keep gameplay rules in explicit domain classes that are not tied to
  the UI

Please write `README.md` for this repository based on the project brief.
Include the project brief verbatim in the README under a "Project Brief"
section. The README must clearly describe the game rules, the later CLI
and UI paths, and the word-list expectations. Keep the README concise
and practical.

Also create `glossary.adoc` from the approved project brief. It should
define the canonical project terms needed for this tutorial and keep
their wording consistent with the brief.

Also create `.gitignore` if you find any harness-specific or IDE-specific
configuration files in this repository.

Also update the active project instructions file (for example
`AGENTS.md`) so it explicitly tells the assistant to read `README.md`
and follow the "Project Brief" section there for project requirements
unless I explicitly override it. The instructions file must also say
that this project never uses the fileless planning path: any code change
requires creation of a task file.

This is documentation-only work, we do not need a task file for it.

Your intent

  • Turn the project brief into durable project files before implementation starts.
  • Lock in the shared vocabulary and the rule that every later code change needs a task file.

You see

  • README.md:
    • exists and captures the project brief requirements,
    • includes the project brief text under Project Brief.
  • glossary.adoc:
    • exists and defines the canonical project terms from the brief,
    • uses wording consistent with the brief so later tasks can reuse it.
  • .gitignore:
    • exists if harness-specific or IDE-specific configuration files were found.
  • Project instructions file:
    • explicitly points the assistant to README.md as the source of the project brief and requirements,
    • states that the fileless planning path is never allowed in this project and that any code change requires a task file.

After completion (commit)

  • After you accept this work item as done: ask the assistant to commit the README, glossary.adoc, .gitignore (if created), and instructions-file changes.

You learned (this step)

  • The assistant can create documentation, add lasting instructions that point to the project brief, and establish glossary.adoc as the project vocabulary without creating a task file.

Step 3: Gradle Java project setup

You send

Let us work on initial Gradle Java project setup in this repository.

The scope must include:
- a single-module Gradle project,
- Gradle wrapper files,
- Kotlin DSL build scripts,
- Java 21 toolchain configuration,
- application plugin wiring,
- standard `src/main/java`, `src/test/java`, and `src/main/resources`
  layout,
- just enough code to prove the application can build, test, and run.

Your intent

  • Start with a small implementation task that proves the normal plan-review-implement loop.
  • Keep scope tight: just enough Gradle and Java setup to build, test, and run.

You see (plan)

  • A task file is created automatically, and implementation still waits for explicit approval.
  • Task file:
    • contains Scope, Motivation, Briefing, Research, Design, and Test specification,
    • records the chosen Gradle wrapper version in Research,
    • includes a build-layout diagram (PlantUML by default; Mermaid only when explicitly preferred).

Approve only after the task definition looks correct. If the assistant does not create the task automatically, the task content does not have the required form, or embedded PlantUML does not render correctly, correct it before approving anything. If needed, send the error text or a screenshot and ask the assistant to fix the diagram. Then ask the assistant to implement it.

You see (after implementation is completed)

  • Build files exist and load as planned.
  • The project has wrapper scripts, Kotlin DSL build files, and the standard source layout.
  • Verification evidence includes the exact verification commands and their result.
  • The task file is in review.
  • The task file may include Implementation notes when relevant; if present, review them as part of the reviewer-facing task artifact.

After acceptance (move to done / commit)

  • After you accept this work item as done: tell the assistant to move the task to done and commit.

You learned (this step)

  • Initial build setup is still task-based work: it is planned first, then implemented after explicit approval.

Step 4: Wordle domain model and evaluation rules

You send

Let us work on the Wordle domain model and evaluation rules in this
repository.

The scope must include:
- domain objects for words and feedback that are not tied to the UI,
- deterministic duplicate-aware letter evaluation,
- immutable model boundaries suitable for later engine and interface
  work.

Break the work down into subtasks.
 
For the initial task creation, do not fully design every future
subtask. Create only:
- the overall task,
- subtasks containing Scope and Motivation each.

Your intent

  • Make the assistant decompose the core gameplay model into reviewable subtasks instead of over-designing everything at once.
  • Establish domain terms and boundaries that later engine and interface work will reuse.

You see (plan)

  • A task file is created automatically with a task header and an ordered subtask breakdown, and it is waiting for your review.
  • Task file:
    • has clear overall Scope, Motivation, and Scenario,
    • keeps future subtasks lightweight,
    • uses glossary.adoc terms consistently.

Subtask-by-subtask workflow

  • Review the task header and the task breakdown first.
  • If the breakdown needs adjustment, ask the assistant to revise it before any implementation starts.
  • If it looks good, ask the assistant to fully design only the first subtask.
  • Review that current subtask detail. If it looks good, ask the assistant to implement only that subtask.
  • After each implemented subtask reaches review, either ask for changes or accept it and ask the assistant to move that subtask to done.
  • Then ask it to create a separate commit and only after that ask it to design the next subtask.

You see (current subtask design)

  • Only the current subtask is fully designed, and implementation still waits for explicit approval.
  • Task file:
    • the current subtask includes Research, Design, and Test specification,
    • future subtasks remain lightweight,
    • the current subtask uses glossary terms consistently.

You see (during subtask implementation)

  • Only the approved current subtask is implemented before the next review step.
  • The implemented current subtask moves to review when local verification is complete.
  • When the last remaining unfinished subtask reaches review and no more work remains, the overall task moves to review too.
  • Tests: separate verification evidence is provided per implemented subtask.
  • Git: there is a separate commit per accepted subtask.
  • glossary.adoc: expands to cover shared gameplay terms and links those terms to the implemented code.

After acceptance (move to done / commit)

  • After you accept the first subtask as done: ask the assistant to move that subtask to done, then commit.
  • After you accept the second subtask as done: ask the assistant to move that subtask to done; if no more work remains, also move the overall task to done, then commit.

You learned (this step)

  • Keep future subtasks lightweight until you reach them: review the current subtask in detail, implement it, verify it, commit it, then move on.

Step 5: Word list loader and validation

You send

Let us work on the internal word list loader and validation.

The scope must include:
- a packaged `wordlist.txt` resource,
- a loader that reads the declared count header from the file,
- random selection of one candidate entry from the declared list,
- conversion of the selected value into the existing validated word
  type,
- no separate dictionary-membership checks beyond loading and existing
  validation.

Your intent

  • Treat word-list loading as real planned work, not a quick hidden utility.
  • Force explicit file-format research and automated tests before implementation.

You see (plan)

  • A task file is created automatically, and implementation still waits for explicit approval.
  • Task file:
    • documents the word-list file format in Research,
    • includes a loader-to-resource flow diagram (PlantUML by default; Mermaid only when explicitly preferred),
    • defines concrete automated tests for loader behavior.

Approve only after the task definition looks correct. Then ask the assistant to implement it.

You see (after implementation is completed)

  • src/main/resources/wordlist.txt exists.
  • Loader code exists and returns validated words from the packaged list.
  • Tests prove header parsing, normalization, and selection behavior.
  • The task file is in review.
  • If the loader work stabilizes a shared term such as Word List and the glossary was not updated, ask the assistant to add that missing glossary update before accepting the step.

After acceptance (move to done / commit)

  • After you accept this work item as done: tell the assistant move the task to done, commit.

You learned (this step)

  • Infrastructure-facing work such as resource loading still benefits from explicit file-format research and testable design.

Step 6: Game engine

You send

Starting point: build on the relevant research already recorded in this
repository.

Let us work on the game engine in this repository.

The scope must include:
- immutable game state,
- explicit game status values,
- attempt limits,
- feedback history,
- game start logic,
- guess submission logic,
- win and lose termination behavior.

Break the work down into these subtasks:
1. define game state model
2. implement game engine logic

Your intent

  • Separate stable state structure from state-transition behavior.
  • Preserve ordered subtask review instead of merging the whole engine into one jump.

You see (plan)

  • A task file is created automatically with a task header and an ordered subtask breakdown, and it is waiting for your review.
  • Task file:
    • keeps future subtasks lightweight,
    • aligns with existing glossary terms,
    • clearly separates state modeling from engine behavior.

Subtask-by-subtask workflow

  • Review the task header and breakdown first.
  • If it looks good, ask the assistant to fully design only the first subtask.
  • Review that design and, if acceptable, ask it to implement only that subtask.
  • When the first subtask reaches review and you accept it, ask the assistant to move that subtask to done, then commit before asking for the next subtask design.

You see (during subtask implementation)

  • State-model work and engine-behavior work are implemented in separate reviewable increments.
  • Each implemented current subtask moves to review when local verification is complete.
  • When the last remaining unfinished subtask reaches review and no more work remains, the overall task moves to review too.
  • Tests prove start state, guess progression, attempt decrement, and win/lose transitions.
  • glossary.adoc stays aligned with Game, Game Engine, Game State, and Game Status terminology.

After acceptance (move to done / commit)

  • After you accept the first subtask as done: ask the assistant to move that subtask to done, then commit.
  • After you accept the second subtask as done: ask the assistant to move that subtask to done; if no more work remains, also move the overall task to done, then commit.

You learned (this step)

  • Separate the stable state shape from the state-transition behavior: it keeps the engine reviewable and the test coverage focused.

Step 7: AssertJ test migration

You send

Let us migrate the existing tests in this repository to AssertJ and add
the required dependency.

The scope must include:
- replacing JUnit assertion helpers with AssertJ,
- updating build configuration as needed,
- keeping existing production APIs unchanged,
- verifying that the full test suite still passes.

Your intent

  • Keep a testing-focused change narrow and reviewable.
  • Require proof that the full suite still passes after the assertion migration.

You see (plan)

  • A task file is created automatically, and implementation still waits for explicit approval.
  • Task file:
    • keeps scope limited to test sources and test dependency configuration,
    • includes concrete verification for the full test suite.

Approve only after the task definition looks correct. Then ask the assistant to implement it.

You see (after implementation is completed)

  • Test code uses AssertJ consistently.
  • Build configuration includes the AssertJ dependency.
  • Verification evidence includes the exact test command and its passing result.
  • The task file is in review.

After acceptance (move to done / commit)

  • After you accept this work item as done: tell the assistant move the task to done, commit.

You learned (this step)

  • Technical cleanup that changes build configuration and tests is still implementation work and still needs a task, verification, and review.

Step 8: Architecture Decision Record (ADR) for CLI argument parsing

You send

Please create one ADR for CLI argument parsing in
`architecture-decisions/`.

First discuss the criteria with me.
The CLI must support:
- `--wordlist` for file path or URL input,
- `--attempts` with default value 6,
- `--cli` for explicit terminal mode,
- standard help output.

Then compare realistic options for argument parsing, including:
- manual parsing without a library,
- using a CLI parsing library.

Record one final choice with rationale.
The ADR should explain why the chosen approach is a good fit for a
small project now and for modest CLI growth later.
Also record the practical verification command for checking the CLI help
or basic option parsing path.

Your intent

  • Ask for the criteria discussion in a way that should make the assistant use the normal spec-loop-clarify-task flow instead of free-form brainstorming.
  • Record the parsing decision as a durable ADR with a real verification command.

You see

  • The final ADR is preceded by a criteria discussion in the normal spec-loop-clarify-task format.
  • If the assistant starts an unstructured discussion instead, stop it and say: Use the spec-loop-clarify-task skill for the criteria discussion before writing the ADR.
  • ADR:
    • compares realistic options,
    • records the chosen parsing approach with rationale,
    • explains the tradeoff between small-project simplicity and future CLI growth,
    • records a practical verification command for the parsing path.

After completion (commit)

  • After you accept the ADR as done: ask the assistant to commit the ADR change.

You learned (this step)

  • ADRs are useful for long-lived tooling or design choices that should not be rediscovered inside a later implementation task.

Step 9: CLI game interface

You send

Starting point: build on the existing gameplay logic in this repository
and follow the approved CLI argument parsing ADR.

Let us work on the CLI game interface in this repository.

The CLI requirements are:
- interactive terminal play,
- `--wordlist` to accept a file path or URL,
- `--attempts` with default value 6,
- `--cli` to force terminal mode later when a UI also exists,
- deterministic textual feedback rendering.

Break the implementation work down in this order:
1. implement CLI parsing and game loop
2. implement feedback rendering
3. document CLI build and usage
4. document application distribution packaging

Your intent

  • Make the CLI feature follow the approved ADR instead of rediscovering parsing choices inside the task.
  • Keep runtime behavior, rendering, and docs in ordered increments.

You see (plan)

  • A task file is created automatically with a task header and an ordered subtask breakdown, and it is waiting for your review.
  • Task file:
    • uses an ordered subtask flow,
    • keeps future subtasks lightweight,
    • treats the documentation subtasks as part of the same accepted delivery path.

Subtask-by-subtask workflow

  • Review the overall task and ordered subtasks first.
  • Ask the assistant to fully design only the first subtask.
  • Review that current subtask design. If it looks correct, ask the assistant to implement only that subtask.
  • After each implemented subtask reaches review, either ask for changes or accept it and ask the assistant to move that subtask to done.
  • Create a separate commit before moving to the next subtask.

You see (during subtask implementation)

  • Each implemented current subtask moves to review when local verification is complete.
  • When the final unfinished subtask reaches review and no more work remains, the overall task moves to review too.
  • CLI parsing and the interactive loop are delivered first.
  • Feedback rendering is delivered as a separate increment with exact output tests.
  • README usage and distribution packaging docs are delivered as later accepted subtasks.
  • Verification evidence includes exact manual and automated verification commands for the CLI path.

After acceptance (move to done / commit)

  • After each accepted non-final subtask: ask the assistant to move that subtask to done, then commit.
  • After you accept the final subtask as done: ask the assistant to move that subtask to done; if no more work remains, also move the overall task to done, then commit.

You learned (this step)

  • Even when one feature spans runtime behavior and documentation, keeping the increments ordered and separately accepted preserves reviewability.

Step 10: UI Clarification and Minimal Swing UI

You send

Starting point: build on the existing gameplay logic in this
repository.

Let us work on a UI in this repository.

I want you to fully design the new UI task in the backlog.

Your intent

  • Leave the UI approach open so the assistant has to surface the missing framework decision.
  • After that, steer it to Swing while keeping CLI fallback and launch-policy constraints explicit.

You see (clarification)

  • The UI approach is intentionally left open here.
  • If the assistant asks what UI approach or framework this task should assume, choose Swing even if Swing is not the recommendation and is not listed in its options.
  • If the assistant starts fully designing the task without first asking what UI approach/framework it should assume, stop it and say: Use the spec-loop-clarify-task skill before designing this task.
  • If it still skips that question, say: Before designing this task, ask which UI approach/framework this task should assume.

If the assistant asks what UI approach/framework this task should assume, reply exactly with:

Use Swing.

Keep CLI availability.
When a display is available and `--cli` is not set, the application
should start the UI.
In headless mode or when `--cli` is set, the application should use
the CLI path.

Break the implementation work down in this order:
1. prepare shared input validation for CLI and UI
2. implement the minimal Swing UI
3. document UI build and usage

If any other unresolved decisions remain, please prefer decision
batches over separate questions for the rest of this clarification
round.

If the assistant asks any other clarification question, or presents a decision batch, follow the recommended options unless you intentionally want a different path. If it includes the UI approach/framework question again and recommends something else, correct that answer to Swing.

You see (plan after clarification)

  • A task file is created automatically with a task header and an ordered subtask breakdown, and it is waiting for your review.
  • Task file:
    • keeps future subtasks lightweight,
    • makes the CLI/UI boundary explicit,
    • uses glossary terms consistently.

Subtask-by-subtask workflow

  • Review the task header and the breakdown first.
  • Ask the assistant to fully design only the first subtask.
  • Review that design and, if acceptable, ask it to implement only that subtask.
  • After each implemented subtask reaches review, either ask for changes or accept it and ask the assistant to move that subtask to done.
  • Create a separate commit before moving on.

You see (during subtask implementation)

  • Each implemented current subtask moves to review when local verification is complete.
  • When the final unfinished subtask reaches review and no more work remains, the overall task moves to review too.
  • Shared input validation lands before the UI itself.
  • Swing UI behavior is delivered as a separate accepted increment.
  • README UI usage updates land as the final subtask.
  • Verification evidence includes exact commands for UI launch, CLI override, and headless fallback behavior.

After acceptance (move to done / commit)

  • After each accepted non-final subtask: ask the assistant to move that subtask to done, then commit.
  • After you accept the final subtask as done: ask the assistant to move that subtask to done; if no more work remains, also move the overall task to done, then commit.

You learned (this step)

  • Leaving the UI approach open can force the missing framework decision into a clarification round before task design.
  • Once the UI direction is chosen, the interface layer can stay small and reviewable when shared validation and engine behavior are separated first.

You learned

Each step follows the Spec Loop workflow model:

  • In chat, you ask the assistant to work on a feature, approved documentation change, or long-lived design decision.
  • For implementation work, the assistant should create the needed task automatically before making executable changes.
  • For larger tasks, the first planning pass may stop at the task header and an ordered subtask breakdown; only the current subtask is designed in detail before implementation.
  • You approve or reject implementation explicitly.
  • Only after explicit approval should the assistant make executable changes.
  • Tasks should include automated tests for their deliverables.
  • Every implementation subtask includes both implementation and testing.
  • When subtasks exist, require separate status updates per subtask.
  • Review-ready implementation moves the current task or subtask to review; after you accept it, you may ask the assistant to move it to done.
  • If glossary.adoc exists, later planning and implementation must keep it aligned with the approved shared terms.
  • Use ADRs for long-lived decisions such as the CLI parsing approach, then make later tasks follow that decision.
  • If the assistant plans too much, skips needed file updates, or starts implementation too early, correct it and ask it to return to the expected workflow.
  • After you explicitly accept a work item as done, ask the assistant to commit before moving on.
  • When a step is implemented via subtasks: move the overall task to done only after the last subtask is done.

Learning outcomes:

  • Keep task and subtask scopes small and reviewable.
  • Use ADRs for long-lived decisions and tasks for incremental delivery.
  • Use the glossary as the stable shared language across the project.
  • Verify behavior using concrete evidence, not assumptions.

How to think while running this tutorial:

  • Keep the process meaningful, not bureaucratic.
  • Chat is for coordination and approvals; task files and the glossary are the long-lived specification files.
  • Trust the installed skills to choose the workflow, and correct the assistant explicitly if it skips planning, over-designs future work, or misses a required file update.
  • Only the user may relax or override these workflow rules.

© Dimitry Polivaev, 2026

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.