# LLM Prepare Command Instructions

## Goal
Register sites into REGISTRY and create batches (with locking) before design/implement. Prepare phase is independent and can run concurrently with design/implement for other batches.

## Inputs
- `sites.csv` at repo root (domain, theme columns)
- `.wdmaker/config.toml` for temp_dir configuration

---

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ CRITICAL: READ ALL RULES BEFORE ANY ACTION ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

## General Rules (Main Agent AND Subagents)

### G1. Script Execution
**Run shell scripts directly; run Python scripts with `uv run python3.12`:**

```bash
# CORRECT — Shell scripts (run directly)
tools/prepare/info.sh --root .
tools/prepare/register.sh --input-file sites.csv
tools/prepare/batch.sh --version v1 --batch-size 20

# CORRECT — Python scripts (use uv run)
uv run python3.12 tools/design/check-dirs.py --batch-id 001 --root .

# WRONG — DO NOT USE THESE FORMS FOR PYTHON
python3 tools/some_script.py ...      # WRONG
python3.12 tools/some_script.py ...   # WRONG
```

**AVOID — Absolute paths add unnecessary directory prefix:**

```bash
# WRONG — Uses absolute path to script (symlink source)
uv run python3.12 /Volumes/Common/QJoon/llm/wdmaker/tools/prepare/info.sh --root /Volumes/Ephemeral/CMassM1

# WRONG — Uses absolute path to script AND arguments (repo path)
/Volumes/Ephemeral/CMassK1/tools/prepare/info.sh --root /Volumes/Ephemeral/CMassK1

# CORRECT — Uses relative paths for both script and arguments
tools/prepare/info.sh --root .
tools/prepare/register.sh --input-file sites.csv
```

Scripts are symlinked into the repo; always use relative `tools/...` and `sites/...` paths.

### G2. No Invented Commands
- **DO NOT create new bash commands or scripts**
- **DO NOT use ad-hoc shell commands** (cat/ls/find/grep/wc/for-loops)
- Use ONLY the provided assist scripts listed below

**AVOID — Ad-hoc inline Python:**

```bash
# WRONG — Inline Python to parse CSV
uv run python3.12 -c "
import csv
for row in csv.reader(open('sites.csv')):
    print(row[0])
"

# CORRECT — Use provided scripts
tools/check/site-list.sh sites.csv
tools/prepare/info.sh --root .
```

**AVOID — Creating ad-hoc debug scripts in TMP:**

```bash
# WRONG — Creating and running debug scripts
uv run python3.12 /Volumes/Temp/WDMaker/CMassK1/parse_csv.py
uv run python3.12 /Volumes/Temp/WDMaker/CMassK1/check_registry.py

# CORRECT — Use provided scripts for inspection
tools/prepare/info.sh --root .
tools/check/status-report.sh
```

**AVOID — Using non-existent or invented scripts:**

```bash
# WRONG — tools/status.py does not exist
uv run python3.12 tools/status.py --root . | grep example.com

# CORRECT — Use provided status scripts
tools/check/status-report.sh
tools/check/batch-manage.sh list
```

**AVOID — Piping to grep:**

```bash
# WRONG — Piping output to grep
tools/check/batch-manage.sh list | grep example.com

# CORRECT — Read full output from provided scripts
tools/check/status-report.sh
tools/check/batch-manage.sh list
```

### G3. No HEREDOC Pattern
- **NEVER use HEREDOC** (`<<EOF`, `<<'EOF'`, etc.) for ad-hoc file writing
- Use **Write tool** or **Filesystem MCP server** when writing new files
- **Note:** HEREDOC in provided scripts (e.g., `tools/*.sh`) is fine — this rule prohibits ad-hoc LLM usage only

```bash
# WRONG — DO NOT USE HEREDOC for ad-hoc file creation
cat <<EOF > sites.csv
domain,theme
example.com,tech
EOF

# CORRECT — Use Write tool or MCP
# (Use the Write tool in Claude Code to create files)
```

### G4. Temp Directory
- Use `${TMP}` from `.wdmaker/config.toml` for temporary files
- **NEVER use `/tmp`** — always use the configured temp directory

### G5. Directory Creation
- **Use provided scripts** for creating/preparing directories (e.g., `tools/prepare/batch.sh`)
- **NEVER use `mkdir`** directly — directories are scaffolded by workflow scripts
- **DO NOT create `.gitkeep`**, `.keep`, or similar placeholder files

### G6. Preserve Reports
- **DO NOT delete** generated reports (batch reports, registry backups, logs)
- Reports in `.smbatcher/batches/`, `.smbatcher/runs/`, `${TMP}/` are historical records

---

## Assist Scripts Reference

### `tools/prepare/info.sh`
**Purpose:** Show config/temp_dir, preview sites.csv, and verify directory structure.

```bash
tools/prepare/info.sh [--root PATH] [--sites PATH] [--config PATH]
```

**Actions:**
- Displays repo root path
- Shows config.toml status and temp_dir value
- Previews first 5 rows of sites.csv (safe summary, not raw cat)
- Verifies presence of key directories (tools/, .smbatcher/, tools/check/, tools/prepare/)

**Options:**
- `--root <PATH>` — Repository root (default: current directory)
- `--sites <PATH>` — Path to sites CSV (default: sites.csv)
- `--config <PATH>` — Path to config.toml (default: .wdmaker/config.toml)

**Output Example:**
```
Repo root: /path/to/repo
config.toml: present at /path/to/repo/.wdmaker/config.toml
temp_dir: /Volumes/Temp/WDMaker/repo/
sites.csv: present at /path/to/repo/sites.csv (rows: 50)
sample rows:
  - domain='example.com' theme='tech'
  - domain='test.org' theme='minimal'
tools: present at /path/to/repo/tools (entries: 15)
```

---

### `tools/prepare/register.sh`
**Purpose:** Lock + register/update sites in REGISTRY from CSV.

```bash
tools/prepare/register.sh --input-file <csv>
tools/prepare/register.sh --sites "domain,title,desc;domain2,title2,desc2"
```

**Actions:**
- Creates `.wdmaker/config.toml` if missing (with default temp_dir)
- Creates temp directory if needed
- Acquires exclusive lock on REGISTRY.lock
- Parses input CSV or inline sites string
- Registers new sites with status "-" (unassigned)
- Preserves existing status/batch for already-registered sites
- Updates REGISTRY.md with timestamp

**Options:**
- `--input-file <PATH>` — Path to CSV file (default: sites.csv)
- `--sites <STRING>` — Inline sites: "domain,title,desc;domain2,title2,desc2"

**CSV Format:**
```csv
domain,theme
example.com,tech
test.org,minimal
```

**Exit codes:**
- 0 = Success
- 1 = No sites provided or parse error

---

### `tools/prepare/batch.sh`
**Purpose:** Lock + select unimplemented sites, create batch, scaffold site directories.

```bash
tools/prepare/batch.sh --version <vX> --batch-size <N> --input-file <csv>
```

**Actions:**
- Acquires exclusive lock on REGISTRY.lock
- Reads registered sites from REGISTRY.md
- Filters sites not yet at target version
- Selects up to batch_size sites
- Assigns unique batch ID (e.g., 001, 002, ...)
- Updates REGISTRY.md with batch assignment and status "B"
- Creates `Batch_<ID>.md` in `.smbatcher/batches/`
- Scaffolds site directories: `sites/<domain>-<version>/`

**Options:**
- `--version <vX>` — Target version (default: v1)
- `--batch-size <N>` — Maximum sites per batch (default: 20)
- `--input-file <PATH>` — CSV for theme lookup (default: sites.csv)

**Output:**
```
SKIP: example.com: already at v1 (>= target v1)
BATCH_ID=001 (selected 20; skipped 5)
```

**Exit codes:**
- 0 = Success (even if no eligible sites)
- 1 = Error

---

### `tools/check/site-list.sh`
**Purpose:** Lenient parse/validation of site CSV (allows blanks).

```bash
tools/check/site-list.sh <csv>
```

**Actions:**
- Parses CSV file
- Validates format
- Warns on empty rows (doesn't fail)
- Reports row count and any issues

---

### `tools/check/status-report.sh`
**Purpose:** Summarize REGISTRY/batches/status.

```bash
tools/check/status-report.sh
```

**Actions:**
- Shows overall site count
- Lists batches with their status
- Summarizes sites by status code

---

### `tools/check/batch-manage.sh`
**Purpose:** Simple batch/registry management commands.

```bash
# List all registry entries
tools/check/batch-manage.sh list

# Register a single site
tools/check/batch-manage.sh register <domain> <title> [description]
```

**Commands:**
- `list` — Show all registry entries (pipe-delimited)
- `register` — Add a single site to registry with status "-"

**Note:** For bulk registration, prefer `tools/prepare/register.sh --input-file`.

---

## Prohibited vs Alternatives

| ❌ Prohibited | ✅ Use Instead |
|--------------|----------------|
| `cat/less/head/tail` for file inspection | `tools/prepare/info.sh --root .` |
| `ls/find/wc/globs` for directory listing | `tools/prepare/info.sh --root .` |
| `grep "pattern"` for searching | `tools/check/status-report.sh` |
| `for ...; do` for loops | `tools/prepare/register.sh --input-file sites.csv` |
| `mkdir -p` for directory creation | `tools/prepare/batch.sh --version v1 --batch-size 20` |
| `python3 script.py` | `uv run python3.12 tools/.../<script>.py` (or shell scripts that handle Python internally) |
| `cat <<EOF` HEREDOC | Write tool or Filesystem MCP |
| Creating new scripts | Use existing assist scripts only |

---

## Steps for LLM (non-interactive)

> **Note:** General Rules (G1-G4) apply to ALL steps below.

### Step 1: Check Environment
At first, execute:
Tool: Bash
Command: `tools/prepare/info.sh --root .`

Use this instead of `ls`/`cat` to preview config and sites.csv.
Never use other commands to check environment.

If `sites.csv` doesn't exist, report user its abscense and stop here.

### Step 2: Validate and Register
Execute #1:
Motivation: Check existence of `.smbatcher/REGISTRY.md`
Tool: Read
Path: `.smbatcher/REGISTRY.md`

If file does not exist (Read returns error), execute #2 and #3
If file exists, proceed to Step 3

Execute #2:
Motivation: Validate sites.csv format before registration
Tool: Bash
Command: `tools/check/site-list.sh sites.csv`

Execute #3:
Motivation: Register sites in REGISTRY.md from `sites.csv`
Tool: Bash
Command: `tools/prepare/register.sh --input-file sites.csv`

### Step 3: Create Batch

Execute:
Tool: Bash
Command: `tools/prepare/batch.sh --version v1 --batch-size 20 --input-file sites.csv`
Motivation:
- Script selects eligible sites
- Marks batch in REGISTRY
- Creates: `Batch_<ID>.md` under `.smbatcher/batches`
- Creates directories: `sites/<domain>-vX/`
Modifiable: modify version (v1) and batch-size (20) based on user request

### Step 4: Verify Registration
Execute:
Tool: Bash
Command: `tools/check/status-report.sh`

---

## Outputs
- New `Batch_<ID>.md` in `.smbatcher/batches/` listing domains/themes/status.
- Update sites which seleceted in new batch as B at `.smbatcher/REGISTRY.md` 
- Site directories under `sites/<domain-vX>/`.

## Status Code Flow
```
- (none) → B (batched) → d (designing) → D (designed) → O (open) → i (implementing) → I (implemented) → Q (queued/complete)
```
