Design: Unified Pipeline Architecture

Section 2 of 4 — Watcher Redesign (Per-Repo Queues + LLM Pools)

repos.yaml — new fields
repos:
  - repo: owner/repo-A
    tracker_repo: owner/tracker-A
    parallel_issues: 2       # NEW: run 2 issues at once (default: 1)
    labels:
      ai-feature:
        pipeline: ai-feature  # optional override; default = label name
      ai-fix: {}

  - repo: owner/repo-B
    tracker_repo: owner/tracker-B
    parallel_issues: 1        # sequential
    labels:
      ai-feature: {}
      ai-custom:
        pipeline: my-custom-pipeline   # maps to pipelines/my-custom-pipeline.yaml
config.yaml — new llm.pools section
llm:
  # ... existing settings unchanged ...
  pools:                        # NEW: per-backend concurrency limits
    ollama: 1                   # only 1 simultaneous Ollama call
    openai: 10                  # 10 concurrent OpenAI calls
    opencode-zen: 5
    opencode-go: 5
    anthropic: 5
    # default for unlisted backends: 5
watcher.py internals
Global: LLMPoolManager (one Semaphore per backend, from config)
           │
           ├── Repo A: ThreadPoolExecutor(max_workers=parallel_issues)
           │     ├── Worker thread: issue #3 → load pipelines/ai-feature.yaml → run
           │     └── Worker thread: issue #5 → load pipelines/ai-fix.yaml → run
           │
           └── Repo B: ThreadPoolExecutor(max_workers=1)
                 └── Worker thread: issue #12 → load pipelines/ai-feature.yaml → run

Each LLM call in base_agent.py:
  with llm_pool.acquire(backend_name):
      response = self._call_llm(...)    # only N calls at a time per backend
What stays the same