actionlint enforcement — local + CI

Stop the next M130 saga before it starts. Catches GHA-native bugs that regular YAML lint and per-workflow CI runs miss.

Why this saves us

Recent example: M130 cron pipeline (PRs #1591..#1599) — 9 PRs over 3 hours.

Two real bugs caused most of it:

Both would have been caught in 30 seconds before the first push.

Two layers, two speeds

LayerWhen it runsSpeedCoverage
Pre-commit hookBefore commit lands locally~50msPer-developer (only those with actionlint installed)
CI workflowOn every PR + push to main~30sEveryone — required check

Layer 1 — pre-commit hook

Extends bin/git-hooks/pre-commit (the repo's existing hook, runs via core.hooksPath = bin/git-hooks) with Section 8:

WORKFLOW_FILES=$(echo "$STAGED_FILES" | grep -E '^\.github/workflows/.*\.ya?ml$' || true)
if [[ -n "$WORKFLOW_FILES" ]]; then
  echo -n "  Linting GitHub Actions workflows... "
  if command -v actionlint >/dev/null 2>&1; then
    if ! actionlint -shellcheck= $WORKFLOW_FILES >/tmp/actionlint-precommit.log 2>&1; then
      echo "FAILED"
      head -20 /tmp/actionlint-precommit.log
      ERRORS=$((ERRORS + 1))
    else
      echo "OK"
    fi
  else
    echo "SKIP (actionlint not installed — install: brew install actionlint)"
  fi
fi

Graceful fallback: missing actionlint → warn + skip, not block.

Layer 2 — CI workflow validate-workflows.yml

name: Validate Workflows
on:
  pull_request:
    paths: ['.github/workflows/**']
  push:
    branches: [main]
    paths: ['.github/workflows/**']
jobs:
  actionlint:
    runs-on: ubuntu-latest
    timeout-minutes: 3
    steps:
      - uses: actions/checkout@de0fac2e... # v6.0.2
        with:
          persist-credentials: false
      - run: |
          curl -sSfL https://.../download-actionlint.bash -o /tmp/install.sh
          bash /tmp/install.sh 1.7.7
          ./actionlint -version
      - run: ./actionlint -shellcheck= .github/workflows/*.yml

Why -shellcheck= (empty)

Diagnostic on current main: 0 GHA-native errors ([expression], [syntax-check], [deprecated], [action]) but 69 shellcheck info/style findings across 12 workflows.

Most are SC2086 (unquoted vars in ${{ ... }} GH expressions) which are pre-quoted by GHA and harmless in practice. Enabling shellcheck would block every PR until those 69 are hand-fixed.

Strategy: enable actionlint NOW with shellcheck off — catch the dangerous bugs immediately. Re-enable shellcheck later as a hardening step (separate backlog ticket).

Verification

  1. This PR merges.
  2. Try staging a deliberately-broken workflow: pre-commit blocks with the actionlint message.
  3. Open a PR with the same broken workflow: Validate Workflows / actionlint CI check fails.
  4. Once added to required checks, broken workflow PRs cannot merge.