MONTREAL.AI / SKILLOS
Autonomous RSI CloudOps Market Proof
Recursive self-improvement on objective cloud reliability incident triage and cost remediation.
Current status
PASSED_AUTONOMOUS_RSI_CLOUDOPS_MARKET_PROOF
No human review. No emails. No invoices. No customers. No private data. No API keys. Deterministic holdout benchmark.
+88.8 ptsfully-correct gain
100.0%SEV1 recall
90.8%MTTR reduction
$56,217,294.29synthetic cost avoided
Recursive self-improvement curve
Before / after on holdout incidents
| Metric | Baseline | SkillOS RSI |
|---|---|---|
| Fully correct decisions | 11.2% | 100.0% |
| Root-cause accuracy | 11.2% | 100.0% |
| Action accuracy | 11.2% | 100.0% |
| SEV1 recall | 12.9% | 100.0% |
| Unsafe action rate | 35.7% | 0.0% |
| Avg MTTR | 129.7 min | 11.9 min |
| Avg cost | $144817.31 | $10966.61 |
Final learned skill rules
- detect_memory_leak — If memory slope and OOM kills rise, restart leaking pods and open leak investigation.
- detect_cache_stampede — If cache hit rate collapses and DB QPS spikes, enable coalescing, rate limiting, and cache warmup.
- detect_db_pool_exhaustion — If DB waiters and connection saturation spike, tune pool limits and throttle callers.
- detect_cert_expiry — If TLS handshake failures and cert-expiry signals appear, renew cert and reload ingress.
- detect_dns_misconfig — If NXDOMAIN/SERVFAIL spikes after DNS change, rollback DNS record and flush bad cache.
- detect_disk_pressure — If disk usage and write failures rise, clear log growth and expand volume.
- detect_queue_backlog — If queue depth and message age rise, scale workers and apply backpressure.
- detect_third_party_outage — If third-party latency dominates while internal metrics are healthy, enable circuit breaker and fallback.
- detect_cost_spike_idle_resources — If cost spikes while utilization is low, shut down idle resources and apply budget guardrail.
- detect_quota_limit — If provider quota errors spike, apply retry backoff and request quota increase.
- detect_secrets_rotation_failure — If auth failures spike after secret rotation, rollback secret version and rotate safely.
- detect_feature_flag_misroute — If traffic routes incorrectly after a flag change, disable the flag and restore routing.
- detect_cpu_saturation — If CPU saturation drives latency without deploy correlation, scale HPA and right-size CPU requests.
- detect_deploy_regression — If error rate spikes within 30 minutes of a deploy, rollback the canary and freeze deploys.
Proof gates
- ✅ not email workflow
- ✅ not invoice workflow
- ✅ no human review required
- ✅ no emails sent
- ✅ no customers contacted
- ✅ no private data used
- ✅ no api keys required
- ✅ deterministic reproducible benchmark
- ✅ recursive self improvement releases at least 5
- ✅ rsi validation improves monotonically
- ✅ train cases at least 250
- ✅ validation cases at least 100
- ✅ holdout cases at least 400
- ✅ final rules at least 12
- ✅ fully correct gain at least 50 points
- ✅ root cause accuracy at least 95 percent
- ✅ action accuracy at least 95 percent
- ✅ sev1 recall at least 99 percent
- ✅ unsafe action rate zero
- ✅ mttr reduction at least 70 percent
- ✅ cost reduction at least 70 percent
- ✅ synthetic cost avoided positive