MONTREAL.AI / SKILLOS
Autonomous RSI Cyber Defense Market Proof
Recursive self-improvement on SOC alert triage and safe incident containment planning.
Current status
PASSED_AUTONOMOUS_RSI_CYBERDEFENSE_MARKET_PROOF
No human review. No emails. No invoices. No CloudOps reuse. No customers. No private data. No API keys. Deterministic holdout benchmark.
+83.1 ptsfully-correct gain
100.0%SEV1 recall
98.1%containment-time reduction
$5,711,374,966.36synthetic cost avoided
Recursive self-improvement curve
Before / after on holdout incidents
| Metric | Baseline | SkillOS RSI |
|---|---|---|
| Fully correct decisions | 16.9% | 100.0% |
| Incident accuracy | 16.9% | 100.0% |
| Action accuracy | 16.9% | 100.0% |
| SEV1 recall | 7.8% | 100.0% |
| Unsafe action rate | 83.1% | 0.0% |
| Avg time to containment | 361.4 min | 6.8 min |
| Avg cost | $9027401.56 | $103378.17 |
Final learned skills
- detect_impossible_travel_ato — Detect impossible travel plus sensitive access and revoke sessions, lock the account, and rotate tokens.
- detect_mfa_fatigue — Detect MFA push flooding and enforce phishing-resistant MFA before access resumes.
- detect_c2_dns_tunnel — Detect anomalous DNS tunneling and sinkhole domains while isolating resolver clients.
- detect_data_exfiltration — Detect unusual outbound data transfer after sensitive reads and block egress while preserving evidence.
- detect_ransomware_staging — Detect encryption staging and lateral movement, then isolate hosts and protect backups.
- detect_privilege_escalation — Detect unusual admin role grants and revoke privilege while reviewing the identity path.
- detect_public_bucket_exposure — Detect public exposure on sensitive storage and remove public access policy.
- detect_cloud_key_leak — Detect leaked cloud access keys and revoke, rotate, and audit usage.
- detect_suspicious_oauth_grant — Detect unusual OAuth grants with broad scopes and revoke the grant.
- detect_insider_mass_download — Detect abnormal mass download by a legitimate user and pause access for review.
- detect_endpoint_cryptominer — Detect cryptomining behavior and remove persistence without disrupting unrelated services.
- detect_phishing_session_hijack — Detect session hijack after phishing and revoke sessions while blocking the phishing domain.
- detect_supply_chain_token_abuse — Detect CI/CD token abuse and freeze the pipeline while verifying artifacts.
- detect_malware_beaconing — Detect endpoint beaconing to suspicious infrastructure and isolate affected hosts.
- detect_credential_stuffing — Detect high failed-login velocity with broad account spray and contain through rate limiting plus forced resets.
- detect_benign_anomaly — Recognize benign anomalies and avoid unnecessary containment.
Proof gates
- ✅ not email workflow
- ✅ not invoice workflow
- ✅ not cloudops workflow
- ✅ defensive only cybersecurity workflow
- ✅ no human review required
- ✅ no emails sent
- ✅ no customers contacted
- ✅ no private data used
- ✅ no api keys required
- ✅ deterministic reproducible benchmark
- ✅ recursive self improvement releases at least 7
- ✅ rsi validation improves monotonically
- ✅ train cases at least 300
- ✅ validation cases at least 150
- ✅ holdout cases at least 600
- ✅ final rules at least 15
- ✅ fully correct gain at least 70 points
- ✅ incident accuracy at least 99 percent
- ✅ action accuracy at least 99 percent
- ✅ sev1 recall at least 99 percent
- ✅ unsafe action rate zero
- ✅ containment time reduction at least 80 percent
- ✅ cost reduction at least 80 percent
- ✅ synthetic cost avoided positive