Reciprocal Rank Fusion (k=60)
Cormack, Clarke, Büttcher (2009), SIGIR
Open citationThis audit tests Search.SmartQuery.sourceWeights (apple-docs=3.0, swift-evolution=1.5, packages=1.5, swift-book=1.0, swift-org=1.0, samples=1.0, apple-archive=0.5, hig=0.5) — the per-source authority bias applied during cross-source RRF fan-out. This machinery had no test coverage before today.
Each card opens its own page. The headline and charts above are all you need at a glance; the cards are for the why and how.
The 6 mismatches are not "ranker chose wrong source from candidates"; they are "expected source had nothing in the top-10 at all because apple-docs's source weight outcompetes the lower-weight sources." A deeper finding…
Read details →For 19 queries where the expected top-1 source was apple-docs (weight 3.0), the actual top-1 was always apple-docs:
Read details →For each of these queries, the HIG or apple-archive corpus does have a relevant page (verified via cupertino search "<query>" --source hig and --source apple-archive):
Read details →25 (query, expected_source, rationale) triples. For each: run cupertino search "<query>" --limit 10, extract top-1 URI, derive its source via the URI prefix, compare against expected.
Read details →For the LLM-agent consumer, this baseline is good news: the agent grounding on cupertino's top-1 will always get apple-docs (the API reference) for any Swift-code-generation query that has an apple-docs answer.
Read details →Three of eight query classes from §1.4 now have documented baselines. Five remain: C (acronym), D (CamelCase fragment), G (prose), H (symbol-attribute), and Phase 1.7 agent-end-to-end.
Read details →Every metric and method this audit relies on, with a link to the foundational source. Auto-collected from the audit text.
Cormack, Clarke, Büttcher (2009), SIGIR
Open citationConover (1999), Practical Nonparametric Statistics
Open citationVoorhees (1999), TREC-8 QA Report
Open citationManning, Raghavan, Schütze (2008) IIR §8.4
Open citation