FreshContext · Illustrative static demo · v1.0

Most RAG pipelines fail because of time, not embeddings.

Same model. Same mock retrieval set. Same query. Two different answers, because one ranking path preserved source age and the other did not.

This demo uses synthetic documents and simplified envelopes to show how FreshContext treats signals. It does not certify the documents, determine truth, or provide current technical advice.

The query

A developer in an illustrative 2026 scenario asks an AI agent for current best practices. Five mock documents are retrieved by semantic similarity:

Query

The retrieval set

Each mock document scored on pure semantic similarity to the query. The top result is an older, confidently written example with a dense keyword match.

The math

FreshContext applies a single correction: an exponential decay weight based on document age. Nothing else changes — same embeddings, same vectors, same retrieval set.

Rt = R0 · eλ·t
  • R0 = base semantic relevancy [0–100]
  • λ  = source-specific decay constant (per hour)
  • t  = hours elapsed since the document was published
  • Rt = decay-adjusted relevancy at query time

For this demo, λ = 0.0001 (half-life ≈ 9.5 months). The live FreshContext engine uses source-specific λ values: HN front page ≈ 14h half-life, blog posts ≈ 29 days, academic papers ≈ 1.6 years.

The ranking shift

The 2022 blog falls from rank 1 to rank 5. The 2026 X post rises from rank 4 to rank 1. Same documents, decay applied:

Without FreshContext
Semantic top-k ranking
    With FreshContext
    Decay-adjusted ranking

      The answer that changes

      Both illustrative answers use the same query. The only difference is which three documents appear at the top of the context window:

      Without FreshContext
      Top context: …
      With FreshContext
      Top context: …

      What this is, and what it isn't

      This isn't a model-certification claim. The demo shows how a model can faithfully summarize stale context when that is what appears first.

      This isn't an embedding benchmark. The mock cosine scores are intentionally plausible so the freshness signal can be isolated.

      This is a context-engineering problem. Retrieval ranks correctly along one axis (semantic similarity) and ignores the other axis that matters in production (temporal validity). FreshContext adds the missing axis.

      "Most RAG pipelines rank context correctly semantically but incorrectly temporally."

      Reproduce this

      The retrieval set, the math, and the prompts are all open. Read the data file, change the query, swap λ, run it against your own RAG pipeline: