Freshness Decay Curves

By Arc Labs ResearchMay 2, 202614 min read

A memory's value at time-of-retrieval is not the same as at time-of-write. A six-month-old fact about the user's job title is probably still right. A six-month-old preference for a particular IDE theme might be obsolete. A six-month-old "I'm in Boston this week" is definitely obsolete.

Per-type freshness decay encodes this. Each memory type has a half-life; decay is exponential. Access boost — a logarithmic function of retrieval count — counteracts decay for memories that prove their value at retrieval time.

Decay curves · per-type half-lives

Slide the day index; see freshness across all five types. Access count lifts every curve.

The formula

freshness(t) = 2^(−t/τ)

Exponential decay; τ is the type-specific half-life in days.

Fact: τ = 180 days.
Preference: τ = 90 days.
Event: τ = 30 days.
Entity: τ = 365 days.
Relation: τ = 180 days.

Decay math in depth

The formula 2^(-t/τ) is the standard radioactive-decay equation written in base-2. Equivalently: exp(-t × ln(2) / τ). The two forms compute identically; base-2 is preferred because it gives τ a direct, intuitive meaning — at exactly t = τ days, freshness = 0.5. That is a genuine half-life in the physics sense.

Why base-2 rather than base-e? With base-e you write exp(-t/τ) and at t = τ, freshness ≈ 0.368, not 0.5. That means τ is no longer a half-life — it's a "1/e life," which is harder to reason about when you're setting type parameters by hand. Base-2 keeps the parameter semantics human-readable.

Why exponential at all, rather than linear? A linear model 1 - t/(2τ) reaches zero at t = 2τ and goes negative beyond it. You'd need to clamp, producing a hard cliff: at day 359 the memory is fine; at day 361 it's completely dead. Exponential decay never reaches zero — it asymptotes. A memory is never completely forgotten; it just becomes very unlikely to surface in ranking. That aligns with how retrieval actually works: even a two-year-old fact might be the only instance of a key piece of information.

What floor do you need to keep a fact retrievable for N years?

Setting the freshness floor to f_floor and wanting the memory to remain retrievable until t = N × 365 days requires:

2^(−N×365/τ) ≥ f_floor
N × 365 / τ ≤ log₂(1/f_floor)
N ≤ τ × log₂(1/f_floor) / 365

For a Fact (τ = 180), floor = 0.1:

N ≤ 180 × log₂(10) / 365
N ≤ 180 × 3.322 / 365
N ≤ 1.64 years

So with a 0.1 floor and no access boost, a Fact that is never retrieved becomes suppressed after about 20 months. Add even modest access boost (access_count = 5, boost ≈ 2.79) and the effective freshness at 20 months is 0.063 × 2.79 ≈ 0.18, well above floor — so accessed memories survive much longer. The floor is not a death sentence; it's a gate that access count can lift.

A concrete lookup table for Fact (τ = 180) with no access boost:

Age (days)	Freshness
30	0.891
90	0.707
180	0.500
360	0.250
540	0.125
720	0.063

At 720 days (two years) a completely unaccessed Fact has freshness 0.063 — below the 0.1 floor, so it gets suppressed from ranked retrieval. At 540 days (18 months) it still has freshness 0.125 — above floor, still returned.

Per-type half-lives: the reasoning

The half-lives aren't pulled from thin air. Each reflects an empirical claim about how fast that class of information loses predictive validity.

Fact (τ = 180 days). A fact is a declarative statement about the world: "the user's employer is Acme Corp", "their preferred language is Rust", "they live in Berlin". Facts are stable but not permanent — job changes, relocation, skill pivots. A 180-day half-life means that a fact about current employment has 70% freshness at 90 days and 50% at 180 days. At one year it's at 25% — still retrieved, but ranked lower than a fresher confirmation.

Preference (τ = 90 days). Consider: "prefers dark mode", "likes concise answers", "wants code examples in Go". Preferences drift faster than facts because they respond to environment changes (new machine, new team, new project). Eight months is well past 90 days — 2^(-240/90) = 2^(-2.67) ≈ 0.158. A preference from eight months ago is at 16% freshness. If it has been retrieved and confirmed a few times (access_count = 5, boost = 2.79), effective freshness is 0.158 × 2.79 ≈ 0.44 — comfortably retrievable but not dominant. If it has never been retrieved since being written, it's close to the floor and will soon be suppressed. That's the right behavior: an eight-month-old unconfirmed IDE preference probably doesn't deserve to inform responses.

Event (τ = 30 days). Events encode time-bounded reality: "I'm in Boston this week", "the sprint ends Friday", "we just shipped v2.1". A 30-day half-life is aggressive by design. At 30 days, freshness = 0.5. At 60 days, 0.25. At 120 days, 0.063 — below floor. A four-month-old event is, for practical purposes, gone from ranked retrieval. This is what you want. The agent should not volunteer "you were in Boston" as if it's a current fact.

Entity (τ = 365 days). Entities are nodes in the knowledge graph: people, organizations, products, codebases. "Alice is on the platform team" or "the payments service uses Stripe" are entity-linked facts that change on slow cycles — org restructures, vendor switches. A one-year half-life keeps them fresh through normal turnover. At two years, freshness = 0.25; at three years, ≈ 0.125. Still retrievable, but ranked lower. Entities intentionally outlast preferences and events.

Relation (τ = 180 days). Relations encode the edges: "Alice reports to Bob", "service A depends on service B", "this PR was reviewed by Carol". The 180-day half-life matches Facts because relations are typically grounded in the same kind of durable-but-mutable reality as declarative facts. A reporting chain is fairly stable; a code dependency is more volatile. If your domain has highly volatile relations, this is the first parameter to tune down.

Access boost interplay with decay

retrieval_freshness = freshness(t) × (1 + ln(1 + access_count))

The access boost is logarithmic in retrieval count — same shape as repetition.

The access boost table:

access_count	boost
0	1.000
1	1.693
5	2.792
10	3.398
100	5.620

Why logarithmic? The same reason the repetition boost in the confidence formula is logarithmic: to prevent popularity dominance. A memory accessed 1000 times has boost 1 + ln(1001) ≈ 7.91. A memory accessed 10 times has boost 3.40. The ratio is 2.3×, not 100×. A heavily-cached memory does not permanently crowd out newer memories.

Worked example: two memories, same age, different access counts.

Take two Preferences written 120 days ago (τ = 90).

freshness(120) = 2^(-120/90) = 2^(-1.333) ≈ 0.397

Both start at 0.397.

Memory A has been retrieved 0 times (access_count = 0):

effective_freshness_A = 0.397 × 1.000 = 0.397

Memory B has been retrieved 8 times (access_count = 8):

boost_B = 1 + ln(1 + 8) = 1 + ln(9) ≈ 1 + 2.197 = 3.197
effective_freshness_B = 0.397 × 3.197 ≈ 1.269

Memory B's effective freshness is above 1 — it will rank above even newly-written memories in the freshness component. Memory A, at 0.397, is still above the 0.1 floor and will be retrieved, but sits well below B.

Now push the clock forward another 150 days — both memories are 270 days old:

freshness(270) = 2^(-270/90) = 2^(-3) = 0.125

Memory A (still never retrieved):

effective_freshness_A = 0.125 × 1.000 = 0.125 (above floor, barely)

Memory B (still 8 retrievals):

effective_freshness_B = 0.125 × 3.197 ≈ 0.400

Memory A is at 0.125 — above 0.1, but it will be suppressed within the next few weeks if it keeps going unaccessed. Memory B is alive and healthy at 0.40. This is the "useful memories stay fresh" property in action: the retrieval system acts as a proof-of-value signal. If a memory keeps getting surfaced and used, its effective freshness stays high. If it was written once and never touched, it fades.

Combined weight: the full calculation

At retrieval time, freshness is not the only signal. It multiplies into a full retrieval weight:

weight(m) = base_retrieval_score × freshness(age(m)) × access_boost(m)

base_retrieval_score is the fused RRF rank score; freshness and boost are applied in the policies stage.

Let's work through a complete example.

Inputs:

Memory type: Fact (τ = 180 days)
Age: 200 days
access_count: 7
base_retrieval_score from RRF fusion: 0.015

Step 1: freshness.

freshness(200) = 2^(-200/180) = 2^(-1.111) ≈ 0.463

Step 2: access boost.

boost = 1 + ln(1 + 7) = 1 + ln(8) = 1 + 2.079 = 3.079

Step 3: effective freshness.

effective_freshness = 0.463 × 3.079 ≈ 1.426

Step 4: weight.

weight = 0.015 × 1.426 ≈ 0.0214

Compare to a newer Fact with age 10 days, access_count = 0, same base score:

freshness(10) = 2^(-10/180) ≈ 0.962
boost = 1.000
effective_freshness = 0.962
weight = 0.015 × 0.962 ≈ 0.0144

The 200-day-old Fact with 7 retrievals (weight 0.0214) outranks the brand-new Fact with 0 retrievals (weight 0.0144). The access boost more than compensates for the freshness penalty. This is intentional: the older memory has demonstrated its value through repeated use; the newer one hasn't yet.

The base RRF score of 0.015 is typical for a mid-rank result in a corpus of a few thousand memories. RRF scores from fusion tend to cluster between 0.005 and 0.05; freshness and boost are multiplicative modifiers within that range.

The freshness floor: why 0.1

effective_freshness = max(freshness(t), floor)

Effective freshness is clamped at a configurable floor; default 0.1.

The floor exists for a specific reason: without it, very old memories become completely unretrievable even if they hold unique information. A birthdate extracted on day 1 that was never re-retrieved would reach freshness 0.001 after several years and be buried far below any reasonable retrieval threshold.

What breaks if floor = 0? Old memories with zero access count vanish from ranked retrieval entirely. For most preferences that's fine — an eight-year-old preference is probably stale. But for long-lived facts — someone's date of birth, their primary language, a longstanding system constraint — zero floor means permanent amnesia. The agent forgets things that should be permanent but were simply never re-retrieved because they were never relevant enough to surface.

What breaks if floor = 0.5? A floor of 0.5 means the freshness penalty caps out at a 2× handicap relative to new memories. Ancient memories stay competitive in ranking forever. A seven-year-old event ("I was in London for a conference") would permanently compete with fresh context. The ranking signal degrades to near-noise for the freshness component.

Why 0.1? A 10% floor gives a 10× disadvantage to a memory that has never been refreshed. New memories and frequently-accessed old memories rank comfortably above floor-clamped memories. But floor-clamped memories aren't zero — they can still appear if the semantic match is strong enough and no fresher alternative exists. The 10% signal is weak but non-zero, which is exactly right for "this is the only known instance of this fact."

At floor = 0.1, a memory with RRF score 0.015 has final weight 0.015 × 0.1 = 0.0015. That's at the bottom of the ranking but not impossible to retrieve. A retrieval that returns 20 results might include it at position 18 or 19 — visible to a thorough application, invisible to one that cuts off at top-5.

Why exponential, not linear

Linear decay produces hard cliff-edges: at day τ × 2 the memory is exactly half useful, at day τ × 4 it's gone. Exponential decay is smooth — useful memories stay useful, marginal ones fade gracefully. Empirically, retrieval value follows an exponential more closely than a linear over time; this is the same reason recency is a power-law feature in classical information retrieval. The exponential also composes cleanly with the access boost: both the decay and the boost use the same underlying curve shape (log for boost, exponential for decay), which means the steady-state for a frequently-retrieved memory is well-behaved.

A linear model would also require explicit clamping at zero. Every memory type would need a hard cutoff date. That's operationally fragile: a Fact with τ = 180 would hard-expire at 360 days, requiring manual exception-handling for memories that are still valid at day 361.

Identity-class facts that shouldn't decay

Some facts don't decay. Birthdate, legal name, country of residence — these are facts that the user treats as permanent. An agent that slowly degrades its confidence in your birthdate because it hasn't been retrieved recently is doing the wrong thing.

The practical implementation has two options:

τ → ∞. Freshness(t) = 2^(-t/∞) = 2^0 = 1 at any age. The memory stays at full freshness forever. In practice, set τ to a very large number (e.g., 36,500 days — 100 years). This keeps the same code path; the math just never decays.
A separate permanent type. Bypass the freshness calculation entirely. The retrieval weight reduces to base_retrieval_score × access_boost(m). Permanent memories are always retrieved at their semantic score without freshness penalty.

Option 2 is cleaner architecturally because it makes the intent explicit and avoids floating-point edge cases at large τ. In the current implementation, permanent-class storage uses τ = ∞ in the config with a special no-decay code path triggered by that sentinel.

The canonical set of identity-class facts: legal name, date of birth, language/locale, long-term home city, explicit system imports (calendar, CRM, HR system). The key signal is: the user would be annoyed if the agent got this wrong after a long gap.

Decay and drift: when type-level half-life isn't enough

Freshness decay handles the passage of time. It does not handle concept drift — when the meaning or value of a memory changes abruptly rather than gradually.

Consider "Alice is the team lead for platform." This Relation has τ = 180 days. If nothing happens, freshness decays smoothly. But if Alice gets promoted to VP and a new team lead is appointed, the memory is now not just stale — it's actively wrong. An agent relying on decayed-but-not-yet-suppressed data will give incorrect answers about the org structure. At 90 days post-change, the memory still has freshness 0.50. At 180 days it has 0.25. It's hurting retrieval quality throughout that window.

Drift detection (covered in the concept drift article) handles this case. The drift signal updates the memory's superseded_by field when a conflicting fact is extracted, which sets retrievable = false immediately — no waiting for the decay curve. Decay and drift are complementary:

Decay handles gradual staleness: the memory ages out smoothly over months.
Drift handles abrupt invalidation: the memory is marked superseded the moment a conflicting extraction arrives.

A well-tuned memory system uses both. Relying only on decay means wrong-but-recent data can crowd out correct-but-older data during the transition window. Relying only on drift detection misses the case where nothing explicit contradicts the memory — it just slowly stops being true.

The interaction between the two: a superseded memory's freshness decay continues even after being marked superseded, but the superseded flag independently sets retrievable = false. If the superseding memory is itself later superseded (Alice leaves, a new team lead is named, then also leaves), the system can potentially walk the supersession chain to recover earlier data — but it will have aged accordingly.

Tuning half-lives for your domain

The default half-lives are calibrated for a general-purpose personal assistant. Different deployment domains have different stability characteristics.

High-stability domains (medical, legal, financial records): Facts change slowly. A diagnosis, a drug allergy, a court order — these can stay accurate for years. τ for Facts should be pushed up to 365 days or beyond. Event τ can stay at 30 days (appointment dates expire normally). The risk in these domains is over-suppression: critical long-lived facts dropping below floor because they weren't re-retrieved in 18 months. Raise the floor to 0.2 or use permanent-class storage for safety-critical data.

High-churn domains (news, social monitoring, live markets): Facts become stale within hours or days. A news monitoring agent should set τ for Facts to 3–7 days. Events should be even shorter — 1–3 days. The access boost matters less here because you want aggressive pruning regardless of retrieval history. In these domains, consider lowering the boost coefficient (reduce the 1× multiplier on the log term) to prevent popular-but-old content from dominating.

Customer support (moderate churn): A user's subscription tier, their open tickets, their stated preferences — these change on a 30–90 day cycle. Fact τ around 90 days, Preference τ around 45 days. Entity τ (customer organizations) can stay high. The tricky case is product version information: a preference for behavior that existed in v1 but was changed in v2 is a drift problem, not a decay problem.

Personal knowledge management (slow churn): Daily-use notes, long-lived professional context. The defaults are roughly right here. The main tuning is increasing Entity τ to 730 days (2 years) for professional relationships and organizational affiliations.

A general heuristic: set τ to the median interval between changes in that fact category, then multiply by 2. The factor-of-2 buffer keeps the memory above floor during the gap between when the fact changes and when the system ingests the updated version.

The retrievable flag mechanics

Freshness decay is computed at query time from the stored creation timestamp. There is no scheduled job that walks the memory store and updates a freshness column. The computation is cheap — a single exponentiation per candidate — so it runs inline during the ranking pass.

The retrievable flag is a different mechanism. A background worker runs on a configurable schedule (default: nightly) and sets retrievable = false for memories where:

Age > 365 days (configurable)
Last accessed > 180 days ago
freshness(age) × access_boost(access_count) < 0.1
Either marked superseded and older than 1 year, OR access_count = 0
No active Relation references this memory via evidence_memory_ids

All five conditions must hold. Condition 5 is the safety net: a memory that is anchored as evidence for an active relation cannot be pruned, even if all other conditions are met. This prevents orphaned relation nodes.

When retrievable = false, the memory is excluded from vector search index queries and keyword search filters. It skips the retrieval stage entirely — it doesn't even get a score. This is distinct from having low freshness: a low-freshness memory with retrievable = true still participates in retrieval and can surface if the semantic match is strong enough. A memory with retrievable = false does not participate at all.

The flag flips back. If a client calls the memory API with a specific memory ID, the direct-lookup path bypasses the retrievable flag entirely. The memory is returned regardless of freshness or retrievable status. After a successful direct lookup, the worker re-evaluates and can flip retrievable back to true if conditions no longer hold (e.g., the access_count just incremented past 0, voiding condition 4).

This matters for one practical scenario: a user explicitly asks about something old. "What did I say about X two years ago?" The agent fetches the relevant memory IDs via a separate administrative lookup, the direct-fetch path returns them, and those memories re-enter active status. They'll then rank according to their freshness (low, with floor) and access boost (now incremented), and may be suppressed again on the next background pass if they're still below threshold.

The audit trail is never purged by the retrievable flag. retrievable = false is invisible to the user; the memory's full record — creation timestamp, source, content, decay history — remains readable via audit APIs. This is intentional. Explainability requires that every decision to suppress a memory be reconstructable.

When decay misses the point

Identity-class facts. Birthdate, name, hometown — these don't decay. Use τ → ∞ or a permanent type.
Drifty preferences. A user who switched IDEs three years ago does not want decay-but-still-retrieved old preferences resurfacing. Decay should be type-aware and drift-aware. See concept drift detection.
Cyclical events. Birthdays come around. A "birthday is next week" Event should not just decay — it should be boosted periodically on the calendar recurrence. Use periodic boosting around the calendar event, not just access count.
Negations. "I no longer work at Acme" is not just a new Fact — it supersedes an old one. Relying on the old Fact's decay to handle the transition is too slow. Negations need explicit supersession, not patience.