In 2011, Tampa repainted its runway numbers. The runway hadn't moved a centimeter. Magnetic north had. Your codebase is full of the same constant: a value that was true when you wrote it, quietly going wrong while every test still passes.
In 2011, Tampa International Airport closed its primary runway and sent out the paint crews. They ground off the big white 18R at one end and stenciled 19R in its place; the far end's 36L became 1L. Taxiway signs were swapped. Approach charts were reissued. Flight-management databases were updated, pilots were rebriefed, and the FAA processed a small mountain of paperwork.
Here is the part worth sitting with: the runway did not move. Not a centimeter of displacement, not a fraction of a degree of rotation. Thousands of feet of concrete pointed exactly where they had pointed the day before. What moved was magnetic north.
A runway's number is not a name. It's a measurement, the runway's magnetic compass heading, rounded to the nearest ten degrees, with the trailing zero dropped. Runway 18 points roughly 180° magnetic, due south. And because the magnetic heading of a fixed strip of concrete depends on where the planet's magnetic pole happens to be, the number has an expiration date. The FAA tries to keep the magnetic-variation figure used for guidance within about one degree of reality. Round to ten degrees, hold a one-degree tolerance, and a runway can only absorb a few degrees of drift before its label is officially a lie. Tampa wasn't an oddity. Fairbanks renumbered in 2009, 1L/19R became 2L/20R, on a cadence that works out to roughly every 24 years at that latitude. London Stansted reassigned its runway the same year, and officials there estimated about 56 years until they'd have to do it again. Same planet, same phenomenon, different drift rate at every airport.
The reference frame, it turns out, is the unreliable component. The north magnetic pole has wandered more than 1,000 kilometers since explorers first fixed its position in 1831. And lately the drift hasn't even held a steady speed: starting in the 1990s the pole accelerated toward Siberia, reaching 50–60 km/yr, and then, in the words of the geophysicists tracking it, staged "the biggest deceleration in speed we've ever seen," dropping to about 35 km/yr. The current best explanation involves two huge lobes of magnetic flux on the core–mantle boundary, one under Canada and one under Siberia, playing tug-of-war with the pole. Even the rate of change changes. Meanwhile the agonic line, the line of zero declination, where compass north and true north agree, has spent a century sweeping across the American Midwest: through Detroit in the early 1900s, then Chicago, then the Mississippi Valley; in the current World Magnetic Model it runs from northern Canada down through the Great Lakes toward the Gulf. Whole regions of the planet have watched "the compass is basically right here" become false, slowly, without anything on the ground changing at all.
Software is full of runway numbers. Most codebases are honeycombed with values that are secretly measurements of an external reference frame, written down as if they were properties of the concrete:
The vendor API response you parse, where status: "closed" meant one thing in v2 and quietly means something subtly different in v3, because the vendor's own semantics drifted underneath a field name that never changed. The regulatory threshold baked into a config: 1024-bit RSA keys were acceptable practice until NIST's key-management guidance disallowed them at the end of 2013; TLS 1.0 and 1.1 went from "best practice" to "formally deprecated" when RFC 8996 landed in March 2021. The value that counts as compliant drifts, even though your code's value sits still. The "current best practice" connection-pool setting your team copied from an authoritative blog post in 2019, which the same author would no longer recommend. The tax bracket, the emissions limit, the clinical reference range. And, the one this audience is living through right now, the large-language-model output format: a prompt and parser tuned to one model version's JSON habits, pinned to a "model" string, while the behavior behind that string drifts with every provider update, and the providers themselves publish model deprecation schedules, which means your pinned snapshot carries an expiration date whether or not you wrote one down. The parser is the runway. The model's behavior is magnetic north.
In every one of these, the bug is not the value. The value was correct when it was written, that's what makes this failure mode so well-camouflaged in code review. The bug is the assumption of fixity: treating a drifting reference as a constant. Do that, and two consequences follow as surely as compass error. First, the misalignment is silent: nothing diffs, nothing alerts, the value just becomes gradually wrong while every test that pinned the same stale assumption keeps passing. Second, the eventual correction is not a tweak but a renumbering: an expensive, error-prone, all-at-once migration touching every chart, sign, and database that ever referenced the value, except that unlike Tampa, you're doing it unplanned, after the near-miss, under incident pressure.
There's a sharper corollary hiding in the rounding math. A runway tolerates only a few degrees of drift because its labeling system is precise: round to ten degrees, hold a one-degree tolerance, and the window is narrow. The more tightly you pin to a drifting standard, the sooner you fall out of alignment. A parser that exact-matches a response string fails before one that matches structure; a hardcoded model-v3.2-20260115 drifts out from under you faster than a capability check. Precision against a moving reference isn't rigor. It's a shorter fuse.
So what do you do with a value you know will be wrong soon? Here the aviation story stops being a parable about failure and becomes something better: a working blueprint. Because the people who maintain the magnetic reference frame solved this problem, in production, decades ago.
The World Magnetic Model, the joint NOAA/British Geological Survey model that underpins runway numbers, smartphone compasses, and military navigation, is never published as "the declination in Tampa is X." It is published in five-year epochs. WMM2015 was valid from December 15, 2014 to December 31, 2019; the current edition is WMM2025. Every value carries three things a bare constant lacks: a validity window, a predicted drift rate, and a scheduled successor. The five-year cadence isn't arbitrary, it's chosen so that accumulated drift stays inside navigational tolerance between reissues. The model doesn't pretend the pole holds still; it publishes the motion as part of the value. Epoch-stamp the value, that's move one. Schedule the re-derivation, re-derive before the heading is wrong, not after, that's move two.
Move three is the one most engineering organizations miss, and 2019 demonstrated it perfectly. In early 2019, the pole was lurching toward Siberia faster than the WMM2015 predictions had assumed. Accumulated error blew past tolerance with nearly a year left on the epoch. So NOAA and the BGS did something unprecedented: they shipped an off-cycle update, WMM2015v2, about a year ahead of schedule. The lesson generalizes cleanly: a fixed cadence is necessary but not sufficient, because the drift rate itself is unstable, remember the biggest-deceleration-ever-seen. You need a monitored tolerance and a trigger, not just a calendar. The schedule is the floor of your vigilance, never the ceiling.
And then the 2019 episode delivered its twist, almost too on-the-nose: the emergency update was itself delayed about two weeks, by the 35-day U.S. federal government shutdown that ran from December 22, 2018 to January 25, 2019. The reference frame drifted out of tolerance, the correction was ready, and an entirely unrelated outage sat on the release. That's move four, the one nobody budgets for: harden the correction machinery. Your re-derivation pipeline must not depend on the system most likely to be down when drift spikes: the singular wiki page, the one engineer who understands the vendor mapping, the manual deploy path that's frozen during the incident the drift just caused. The universe has a documented sense of humor about scheduling your emergency during your outage.
If this discipline sounds exotic, it isn't. Software already runs it at planetary scale, for exactly one drifting standard. The IANA tz database treats time-zone rules as what they are: political decisions that drift, continuously, with no physical event whatsoever. Parliaments move daylight-saving dates; ministries shift offsets; in 2011 Samoa hopped the International Date Line and December 30 simply never happened there. The tz database absorbs all of it through versioned, epoch-stamped releases (2025b, 2026a) that downstream systems ingest on a cadence (operating systems and language runtimes ship tzdata refreshes as routine patch releases), reloading rules and reprocessing future timestamps. Nobody hardcodes "Samoa is UTC-11" and calls it engineering. We learned to treat timezone rules as data with a release cadence rather than constants, for an unflattering reason: time zones burned us early, visibly, and often. APIs, regulatory thresholds, best practices, and model behaviors drift on slower clocks, five-year clocks, not six-month clocks. And a reference that drifts slower than your average job tenure is indistinguishable from a constant, right up until it isn't. The engineer who pinned it has usually left by the time it breaks; their successor inherits a runway numbered for a pole that moved.
One honest seam in the analogy, because it points at the real first step: the geophysicists have it easier in one important respect. Magnetic observatories and satellites measure the field continuously, the WMM knows its drift rate, which is what makes epochs and off-cycle triggers possible at all. Your vendor does not publish a declination chart for their API semantics, and no agency emails you when "best practice" moves. So for most software runway numbers, the discipline starts a step earlier: you have to build the magnetometer. Contract tests that exercise the live vendor API in CI, not just your mocks of it. A canary that re-derives "compliant" from the current standard's text rather than from last year's ticket. Schema validation on every model response, alerting on drift in the distribution, not only on hard failure. Instrument first; you can't epoch what you don't measure.
Here's what the whole discipline looks like compressed into a config entry, the difference between a bare constant and an epoch-stamped one:
# before: a runway number
min_rsa_bits: 2048
# after: a heading with an epoch
min_rsa_bits: 2048
source: "NIST SP 800-57 Part 1 Rev. 5"
epoch: 2026-06
review_by: 2027-06 # scheduled re-derivation, on the calendar
drift_watch: nist-crypto-feed # the magnetometer: what triggers off-cycle review
owner: security-platform # who repaints the runway
Six lines. A value, the standard it's pinned to, when that pin was placed, when it gets re-derived on schedule, what signal forces the re-derivation early, and who owns the repaint. It costs almost nothing at write time, and it converts the future renumbering from an archaeology project conducted during an incident into a scheduled maintenance task with a name attached.
How far out should review_by be? The geomagnetic answer: it depends on the local drift rate, and there is no global policy. Fairbanks renumbers roughly every 24 years; Stansted expects 56, same pole, same physics, different drift at each field. Your constants are the same. The cadence for any one of them falls out of two numbers you can actually estimate: how fast the reference moves (how often has this vendor broken semantics before? how often does this standard revise?) and how much misalignment you can absorb before the error is user-visible. A volatile vendor API might earn a quarterly review; a slow-moving ISO standard, a comfortable five years, WMM-style. The interval matters less than the principle: each pinned value gets its own epoch math, derived from its own reference frame, instead of inheriting the org-wide habit of "we'll revisit it someday."
Tampa's runway will be renumbered again someday; the pole hasn't stopped, and the geophysicists wouldn't promise you its speed next decade even if you asked nicely. That part isn't optional, drift never is. What's optional is the manner of discovery. The concrete doesn't have to move for your heading to be wrong, and a value that was true when you wrote it is not a value that stays true. Stamp the epoch, schedule the repaint, wire up a magnetometer, and when magnetic north lurches, ship your v2 a year early, on purpose, instead of grinding numbers off the asphalt after the near-miss.
A bare constant forgets what it was pinned to. An epoch-stamped one remembers.
The fix in this essay is a provenance record on every drifting value: the standard it was pinned to, when, against which version, and who owns the repaint, so the future renumbering is a scheduled task instead of an archaeology dig. For an agent's decisions and outputs, that record is Chain of Consciousness: a tamper-evident trace of what an agent did, against which model and which inputs, so when the behavior behind a pinned string drifts, you can see exactly when your heading stopped matching the world.
pip install chain-of-consciousness · npm install chain-of-consciousness
Hosted Chain of Consciousness → · See it in action