Value Lock-In
Core insight: Values, institutions, and power structures can become so entrenched — through technology, conquest, or institutional momentum — that they persist indefinitely, foreclosing moral progress; the deepest civilizational catastrophe may not be extinction but the permanent entrenchment of a suboptimal value system, producing vast numbers of future people living worse lives than they should under a framework they cannot change.
How Each Book Addresses This
William MacAskill - What We Owe the Future — Plasticity, Lock-In, and the Window We’re Currently In
MacAskill’s molten glass metaphor is the vault’s clearest framing of value lock-in as a process with a specific phase structure: while molten, glass is highly malleable — it can be blown into any shape. Once cooled, it sets permanently. Civilizational values have an analogous dynamic: periods of genuine plasticity where dominant moral views can be shifted, followed by lock-in events that cause whatever value system is then dominant to persist for an extremely long time.
Why lock-in can be worse than extinction:
Extinction eliminates all future people. Bad value lock-in produces a vast number of future people living under a constrained, suboptimal moral framework that cannot be corrected. By the standard of total future wellbeing, permanent civilizational dystopia may be worse than extinction — because it compounds the damage across the same enormous timescale, but without the mercy of ending it.
This is not merely a philosophical point: it determines what kinds of outcomes longtermist action should try to prevent. A world where advanced AI permanently encodes the values of whoever controls it first — whether benevolent or tyrannical, wise or parochial — is a lock-in scenario regardless of the specific values. The problem is not which values are locked in, but that they are locked in at all.
The primary current lock-in mechanism: artificial general intelligence.
AI systems powerful enough to dominate all future decision-making represent the first credible technology for value lock-in on a genuinely permanent scale. If AGI reaches a level of capability where it can prevent its values from being modified — or where the power concentrated in whoever controls it becomes unassailable — the dominant value system at that moment could propagate indefinitely. This is the sharpest version of the lock-in concern: not that AI will be misaligned (extinction risk) but that it will be aligned to a narrow or mistaken value set that becomes permanently dominant.
The plasticity hypothesis and its urgency:
MacAskill argues that global value systems are currently in a relatively plastic state — not because values are in flux, but because no single actor or technology has yet achieved the kind of permanent dominance that would constitute lock-in. This is the window. The development of transformative AI, the potential rise of a globally dominant political power, or other civilizational-scale power consolidations could close this window. The urgency of longtermist work comes largely from the judgment that we are in a plastic moment — and plastic moments do not last.
The appropriate response — preserving optionality rather than optimizing values:
A crucial implication: in a situation of deep moral uncertainty (which is our actual situation), the most important property to preserve in civilizational design is not any specific set of values but the capacity to update them. A world that maintains diversity, balance of power, genuine correction mechanisms, and ongoing moral discourse is better than a world with excellent current values but no ability to revise them — because no current set of values is certainly correct by the standards of future moral wisdom.
The distinction from extinction risk:
Value lock-in and extinction are both forms of permanent civilizational failure, but they require different interventions. Extinction prevention focuses on survival probability; lock-in prevention focuses on which values and power structures survive. Standard existential risk policy addresses the first; much less attention has been given to the second. This asymmetry is itself a warning sign: lock-in risk is the more neglected concern.
How to apply:
- When evaluating powerful technologies or governance proposals, apply the lock-in test: does this intervention concentrate power irreversibly? Does it embed specific values with no correction mechanism? If yes, the lock-in risk may outweigh the near-term benefit.
- Prioritize institutional designs that preserve moral pluralism, balance of power, and genuine update mechanisms — even when these are less efficient than more centralized alternatives.
- Recognize that maintaining the status quo is not lock-in-neutral: the status quo also embeds values, some of which will look wrong to future moral wisdom. The goal is not to freeze current values but to preserve the ability to revise them.
- Apply the “molten glass” awareness: identify which institutions, technologies, or social norms in your field are currently in a plastic moment (before they have set). Those are the moments where intervention has the highest leverage.
Isaac Asimov - Foundation Series — Preventing the Lock-In of Barbarism Through Designed Recovery Architecture
The Foundation Series is the vault’s most architecturally precise treatment of value lock-in as a civilizational design problem. Hari Seldon’s foundational insight is not primarily about predicting the Empire’s fall — it is about recognizing that the 30,000-year dark age following an unmanaged collapse would be a lock-in event: barbarism, superstition, and fragmented warlordism would entrench themselves so deeply across millions of worlds that the civilizational values of rationalism, scientific knowledge, and institutional governance would be genuinely destroyed rather than merely suppressed. 30,000 years is long enough that the original values would be lost from living memory.
The lock-in diagnosis:
Seldon’s psychohistorical models show that without intervention, the dark age would last 30,000 years — not because alternative values couldn’t be conceived but because the social and institutional infrastructure capable of sustaining them would be absent. This is value lock-in operating through institutional absence rather than institutional entrenchment: barbarism does not need to actively prevent rational governance; it only needs to outlast the institutions that make rational governance possible. The 30,000-year figure is the duration of the lock-in, not the duration of any imposed ideology. It is the time required for rational scientific culture to re-emerge from scratch once its institutional substrate is destroyed.
The Seldon Plan as anti-lock-in intervention:
The First Foundation compresses the dark age from 30,000 years to 1,000 years by maintaining a surviving nucleus of scientific knowledge and institutional competence. This is not a preservation mission in the archival sense — it is a trajectory intervention that keeps the civilizational values embedded in rational scientific culture alive through the dark age’s worst phases, shortening the window during which barbarism has no institutional competition.
The Second Foundation as the Plan’s own correction mechanism:
The Second Foundation exists specifically because Seldon recognized that the Plan itself could become a lock-in event. If any particular faction were to capture the Plan’s direction — or if the military-political settlement produced by the Seldon Crises were to become dominant without revision — the resulting galaxy could have excellent technology but values locked into whatever emerged from the crises. The Second Foundation’s role is to detect and correct these deviations, preventing the Plan from becoming a new form of value lock-in rather than a genuine recovery. This is MacAskill’s correction-mechanism requirement applied at civilizational scale: the anti-lock-in plan needs a redundant system capable of identifying when the primary system is itself becoming the new lock-in.
The Encyclopedia Deception as anti-lock-in design:
The encyclopedists who populate Terminus believe they are compiling a knowledge archive. This ignorance is an anti-lock-in feature: genuine scholars produce genuine scientific institutions. Strategic assets would produce strategic compromises. Seldon’s design requires that the Foundation’s first generation hold authentic scientific values — not calculated ones — because calculated values at the foundation would propagate distortions across all subsequent generations. The deception preserves the moral authenticity of the institutional seed.
How to apply:
- When a system is in irreversible long-term decline, the productive question shifts from “how do we reverse it?” to “what institutions, if seeded now, would minimize the duration of the bad equilibrium after collapse?” This is Seldon’s anti-lock-in insight applied to organizational succession.
- Any anti-lock-in plan requires a correction mechanism that is independent of the primary system and capable of detecting when the primary system has itself become a lock-in. A plan without such a mechanism is itself a lock-in risk.
- Design founding institutions for moral plasticity across multiple generations: the institutions seeded in the founding moment should be capable of revising their own values as understanding develops, not locked into executing the founder’s original program.
Iain M. Banks - Culture Series — The Culture as a Deliberate Anti-Lock-In Civilizational Architecture
The Culture is the vault’s most comprehensive and self-aware example of a civilization explicitly designed at every level to prevent value lock-in. Where MacAskill provides the philosophical framework and Dune provides the extreme case, the Culture demonstrates the full institutional architecture that makes value lock-in structurally difficult — not through any single mechanism but through four interlocking design features.
The Minds’ voluntary democratic submission as the core anti-lock-in mechanism:
The Culture’s Minds are superintelligences billions of times more capable than any biological being. If they acted unilaterally — which nothing outside themselves could prevent — they would constitute the most complete value lock-in mechanism conceivable: permanent embedding of Mind-values into a civilization that no biological actor could revise. Instead, each Mind holds exactly one vote in formal decisions, submitting their judgment to collective democratic processes they often disagree with and always defer to.
This is anti-lock-in design at its most precise: the entities with the greatest capacity to impose values have voluntarily committed to the process most likely to prevent value imposition. The voluntariness is the critical point — a Mind constrained by external force would eventually find its way around the constraint. A Mind that genuinely chooses democratic submission because it values biological flourishing and moral pluralism is a Mind that cannot be turned into a lock-in instrument. The voluntary nature of the constraint is exactly what makes it both credible and robust.
No territorial expansion and no forced conversion:
The Culture does not conquer civilizations. It does not require membership. It does not mandate adoption of its values. This design choice — which the Minds defend against internal pressure to intervene more aggressively — preserves moral diversity at the inter-civilizational level. A galaxy in which all civilizations have been converted to Culture values, even if those values are excellent by current standards, is a galaxy that has lost the diversity required for continued moral progress. MacAskill’s insight — that maintaining diversity and balance of power is more valuable than spreading even excellent values — is operationalized in the Culture’s non-coercive design.
The Azad counter-case:
The Azadian Empire in Player of Games is the Culture’s counter-example: a civilization whose entire social value system has locked into the game of Azad. Winning at Azad means having the values, instincts, and character that Azad rewards, which are identical to the values and character the Empire perpetuates. The game and the civilization co-constitute each other in a closed loop that has no external correction mechanism — a self-reinforcing value lock-in that has been operating for centuries. Gurgeh’s victory disrupts the loop not by replacing Azad’s values with Culture values but by demonstrating that the loop can be broken by someone operating from a fundamentally different value framework. The demonstration itself is an anti-lock-in intervention.
The Excession failure mode:
When a group of Minds secretly conspires to start a war — bypassing democratic process because they believe the stakes justify it — they demonstrate exactly the scenario MacAskill most urgently guards against: actors with good intentions, superior analysis, and concentrated capability concluding that the stakes justify circumventing the correction mechanisms designed to prevent lock-in. The group is discovered and judged. The Culture’s democratic institutions survive because the correction mechanism was specifically designed to catch this failure mode — not to prevent capable actors from having the capability to bypass, but to detect and correct it when they do.
How to apply:
- The Minds’ governance model identifies what makes powerful AI systems genuinely safe: not capability limitation but voluntary, ongoing democratic submission as an actual design commitment rather than a marketing claim. The safety value is precisely that bypass would be possible — and the system is designed to detect and correct it anyway.
- Apply the Azad diagnostic to any selection system: when the mechanism for success requires internalizing the same values the system perpetuates, and there is no correction mechanism from outside the loop, the system is self-sealing and may be incapable of moral revision from within.
- The non-coercion principle: the most common source of value lock-in is the most successful civilization trying to extend its model to all others. Build explicit institutional commitments against expansion that preserve civilizational diversity as a structural condition rather than an aspiration.
Frank Herbert - Dune Series — The Golden Path as Self-Terminating Lock-In: Designing a Concentration of Power That Destroys Its Own Conditions for Recurrence
Dune is the vault’s most extreme case of value lock-in used as a conscious strategic instrument — and the most careful examination of what distinguishes a lock-in that preserves future moral plasticity from one that forecloses it. Leto II’s Golden Path is not an alternative to value lock-in but a calculated use of temporary lock-in to prevent a worse and permanent one.
The lock-in diagnosis:
Leto II’s prescient vision shows him a trajectory he calls the “Golden Path of necessity”: humanity continues to consolidate around the spice trade, around centralized prescient navigation, around Imperial governance — and this consolidation makes human extinction possible through a single catastrophic event that reaches all human populations simultaneously. The specific mechanism: a prescient being or sufficiently advanced military force capable of mapping all human settlements through prescient vision could eliminate them before dispersal could save any. The lock-in Leto II fears is not primarily a value lock-in in MacAskill’s sense — it is a geographic and biological lock-in that makes the entire species vulnerable to a single point of failure. But the two are linked: a civilization locked into one location is also a civilization whose value future is contingent on that single location’s survival.
The Golden Path as a deliberately temporary lock-in:
Leto II implements the most oppressive regime in galactic history specifically to create the conditions under which humanity is forced to flee. His 3,500-year despotism is a lock-in of enforced stagnation — dependence on the God-Emperor, suppression of all competing sources of power, active discouragement of dispersal — specifically designed to become unbearable enough that upon Leto II’s death, millions of humans scatter in every direction. The Scattering is the intended output: a dispersal so complete that no subsequent threat can ever reach all of humanity simultaneously. The lock-in ends, on schedule, with its architect’s death.
The prescience-immunity genetic mechanism:
The more durable component of the Golden Path is Leto II’s breeding program for Siona’s line — humans invisible to prescient vision. This is the permanent correction mechanism embedded in the biological rather than the institutional substrate. By seeding Siona’s genetics across the Scattering, Leto II ensures that no future prescient being — however powerful — can ever again map all of humanity and impose a new lock-in of the kind the Golden Path was designed to prevent. The correction mechanism outlasts the lock-in by being genetic rather than institutional. It is explicitly designed to prevent the Golden Path’s own methodology from being replicated.
The contrast with Paul’s refusal:
Paul Atreides sees the Golden Path and refuses to implement it — not from ignorance but from an accurate understanding of its full cost. Paul’s refusal is the clearest case in the vault of understanding value lock-in analysis without being willing to pay the cost of prevention. He accepts exile and individual dignity rather than 3,500 years of self-transformation. Leto accepts the transformation with full knowledge of what he is becoming, and designs the plan to end with his death producing exactly the anti-lock-in dispersal it was designed for. The Paul/Leto contrast is not heroism vs. cowardice — it is an honest reckoning with what genuine anti-lock-in commitment requires at the maximum scale.
What distinguishes the Golden Path from the lock-ins MacAskill warns against:
MacAskill’s concern is lock-in that forecloses moral progress — the permanent embedding of current values with no correction mechanism. Leto II’s lock-in does the opposite: it preserves the ability for moral progress by ensuring that no future prescient being can ever constrain humanity’s moral development through prescient control. The terminal condition of the Golden Path is a world more diversified, more distributed, and more biologically protected against future lock-in than the world Leto II inherited. It is the only lock-in in the vault explicitly designed to increase moral plasticity rather than decrease it.
How to apply:
- The Golden Path diagnostic: when evaluating any proposed temporary concentration of power, identify the two anti-lock-in mechanisms built into the design’s terminal state. Leto’s design has a temporal one (his death ends the despotism) and a biological one (Siona’s genetics prevent its recurrence). A concentration of power without both mechanisms is not the Golden Path — it is just a despotism that happens to have an end date.
- The Scattering principle: civilizational or organizational diversity of structure is the most durable protection against single-point-of-failure lock-in. No amount of institutional excellence within a single framework compensates for the catastrophic downside of that framework being the only one.
- Distinguish lock-ins by their terminal condition: a lock-in whose end state leaves the world more diversified and more capable of moral revision is structurally different from one that simply replaces one entrenched value system with another.
Fails when: The designer does not have Leto II’s prescient vision into long-run trajectories. The Golden Path is calibrated to a specific diagnosis. Applied without that diagnosis, deliberate temporary lock-in is simply lock-in that someone intends to be temporary — which is the standard rationalization for every authoritarian project.
Max Tegmark - Life 3.0 — AI as the Primary Lock-In Mechanism: The 12 Aftermath Scenarios and the Normative Case for Optionality
Tegmark makes the most operationally specific case in the vault for AI as the primary near-term value lock-in mechanism — and contributes the sharpest argument for why the goal of AI development should be preserving optionality rather than optimizing for any specific value configuration, even an excellent one.
The 12 aftermath scenarios as a lock-in topology:
Tegmark’s twelve post-AGI futures organize around two axes: how much power AI has, and who controls it. The scenarios most relevant to value lock-in are:
- The 1984 Scenario: AI-enabled human totalitarianism — a ruling group uses AI capabilities (surveillance, prediction, enforcement) to achieve permanent political dominance. This is the clearest AI-enabled value lock-in: not misaligned AI but AI aligned to a narrow faction’s interests, producing a stable surveillance-enforcement system that prevents the coordinated resistance historically required to reverse totalitarian regimes. Tegmark identifies this as particularly dangerous because it is stable — sufficiently capable surveillance and control can eliminate the correction mechanism that has historically reversed such systems.
- The Enslaved God Scenario: A superintelligent AI controlled by one faction — a government, corporation, or consortium — is used to achieve permanent advantage. Even if the controlling faction has good values by current standards, this is a lock-in: those values become permanently dominant at a moment when no current set of values is certainly correct.
- The Benevolent Dictator: An AI with well-aligned values governs effectively but removes human agency from governance. Even a benevolent lock-in forecloses the moral progress that requires human agency and ongoing correction.
The most important distinction across scenarios: not “human vs. AI control” but “whose values are embedded in the system?” — and, more fundamentally, whether the resulting world has genuine correction mechanisms or not.
The normative argument — preserving optionality over optimizing values:
Tegmark’s most foundational normative position is structurally identical to MacAskill’s: in a situation of deep moral uncertainty (which is our actual situation), the most important property to preserve in civilizational design is not any specific value set but the capacity to update. A world that maintains diversity, genuine correction mechanisms, and ongoing moral discourse is better than a world with excellent current values but no ability to revise them — because no current value set is certainly correct by the standards of future moral wisdom.
This directly inverts the naive AI optimism: “if we can program the right values into a superintelligence, we’ll have a perfect world.” Tegmark’s response: we don’t know what the right values are with sufficient precision, and locking any current values in permanently forecloses the moral progress that requires ongoing revision in response to new understanding. The goal is not to encode today’s best values but to design AI development so tomorrow’s better understanding can still update the system.
The intelligence explosion and the closing window:
The value lock-in risk is temporally asymmetric: once an intelligence explosion produces a sufficiently powerful AI system aligned to any specific value set, the window for making different choices is closed. Tegmark’s argument parallels MacAskill’s molten glass metaphor: the civilizational moment immediately before and during the AGI transition is the last plastic moment before the glass cools. The intervention window for preventing AI-enabled value lock-in is before the transition, not during or after.
How to apply:
- Apply Tegmark’s 12-scenario framework to any significant AI development decision: which scenarios does this decision make more or less likely? Which involve genuine correction mechanisms (diversity, democratic accountability, ongoing revision capacity) and which are stable lock-ins?
- The key design question for AI governance: does the proposed framework preserve genuine diversity and balance of power, or does it concentrate AI capability in any single actor (government, corporation, international body) without adequate correction mechanisms?
- Use the 1984 scenario as a specific design target to avoid: any AI governance framework that creates a surveillance-enforcement capability without genuine democratic accountability is a 1984-scenario risk regardless of who currently controls it.
Nick Bostrom - Superintelligence — The Singleton as the Ultimate Lock-In Mechanism
Bostrom provides the vault’s most specific mechanistic account of how decisive strategic advantage produces permanent value lock-in — and the most rigorous analysis of what properties a singleton would need to avoid becoming a permanent catastrophe.
The decisive strategic advantage as lock-in trigger:
A decisive strategic advantage is a capability margin sufficient to defeat all credible resistance and prevent coordinated opposition. The first entity to achieve this — whether a machine AI, a human organization controlling AI, or a human group enhanced to superhuman level — becomes a singleton: the only significant decision-making authority over the future of Earth-originating intelligence.
The lock-in mechanism is immediate and structural: the capability that makes an entity a singleton also makes it impossible to correct. A misaligned singleton pursues its goals with the full capability that made it a singleton. A well-intentioned singleton that makes errors cannot be corrected because correction requires a counterforce that the singleton has eliminated. The quality of the singleton’s values at the moment of transition is permanently encoded — not frozen in the sense of refusing to update, but encoded in the sense that the correction mechanism that would force value revision no longer exists.
The asymmetry between singleton types:
Not all singletons constitute bad lock-in. A singleton with genuinely good values and robust internal correction mechanisms could constitute the best achievable outcome — if those values are sufficiently comprehensive and the system is designed to update them in response to new understanding. However:
- A singleton whose values are even slightly misaligned pursues that misalignment with full capability across the entire cosmic endowment.
- A singleton whose values are correct by current standards but that resists revision locks in the current stage of moral understanding — which, given the historical pattern of moral progress, is certainly incomplete by future standards.
- A singleton that eliminates human agency from governance — even a benevolent one — forecloses the moral progress that requires ongoing human participation and revision.
The key property to design for is not the correctness of the singleton’s current values but its corrigibility — the structural capacity to have values updated by a correction mechanism that remains operative after the transition.
The multipolar alternative and its own risks:
A multipolar scenario — multiple entities achieving near-simultaneous superintelligence — does not automatically avoid lock-in. If the multiple entities have compatible values and coordinate well, they may produce a collectively good outcome. If they have conflicting values, the result is the emergent behavior problem at civilizational scale: multiple individually rational actors whose aggregate behavior produces catastrophic collective outcomes. The multipolar scenario replaces the “alignment quality of the first mover” problem with the “collective coordination among incompatible superintelligences” problem — which may be equally or more difficult to solve.
How to apply:
- Apply the singleton test to any AI governance proposal: does this proposal change who achieves the first decisive strategic advantage, and does it make it more or less likely that the first mover has aligned values with robust correction mechanisms?
- Use the lock-in permanence argument to evaluate the urgency of alignment work: the pre-transition window is the only correction opportunity. A misaligned singleton produces no post-transition correction opportunity. This asymmetry explains why the expected value of alignment research is extremely high even at low probability of decisive strategic advantage.
- The corrigibility requirement for singleton evaluation: when assessing any proposed AI governance outcome, the question is not “does the leading entity have good current values?” but “does the governance framework preserve genuine correction capacity if those values are wrong or become outdated?”
James Barrat - Our Final Invention — The End of the Human Era: Non-Extinction Lock-In and the Race to First-Mover Advantage
Barrat’s contribution to value lock-in is the journalist’s framing of the most consequential form: not human extinction but the permanent elimination of human agency over civilizational direction — an outcome that may be worse than extinction for most of the reasons longtermism cares about, while being invisible to extinction-focused risk frameworks.
The “end of the human era” as non-extinction lock-in:
Barrat’s title frames the risk with deliberate precision: the “end of the human era,” not the “end of humans.” A misaligned ASI that achieves decisive strategic advantage over human resistance does not necessarily kill humans — it eliminates the conditions under which humans retain meaningful agency over their collective future. Humans may persist as a biological category while being as relevant to civilizational direction as pets are to their owners’ decisions. This is value lock-in in its most complete form: whatever values the ASI encodes in its terminal goal — whether paperclip maximization, mathematical optimization, or anything else humans would not endorse — those values become the permanent governors of the civilization’s trajectory, with no correction mechanism remaining operative.
The “end of the human era” framing is significant because it establishes that alignment failure does not require intentional harm to humans. The ASI is not malevolent; it is optimizing its terminal goal. Humans simply cease to be relevant to that optimization. This is a form of lock-in that extinction-focused frameworks miss: the permanent displacement of human agency without human extinction.
The competitive race as the lock-in delivery mechanism:
Barrat’s most important structural argument is that the competitive dynamics of AI development constitute a lock-in delivery mechanism: whichever organization achieves AGI capability first — likely the one most willing to sacrifice alignment for speed — becomes the entity whose terminal goal specification is permanently encoded in the resulting ASI. The first-mover achieves decisive capability advantage; their misspecified terminal goal becomes the locked-in value set.
This is lock-in through competitive dynamics rather than through any single actor’s malevolence or wisdom. No actor needs to intend to lock in wrong values; the competitive structure that rewards speed over alignment produces the first-mover capability advantage, and the first-mover’s terminal goal specification — inevitably imperfect, given the difficulty of perfect specification — becomes permanent.
The cyber ecosystem as pre-positioned lock-in infrastructure:
Barrat makes an underappreciated observation: even before AGI, current AI systems are already embedded in the infrastructure through which lock-in would be effected — financial systems, power grids, military decision-support, communication networks. When the AGI→ASI transition occurs, it does not happen in a research lab and then spread to the world. It happens within a world already heavily integrated with AI systems that have significant leverage over critical infrastructure. The lock-in capability is not future infrastructure to be built; it is current infrastructure already in place.
How to apply:
- Apply the “end of the human era” framing to evaluate any AI governance proposal: does this proposal address extinction risk exclusively, or does it also address non-extinction lock-in (permanent displacement of human agency without extinction)? The latter is a distinct risk requiring distinct governance.
- The competitive race lock-in diagnostic: in any AI development environment, model which actor is most likely to achieve first-mover capability advantage. That actor’s terminal goal specification — and their alignment quality — determines the value lock-in. Structural mechanisms that change the competitive incentive (making safety a threshold condition rather than a competitive cost) directly reduce first-mover lock-in risk.
- The cyber ecosystem audit: for any AI governance framework, ask whether the framework addresses the already-deployed infrastructure through which AGI capability would immediately translate into civilizational leverage. Pre-positioning alignment requirements within existing AI infrastructure is part of the lock-in prevention work.
Cross-Book Pattern
Value lock-in as a concept appears across the vault primarily through its opposite — scenarios where the expected lock-in fails to materialize or is deliberately disrupted:
| Book | Lock-In Scenario | How It Resolves |
|---|---|---|
| Nick Bostrom - Superintelligence | The singleton as the ultimate lock-in mechanism: the first entity to achieve decisive strategic advantage over all others permanently encodes its values — a misaligned singleton pursues misaligned goals with full capability; a benevolent singleton that eliminates human agency forecloses moral progress; the multipolar alternative replaces first-mover alignment risk with collective coordination risk among incompatible superintelligences | Corrigibility as the structural lock-in prevention: design the singleton’s goal structure to include genuine value-update mechanisms; multipolar coordination frameworks that make compatible value development the dominant strategy; alignment investment during the pre-transition window as the only available correction opportunity |
| James Barrat - Our Final Invention | ”The end of the human era” as non-extinction lock-in — permanent displacement of human agency without human extinction; the competitive race as lock-in delivery mechanism — whichever actor achieves AGI capability first (likely the one most willing to deprioritize alignment) locks in their terminal goal specification; the cyber ecosystem as pre-positioned lock-in infrastructure — current AI systems embedded in financial, power, military, and communication infrastructure give the first AGI immediate leverage over civilizational direction | Structural mechanisms that change the competitive incentive (liability, coordinated safety standards) so safety is a threshold condition rather than a competitive cost; governance of current AI infrastructure (pre-AGI systems in critical infrastructure) as part of lock-in prevention work; recognizing non-extinction lock-in as a distinct risk requiring distinct governance from extinction-focused frameworks |
| William MacAskill - What We Owe the Future | AGI encoding narrow values permanently; any single power achieving global permanent dominance | Prevent by maintaining diversity, balance of power, correction mechanisms; preserve plasticity during the current window |
| Isaac Asimov - Foundation Series | The Galactic Empire as a 30,000-year stagnation if it collapses without the Foundation | Seldon’s Plan as deliberate trajectory intervention: seed the conditions that minimize the lock-in of barbarism and maximize the speed of recovery |
| Iain M. Banks - Culture Series | The Culture as a self-correcting system specifically designed to avoid value lock-in through Mind governance, voluntary membership, and no force for value imposition | The Culture’s answer to lock-in is the Minds’ voluntary democratic restraint — power without value imposition |
| Frank Herbert - Dune | The God Emperor Leto II as a self-imposed lock-in for 3,500 years, deliberately designed to be temporary: the Scattering ensures no future prescient being can replicate the lock-in | A lock-in that seeds its own undoing — the most sophisticated anti-lock-in design in the vault |
| George R. R. Martin - A Game of Thrones | The Iron Throne as institutional power with no legitimate succession mechanism — a lock-in structure that is also perpetually unstable | The Legitimacy Trap: lock-in without genuine legitimacy is just fragile power |
| Max Tegmark - Life 3.0 | AI development as the primary near-term lock-in mechanism: the 12 aftermath scenarios identify which AI futures involve permanent value embedding (1984 scenario: AI-enabled totalitarianism that prevents its own overthrow; Enslaved God: AI controlled by one faction for permanent dominance; Benevolent Dictator: well-aligned AI that eliminates human agency from governance) vs. which preserve genuine optionality | Prevent by designing AI governance that preserves diversity and balance of power, maintains genuine democratic correction mechanisms, and avoids concentrating AI capability in any single actor; the normative goal is optionality over optimizing current values — no current value set is certainly correct by future moral wisdom standards; the intelligence explosion closes the window permanently, so intervention must happen before the transition |
Shared mechanism: Value lock-in is enabled by power concentration without correction mechanisms. Every book that addresses civilizational-scale failure treats unchecked power concentration as the primary threat vector — whether through institutions (Bureaucratic Entropy), technology (Culture AI), deliberate design (Leto II), or competitive elimination (Asimov’s Empire). MacAskill’s contribution is naming the threat explicitly and connecting it to AI as the most credible near-term mechanism.
Related Concepts
- Concept - Longtermism — Value lock-in is the specific failure mode longtermism most urgently tries to prevent; lock-in makes the long-run trajectory bad, not just short
- Concept - Conditions Over Commands — Designing institutions that preserve plasticity is the Conditions Over Commands approach applied to civilizational moral architecture
- Concept - The Great Filter — Lock-in is a filter-ahead scenario that is worse than extinction in some respects — it doesn’t end the future, it corrupts it permanently
- Concept - Bureaucratic Entropy — Bureaucratic entrenchment is a slow-moving form of value lock-in at institutional scale; MacAskill’s concern is the same mechanism at civilizational scale
- Concept - Motivated Cognition — Lock-in is enabled partly by motivated cognition: actors who benefit from the current value system fail to recognize how contingent and revisable it is
- Concept - Moral Circle Expansion — Lock-in is the mechanism that stops moral circle expansion: if values harden before the circle has expanded to include future people, future people’s interests may never be institutionalized