Maintenance, repair & spares economy
The capability of keeping the settlement's machines running across 26-month resupply gaps with no instant supply chain — through reliability engineering, repair, remanufacturing, and provisioned spares. It turns every component's failure rate into a survival calculation: carry a spare, make one locally, repair the failed unit, or lose the function. The governing trade is the spares-mass problem — reliability alone cannot guarantee a long Mars mission, so in-situ repair and manufacturing must cover the gap.
Governing equations
Availability = mean-time-between-failures over (MTBF + mean-time-to-repair). For life-critical systems A must approach 1, which on Mars means either very high MTBF (heavy/expensive) or fast local repair (MTTR) — usually both. [1]
Poisson statistics size the spares: given failure rate λ over the resupply interval t, how many spares give an acceptable probability of not running out. The 26-month t makes the required inventory large. [2]
Spare mass scales with failure rate × resupply gap × part mass — for a long mission this grows beyond what reliability alone can carry, which is the quantitative case for in-situ repair and manufacturing over pure stockpiling. [2]
Standardizing on fewer part types lets one spare cover many uses (pooling), cutting total inventory — design commonality is a first-order spares-mass lever. [2]
Key constants & quantities
| Symbol | Value | Units | Conditions | Description |
|---|---|---|---|---|
| Resupply interval | 26 | months (launch-window-locked) | — | The gap a colony must survive without resupply — the t that drives the entire spares calculation, fixed by orbital mechanics.[2] |
| Life-critical availability | 0.999–0.99999 | A (target) | — | Availability required of life-support and other life-critical functions — extreme, demanding redundancy plus fast repair.[1] |
| Spares mass fraction | 10–40 | % of system mass over mission life | — | Order-of-magnitude spares mass a long Mars mission must carry under a stockpile-only strategy — large enough that repair/ISM is worth heavy investment.[2] |
| Repair vs replace crossover | 1 | favors repair as λ·t·m grows | — | The point where repairing/remanufacturing beats stocking spares — reached quickly on Mars because t (26 mo) is large.[2] |
| Locally-remanufacturable fraction | 40–90 | % of failed parts (with machine tools + AM) | — | Share of failures a colony with machining and additive manufacturing can repair or remake — the rest (chips, membranes, catalysts) need spares.[3] |
Operating envelope
Mass balance
Basis: sustaining the operating base across one 26-month resupply gap (capability)
Inputs
| Provisioned spares (imported + local) | 1 | sized to λ·t | [2] |
| Repair/remanufacturing capability | 1 | machine tools + AM + diagnostics | [3] |
| Failed components (feedstock) | 1 | recycled/repaired | [1] |
- Provisioned spares (imported + local): Critical irreplaceables (chips, membranes, catalysts, seals) stocked deep; commodity parts made locally.
- Repair/remanufacturing capability: Machining, metal/polymer printing, electronics rework, welding — the local fix-it base.
- Failed components (feedstock): Broken parts are repaired, remanufactured, or recycled to material — not waste.
Outputs
| Sustained system availability | 1 | A → target | [1] |
- Sustained system availability: Functions kept running through the gap; life-critical systems above their availability targets.
Maintenance itself is not energy-heavy, but it is the multiplier on every other node's value: a reactor or chemistry plant that can't be kept running is worthless. Availability, not nameplate capacity, is what actually delivers over a mission.
Variants & trade-offs
Reliability + provisioned spares (baseline)
[2]Design for high MTBF and carry Poisson-sized spares for the resupply gap — the conventional approach, extended to Mars timescales.
- Predictable; proven; simple for irreplaceable parts (chips, membranes, catalysts)
- No local manufacturing needed for the stocked items
- Spares mass grows large over a long mission; can't anticipate every failure
- Obsolescence and inventory management overhead
When preferred: Irreplaceable/critical parts and early missions before local repair matures.
In-situ repair & remanufacturing
[3]Fix failed units and remake commodity parts locally with machining, additive manufacturing, welding, and electronics rework — repairing rather than stockpiling.
- Slashes spares mass for the large fraction of parts that are locally makeable
- Adapts to unanticipated failures; turns broken parts into feedstock
- Can't remake chips, membranes, catalysts, or seeds; needs the manufacturing base and skills
When preferred: Commodity mechanical/structural parts; the core of a self-sufficient maintenance economy.
Redundancy + graceful degradation
[1]Run parallel/redundant units and design systems to degrade gracefully, so a single failure doesn't stop a function while repair proceeds.
- Maintains availability through failures; buys time for repair
- Essential for life-critical functions
- Extra mass/complexity; redundant units can share common-mode failures
When preferred: Life-critical and single-point-of-failure systems (power, life support, key rotating equipment).
Condition-based / predictive maintenance
[1]Monitor equipment health (vibration, temperature, performance) to fix things before they fail and to avoid unnecessary teardowns.
- Catches failures early; optimizes spare use and downtime
- Reduces both surprise failures and wasteful preventive swaps
- Depends on the very electronics/sensors that are themselves import-dependent
When preferred: High-value rotating equipment (compressors, pumps, reactor systems) across the plant.
Failure modes
| Mode | Cause | Detection | Mitigation |
|---|---|---|---|
| Spare stock-out of an irreplaceable part (safety-critical)[2] | A critical un-makeable component (chip, membrane, catalyst, seal) fails more often than stocked, with no local substitute and 26 months to resupply. | Inventory vs failure-rate tracking; reorder-point alarms against the resupply gap. | Deep spares on irreplaceables, redundancy, design commonality/pooling, and develop local substitutes where possible. |
| Common-mode failure defeats redundancy[1] | Redundant units fail together from a shared cause (same dust exposure, same bad batch, same software bug) — redundancy gives false confidence. | Failure correlation analysis; diverse-redundancy review. | Diverse (not identical) redundancy where feasible, separate failure causes, stagger maintenance and part lots. |
| Cascading failure during a maintenance backlog[2] | Multiple concurrent failures (e.g. after a dust storm) overwhelm repair capacity; backlog grows and availability collapses. | Repair-queue and availability trending. | Maintenance capacity margin, prioritization by criticality, redundancy to buy time, surge repair procedures. |
| Skill / knowledge gap[3] | The crew lacks the expertise to diagnose or repair a failure — the information-closure problem made concrete. | Skills-coverage audit; repair success rate. | Broad cross-training (the academy), repair documentation/automation, remote Earth expert support (across light-lag), retained local knowledge. |
| Recycling-loop contamination[1] | Remanufacturing from failed parts reintroduces contaminants or degraded material, propagating defects into "repaired" components. | Material/quality verification of remanufactured parts. | Material testing, quality control on remanufacture, segregate recycle streams, witness testing. |
Mars adjustments
Reliability alone is not enough[2]
Impact: The central finding of Mars-logistics analysis: for 26-month-gap missions, no achievable component reliability avoids large spares mass — so in-situ repair and manufacturing are not optional extras but a core requirement.
Mitigation: Pair high reliability with deep local repair/remanufacturing and redundancy; budget spares with Poisson math against the gap.
Repair beats stockpiling for makeable parts[3]
Impact: Because spare mass scales with the long resupply gap, the large fraction of parts that machining + AM can remake is far cheaper to repair locally than to ship and store as spares.
Mitigation: Invest in the machine-tools/AM/recycling base; reserve imported spares for the un-makeable (chips, membranes, catalysts, seeds).
Dust accelerates wear everywhere[4]
Impact: Pervasive abrasive dust raises failure rates on seals, bearings, and mechanisms across the colony — λ is higher on Mars, inflating both spares and repair demand.
Mitigation: Dust-tolerant design, sealed mechanisms, and maintenance intervals set by dust exposure (the recurring lesson across nodes).
Availability, not capacity, is what survives[1]
Impact: A plant's nameplate capacity is meaningless if it's down; on Mars, sustained availability of life-critical functions is the real figure of merit, and maintenance is what delivers it.
Mitigation: Design and operate to availability targets; redundancy + fast local repair for life-critical systems.
Maintenance is the multiplier on the whole tree[2]
Impact: Every other node's value is conditional on being kept running. A reactor, chemistry plant, or life-support system that can't be maintained for 26 months is not an asset — maintenance is what turns capability into sustained capability.
Mitigation: Treat the maintenance/repair/spares economy as foundational infrastructure, co-equal with the production nodes it sustains.
Alternatives & substitutes
Frequent resupply (Earth as the warehouse)[2]
- No local repair needed; just ship replacements
- 26-month gap makes this impossible for anything that fails between windows; massive cargo cost
When preferred: Never sufficient alone on Mars; supplements, not replaces, local capability.
Extreme reliability / overdesign[2]
- Fewer failures to fix; longer MTBF reduces spares
- Reliability alone is provably insufficient for Mass-class durations; overdesign adds mass; surprises still happen
When preferred: Combined with repair and spares — not as a sole strategy (Owens' central finding).
Design for disposability + bulk spares[1]
- Simple modules swapped wholesale rather than repaired
- High spares mass; wasteful of irreplaceable content
When preferred: Cheap, locally-makeable modules; not for high-value imported assemblies.
Requires
References
- (2012). Practical Reliability Engineering, 5th Edition. Wiley. ISBN 978-0-470-97981-5. — Reliability engineering fundamentals: failure rates, MTBF, availability, maintainability, and spares provisioning.
- (2015). Limitations of reliability for long-endurance human spaceflight. AIAA SPACE 2015 Conference, AIAA 2015-4611. doi:10.2514/6.2015-4611 — Quantifies the spares-mass problem for Mars-class missions: the 26-month resupply gap drives large spare inventories or in-situ repair/manufacturing.
- (2004). Kinematic Self-Replicating Machines. Landes Bioscience. ISBN 978-1-57059-690-2. — The definitive survey of self-replication theory and engineering: replication closure, the closure-fraction metric, and feedstock/parts/information closure.
- (2002). Aeolian removal of dust types from photovoltaic surfaces on Mars. NASA Glenn Research Center, NASA/TM-2002-211837. NASA/TM-2002-211837. — Mars dust deposition + removal mechanisms on optical / radiator surfaces; α_s and ε degradation rates.