Status: Compatibility-window mechanism specification
Last Updated: 2026-04-24
Scope: implementation-facing migration control-plane specification for future runtime-layout upgrades.
2026-04-24 breaking cleanup note:
docs/development/breaking-compat-cleanup-plan.mdscholaraio migrate finalize --confirm2026-04-23 implementation note:
.scholaraio-control/ root, config accessors for instance.json / migration.lock / journal root, lightweight auto-creation of instance.json, and scholaraio migrate status|recover --clear-lockmigration.lock exists, and recovery can explicitly clear the lock while marking layout_state=needs_recovery when appropriateinstance.json.layout_version is newer than the running program supports, normal commands fail fast with an upgrade-required message while migrate status stays availablemigrate status reports the journal inventoryscholaraio migrate plan writes a non-executing inventory-oriented plan.json into one journal, records store-level targets for the covered durable-library, spool, and papers moves, and reports planned legacy-move records plus simple blockersscholaraio migrate verify refreshes verify.json for an existing journal and records component-aware checks for papers/workspaces/index-registry/keyword-search/citation-style loadability/explore openability/toolref current-version resolution/proceedings search/spool roots/translation-resume inventoryscholaraio migrate run --store citation_styles|toolref|explore|proceedings|spool|papers|workspace --confirm copies supported legacy stores into data/libraries/ or data/spool/, or rewrites workspace paper indexes, without overwriting conflicts; it records run metadata plus cleanup candidates and runs post-copy verification before marking the migration successfulverify.json.status = passed_with_warnings when the only remaining failures are explicitly non-authoritative rebuildable search-state checks that are expected to be rebuilt before final release signoffscholaraio migrate cleanup requires a passed verification record, records preview/confirm journal steps, and archives explicit cleanup candidates into the migration journal instead of deleting themscholaraio migrate upgrade --confirm runs the needed supported store moves, then calls finalization in the same journal; empty legacy roots are still cleanup candidates so historical directories do not linger after finalizationscholaraio migrate finalize --confirm rechecks target readiness, auto-migrates workspace paper indexes to refs/papers.json, migrates legacy workspace system outputs into workspace/_system/, reruns verification, archives remaining cleanup candidates, and records a dedicated finalize step plus conflict counts without overwriting canonical targetscitation_styles, toolref, explore, proceedings, spool, and papers require separate design and testsThis document defines the minimum migration mechanisms ScholarAIO SHOULD implement before performing a real runtime-layout move for existing users.
This is not the directory-vision document and not the user-facing migration story document. Its job is narrower:
This document exists so that future implementation work does not jump directly from “new directory idea” to “move user files”.
This document is a companion to the existing migration documents:
docs/development/directory-structure-spec.md
docs/development/directory-migration-sequence.md
docs/development/user-data-migration-strategy.md
docs/development/upgrade-validation-matrix.md
This document fills the missing middle layer:
The following product decisions are treated as settled assumptions for the first migration-capable implementation.
The first real runtime-layout migration MUST be an offline migration.
Meaning:
Rationale:
Compatibility for the first migration generation MUST be one-way only.
Meaning:
Rationale:
Large runtime-layout migration MUST remain an explicit user operation.
Meaning:
data/, workspace/, or other large user-owned treesSmall additive metadata upgrades MAY happen automatically if they are local, reversible, and do not relocate user content.
The first migration generation MAY assume a single canonical runtime root chosen by the user or by the existing config-resolution path.
Meaning:
This is an intentional simplification, not a claim that multiple roots never exist.
ScholarAIO SHOULD reserve a hidden control directory at runtime-instance root:
instance-root/
├── config.yaml
├── config.local.yaml
├── data/
├── workspace/
└── .scholaraio-control/
Recommended contents:
.scholaraio-control/
├── instance.json
├── migration.lock
└── migrations/
└── <migration-id>/
├── plan.json
├── steps.jsonl
├── verify.json
├── rollback.json
└── summary.md
The control metadata SHOULD NOT be mixed into data/ or workspace/ because those trees are themselves migration targets.
It SHOULD NOT use a generic filename such as layout.json at instance root because layout.json already has an established meaning inside paper directories for MinerU-derived layout artifacts.
The control metadata SHOULD live at a stable root-level location that works in both:
~/.scholaraio/instance.jsoninstance.json is the minimum durable record that tells ScholarAIO what kind of runtime root it is opening.
instance.json SHOULD answer four questions:
The exact schema MAY evolve, but the first version SHOULD include fields equivalent to:
instance_meta_versionlayout_versionlayout_statewriter_versioninstance_idupdated_atlast_successful_migration_idRecommended meanings:
instance_meta_version
instance.json itselflayout_version
layout_state
legacy_implicit, normal, migrating, needs_recoverywriter_version
instance_id
updated_at
last_successful_migration_id
The startup rules SHOULD be:
instance.json does not exist:
instance.json exists and layout_state == migrating:
instance.json exists and layout_version is newer than the running program supports:
instance.json exists and is supported:
The root-level layout marker MUST NOT be a plain layout.json.
Reason:
layout.json is already a meaningful per-paper artifact name in the current ingest and parsing flowmigration.lockmigration.lock is the mechanism that turns “offline migration only” into an enforceable runtime rule.
The lock MUST:
The lock file SHOULD record:
migration_idpidhostnamestarted_atwriter_versionmodemode MAY distinguish states such as plan, run, or rollback if needed later. For V1, run is sufficient.
While migration.lock exists, normal ScholarAIO commands SHOULD fail fast unless they belong to the migration/recovery surface.
For V1, the simplest rule is:
This is intentionally conservative. The goal is not to maximize availability during migration. The goal is to minimize accidental mixed-state access.
The first implementation SHOULD treat stale locks cautiously.
Minimum behavior:
The system MUST NOT silently delete a lock just because a timestamp looks old.
Each migration run SHOULD create a dedicated journal directory under:
.scholaraio-control/migrations/<migration-id>/
The journal is the durable record of what the migration attempted and what actually happened.
The journal MUST make it possible to answer:
plan.jsonThe frozen migration plan for this run.
It SHOULD record at least:
steps.jsonlAn append-only execution log.
Each entry SHOULD include:
JSONL is preferred because it is easy to append safely and easy to inspect incrementally.
verify.jsonThe structured verification result.
It SHOULD record:
rollback.jsonThe rollback recipe or rollback-relevant state.
This file does not require a perfect reverse operation graph in V1. It MUST at least preserve enough information to support deterministic recovery decisions.
summary.mdA human-readable report.
It SHOULD explain in plain language:
The migration journal MUST survive until cleanup completes.
The journal MUST NOT be deleted immediately after a successful run.
Verification is the final safety gate between “data moved” and “migration accepted”.
The first implementation SHOULD verify at least:
Verification SHOULD focus on user-visible system health, not just filesystem existence.
That means the verification target is not:
It is:
For compatible store-by-store migration runs, verification MAY additionally distinguish:
A migration MUST NOT be marked fully accepted until verification succeeds.
Implication:
instance.json MUST NOT be finalized as fully migrated before verification passesCompatibility-window nuance:
passed_with_warningsThe desired startup behavior is:
This matches the compatibility-first strategy already defined in the migration strategy document.
migrate planmigrate plan SHOULD:
migrate plan MUST NOT mutate user data.
migrate runmigrate run SHOULD:
instance.json as migratingIf migration fails mid-run, the runtime root MUST remain recoverable and MUST NOT be silently treated as fully migrated.
migrate verifymigrate verify SHOULD:
verify.jsonverify.json.status SHOULD support at least:
passedpassed_with_warnings
failed
This command is especially useful when the user wants extra confidence before cleanup.
migrate cleanupmigrate cleanup SHOULD:
Cleanup MUST be a separate step from migration execution.
Whether rollback is fully automatic or partly operator-driven MAY vary in V1.
However, the implementation MUST support one of the following:
migrate runThe system MUST NOT leave operators with only an unstructured partial filesystem and no durable record of what happened.
migrate plan and inventory logic MUST NOT execute arbitrary user-provided code as part of discovery.
Implication:
If planning encounters files that do not match a known schema, it SHOULD:
It MUST NOT:
Migration protection MUST NOT rely only on the current backup feature, because current backup defaults still focus on data/ rather than the full runtime root.
Migration needs its own lock, journal, and verification path even if backup integration is added later.
The first implementation does not need to solve all future migration concerns.
It MAY explicitly defer:
The V1 goal is narrower:
Before ScholarAIO performs a real runtime-layout move for existing users, the codebase SHOULD have all of the following:
instance.json supportWithout these pieces, ScholarAIO would still be doing a refactor, not a user-safe upgrade.