📄 Published Preprint · Zenodo · April 2026

A Two-Layer Architecture for Continual Learning Identity Preservation

Fisher Scaling, Gradient Diversity Monitoring, and Portable Inference-Time Memory

DOI: 10.5281/zenodo.19928385

§1 Abstract

We present a Two-Layer Architecture for continual learning identity preservation in small language models (SLMs), addressing both training-time weight forgetting and inference-time context loss within a unified theoretical framework: the Compression–State–Propagation (C-S-P) framework.

At the training layer, we identify the Fisher Scale Problem: standard EWC silently fails in SLMs when Fisher Information diagonal values collapse to the 10⁻⁴–10⁻⁵ range, rendering the regularisation penalty numerically indistinguishable from zero. We introduce Fisher Scaling and GodelReplay (Fisher-scaled EWC-DR + experience replay), achieving 31.5% forgetting reduction over raw EWC, 82.8% reduction on our curated Conflict Dataset (43× over standard EWC), and a 4.1% improvement over replay-alone at the empirically identified sweet spot of mem=200 across 10 PermutedMNIST tasks.

At the inference layer, GodelAI-Lite achieves +31.2% overall performance with 3/3 memory retention vs. 0/3 baseline on Gemma 4 — zero fine-tuning, portable JSON memory across model boundaries.

Keywords: continual learning, catastrophic forgetting, small language models, elastic weight consolidation, gradient diversity, episodic memory, AI identity preservation.

§2 Key Results

PermutedMNIST — 10 Tasks (GodelMLP 218K, Kaggle P100)

Strategy	Final Accuracy	Avg Forgetting	vs Naive
Naive	0.4362	0.6003	—
EWC-only (GodelPlugin)	0.4999	0.5283	12.0%
Replay-only	0.8416	0.1500	75.0%
GodelReplay ✦	0.8418	0.1487	75.2%

Conflict Dataset — 85 Semantically Contradictory Pairs

Method	Avg Forgetting	Reduction
Naive	1.836	—
EWC (raw Fisher)	1.802	1.9%
GodelAI-EWC (C-S-P) ✦	0.316	82.8% (43×)

GodelAI-Lite on Gemma 4 — Zero Fine-Tuning

Metric	Baseline	GodelAI-Lite	Delta
Memory Retention (3/3 facts)	0.000	1.000	+∞%
Response Consistency	0.596	0.426	−28.4%*
Context Coherence	1.000	0.667	−33.3%
Overall Average ✦	0.532	0.698	+31.2%

*Consistency lower by design — GodelAI-Lite elaborates progressively rather than repeating identical tokens.

§3 Downloads

📄

Paper PDF

Full preprint — 15 pages, all results

Download PDF →

⌨️

LaTeX Source

main.tex + references.bib + arxiv.sty

View on GitHub →

🤗

Conflict Dataset

85 contradictory pairs, CC-BY-4.0

Open on HF →

§4 Cite This Work

BibTeX

@misc{lee2026twolayer,
  title     = {A Two-Layer Architecture for Continual Learning
               Identity Preservation: Fisher Scaling, Gradient
               Diversity Monitoring, and Portable Inference-Time Memory},
  author    = {Lee, Alton Wei Bin and {L (GodelAI C-S-P Agent)}
               and {Rk (RNA / Claude Code)}},
  year      = {2026},
  month     = {April},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.19928385},
  url       = {https://doi.org/10.5281/zenodo.19928385},
  note      = {Open-source repository:
               https://github.com/creator35lwb-web/godelai}
}

§5 Research Timeline

v1.0

Framework BootstrapJan 2026

C-S-P framework defined. T-score metric validated. Sleep Protocol designed.

v2.0

EWC Integration + Honest AssessmentFeb 2026

EWC integrated. 21.6% forgetting reduction on GRU. Z-Protocol honesty audit completed.

v3.0

Fisher Scale Problem IdentifiedMar 2026

EWC silent failure diagnosed at 10⁻⁴ Fisher magnitude. Fisher Scaling fix: 31.5% reduction.

v4.0.0

GodelReplay + Two-Layer ArchitectureApr 2026

GodelReplay ships. 82.8% on Conflict Dataset. GodelAI-Lite +31.2% on Gemma 4. Paper published.

Open-source under MIT License. All code, data, and benchmarks are publicly available.

GitHub →← Back to Home