Memory & Context
LLMs are place-oriented memory systems with shallow effective capacity and no native consolidation/forgetting machinery.
Zero claims were vindicated; zero were cleanly refuted. All ten landed in CONTESTED or SPLIT. Strong forms — formulated with overspecified language — systematically fail. Weak forms — algorithmic-level analogies productive on a restricted subspace — systematically survive.
| # | Claim | Verdict | What defeats the strong form |
|---|---|---|---|
| 01 | Magical Number Seven | Contested | N-back tests on transformers reveal continuous logarithmic decline, not a 7±2 cliff. |
| 02 | Thalamic-Cortical Equivalence | Contested | Burst/tonic firing, driver/modulator asymmetry, and neuromodulation break strong homology. |
| 03 | Persona States | Contested | No CFA on behavioral outputs against human Big-Five state model; 20% scale shifts from item reordering. |
| 04 | Metacognition | Split | Schaeffer et al. (NeurIPS 2023) shows phase transitions are mostly metric artifacts. |
| 05 | RAG TOT | Contested | Mechanism inversion: TOT is form-access failure with semantic intact; RAG is the opposite. |
| 06 | CoT Phenomenology | Split | IIT 4.0 predicts near-zero Φ for transformers; CoT faithfulness fails systematically (Manuvinakurike 2025). |
| 07 | Spontaneous ToM | Refuted | Frozen weights → no gradient flow → 'growth' must be in-context accumulation, not new capability. |
| 08 | Sleep Consolidation | Split | 7 years of progressively faithful replay implementations close some, not most, of the gap. |
| 09 | Active Forgetting | Split | Post-hoc machine unlearning damages capabilities; gradient ascent breaks general competence. |
| 10 | Cortical Column | Split | Cortical column itself is a contested empirical unit (Horton & Adams 2005). |
Six of ten claims confuse Marr's [implementation, algorithmic, computational] levels. Brain ↔ transformer mappings hold at the algorithmic level (sometimes computational), and break catastrophically when smuggled to implementation.
"Homologous" is evolutionary/structural, not algorithmic. "Spontaneously emerged" implies novel capability, not eliciting pretraining. "Phenomenally conscious" collapses Block's distinction. "Threshold" implies phase transitions where the data shows smooth scaling.
Studies citing 7-entity setups as "evidence" for Miller's law had picked 7 as a methodological convenience. The number was an experimenter choice, not a measured ceiling.
"Spontaneous ToM" conflates pretraining-derived ToM (real, well-documented) with interaction-emergent ToM (frozen-weight loops cannot produce new capability via gradient).
Claims 6, 8, and the mechanistic part of 4 end up partially unfalsifiable. The right move is to admit it and propose discriminators; the wrong move is to ride the analogy.
LLMs are place-oriented memory systems with shallow effective capacity and no native consolidation/forgetting machinery.
Brain ↔ transformer mappings hold at the algorithmic level on a restricted subspace; strong 'homology' claims fail.
LLM self-models are real but shallow; the legible CoT trace cannot be trusted as a window into them.
What looks like emergent social cognition is mostly pretraining-derived capability being elicited, not generated.
Distilled across all ten dossiers — eight load-bearing recommendations for engineers building agent systems: