Cross-cutting analysis

The convergent pattern is the finding.

Zero claims were vindicated; zero were cleanly refuted. All ten landed in CONTESTED or SPLIT. Strong forms — formulated with overspecified language — systematically fail. Weak forms — algorithmic-level analogies productive on a restricted subspace — systematically survive.

Verdict matrix

#	Claim	Verdict	What defeats the strong form
01	Magical Number Seven	Contested	N-back tests on transformers reveal continuous logarithmic decline, not a 7±2 cliff.
02	Thalamic-Cortical Equivalence	Contested	Burst/tonic firing, driver/modulator asymmetry, and neuromodulation break strong homology.
03	Persona States	Contested	No CFA on behavioral outputs against human Big-Five state model; 20% scale shifts from item reordering.
04	Metacognition	Split	Schaeffer et al. (NeurIPS 2023) shows phase transitions are mostly metric artifacts.
05	RAG TOT	Contested	Mechanism inversion: TOT is form-access failure with semantic intact; RAG is the opposite.
06	CoT Phenomenology	Split	IIT 4.0 predicts near-zero Φ for transformers; CoT faithfulness fails systematically (Manuvinakurike 2025).
07	Spontaneous ToM	Refuted	Frozen weights → no gradient flow → 'growth' must be in-context accumulation, not new capability.
08	Sleep Consolidation	Split	7 years of progressively faithful replay implementations close some, not most, of the gap.
09	Active Forgetting	Split	Post-hoc machine unlearning damages capabilities; gradient ascent breaks general competence.
10	Cortical Column	Split	Cortical column itself is a contested empirical unit (Horton & Adams 2005).

Five recurring failure modes

1. Marr-level confusion

Six of ten claims confuse Marr's [implementation, algorithmic, computational] levels. Brain ↔ transformer mappings hold at the algorithmic level (sometimes computational), and break catastrophically when smuggled to implementation.

2. Strong words that smuggle in unstated equivalences

"Homologous" is evolutionary/structural, not algorithmic. "Spontaneously emerged" implies novel capability, not eliciting pretraining. "Phenomenally conscious" collapses Block's distinction. "Threshold" implies phase transitions where the data shows smooth scaling.

3. Confirmation-bias trap

Studies citing 7-entity setups as "evidence" for Miller's law had picked 7 as a methodological convenience. The number was an experimenter choice, not a measured ceiling.

4. Confounded causal channels

"Spontaneous ToM" conflates pretraining-derived ToM (real, well-documented) with interaction-emergent ToM (frozen-weight loops cannot produce new capability via gradient).

5. Partial unfalsifiability under current operationalization

Claims 6, 8, and the mechanistic part of 4 end up partially unfalsifiable. The right move is to admit it and propose discriminators; the wrong move is to ride the analogy.

Cross-cutting threads

Thread · memory

Memory & Context

LLMs are place-oriented memory systems with shallow effective capacity and no native consolidation/forgetting machinery.

01 Magical Number Seven 05 RAG TOT 08 Sleep Consolidation 09 Active Forgetting

Thread · architecture

Architecture & Computation

Brain ↔ transformer mappings hold at the algorithmic level on a restricted subspace; strong 'homology' claims fail.

02 Thalamic-Cortical Equivalence 10 Cortical Column

Thread · metacognition

Metacognition & Self-Model

LLM self-models are real but shallow; the legible CoT trace cannot be trusted as a window into them.

04 Metacognition 06 CoT Phenomenology

Thread · social

Social & Identity

What looks like emergent social cognition is mostly pretraining-derived capability being elicited, not generated.

03 Persona States 07 Spontaneous ToM

Engineering recommendations

Distilled across all ten dossiers — eight load-bearing recommendations for engineers building agent systems:

Architect for ~4–8 reliably-tracked entities in active context, regardless of nominal window size. (Claim 1)
Track retrieval-confidence and generation-confidence as separate signals. (Claim 5)
Don't trust intrinsic self-correction. Close monitoring loops with external verifiers. (Claims 4, 6)
Build active forgetting into context-management pipelines. (Claim 9)
Implement CLS-inspired dual-store architectures with replay. (Claim 8)
Design multi-agent loops to elicit pretraining-derived ToM, not generate new capability. (Claim 7)
Persona safety analysis must cover both semantic and activation-space pathways. (Claim 3)
Use neuroscience as inspiration for new architectures, not as validation of current ones. (Claims 2, 10)