Skip to content
Cross-cutting analysis

The convergent pattern is the finding.

Zero claims were vindicated; zero were cleanly refuted. All ten landed in CONTESTED or SPLIT. Strong forms — formulated with overspecified language — systematically fail. Weak forms — algorithmic-level analogies productive on a restricted subspace — systematically survive.

Verdict matrix

# Claim Verdict What defeats the strong form
01 Magical Number Seven Contested N-back tests on transformers reveal continuous logarithmic decline, not a 7±2 cliff.
02 Thalamic-Cortical Equivalence Contested Burst/tonic firing, driver/modulator asymmetry, and neuromodulation break strong homology.
03 Persona States Contested No CFA on behavioral outputs against human Big-Five state model; 20% scale shifts from item reordering.
04 Metacognition Split Schaeffer et al. (NeurIPS 2023) shows phase transitions are mostly metric artifacts.
05 RAG TOT Contested Mechanism inversion: TOT is form-access failure with semantic intact; RAG is the opposite.
06 CoT Phenomenology Split IIT 4.0 predicts near-zero Φ for transformers; CoT faithfulness fails systematically (Manuvinakurike 2025).
07 Spontaneous ToM Refuted Frozen weights → no gradient flow → 'growth' must be in-context accumulation, not new capability.
08 Sleep Consolidation Split 7 years of progressively faithful replay implementations close some, not most, of the gap.
09 Active Forgetting Split Post-hoc machine unlearning damages capabilities; gradient ascent breaks general competence.
10 Cortical Column Split Cortical column itself is a contested empirical unit (Horton & Adams 2005).

Five recurring failure modes

1. Marr-level confusion

Six of ten claims confuse Marr's [implementation, algorithmic, computational] levels. Brain ↔ transformer mappings hold at the algorithmic level (sometimes computational), and break catastrophically when smuggled to implementation.

2. Strong words that smuggle in unstated equivalences

"Homologous" is evolutionary/structural, not algorithmic. "Spontaneously emerged" implies novel capability, not eliciting pretraining. "Phenomenally conscious" collapses Block's distinction. "Threshold" implies phase transitions where the data shows smooth scaling.

3. Confirmation-bias trap

Studies citing 7-entity setups as "evidence" for Miller's law had picked 7 as a methodological convenience. The number was an experimenter choice, not a measured ceiling.

4. Confounded causal channels

"Spontaneous ToM" conflates pretraining-derived ToM (real, well-documented) with interaction-emergent ToM (frozen-weight loops cannot produce new capability via gradient).

5. Partial unfalsifiability under current operationalization

Claims 6, 8, and the mechanistic part of 4 end up partially unfalsifiable. The right move is to admit it and propose discriminators; the wrong move is to ride the analogy.

Cross-cutting threads

Engineering recommendations

Distilled across all ten dossiers — eight load-bearing recommendations for engineers building agent systems:

  1. Architect for ~4–8 reliably-tracked entities in active context, regardless of nominal window size. (Claim 1)
  2. Track retrieval-confidence and generation-confidence as separate signals. (Claim 5)
  3. Don't trust intrinsic self-correction. Close monitoring loops with external verifiers. (Claims 4, 6)
  4. Build active forgetting into context-management pipelines. (Claim 9)
  5. Implement CLS-inspired dual-store architectures with replay. (Claim 8)
  6. Design multi-agent loops to elicit pretraining-derived ToM, not generate new capability. (Claim 7)
  7. Persona safety analysis must cover both semantic and activation-space pathways. (Claim 3)
  8. Use neuroscience as inspiration for new architectures, not as validation of current ones. (Claims 2, 10)