Methodology.
Why the verdicts can be trusted as much as their evidence will allow — and where to push back.
One sentence
Generate ten ambitious hypotheses spanning AI ↔ brain, formulate each with a strong form (theory-grade equivalence) and an implicit weak form (the underlying engineering or algorithmic intuition), then dispatch parallel adversarial-validation agents that surface supporting and contradicting evidence, apply two prompt-level persona lens primers (the full persona skills are loaded later in the pipeline, only at the roundtable), articulate a steelman counterargument, and commit to a verdict.
Stage 1 — Hypothesis generation
The ten claims were chosen to span four sub-domains (memory, attention/architecture, metacognition, social cognition) and to be deliberately over-strong — to fail on the strong form so the weak form's structure could be examined cleanly. None were chosen because they were likely to be vindicated; several were chosen because they were likely to be unfalsifiable, and that itself is informative.
Stage 2 — Adversarial research per claim
Ten parallel research agents (one per claim) executed:
- 8–12 web searches across peer-reviewed neuroscience, ML conferences, arXiv, and credible commentary
- Surfacing of 4–6 supporting and 4–6 contradicting sources, plus 1–2 active-debate references
- Application of two analytical lenses via prompt-level persona primers. The roster: Karpathy, Joscha Bach, Hickey, Carmack, Bret Victor, Bryan Cantrill, Ayanokoji, pg, Taleb. Deployment was uneven — see How it was made for honest counts.
- Explicit steelman counterargument
- Verdict ∈ { VINDICATED, PLAUSIBLE, CONTESTED, REFUTED, UNFALSIFIABLE } with what would change it
- 5–10 papers-to-read with annotations
Stage 3 — Synthesis
After the 10 dossiers returned, a synthesis pass identified five recurring failure modes (Marr-level confusion, load-bearing strong words, confirmation-bias trap, confounded causal channels, partial unfalsifiability) and four cross-cutting threads. The convergent verdict pattern (0/10 vindicated, 0/10 cleanly refuted, all 10 in CONTESTED or SPLIT) emerged directly from the dossier-by-dossier process — it was not designed in.
Stage 4 — Product-surface stress test
The strongest engineering surface (claim 9) was developed into a product candidate and put through the 4-persona roundtable in full mode (Round 1 + Round 2 + neutral synthesis). The roundtable unanimously killed the proposed library architecture and converged on an immutable-log + pure-function-views architecture as the substrate. The current site + observatory primitive ships the architecture that survived all of that scrutiny.
Limits & honest caveats
- Citation verification. The reading list deliberately foregrounds older, well-established references; verification status is shown per item.
- Persona lenses are stylistic, not authoritative. They are an analytical convenience, not endorsements by the named individuals.
- Strong-form / weak-form is a prompted dichotomy. Some claims may have intermediate forms not surfaced.
- The convergent pattern is partly an artifact of construction. Strong forms were deliberately chosen to overshoot.
Reproducibility
The full corpus — claim formulations, persona prompts, roundtable transcripts, and angle-research dossiers — lives in the source repository. Critique a verdict by reading the dossier, the steelman, and the listed papers. Disagreements are PRs.