ASSC 2026 · Association for the Scientific Study of Consciousness
A unifying consciousness theory for reasoning in LLMs
We build an explicit global workspace inside a pretrained Large Language Model, by augmenting the Transformer. It improves reasoning and increases the synergy between its components.
Leave your email and we will send the preprint, full results, and the codebase the moment they are out.
No spam, one email when it ships.
The idea
Theories of consciousness are hard to test in brains. An LLM with an explicit workspace gives us a system we can build a theory into and intervene on directly.
Consciousness theories are normally tested in brains, where clean intervention is hard.
An LLM with an explicit workspace lets us build a theory's commitments in and manipulate them directly.
We use it as a testbed for Global Workspace Theory, IIT, and how they relate.
Architecture
A capacity-limited spotlight selects salient components, writes them sparsely to a shared workspace, and broadcasts the result back to all components, iterated across layers. Unlike prior workspace networks trained from scratch, ours runs on a pretrained model's KV cache, and we measure the integration it produces.
Results · reasoning
Across four backbones, broadcasting selected information improves multi-step reasoning. The workspace does functional work, the access claim made concrete.
| Method | GSM8K | SVAMP | GSM-Hd | LogiQA | Gaokao | AVG |
|---|---|---|---|---|---|---|
| Llama-3.2 1B | ||||||
| SFT | 33.00 | 42.00 | 8.00 | 28.90 | 23.90 | 27.16 |
| BT | 35.30 | 43.70 | 8.20 | 28.60 | 25.60 | 28.28 |
| +GW | 35.60 | 46.70 | 7.70 | 29.50 | 25.40 | 28.98 |
| Llama-3.2 3B | ||||||
| SFT | 53.98 | 64.67 | 14.33 | 30.26 | 31.05 | 38.86 |
| BT | 57.09 | 71.33 | 14.63 | 30.26 | 30.20 | 40.70 |
| +GW | 58.30 | 69.67 | 15.85 | 31.49 | 30.48 | 41.16 |
| Llama-3.1 8B | ||||||
| SFT | 13.12 | 23.33 | 3.11 | 27.19 | 28.21 | 18.99 |
| BT | 20.85 | 39.33 | 4.93 | 27.04 | 26.50 | 23.73 |
| +GW | 21.76 | 40.00 | 5.38 | 26.57 | 25.07 | 23.76 |
| Qwen3-0.6B | ||||||
| SFT | 53.70 | 68.30 | 20.30 | 27.50 | 26.80 | 39.32 |
| BT | 54.80 | 68.70 | 21.20 | 27.20 | 26.50 | 39.68 |
| +GW | 55.00 | 70.00 | 21.10 | 27.50 | 27.10 | 39.94 |
SFT supervised fine-tuning · BT Bottlenecked Transformer · +GW global workspace (dense broadcast). Average over the five tasks; best per column shaded.
Results · efficiency
Averaged over the standard LM benchmarks · 355–356M params · 20B tokens · matched across TF / BT / BGT.
Results · performance vs cost
We observe. Accuracy rises with width and is highest at full broadcast. A narrow spotlight recovers most of the accuracy at far lower compute.
Results · integration
Synergy is the information in the whole minus the sum of its parts, over balanced head partitions. More positive means higher synergy, and our model (BGT) tends toward it.
SFT baseline · BT · BGT 95% CI · less separable →. Still redundancy-dominated, so a relative shift.
Results · performance vs integration
We observe. Synergy and performance have a complex relationship. Accuracy peaks within a few steps, then collapses if the processor keeps iterating, even as raw synergy carries on climbing.
Discussion
Specialised modules approximate a factorised, mean-field posterior that drops the dependencies between them. Synergy is the part no subset captures. The workspace re-couples a few to recover it and escape the local optima that factorised inference gets stuck in, at an energy cost.
Trained on the model's own loss, the workspace keeps what the latent tells us about the output while discarding input detail (data processing inequality). Compress the input, keep the output, the condition for generalisation.
Compression and synergy are orthogonal: how much of the input survives, versus how it is organised across modules. The workspace does both. We see synergy rise where broadcast helps; we have not yet shown it is the cause.
Open questions
We have the testbed. Tell us what experiment would convince you.
Stay in the loop
Leave your email and we will send the preprint, the full results, and the codebase the moment they are out.
No spam, one email when it ships.