Revisiting Minsky’s Society of Mind in 2025

AI Agents, Modularity, and Alignment

Jun 17, 2025

A Teenager’s Frustration, a Researcher’s Revelation

(Note: this is a deeply nerdy / technical post, with most applicability to folks building AI systems, and of little relevance to most users of AI.)

In the late 90s, as a tech-obsessed teenager, I picked up Marvin Minsky’s 1986 book The Society of Mind expecting profound answers about intelligence. It was exciting: Minsky made AI seem so tractable, with beautiful essays arguing that the mind is composed of countless simple “agents” – little processes that individually do almost nothing, yet collectively produce what we call thinking. Between this and Hofstadter’s Gödel, Escher, Bach, it seemed like intelligence as an emergent phenomena was just around the corner. But in the many years that followed, any progress in AI came from systems that felt very different than the one Minsky described.

Eventually, I dismissed Minsky’s theory as an interesting relic of AI history, far removed from the sleek deep learning models and monolithic AI systems rising to prominence.

Fast forward to 2025, and my perspective has flipped. After a decade of working with large language models and AI systems, I’m struck by how prescient Minsky’s ideas now seem. The AI field is running into the limits of gigantic, monolithic models – like today’s large language models (LLMs) that try to do everything in one go – and increasingly looking toward modular, multi-agent approaches. Techniques that once sounded fanciful in Society of Mind, like collections of specialized “mini-AIs” and internal self-monitoring agents, are re-emerging as practical strategies for building more robust, scalable, and aligned AI.

As a historian of technology, I’ve seen foundational ideas cycle back into relevance, and Minsky’s vision of a “society” inside a mind is a prime example. Today, AI researchers and engineers are essentially operationalizing the Society of Mind – often without realizing it – through architectures that value diversity of components over any single all-powerful algorithm.

Let’s explore how Minsky’s insights on modularity, agent-like architectures, and internal oversight map onto current developments: from Mixture-of-Experts models and multi-agent systems (HuggingGPT, AutoGen) to new approaches in AI alignment and debates over centralized vs. decentralized designs.

Minsky’s Vision: Mind as a Society of Simple Agents

Minsky’s core proposal in The Society of Mind is elegant and radical: “The power of intelligence stems from our vast diversity, not from any single, perfect principle.”

Rather than a single, unified “genius” solving problems, our minds are portrayed as assemblies of many tiny agents, each with limited ability. These agents form hierarchies and teams (which Minsky calls “agencies”), where each sub-agent handles a piece of a task, and higher-level agents coordinate or choose which agents should act. Intelligence, in this view, emerges from the interplay of lots of simple parts, much like a society’s culture emerges from many individuals. No single component understands the whole, yet together they achieve complex, adaptive behavior.

Importantly, Minsky also anticipated the need for oversight and self-regulation within this agent society. He described how minds avoid mistakes by employing what he called “negative expertise” – essentially knowledge about what not to do. In Minsky’s model this takes the form of special “censor” and “suppressor” agents that watch for dangerous or unproductive impulses. “Censors suppress the mental activity that precedes unproductive or dangerous actions, while suppressors suppress those unproductive or dangerous actions themselves,” he wrote . In other words, one part of the mind can veto or inhibit another, providing a safety check against runaway behaviors or known pitfalls.

Minsky went further to imagine a B-brain (and A-brain) structure – essentially a division between the mind’s primary problem-solving processes and a higher-level monitoring process. The “A-brain” handles the world directly (perceiving and acting), while the “B-brain”’s job “is not to think about the outside world, but rather to think about the world inside the mind.” The B-brain watches the A-brain in action, looking for errors like infinite loops or bad reasoning, and intervenes to correct them.

This is an early vision of an internal oversight agent dedicated to keeping the rest of the system on track – a clear parallel to what we now discuss in AI as alignment techniques and self-reflection modules. Minsky’s B-brain and censor agents are essentially built-in safety mechanisms for an intelligent system, ensuring the “society” of mind doesn’t run off the rails. At the time, these ideas were speculative and not grounded in concrete engineering, but they laid out a blueprint for modular, introspective intelligence that feels uncannily relevant to today’s challenges.

What I didn’t understand as a 12 year old was that, Minsky’s book was more a philosophical framework than an implementation plan. It didn’t prove how to build a thinking machine, but offered a set of concepts – from “K-lines” (memory triggers) to “frames” (knowledge structures) – that influenced how some AI researchers thought about cognition. For a while, though, mainstream AI drifted in other directions.

From One Big Brain to Many: The Limitations of Monolithic AI

Why do Minsky’s ideas resonate more in 2025 than they did a decade or two ago? One reason is the growing awareness of limitations in monolithic AI architectures. Through the 2010s and early 2020s, the dominant strategy in AI was to train ever-larger neural networks on ever-broader data. Models like GPT-3 and GPT-4 are single behemoth brains – one big algorithm trying to do everything . This approach yielded remarkable breakthroughs. But as we push it further, cracks are showing. Giant all-in-one models can be astonishingly clever in some respects, yet surprisingly brittle in others: they can still get confused by multi-step reasoning, struggle with long-horizon planning, hallucinate false information with supreme confidence, and lack any built-in mechanism to check or justify their own outputs.

Essentially, a monolithic LLM is a talented mimic (stochastic parrot?) that doesn’t truly know when it’s wrong.

Moreover, having a single model handle every aspect of a complex task can be inefficient. If you prompt a lone large model to “plan a research project, write code for it, analyze data, then draft a report,” you might find it loses coherence or makes errors as it tries to juggle all those subtasks in one go. While one-shot end-to-end prompting is convenient, it often isn’t the most reliable way to get complex jobs done. Just as importantly, a single huge model is a single point of failure – it lacks the diversity of approach that multiple specialized modules could offer.

In a sense, today’s giant neural networks embody the opposite of Minsky’s take on diversity: they seek a single universal principle (a large Transformer) that can handle all problems if scaled up enough. While straight scale up (more neurons, larger training data sets) brought tremendous gains in the first wave of LLMs, it turns out that that this “one model to rule them all” mentality has diminishing returns.

Instead of relying on one mega-model for everything, why not compose several models, each an expert in a particular function, and have them collaborate? This is where Minsky’s vision starts to map onto current trends. The pendulum in AI research is swinging from extreme centralization back toward decentralized architectures – effectively, towards building a “society of models” rather than one solitary model. Let’s look at how this is unfolding in practice.

Mixture-of-Experts: A “Society” Inside One Model

One prominent example of modularity gaining traction is the Mixture-of-Experts (MoE) architecture. Mixture-of-Experts models split a neural network into many specialized sub-networks (experts) and a gating mechanism that routes each input to the most appropriate expert. In essence, an MoE is not one neural network but a team of smaller networks each trained on different facets of the data. The gating network plays the role of a dispatcher, deciding, for instance, that one “expert” handles questions about code while another handles questions about history, etc.

As the Hugging Face team explains, in an MoE “layers have a number of ‘experts’ (e.g., 8), where each expert is a neural network… and a gate network or router determines which tokens are sent to which expert.” Only a few experts are active for any given input, making the computation sparse and efficient. This design allows models to scale to enormous parameter counts without proportional increases in runtime cost – in fact MoE approaches have enabled training trillion-parameter models by leveraging many experts efficiently .

Mixture-of-Experts can be seen as a contemporary embodiment of Minsky’s idea that different problems may require different solution methods. Instead of one uniform strategy baked into a monolith, an MoE model dynamically selects from a toolbox of strategies. Each expert may develop its own “skill” (one might excel at grammar, another at common-sense reasoning, etc.), akin to agents in a society with different roles. Notably, MoEs echo an old insight from cognitive science: that humans might internally use multiple specialized processes for different tasks. In fact, some cognitive modeling research has argued that MoEs provide a “psychologically plausible” model of how people partition knowledge and strategies.

While MoEs alone don’t give us full agent-like behavior (the experts aren’t independently reasoning agents, just parts of a network), they break the illusion of a single unified model. Within an MoE, there is a community of subnetworks, each contributing in turn. This is a step toward a Society of Mind – embracing diversity inside the model. It’s also an early hint that modularity can beat scale alone: a mixture with 100 smaller experts can outperform an equivalently large homogeneous model, because those experts can specialize and avoid each other’s pitfalls.

Multi-Agent Systems: Toward a Society of Models

Multi-agent AI systems are systems where instead of one model, we have multiple AI agents (often powered by LLMs or other models) that explicitly communicate, coordinate, and collaborate to solve tasks. It’s the difference between a lone problem-solver and a team of specialists working together. Since 2023 a flurry of research prototypes and frameworks embraced this idea, effectively creating AI “societies” in software.

For example, HuggingGPT (a project from Microsoft Research) proposes using a large language model as a controller that can call on other pretrained models as needed. The central LLM (e.g. ChatGPT) serves as a planner and orchestrator: it interprets a user’s request, then “acts as a controller to manage existing AI models to solve complicated tasks” . In practice, HuggingGPT will break a complex request into subtasks and, for each subtask, invoke a specialized model from the vast Hugging Face model ecosystem (perhaps a vision model for an image subtask, a speech model for audio, etc.). The LLM controller figures out which expert is needed when, routes the query to it, then integrates the results into a final answer . This is essentially a Society-of-Models architecture: one high-level “brains” that knows how to delegate, and many narrow “doers” that execute specific tasks. It has proven effective at handling multi-modal requests that no single model could solve alone. The multi-agent philosophy is clear: use the right tool for each job, and use an intelligent coordinator to oversee the workflow.

Another example is AutoGen, an open-source framework that lets developers compose multiple LLM agents that converse with each other to accomplish goals . With AutoGen, you can create, say, a pair of agents – one charged with brainstorming ideas and another with critiquing them – and let them chat back and forth to refine a solution. Or you might have a chain of agents each transforming the output of the previous (one generates a plan, the next turns it into pseudocode, another into actual code, etc.). AutoGen provides the infrastructure for such agent conversations: “Agents are customizable, conversable, and can operate in various modes (LLMs, tools, human inputs)… [it] allows developers to create flexible agent behaviors and conversation patterns for different applications.” In essence, it’s a platform for building a society of AI agents tailored to your task. Early applications range from coding assistants to supply-chain optimizers, all using teams of agents under the hood .

Beyond specific products, there’s a general pattern emerging across many experimental setups: multi-agent systems often adopt functional roles very reminiscent of Minsky’s agencies. A complex AI application might assign, for instance, one agent as a planner, others as workers on subtasks, perhaps one as a quality-checking critic, and another as a memory store. This structure has organically appeared in projects like AutoGPT (the autonomous AI agent loop that captured popular imagination), and is discussed widely in the AI community. As one observer summarized, “We’re seeing more systems now where: one agent plans or delegates; others handle subtasks like code, retrieval, or summarization; critics check outputs; memory agents preserve long-term state.” Individually, each agent is limited – “none of these agents are doing anything miraculous” on its own – but together they achieve results that a single model struggles with, “especially long-horizon, multi-step tasks.”

The multi-agent approach shines at tasks that require decomposition and iteration: instead of forcing one model to internally juggle everything, the system explicitly breaks the problem into parts and lets each part be handled by an expert (sound familiar?). It’s effectively task decomposition and delegation, which has been a known strategy in classical AI planning for decades, now supercharged with LLM-driven agents. Significantly, teams of agents can also operate in parallel, tackling different subtasks simultaneously – a huge speed and scalability win over a single-threaded large model .

What’s striking is how often these setups rediscover dynamics Minsky anticipated. Consider the use of a critic agent to catch mistakes: some projects use a “solver + critic + refiner” loop where one agent proposes a solution, another agent checks it or tries to find flaws, and then the first (or a third) agent revises the solution. This is basically an internal debate and refinement process. And indeed, practitioners report that “a solver proposes, a critic flags issues, and a refiner improves the answer – this kind of structure consistently improves factual accuracy” in LLM outputs. In other words, building a mini-society of mind with an internal adversarial dialogue yields more trustworthy results than relying on a single model’s first answer.

Echos of Minsky: it’s a direct realization of the idea that you need different viewpoints (at least a generator and an evaluator) to approach truth. Similarly, the “memory agent” concept in multi-agent frameworks (an agent dedicated to storing and recalling information) harkens back to Minsky’s discussion of agents that specialize in remembering past states (like his “K-lines”). Modern systems like LangChain’s memory modules or vector database plugins essentially serve this role, giving the agent society a longer attention span than any individual neural net’s context window.

All told, the trend toward multi-agent systems represents a shift from centralized to decentralized intelligence in AI. Rather than one model that pretends to be all-knowing, we have assemblies of models that collectively cover more ground. This distributed approach is inherently more modular: one can swap out or upgrade an agent without retraining the whole system, just as in a company you can replace a team member without rebuilding the organization from scratch. It’s also more interpretable in many cases – you can follow the chain of decisions across agents, which is easier to dissect than the hidden layers of a single giant network. We’re basically acknowledging that intelligence might be better achieved through a community of simpler intelligences, much like Minsky posited.

And interestingly, this mirrors how human organizations solve complex problems (divide the work between departments/experts) and even how some cognitive scientists theorize the human mind itself might work (different brain regions and processes handling different aspects of cognition). What was once an abstract theory in The Society of Mind is starting to look like product architecture in real-world AI systems.

Internal Critics and AI Alignment: Censors, Self-Reflection, and the B-Brain Reborn

One of the most challenging aspects of modern AI is alignment: ensuring that AI systems behave in accordance with human values, produce truthful outputs, and are safe. Here again, Minsky’s ideas offer a compelling template. The notion of internal censor agents and a monitoring B-brain is essentially a blueprint for how one might build AI systems with internal alignment mechanisms. Instead of relying only on outside oversight or hard-coded rules, the system itself could contain subsystems devoted to keeping it honest and safe. Today’s researchers are exploring exactly this idea through techniques like internal critics, self-reflection prompts, and multi-agent debate.

For instance, there is growing evidence that prompting an LLM to critique and refine its own answers can significantly improve correctness. In recent studies, after an LLM gives an initial answer, it can be prompted (or another “referee” model can prompt it) to reflect on potential errors and try again. The results are striking: “LLMs are able to reflect upon their own chain-of-thought and produce guidance that can significantly improve problem-solving performance.” This self-reflection process often catches logical mistakes or inconsistencies that the model made on the first pass. In effect, the model is split into a first-pass “solver” and a second-pass “critic” – a very simple two-agent society.

The improvement comes from the interplay between these roles. Researchers Matthew Renze and Erhan Guven, for example, found that across multiple domains and models, adding a self-reflection step boosted accuracy on difficult questions, validating the power of internal feedback loops . Other work on “chain-of-thought” prompting, “self-correction” algorithms, and even Anthropic’s Constitutional AI (which has the model judge its outputs against a set of principles) follows a similar intuition: build a bit of Minsky’s B-brain into the AI. Have the AI think not just about the external answer, but also introspect on how it arrived at that answer and whether it might be going astray.

We can also see this trend in more elaborate multi-agent alignment strategies. One proposal from DeepMind and others is to have AIs engage in a debate where one AI tries to convince a judge of an answer and another AI tries to point out flaws or deceptive reasoning. The hope is that through such an internal adversarial dialogue, the truth will prevail more often – the AI equivalent of the Socratic method. This is very much in spirit with Minsky’s censors and critics: harness parts of the system to watchdog other parts. Similarly, projects like OpenAI’s “assistant vs. adversary” training (red-teaming a model with another model) or the use of separate moderation models to filter a primary model’s outputs can be seen as forms of Society of Mind thinking – they add extra agent-like components focused on safety. In less formal settings, even a tool like the “Critic/Refiner” loop mentioned by a practitioner in a forum (where one agent continually reviews another’s work to eliminate “dumb mistakes” ) is an example of alignment via internal critique. It’s essentially implementing Minsky’s negative expertise: one agent in the system is tasked not with generating new content, but with catching bad content (errors, harmful material, etc.) before it goes out.

We have come to realize that complex AI systems, like complex human organizations, need governance structures. Just as societies have laws and regulators, AI “societies” might need internal moderators. Minsky’s theory provided names for these roles decades ago – now we’re finally seeing what they look like in practice.

Of course, these approaches are not silver bullets. Building reliable internal critics is hard – an AI censor might miss things or even introduce bias if it’s not well-designed. The research community is actively debating how best to achieve inner alignment. But the key point is that monolithic approaches to alignment (e.g. just train one model really well on all data and hope for the best) are giving way to more structured solutions. Whether it’s splitting a model into a question-answerer and a fact-checker, or having multiple agents cross-examine each other, the aligned AI of the future likely won’t be a single black box. It will be more like an ensemble with checks and balances – an idea The Society of Mind encapsulated in its metaphor of a mind with self-scrutiny built in.

Centralized vs. Decentralized AI Architectures: The Pendulum Swings

The resurgence of Minsky’s ideas also reflects a broader cyclical debate in technology: centralized versus decentralized design. In AI, a centralized (monolithic) architecture means one integrated model or system handles everything, whereas a decentralized (modular) architecture breaks functionality into components that interact. Both approaches have trade-offs. Monolithic systems can be simpler to build initially and may leverage emergent properties when everything is trained together – sometimes a big model finds a clever internal representation that a collection of small models might miss. Modular systems, on the other hand, offer flexibility, specialization, and fault tolerance – you can optimize each piece for its task, and the failure of one component might be caught or compensated by others.

We’re currently in a phase where the benefits of modular, multi-agent designs are becoming hard to ignore. As one AI engineer quipped, “multi-agent setups today are basically operationalizing Society of Mind… it’s coordination over raw scale now.” Rather than purely increasing the scale of one model’s parameters, researchers are coordinating multiple semi-intelligent pieces. Still, the debate isn’t settled. Some tasks might still be best handled by a single unified model – for example, tightly coupled problems where all parts share the same context.

Additionally, modular systems bring complexity: stitching together a bunch of models with APIs, prompts, or communication protocols is more complicated to engineer and can introduce new failure modes at the interfaces. In one discussion, developers noted that modularity can add opportunities for errors or confusing behaviors at the boundaries, and may reduce the chance for emergent behaviors that a single large model might exhibit . In other words, if you cut an intelligence into parts, you might lose some “magic” that comes from the whole. These are valid concerns, and it’s likely that the optimal solutions will hybridize both philosophies – much as the human brain has localized modules and a deeply interconnected whole.

From a historical perspective, AI has swung between these paradigms before. The early symbolic AI systems of the 1960s-80s were often modular (different subsystems for vision, planning, etc.), then the pendulum swung to end-to-end learning where a single deep network does everything from perception to action. Now, with the advent of very powerful but opaque large models, we’re swinging back to adding structure and decomposition on top of them. It’s a reminder that there is no one-size-fits-all approach. Minsky’s Society of Mind doesn’t say that a society is always better than an individual mind; it says that what we think of as an “individual mind” might itself be a society at another level of description. Likewise, the best AI systems might treat a single large model as an agent in a higher-level society – for instance, using a GPT-4 as one component among many in a larger workflow. The lines between monolithic and modular can blur, as we nest levels of organization. The crucial realization in 2025 is that we are not constrained to choose one or the other; we can compose intelligent systems with both powerful sub-brains and smart orchestration among them.

Conclusion: From 1986 to 2025, Coming Full Circle

Nearly four decades ago, Marvin Minsky offered an answer to “How does the mind work?” that sounded more like science fiction than engineering: the mind works by being a society. Today, as we design advanced AI systems, we find ourselves rediscovering that wisdom in tangible ways. What once seemed an oversimplification – breaking intelligence into little pieces – is proving to be a source of strength. Modularity, heterogeneity, and interaction may be the keys to AI that is more capable and more controllable. We are, in essence, building societies of mind in our software: whether it’s an ensemble of expert models routed by a gating network, a swarm of autonomous agents collaborating on a task, or a single chatbot augmented with an inner voice of reason checking its work.

As someone who initially felt frustration at Minsky’s book and now feels admiration, I find this turn of events profoundly satisfying. It highlights how ideas in technology often come full circle. The challenges of aligning AI with human values, of achieving generality without sacrificing reliability, of scaling intelligence without losing control – these are leading us back to architectures that explicitly incorporate multiple perspectives and modular oversight. Back then, Society of Mind was a grand philosophical framework; now it’s starting to look like a practical blueprint for AI design.

Of course, we are still in the early days of multi-agent and modular AI. Many open questions remain: How do we best divide cognitive labor among agents? How do we ensure the overall system is coherent and efficient? Will autonomous agent societies exhibit unexpected emergent behaviors, for better or worse? How do we govern a complex web of interacting intelligences? These are questions Minsky’s work can’t answer directly, but it provides a language and lens to approach them. The value of looking back at The Society of Mind is not in finding a ready-made solution, but in appreciating the foresight of viewing intelligence as plural.

In 2025, the AI community is increasingly viewing large intelligence not as a single monolith, but as an ecosystem of models, modules, and feedback processes. It took us decades of detour through neural networks and end-to-end learning to appreciate what Minsky and others intuited: that sometimes, solving a hard problem requires many minds. As we build the next generation of AI – perhaps on the road to artificial general intelligence – the design principles from The Society of Mind are proving not only relevant but essential. In embracing societies of agents, we aren’t just validating an old theory; we’re crafting a future where AI is more robust, more aligned, and perhaps a bit more human-like in its diversity. And to teenage me, who tossed aside Minsky’s book years ago – I would say, hang onto those “simplistic” ideas, because one day they might light the way forward.

Sutha’s Substack

Discussion about this post