In part 1, we took a look at the early history of artificial intelligence, focusing on so-called “symbolic” approaches to artificial intelligence. Given that it was the dominant paradigm in public and private research through the second “AI winter” of the late 1980s, funding for artificial intelligence as a whole suffered up until quite recently — and in large part due to the wholly symbolic approach’s failure to live up to the promises made by some of its most outspoken proponents.

In part 2, we saw how a number of connectionist-inspired approaches were successful in tackling problems that had been sources of difficulties for the various symbolic approaches. However, the stochastic architecture of contemporary LLMs (based, as they were, on these earlier successes) meant that they remained (and remain) vulnerable to an altogether different set of challenges — some of which may be answered by hybrid approaches.

In our third (and final) part, we’ll turn our attention to three novel ways of combining these two approaches and see how they attempt to address these issues.

Hybrid Approaches

“Neuro-symbolic” (“NeSy”) approaches to AI mean roughly what the name suggests: a hybrid approach that combines symbolic AI with approaches inspired by the brain (e.g., connectionism, artificial neural networks) that are inspired by the mechanisms of the human brain. Recent talks and papers on the topic (such as those by Angelo Dalli, Henry Kautz, Francesca Rossi and Bart Selman) have argued for incorporating the insights of psychologist Daniel Kahneman (of Thinking, Fast and Slow fame) into the practice of combining symbolic and stochastic systems. Kahneman’s “System 1” (characterized by quick, intuitive and generalized judgments) would be approximated with stochastic systems, while “System 2” issues (consisting of more careful, deliberative tasks) would be handled by various forms of more traditional symbolic processing. Gary Marcus, a proponent of NeSy, argues that “[w]e cannot construct rich cognitive models in an adequate, automated way without the triumvirate of hybrid architecture, rich prior knowledge, and sophisticated techniques for reasoning.” We do have examples of this in practice, such as with Waymo autonomous vehicles, which “nests” a neural system within an overall symbolic one.

Within this larger context, the notion of “tool use design” refers to the process of designing an AI “agent” with the capability to use a number of external resources, such as querying external APIs or databases, executing code or interacting with knowledge bases. Given that the design of a given AI agent assigns the right tools for the right tasks, certain errors that are due to the symbolic or stochastic nature of the architecture can be avoided in advance. The challenge here is in integrating these systems in a way that represents the data correctly while also being computationally efficient. These “tools” could include data that can be incorporated into neural or symbolic layers, depending on the architecture’s intended domain.

A third design philosophy, called “constrained decoding”, attempts to structure the output from LLMs by providing parameters for what an output ought to look like, but by using resources already within the LLM to do so. Constrained decoding differs from NeSy and tool use in that it seeks to extract well-ordered information from a single LLM, rather than modify the outputs using external sources (like in tool use/NeSy) or with multiple agents, even. One way to do this is to provide prompts that ask for data that conforms to a schema provided in regular expressions.

These three above-mentioned examples barely scratch the surface of the various ways in which these two approaches can be constructively combined. Historically, both the neural/connectionist as well as the symbolic approaches have encountered their own unique sets of challenges, and at points the responses to these challenges have taken the form of essentially starting over from the opposite side. Improvements in hardware capabilities since the shift from expert systems in the 1980s mean that in principle, neither approach taken in isolation has any advantages in terms of hardware (as was the case before large-scale adoption of GPUs in the late 1990s).

Hybrid approaches — whether that means NeSy, tool use design, constrained decoding, etc. — have the advantage of being infinitely customizable. In theory, at least, symbolic layers could receive stochastic layers as inputs, or vice-versa, depending on the behavior in question to be modeled. Researchers have indeed made progress on both fronts, and there seems to be no need to throw either baby out with the bathwater.

Our current position in the history of the field allows us to see how both ways of doing things were inadequate at various times or given certain practical limitations. Their respective “failures” at given times in that history can now be viewed as steps on a path that has gotten us to where we are now. The dialectic between these two approaches is certainly not resolved, but now includes a richer, more mutual understanding that was not previously possible without successes and failures on each side. The successes made in the fine-tuning of both of these opposing approaches owe a debt to the struggles of the other, for better or worse.

How Did We Get Here? From Symbolic to Stochastic (Part 3)

Hybrid Approaches