LESSWRONG
LW

3

Why are neuro-symbolic systems not considered when it comes to AI Safety?

11th Apr 2025

1 min read

3

I am really not sure of why neuro-symbolic systems are considered as alternatives to the current black-box ones?

A concrete example I have found (and currently studying) is HOUDINI (https://arxiv.org/pdf/1804.00218). Essentially, it implements neural networks using higher order combinators (map, fold etc.) that were found via enumeration/genetic programming searches. When the programs are found, the higher order combinators are "transformed" into trainable networks and added to an ever growing library of "neural functions". The safety provided by such systems comes in the form of understanding the combined functions that form a the solution to a problem. Perhaps, mechanistic interpretability could be further used to dissect the inner workings of the trained networks.

Please, describe to me why is this not a viable course for AI Safety? For that matter, why are alternative technologies not considered at all (or if they are, please mention them)? My initial guess would be that such systems are either not competitive enough, or are a form of "starting from scratch". However, these point might not apply to neuro-symbolic systems.

New to LessWrong?

Getting Started

3

Why are neuro-symbolic systems not considered when it comes to AI Safety?

7Thane Ruthenis

New Comment

6 comments, sorted by

Click to highlight new comments since: Today at 9:38 PM

[-]tailcalled13d30

https://www.lesswrong.com/posts/gebzzEwn2TaA6rGkc/deep-learning-systems-are-not-less-interpretable-than-logic

[-]AnthonyC13d20

I think @tailcalled hit the main point and it would be a good idea to revisit the entire "Why not just..." series of posts.

But more generally, I'd say to also revisit Inadequate Equilibria for a deeper exploration of the underlying problem. Let's assume you or anyone else really did have a proposed path to AGI/ASI that would be in some important senses safer than our current path. Who is the entity for whom this would or would not be a "viable course?" Who would need to be doing the "considering" of alternative technologies, and what is the process by which those alternative technologies could come to be at the forefront of AI? Where, in the system of companies and labs and researchers and funding mechanisms and governments, could the impetus for it come from, and why would they actually do that? If there is no such entity, then who has the power to convene a sufficient set of stakeholders that would collectively be able and willing to act on the information, and force a negotiated solution?

Consider that in our current system, 77% of all venture funding is going into extant AI approaches, and OpenAI alone is 26%. And consider that competition in AI is intense enough to start breaking down many-decades-old barriers to building new nuclear power plants and upgrading the power grid in a way climate change has never managed. Changing the course of AI in some way that is really fundamental may in fact be necessary, but forcing it to happen requires pushing back against, or sidestepping, a huge amount of pressure to stay the course.

[-]Thane Ruthenis12d72

Let's assume you or anyone else really did have a proposed path to AGI/ASI that would be in some important senses safer than our current path. Who is the entity for whom this would or would not be a "viable course?"

A new startup created specifically for the task. Examples: one, two.

Like, imagine that we actually did discover a non-DL AGI-complete architecture with strong safety guarantees, such that even MIRI would get behind it. Do you really expect that the project would then fail at the "getting funded"/"hiring personnel" stages?

tailcalled's argument is the sole true reason: we don't know of any neurosymbolic architecture that's meaningfully safer than DL. (The people in the examples above are just adding to the AI-risk problem.) That said, I think the lack of alignment research going into it is a big mistake, mainly caused by the undertaking seeming too intimidating/challenging to pursue / by the streetlighting effect.

1

1

[-]AnthonyC11d42

Do you really expect that the project would then fail at the "getting funded"/"hiring personnel" stages?

Not at all, I'd expect them to get funded and get people. Plausibly quite well, or at least I hope so!

But when I think about paths by which such a company shapes how we reach AGI, I find it hard to see how that happens unless something (regulation, hitting walls in R&D, etc.) either slows the incumbents down or else causes them to adopt the new methods themselves. Both of which are possible! I'd just hope anyone seriously considering pursuing such a venture has thought through what success actually looks like.

"Independently develop AGI through different methods before the big labs get there through current methods" is a very heavy lift that's downstream of but otherwise almost unrelated to "Could this proposal work if pursued and developed enough?"

I think, "Get far enough fast enough to show it can work, show it would be safer, and show it would only lead to modest delays, then find points of leverage to get the leaders in capabilities to use it, maybe by getting acquired at seed or series A" is a strategy not enough companies go for (probably because VCs don't think its as good for their returns).

1

[-]Edy Nastase12d10

These are some very valid points, and it does indeed make sense to ask "who would actually do it/advocate it/steer the industry etc.". I was just wondering what are the chances of such approach to take-off, but maybe the current climate does not really allow for such major changes to the systems' architecture.

Maybe my thinking is flawed, but the hope with this post was to confirm whether it would harmful or not to work on neuro-symbolic systems. Another point was to use such a system on benchmarks like ARC-AGI to prove that an alternative to dominating LLMs is possible, while also being to some degree interpretable. The linked post by @tailcalled is a good point, but I also noticed some criticism in the comments regarding concrete examples of how interpretable/less interpretable such probabilistic/symbolic system really are. Perhaps, some research on this question might not be harmful at all, but I think that is my opinion.

[-]AnthonyC11d20

I'm not a technical expert by any means, but given what I've read I'd be surprised if that kind of research were harmful. Curious to hear what others say.

Curated and popular this week

107Impact, agency, and taste

3h

2

248Why Have Sentence Lengths Decreased?

Arjun Panickssery

5d

74

220Accountability Sinks

3d

17