Review

This post comes in two forms: a 15 minute talk and link to partway through forward-forward's initial paper (see also papers citing this). 

Talk version (2x speed recommended, captions recommended): 

I don't find the discussion of forward-forward to be the most interesting part; it's a plausible learning algorithm, perhaps, but what I'm really interested in is the impact he thinks it'll have on how computers are designed: he claims they're going to be chunks of trained matter that interface with the outside world at boundaries and are otherwise inscrutable and reliant on defects in the particular hardware's shape.

This seems most relevant to me in terms of what effects it might have on the shape of selfhood, if any, of the AI living in that low-power silicon brain.


Sidenote: Amusingly, he opens with a claim that this implies that we can't do brain uploads, and yet describes exactly how to do them anyway: distillation training of a student, including copying of mistakes, which seems to me like the obvious way to do incremental hardware replacement of human brains as well. I also think he's overestimating how much this will prevent exact copying; exact copies won't behave exactly the same way, but it seems likely to me that one would be able to copy knowledge out of a chip more precisely than just by distillation by also involving the kind of offline scanning hardware one would use to examine a CPU. Using the resulting scan would require the help of a learned scan-to-hardware converter AI, though. 

New Comment
3 comments, sorted by Click to highlight new comments since:

The presentation starts at 3:50.

Did Hinton explain why the idea of uploading a mind is ridiculous and would never work? I didn't hear a refutation.

Mortal computers sound fascinating, and, in light of transistor sizes reaching fundamental limits, are certainly a technology tree worth trying. But unless costs are several orders of magnitude lower than their silicon-based counterparts, I don't see broad applicability. 

You're sacrificing immense generalizability and universality for a one-off, opaque black-box that may or may not do what you want. Contrast that with pre-trained, essentially immortal models in-silica - wouldn't the duplicability of such a model outweigh any perceived economic benefit? I struggle to find a single use case that a traditional model wouldn't do better and (over the long term) cheaper.

A key thing here, as far as I can tell hinton is most driven to understand the brain, not so much to create superintelligent machines. He's making intelligent machines in order to build models of neuroscience; the generative perspective on intelligence.

re: brain uploading, I think the idea is that you'd distill into and out of the volumetric compute system by reading out its' weights' implications on its outputs, whether it's a bio brain or a silicon one, instead of uploading its contents directly using lossless readout of the weights. What he claimed was specifically that you can't upload a high error rate brain, not that you can't make anything that duplicates the brain's behavior. In my view, if tightly linked during online learning, incremental distillation will cause the physical systems to operate as one conscious entity and still counts as uploading for my purposes as a brain myself. You could say that distillation is where you have kids but also replicate at the connectome level not just the genome level; mitosis of a brain, perhaps. He's saying you can't make an exact copy of a system that does hardware-level error correction, because a system that learns to make use of its own errors as part of normal operation won't perform the same way if its internal state is what is copied, and thus any structural copy you make will lose tiny details that add up to a replica that is inexact.

I don't think he's totally barking up the wrong tree there, but I think it'll also be fairly possible to copy and relearn. The copy won't be an exact replica but whatever, who cares, if you then also do the teacher-student distillation he's talking about then you end up with an even better replica than if you'd just distillled. well, maybe, anyway.

re: efficiency and cost - I think the chip will, for the time being, probably still be made of silicon or at least standard chip components even if we move off silicon. There are neuromorphic chip groups where a lot of worker and chip fab time and energy are directed at this problem. The benefit is in completely avoiding any IO wiring, which is most of the cost of current massively parallel chips, not avoiding use of transistors. That said, more generally, I imagine you're more or less correct about the tradeoff, and that mortality is not the key to why this approach could produce significant intelligence - only compute-in-memory is needed to get the benefits he imagines, and the algorithms he comes up with intended for mortal compute should also work for compute-in-memory. But, algorithms that scale down to chips where computation makes no attempt at all to be error free or structured in ways that generalize to other chips is both a promising area of research for increasing capability per watt, the only true metric of intelligence, and also should scale up to bulkier circuitry as well.

An issue I see here, though, is that this approach seems much more likely to be able to exceed human intelligence per watt in the near term, which is a key metric that gives us a guideline about the foom curve and when to expect biological brains to be at significant risk of acute disempowerment. the question we need to answer from a safety perspective is still how to end up in a co-protective equilibrium with artificial beings, but now with the added modifier that their brains may only be imperfectly copyable.