They were not; computers were programmable long before that. Before the 4004, the functionality that we today find on a CPU were distributed over a larger collection of circuitry, with different separate components for the ALU and the instruction interpreter and the register bank and the memory controller and such. But that assembly was already programmable and functioned as a Turing universal computer since the late 40s. The innovation of the intel 4004 was that it was the first design that had all that machinery on a single integrated chip (the first CPU...
One overriding perspective in Max and my approach is that we need to design our systems so that their safety can be formally verified. In software, people often bring up the halting problem as an argument that general software can't be verified. But we don't need to verify general software, we are designing our systems so that they can be verified.
I am a great proponent of proof-carrying code that is designed and annotated for ease of verification as a direction of development. But even from that starry-eyed perspective, the proposals that Andrew argues...
This seems very confused.
What makes good art a subjective quality is that its acceptance criterion is one that refers to the viewer as one of its terms. The is-good-art() predicate, or the art-quality() real-valued function, has a viewer parameter in it. What makes good physics-theory an objective quality is that its acceptance criterion doesn't refer to the viewer; the is-good-physics-theory() predicate, or the physics-theory-accuracy() real-valued function, is one that compares the theory to reality, without the viewer playing a role as a term inside the...
This was my experience studying in the Netherlands as well. University officials were indeed on board with this, with the general assumption being that lectures and instructions and labs and such are a learning resource that you can use or not use at your discretion.
I would say that some formal proofs are actually impossible
Plausible. In the aftermath of spectre and meltdown I spent a fair amount of time thinking on how you could formally prove a piece of software to be free of information-leaking side channels, even assuming that the same thing holds for all dependent components such as underlying processors and operating systems and the like, and got mostly nowhere.
...In fact, I think survival timelines might even require anyone who might be working on classes of software reliability that don't relate to alignment
Yes, I agree with this.
I cannot judge to what degree I agree with your strategic assessment of this technique, though. I interpreted your top-level post as judging that assurances based on formal proofs are realistically out of reach as a practical approach; whereas my own assessment is that making proven-correct [and therefore proven-secure] software a practical reality is a considerably less impossible problem than many other aspects of AI alignment, and indeed one I anticipate to actually happen in a timeline in which aligned AI materializes.
Assurance Requires Formal Proofs, Which Are Provably Impossible
The Halting Problem puts a certain standard of formalism outside our reach
This is really not true. The halting problem only makes it impossible to write a program that can analyze a piece of code and then reliably say "this is secure" or "this is insecure". It is completely possible to write an analyzer that can say "this is secure" for some inputs, "this is definitely insecure for reason X" for some other inputs, and "I am uncertain about your input so please go improve it" for everythin...
We see this a lot on major events, as I’ve noted before, like the Super Bowl or the World Cup. If you are on the ball, you’ll bet somewhat more on a big event, but you won’t bet that much more on it than on other things that have similarly crazy prices. So the amount of smart money does not scale up that much. Whereas the dumb money, especially the partisans and gamblers, come out of the woodwork and massively scale up.
This sounds like it should generalize to "in big events, especially those on which there are vast numbers of partisans on both sides, th...
I do not think you are selling a strawman, but the notion that a utility function should be computable seems to me to be completely absurd. It seems like a confusion born from not understanding what computability means in practice.
Say I have a computer that will simulate an arbitrary Turing machine T, and will award me one utilon when that machine halts, and do nothing for me until that happens. With some clever cryptocurrency scheme, this is a scenario I could actually build today. My utility function ought plausibly to have a term in it that assigns a po
This second claim sounds to me as being a bit trivial. Perhaps it is my reverse engineering background, but I have always taken it for granted that approximately any mechanism is understandable by a clever human given enough effort.
This book [and your review] explains a number of particular pieces of understanding of biological systems in detail, which is super interesting; but the mere point that these things can be understood with sufficient study almost feels axiomatic. Ignorance is in the map, not the territory; there are no confusing phenomena, only m...
If you are going to include formal proofs with your AI showing that the code does what it's supposed to, in the style of Coq and friends, then the characteristics of traditionally unsafe languages are not a deal-breaker. You can totally write provably correct and safe code in C, and you don't need to restrict yourself to a sharply limited version of the language either. You just need to prove that you are doing something sensible each time you perform a potentially unsafe action, such as accessing memory through a pointer.
This slows things down a...
The point is: if people understood how their bicycle worked, they’d be able to draw one even without having to literally have one in front of them as they drew it!
I don't think this is actually true. Turning a conceptual understanding into an accurate drawing is a nontrivial skill. It requires substantial spatial visualization ability, as well as quite a bit of drawing skill -- one who is not very skilled in drawing, like myself, might poorly draw one part of a bike, want to add two components to it, and then realize that there is no way to add a thir...
Even if such a person decides to do this, they will eventually get fed up and leave.
Will they, necessarily? The structure of the problem you describe sounds a lot like any sort of teaching, which involves a lot of finding out what a student misunderstands about a particular topic and then fixing that, even if you clear up that same misunderstanding for a different student every week. There are lots of people who do not get fed up with that. What makes this so different?
Pigeons have stable, transitive hierarchies of flight leadership, and they have stable pecking order hierarchies, and these hierarchies do not correlate.
one of the things you can do with the power to give instructions is to instruct others to give you more goodies.
It occurs to me that leading a flight is an unusual instruction-giving power, in that it comes with almost zero opportunities to divert resources in your own direction. Choosing where to fly and when to land affects food options, but it does not affect your food options relative to your flight-ma...
General rationality question that should not be taken to reflect any particular opinion of mine on the topic at hand:
At what point should "we can't find any knowledgeable critics offering meaningful criticism against <position>" be interpreted as substantial evidence in favor of <position>, and prompt one to update accordingly?
Having lost this signaling tool, we are that much poorer.
Are we? Signaling value is both a blessing and a curse, and my impression is that it is generally zero-sum. Personally, I consider myself *richer* when a mundane activity or lifestyle choice loses its signaling association, for it means I am now less restricted in applying it.
At the time of writing, for the two spoilers in the main post, hovering over either will reveal both. Is that intentional? It does not seem desirable.
I think there is about a three orders of magnitude difference between the difficulties of "inventing calculus where there was none before" and "learning calculus from a textbook explanation carefully laid out in the optimal order, with each component polished over the centuries to the easiest possible explanation, with all the barriers to understanding carefully paved over to construct the smoothest explanatory trajectory possible".
(Yes, "three orders of magnitude" is an actual attempt to estimate something, insofar as that is at all meaningful for an unquantified gut instinct; it's not just something I said for rhetoric effect.)
I think it will be next to impossible to set up a community norm around this issue for all communities save those with a superhuman level of general honesty. For if there is a norm like this in place, Alice always has a strong incentive to pretend that she is punching based on some generally accepted theory, and that the only thing that needs arguing is the application of this theory to Bob (point 2). Even when there is in fact a new piece of theory ready to be factored out of Alice's argument, it is in Alice's interest to pass this off as being ...
While I'm not disputing the substance of what you are saying here (besides the 4004 timeline), from a computer science perspective I am a bit annoyed at the terminology. A machine that can load computational instructions from a storage medium would traditionally be called a programmable computer, whereas the system you describe "a machine that runs an algorithm" is just precisely a computer. I understand that this is not a nuance represented in more popular terminology, but I feel an article that is precise about the difference could benefit from also using the more precise terminology.