This seems very confused.
What makes good art a subjective quality is that its acceptance criterion is one that refers to the viewer as one of its terms. The is-good-art()
predicate, or the art-quality()
real-valued function, has a viewer
parameter in it. What makes good physics-theory an objective quality is that its acceptance criterion doesn't refer to the viewer; the is-good-physics-theory()
predicate, or the physics-theory-accuracy()
real-valued function, is one that compares the theory to reality, without the viewer playing a role as a term inside the...
This was my experience studying in the Netherlands as well. University officials were indeed on board with this, with the general assumption being that lectures and instructions and labs and such are a learning resource that you can use or not use at your discretion.
I would say that some formal proofs are actually impossible
Plausible. In the aftermath of spectre and meltdown I spent a fair amount of time thinking on how you could formally prove a piece of software to be free of information-leaking side channels, even assuming that the same thing holds for all dependent components such as underlying processors and operating systems and the like, and got mostly nowhere.
...In fact, I think survival timelines might even require anyone who might be working on classes of software reliability that don't relate to alignment
Yes, I agree with this.
I cannot judge to what degree I agree with your strategic assessment of this technique, though. I interpreted your top-level post as judging that assurances based on formal proofs are realistically out of reach as a practical approach; whereas my own assessment is that making proven-correct [and therefore proven-secure] software a practical reality is a considerably less impossible problem than many other aspects of AI alignment, and indeed one I anticipate to actually happen in a timeline in which aligned AI materializes.
Assurance Requires Formal Proofs, Which Are Provably Impossible
The Halting Problem puts a certain standard of formalism outside our reach
This is really not true. The halting problem only makes it impossible to write a program that can analyze a piece of code and then reliably say "this is secure" or "this is insecure". It is completely possible to write an analyzer that can say "this is secure" for some inputs, "this is definitely insecure for reason X" for some other inputs, and "I am uncertain about your input so please go improve it" for everythin...
We see this a lot on major events, as I’ve noted before, like the Super Bowl or the World Cup. If you are on the ball, you’ll bet somewhat more on a big event, but you won’t bet that much more on it than on other things that have similarly crazy prices. So the amount of smart money does not scale up that much. Whereas the dumb money, especially the partisans and gamblers, come out of the woodwork and massively scale up.
This sounds like it should generalize to "in big events, especially those on which there are vast numbers of partisans on both sides, th...
I do not think you are selling a strawman, but the notion that a utility function should be computable seems to me to be completely absurd. It seems like a confusion born from not understanding what computability means in practice.
Say I have a computer that will simulate an arbitrary Turing machine T, and will award me one utilon when that machine halts, and do nothing for me until that happens. With some clever cryptocurrency scheme, this is a scenario I could actually build today. My utility function ought plausibly to have a term in it that assigns a po
...This second claim sounds to me as being a bit trivial. Perhaps it is my reverse engineering background, but I have always taken it for granted that approximately any mechanism is understandable by a clever human given enough effort.
This book [and your review] explains a number of particular pieces of understanding of biological systems in detail, which is super interesting; but the mere point that these things can be understood with sufficient study almost feels axiomatic. Ignorance is in the map, not the territory; there are no confusing phenomena, only m...
The second claim was actually my main goal with this post. It is a claim I have heard honest arguments against, and even argued against myself, back in the day. A simple but not-particularly-useful version of the argument would be something like "the shortest program which describes biological behavior may be very long", i.e. high Kolmogorov complexity. If that program were too long to fit in a human brain, then it would be impossible for humans to "understand" the system, in some sense. We could fit the program in a long book, maybe, b...
If you are going to include formal proofs with your AI showing that the code does what it's supposed to, in the style of Coq and friends, then the characteristics of traditionally unsafe languages are not a deal-breaker. You can totally write provably correct and safe code in C, and you don't need to restrict yourself to a sharply limited version of the language either. You just need to prove that you are doing something sensible each time you perform a potentially unsafe action, such as accessing memory through a pointer.
This slows things down a...
The point is: if people understood how their bicycle worked, they’d be able to draw one even without having to literally have one in front of them as they drew it!
I don't think this is actually true. Turning a conceptual understanding into an accurate drawing is a nontrivial skill. It requires substantial spatial visualization ability, as well as quite a bit of drawing skill -- one who is not very skilled in drawing, like myself, might poorly draw one part of a bike, want to add two components to it, and then realize that there is no way to add a thir...
Even if such a person decides to do this, they will eventually get fed up and leave.
Will they, necessarily? The structure of the problem you describe sounds a lot like any sort of teaching, which involves a lot of finding out what a student misunderstands about a particular topic and then fixing that, even if you clear up that same misunderstanding for a different student every week. There are lots of people who do not get fed up with that. What makes this so different?
Pigeons have stable, transitive hierarchies of flight leadership, and they have stable pecking order hierarchies, and these hierarchies do not correlate.
one of the things you can do with the power to give instructions is to instruct others to give you more goodies.
It occurs to me that leading a flight is an unusual instruction-giving power, in that it comes with almost zero opportunities to divert resources in your own direction. Choosing where to fly and when to land affects food options, but it does not affect your food options relative to your flight-ma...
General rationality question that should not be taken to reflect any particular opinion of mine on the topic at hand:
At what point should "we can't find any knowledgeable critics offering meaningful criticism against <position>" be interpreted as substantial evidence in favor of <position>, and prompt one to update accordingly?
Having lost this signaling tool, we are that much poorer.
Are we? Signaling value is both a blessing and a curse, and my impression is that it is generally zero-sum. Personally, I consider myself *richer* when a mundane activity or lifestyle choice loses its signaling association, for it means I am now less restricted in applying it.
At the time of writing, for the two spoilers in the main post, hovering over either will reveal both. Is that intentional? It does not seem desirable.
I think there is about a three orders of magnitude difference between the difficulties of "inventing calculus where there was none before" and "learning calculus from a textbook explanation carefully laid out in the optimal order, with each component polished over the centuries to the easiest possible explanation, with all the barriers to understanding carefully paved over to construct the smoothest explanatory trajectory possible".
(Yes, "three orders of magnitude" is an actual attempt to estimate something, insofar as that is at all meaningful for an unquantified gut instinct; it's not just something I said for rhetoric effect.)
I think it will be next to impossible to set up a community norm around this issue for all communities save those with a superhuman level of general honesty. For if there is a norm like this in place, Alice always has a strong incentive to pretend that she is punching based on some generally accepted theory, and that the only thing that needs arguing is the application of this theory to Bob (point 2). Even when there is in fact a new piece of theory ready to be factored out of Alice's argument, it is in Alice's interest to pass this off as being ...
Why oh way does this system make it so needlessly inconvenient to partake in these courses?
I just want to read the lecture material on the topics that interest me, and possibly do some of the exercises. Why do I need to create an account for that using a phony email, subscribe to courses, take tests with limited numbers of retries, and all of that? I am not aiming to get a formal diploma here, and I don't think you plan on awarding me any. So why can't I just... browse the lectures in a stateless web 1.0 fashion?
It looks to me as if ihatestatistics.com is a for-profit business selling educational materials and systems to universities. The design decisions that make most sense for them are not necessarily the most convenient for students. So e.g. they may be keen to be able to distinguish one student from another, take measures against cheating, encourage and/or measure "engagement", etc. The things they do to that end may be annoying for students, but they still do them because they make their actually-paying customers happier, or make it easier to convi...
This. The phrasing "if you are on Vulcan, then you are on the mountain" *sounds* like it should be orthogonal to, and therefore gives no new information on and cannot affect the probability of, your being on Vulcan.
This is quite false, as can be shown easily by the statement "if you are on Vulcan, then false". But it is a line of reasoning I can see being tempting.
To me, this form of "epistemic should" doesn't feel like a responsibility-dodge at all. To me, it carries a very specific meaning of a particular warning: "my abstract understanding predicts that X will happen, but there are a thousand and one possible gotchas that could render that abstract understanding inapplicable, and I have no specific concrete experience with this particular case, so I attach low confidence to this prediction; caveat emptor". It is not a shoving off of responsibility, so much as a marker of low confidence, a...
I agree that a distillation of a complex problem statement to a simple technical problem represents real understanding and progress, and is valuable thereby. But I don't think your summary of the first half of the AI safety problem is one of these.
The central difficulty that stops this from being a "mere" engineering problem is that we don't know what "safe" is to mean in practice; that is, we don't understand in detail *what properties we would desire a solution to satisfy*. From an engineering perspective, that marks th...
I think what we are looking at here is Moloch eating all slack out of the system. I think that is a summary of about 75% of what Moloch does.
In these cases, I do not think such explanations are enough.
Eliezer gives the model of researchers looking for citations plus grant givers looking for prestige, as the explanation for why his SAD treatment wasn’t tested. I don’t buy it. Story doesn’t make sense.
On my model, the lack of exploitability is what allowed the failure to happen, whereas your theory on reasons why people do not try more dakka may be what caused the failure to happen.
If the problem were exploitable in the Eliezer-sense, the market would bulldoze straight through the roadblocks y...
My interpretation of this thesis immediately remind me of Eliezer's post on locality and compactness of specifications, among others.
Under this framework, my analysis is that triangle-ness has a specification that is both compact and local; whereas L-opening-ness has a specification that is compact and nonlocal ("opens L"), and a specification that is local but noncompact (a full specification of what shapes it is and is not allowed to have), but no specification that is both local and compact. In other words, there is a short specification which...
Consensus tends to be dominated by those who will not shift their purported beliefs in the face of evidence and rational argument.
This appears to be empirically incorrect, at least in some fields. A few examples:
I have tried exactly this with basic topology, and it took me bloody ages to get anywhere despite considerable experience with coq. It was a fun and interesting exercise in both the foundations of the topic I was studying and coq, but it was by no means the most efficient way to learn the subject matter.
My take on it:
You judge an odds ratio of 15:85 for the money having been yours versus it having been Nick's, which presumably decomposes into a maximum entropy prior (1:1) multiplied by whatever evidence you have for believing it's not yours (15:85). Similarly, Nick has a 80:20 odds ratio that decomposes into the same 1:1 prior plus 80:20 evidence.
In that case, the combined estimate would be the combination of both odds ratios applied to the shared prior, yielding a 1:1 * 15:85 * 80:20 = 12:17 ratio for the money being yours versus it being Nicks. Thus, you deserve 12/29 of it, and Nick deserves the remaining 17/29.
So it's nonstandard clever wordplay. Voldemort will still anticipate a nontrivial probability of Harry managing undetected clever wordplay. Which means it only has a real chance of working when threatening something that Voldemort can't test immediately.
I don't think this is likely, if only because of the unsatisfyingness. However:
And the messages would come out in riddles, and only someone who heard the prophecy in the seer's original voice would hear all the meaning that was in the riddle. There was no possible way that Millicent could just give out a prophecy any time she wanted, about school bullies, and then remember it, and if she had it would've come out as 'the skeleton is the key' and not 'Susan Bones has to be there'. (Ch.77)
Some foreshadowing on the idea of ominous-sounding prophecy terms a...
I took the survey. No scanner available, alas.
Seconded. P versus NP is the most important piece of the basic math of computer science, and a basic notion of algorithms is a bonus. The related broader theory which nonetheless still counts as basic math is algorithmic complexity and the notion of computability.
I've always modeled it as a physiological "mana capacity" aspect akin to muscle mass -- something that grows both naturally as a developing body matures, and as a result of exercise.
Certainly, though I should note that there is no original work in the following; I'm just rephrasing standard stuff. I particularly like Eliezer's explanation about it.
Assume that there is a set of things-that-could-happen, "outcomes", say "you win $10" and "you win $100". Assume that you have a preference over those outcomes; say, you prefer winning $100 over winning $10. What's more, assume that you have a preference over probability distributions over outcomes: say, you prefer a 90% chance of winning $100 and a 10% chance o...
It's full of hidden assumptions that are constantly violated in practice, e.g. that an agent can know probabilities to arbitrary precision, can know utilities to arbitrary precision, can compute utilities in time to make decisions, makes a single plan at the beginning of time about how they'll behave for eternity (or else you need to take into account factors like how the agent should behave in order to acquire more information in the future and that just isn't modeled by the setup of vNM at all), etc.
Those are not assumptions of the von Neumann-Morgens...
It's more than a metaphor; a utility function is the structure any consistent preference ordering that respects probability must have. It may or may not be a useful conceptual tool for practical human ethical reasoning, but "just a metaphor" is too strong a judgment.
I don't think I have much to add to this discussion that you guys aren't already going to have covered, except to note that Qiaochu definitely understands what a utility function is and all of the standard arguments for why they "should" exist, so his beliefs are not a function of not having heard these arguments (just noting this because this thread and some of the siblings seem to be trying to explain basic concepts to Qiaochu that I'm confident he already knows, and I'm hoping that pointing this out will speed up the discussion).
a utility function is the structure any consistent preference ordering that respects probability must have.
This is the sort of thing I mean when I say that people take utility functions too seriously. I think the von Neumann-Morgenstern theorem is much weaker than it initially appears. It's full of hidden assumptions that are constantly violated in practice, e.g. that an agent can know probabilities to arbitrary precision, can know utilities to arbitrary precision, can compute utilities in time to make decisions, makes a single plan at the beginning of ...
"a utility function is the structure any consistent preference ordering that respects probability must have."
Yes, but humans still don't have one. It's not even clear they can make themselves have one.
A more involved post about those Bad Confused Thoughts and the deep Bayesian issue underlying it would be really interesting, when and if you ever have time for it.
Upvoted for the simple reason that this is probably the first article I've EVER seen with a title of the form 'discussion about ' which is in fact about the quoted term, rather than the concept it refers to.
As a point of interest, I want to note that behaving like an illiterate immature moron is a common tactic for (usually banned) video game automation bots when faced with a moderator who is onto you, for exactly the same reason used here -- if you act like someone who just can't communicate effectively, it's really hard for others to reliably distinguish between you and a genuine foreign 13-year-old who barely speaks English.
"Worst case analysis" is a standard term of art in computer science, that shows up as early as second-semester programming, and Eliezer will be better understood if he uses the standard term in the standard way.
Actually, in the context of randomized algorithms, I've always seen the term "worst case running time" refer to Oscar's case 6, and "worst-case expected running time" -- often somewhat misleadingly simplified to "expected running time" -- refer to Oscar's case 2.
...A computer scientist would not describe the
Group selectionism alert. The "we are optimized for effectively playing the iterated prisoner's dilemma" argument, AKA "people will remember you being a jackass", sounds much more plausible.
...Even with measurements in hand, old habits are hard to shake. It’s easy to fall in love with numbers that seem to agree with you. It’s just as easy to grope for reasons to write off numbers that violate your expectations. Those are both bad, common biases. Don’t just look for evidence to confirm your theory. Test for things your theory predicts should never happen. If the theory is correct, it should easily survive the evidential crossfire of positive and negative tests. If it’s not you’ll find out that much quicker. Being wrong efficiently is what scienc
I already knew it, but this post made me understand it.
Passphrase: eponymous haha_nice_try_CHEATER
Well played :)
Trust -- the quintessential element of your so-called "tell culture" -- and vulnerability are two sides of the same coin.
That's true in general. In network security circles, a trusted party is one with the explicit ability to compromise you, and that's really the operational meaning of the term in any context.
My own definition - proto-science is something put forward by someone who knows the scientific orthodoxy in the field, suggesting that some idea might be true. Pseudo-science is something put forward by someone who doesn't know the scientific orthodoxy, asserting that something is true.
This seems like an excellent heuristic to me (and probably one of the key heuristics people actually use for making the distinction), not not valid as an actual definition. For example, Sir Roger Penrose's quantum consciousness is something I would classify as pseudoscience without a second thought, despite the fact that Penrose as a physicist should know and understand the orthodoxy of physics perfectly well.
Taking the survey IS posting something insightful.
Taken to completion.
The Cryonics Status question really needs an "other" answer. There are more possible statuses one can be in than the ones given; in particular there are more possible "I'd want to, but..." answers.
To figure out a strange plot, look at what happens, then ask who benefits. Except that Dumbledore didn't plan on you trying to save Granger at her trial, he tried to stop you from doing that. What would've happened if Granger had gone to Azkaban? House Malfoy and House Potter would've hated each other forever. Of all the suspects, the only one who wants that is Dumbledore. So it fits. It all fits. The one who really committed the murder is - Albus Dumbledore!
I think if you use this line of reasoning and then allow yourself to dismiss arbitrary parts of ...
I am a great proponent of proof-carrying code that is designed and annotated for ease of verification as a direction of development. But even from that starry-eyed perspective, the proposals that Andrew argues... (read more)