Then the wave crested.
Is there any more you can say about this?
People already tell me I have good vibes, and feel like I listen to them[1]. I give off an air of nonchalance, because I think I kill[2] my emotions in public.
But I do it because it's one more thing to deal with that I generally don't have energy for. I'm already bothered by all the light and sound.
It seems like you're saying "I meditated, and while at first that made sensory issues worse, eventually they just stopped." And I'd like to know why, if that was legible to you.
Would...
Whether or not to get insurance should have nothing to do with what makes one sleep – again, it is a mathematical decision with a correct answer.
Don't be overly naive consequentialist about this. "Nothing" is an overstatement.
Peace of mind can absolutely be one of the things you are purchasing with an insurance contract. If your Kelly calculation says that motorcycle insurance is worth $899 a month, and costs $900 a month, but you'll spend time worrying about not being insured if you don't buy it, and won't if you do, I fully expect that is worth more than...
Is it important that negentropy be the result of subtracting from the maximum entropy? It seemed a sensible choice, up until it introduces infinities, and made every state's negentropy infinite. (And also that, if you subtract from 0, then two identical states should have the same negentropy, even in different systems. Unsure if that's useful, or harmful).
Though perhaps that's important for the noting that reducing an infinite system to a finite macrostate is an infinite reduction? I'm not sure if I understand how (or perhaps when?) that's more useful than...
Back in Reward is not the optimization target, I wrote a comment, which received a (small I guess) amount of disagreement.
I intended the important part of that comment to be the link to Adaptation-Executers, not Fitness-Maximizers. (And more precisely the concept named in that title, and less about things like superstimuli that are mentioned in the article) But the disagreement is making me wonder if I've misunderstood both of these posts more than I thought. Is there not actually much relation between those concepts?
There was, obviously, other content to ...
When I tried to answer why we don't trade with ants myself, communication was one of the first things (I can't remember what was actually first) I considered. But I worry it may be more analogous to AI than argued here.
We sort of can communicate with ants. We know to some degree what makes them tick, it's just we mostly use that communication to lie to them and tell them this poison is actually really tasty. The issue may be less that communication is impossible, and more that it's too costly to figure out, and so no one tries to become Antman even if they...
I interpret OP (though this is colored by the fact that I was thinking this before I read this) as saying Adaptation-Executers, not Fitness-Maximizers, but about ML. At which point you can open the reference category to all organisms.
Gradient descent isn't really different from what evolution does. It's just a bit faster, and takes a slightly more direct line. Importantly, it's not more capable of avoiding local maxima (per se, at least).
So, I want to note a few things. The original Eliezer post was intended to argue against this line of reasoning:
I occasionally run into people who say something like, "There's a theoretical limit on how much you can deduce about the outside world, given a finite amount of sensory data."
He didn't worry about compute, because that's not a barrier on the theoretical limit. And in his story, the entire human civilization had decades to work on this problem.
But you're right, in a practical world, compute is important.
I feel like you're trying to make this take ...
"you're jumping to the conclusion that you can reliably differentiate between..."
I think you absolutely can, and the idea was already described earlier.
You pay attention to regularities in the data. In most non-random images, pixels near to each other are similar. In an MxN image, the pixel below is a[i+M], whereas in an NxM image, it's a[i+N]. If, across the whole image, the difference between a[i+M] is less than the difference between a[i+N], it's more likely an MxN image. I expect you could find the resolution by searching all possible resolutions from ...
"the addition of an unemployable worker causes ... the worker's Shapley values to drop to $208.33 (from $250)."
I would emphasize here that the "workers'" includes the unemployed one. It was not obvious to me, until about halfway through the next paragraph, and I think the next paragraph would read better with that in mind from the start.
It seems odd to suggest that the AI wouldn't kill us because it needs our supply chain. If I had the choice between "Be shut down because I'm misaligned" (or "Be reprogrammed to be aligned" if not corrigible) and "Have to reconstruct the economy from the remnants of human civilization," I think I'm more likely to achieve my goals by trying to reconstruct the economy.
So if your argument was meant to say "We'll have time to do alignment while the AI is still reliant on the human supply chain," then I don't think it works. A functional AGI would rather destro...
Surely creating the full concrete details of the strategy is not much different from "putting forth as-good-as-human definitions, finding objections for them, and then improving the definition based on considered objections." I at least don't see why the same mechanism couldn't be used here (i.e. apply this definition iteration to the word "good", and then have the AI do that, and apply it to "bad" and have the AI avoid that). If you see it as a different thing, can you explain why?
I think you missed the point. I'd trust an aligned superintelligence to solve the objections. I would not trust a misaligned one. If we already have an aligned superintelligence, your plan is unnecessary. If we do not, your plan is unworkable. Thus, the problem.
If you still don't see that, I don't think I can make you see it. I'm sorry.
It seems simple and effective because you don't need to put weight on it. We're talking a superintelligence, though. Your definition will not hold when the weight of the world is on it.
And the fact that you're just reacting to my objections is the problem. My objections are not the ones that matter. The superintelligence's objections are. And it is, by definition, smarter than me. If your definition is not something like provably robust, then you won't know if it will hold to a superintelligent objection. And you won't be able to react fast enough to fix i...
I'm not sure this is being productive. I feel like I've said the same thing over and over again. But I've got one more try: Fine, you don't want to try to define "reason" in math. I get it, that's hard. But just try defining it in English.
If I tell the machine "I want to be happy." And it tries to determine my reason for that, what does it come up with? "I don't feel fulfilled in life"? Maybe that fits, but is it the reason, or do we have to go back more: "I have a dead end job"? Or even more "I don't have enough opportunities"?
Or does it go a ...
No, they really don't. I'm not trying to be insulting. I'm just not sure how to express the base idea.
The issue isn't exactly that computers can't understand this, specifically. It's that no one understands what those words mean enough. Define reason. You'll notice that your definition contains other words. Define all of those words. You'll notice that those are made of words as well. Where does it bottom out? When have you actually, rigorously, objectively defined these things? Computers only understand that language, but the fact that a computer wouldn't...
Again, what is a "reason"? More concretely, what is the type of a "reason"? You can't program an AI in English, it needs to be programmed in code. And code doesn't know what "reason" means.
It's not exactly that your plan "fails" anywhere particularly. It's that it's not really a plan. CEV says "Do what humans would want if they were more the people they want to be." Cool, but not a plan. The question is "How?" Your answer to that is still under specified. You can tell by the fact you said things like "the AI could just..." and didn't follow it with "add tw...
The quickest I can think of is something like "What does this mean?" Throw this at every part of what you just said.
For example: "Hear humanity's pleas (intuitions+hopes+worries)" What is an intuition? What is a hope? What is a worry? How does it "hear"?
Do humans submit English text to it? Does it try to derive "hopes" from that? Is that an aligned process?
An AI needs to be programmed, so you have to think like a programmer. What is the input and output type of each of these (e.g. "Hear humanity's pleas" takes in text, and outputs... what? Hopes? Wha...
This is the sequence post on it: https://www.lesswrong.com/posts/5wMcKNAwB6X4mp9og/that-alien-message, it's quite a fun read (to me), and should explain why something smart that thinks at transistor speeds should be able to figure things out.
For inventing nanotechnology, the given example is AlphaFold 2.
For killing everyone in the same instant with nanotechnology, Eliezer often references Nanosystems by Eric Drexler. I haven't read it, but I expect the insight is something like "Engineered nanomachines could do a lot more than those limited by designs that...
In addition to the mentions in the post about Facebook AI being rather hostile to the AI safety issue in general, convincing them and top people at OpenAI and Deepmind might still not be enough. You need to prevent every company who talks to some venture capitalists and can convince them how profitable AGI could be. Hell, depending on how easy the solution ends up being, you might even have to prevent anyone with a 3080 and access to arXiv from putting something together in their home office.
This really is "uproot the entire AI research field" and not "tell Deepmind to cool it."
To start, it's possible to know facts with confidence, without all the relevant info. For example I can't fit all the multiplication tables into my head, and I haven't done the calculation, but I'm confident that 2143*1057 is greater than 2,000,000.
Second, the line of argument runs like this: Most (a supermajority) possible futures are bad for humans. A system that does not explicitly share human values has arbitrary values. If such a system is highly capable, it will steer the future into an arbitrary state. As established, most arbitrary states are...
Question. Even after the invention of effective contraception, many humans continue to have children. This seems a reasonable approximation of something like "Evolution in humans partially survived." Is this somewhat analogous to "an [X] percent chance of killing less than a billion people", and if so, how has this observation changed your estimate of "disassembl[ing] literally everyone"? (i.e. from "roughly 1" to "I suppose less, but still roughly 1" or from "roughly 1" to "that's not relevant, still roughly 1"? Or something else.)
(To take a stab at it my...
If you only kept promises when you want to, they wouldn't be promises. Does your current self really think that feeling lazy is a good reason to break the promise? I kinda expect toy-you would feel bad about breaking this promise, which, even if they do it, suggests they didn't think it was a good idea.
If the gym was currently on fire, you'd probably feel more justified breaking the promise. But the promise is still broken. What's the difference in those two breaks, except that current you thinks "the gym is on fire" is a good reason, and "I'm feeling lazy...
Promises should be kept. It's not only a virtue, but useful for pre-commitment if you can keep your promises.
But, if you make a promise to someone, and later both of you decide it's a bad idea to keep the promise, you should be able to break it. If that someone is your past self, this negotiation is easy: If you think it's a good idea to break the promise, they would be convinced the same way you were. You've run that experiment.
So, you don't really have much obligation to your past self. If you want your future self to have obligation to you, you are aski...
Space pirates can profit by buying shares in the prediction market that pay money if Ceres shifts to a pro-Earth stance and then invading Ceres.
Has this line got a typo, or am I misunderstanding? Don't the pirates profit by buying pro-Mars shares, then invading to make Ceres pro-Mars (because Ceres is already pro-Earth)?
Mars bought pro-Earth to make pro-Mars more profitable, in the hope that pirates would buy pro-Mars and then invade.
I doubt my ability to be entertaining, but perhaps I can be informative. The need for mathematical formulation is because, due to Goodhart's law, imperfect proxies break down. Mathematics is a tool which is rigorous enough to get us from "that sounds like a pretty good definition" (like "zero correlation" in the radio signals example), to "I've proven this is the definition" (like "zero mutual information").
The proof can get you from "I really hope this works" to "As long as this system satisfies the proof's assumptions, this will work", because the ...
I understand the point of your dialog, but I also feel like I could model someone saying "This Alignment Researcher is really being pedantic and getting caught in the weeds." (especially someone who wasn't sure why these questions should collapse into world models and correspondence.)
(After all, the Philosopher's question probably didn't depend on actual apples, and was just using an apple as a stand-in for something with positive utility. So, the inputs of the utility functions could easily be "apples" (where an apple is an object with 1 property, "owner"...
In my post I wrote:
Am I correct after reading this that this post is heavily related to embedded agency? I may have misunderstood the general attitudes, but I thought of "future states" as "future to now" not "future to my action." It seems like you couldn't possibly create a thing that works on the last one, unless you intend it to set everything in motion and then terminate. In the embedded agency sequence, they point out that embedded agents don't have well defined i/o channels. One way is that "action" is not a well defined term, and is often not atomi...
I'm a little confused what it hopes to accomplish. I mean, to start I'm a little confused by your example of "preferences not about future states" (i.e. 'the pizza shop employee is running around frantically, and I am laughing' is a future state).
But to me, I'm not sure what the mixing of "paperclips" vs "humans remain in control" accomplishes. On the one hand, I think if you can specify "humans remain in control" safely, you've solved the alignment problem already. On another, I wouldn't want that to seize the future: There are potentially much better fut...
I don't know if there's much counterargument beyond "no, if you're building an ML system that helps you think longer about anything important, you already need to have solved the hard problem of searching through plan-space for actually helpful plans."
This is definitely a problem, but I would say human amplification further isn't a solution because humans aren't aligned.
I don't really have a good what human values are, even in an abstract English definition sense, but I'm pretty confident that "human values" are not, and are not easily transformable ...
Maybe I misunderstand your use of robust, but this still seems to me to be breadth. If an optima is broader, samples are more likely to fall within it. I took broad to mean "has a lot of (hyper)volume in the optimization space", and robust to mean "stable over time/perturbation". I still contend that those optimization processes are unaware of time, or any environmental variation, and can only select for it in so far as it is expressed as breadth.
The example I have in my head is that if you had an environment, and committed to changing some aspect of it af...
"Real search systems (like gradient descent or evolution) don’t find just any optima. They find optima which are [broad and robust]"
I understand why you think that broad is true. But I'm not sure I get robust. In fact, robust seems to make intuitive dis-sense to me. Your examples are gradient descent and evolution, neither of which have memory, so, how would they be able to know how "robust" an optima is? Part of me thinks that the idea comes from how, if a system optimized for a non-robust optima, it wouldn't internally be doing anything different, but we...
Can I try to parse out what you're saying about stacked sigmoids? Because it seems weird to me. Like, in that view, it still seems like showing a trendline is some evidence that it's not "interesting". I feel like this because I expect the asymptote of the AlphaGo sigmoid to be independent of MCTS bots, so surely you should see some trends where AlphaGo (or equivalent) was invented first, and jumped the trendline up really fast. So not seeing jumps should indicate that it is more a gradual progression, because otherwise, if they were independent, about hal...
So, I'm not sure if I'm further down the ladder and misunderstanding Richard, but I found this line of reasoning objectionable (maybe not the right word):
"Consider an AI that, given a hypothetical scenario, tells us what the best plan to achieve a certain goal in that scenario is. Of course it needs to do consequentialist reasoning to figure out how to achieve the goal. But that’s different from an AI which chooses what to say as a means of achieving its goals."
My initial (perhaps uncharitable) response is something like "Yeah, you could build a safe syste...
"Since my expectations sometimes conflict with my subsequent experiences, I need different names for the thingies that determine my experimental predictions and the thingy that determines my experimental results. I call the former thingies 'beliefs', and the latter thingy 'reality'."
I think this is a fine response to Mr. Carrico, but not to the post-modernists. They can still fall back to something like "Why are you drawing a line between 'predictions' and 'results'? Both are simply things in your head, and since you can't directly observe reality, your 'r...
In a Newcombless problem, where you can either have $1,000 or refuse it and have $1,000,000, you could argue that the rational choice is to take the $1,000,000, and then go back for the $1,000 when people's backs were turned, but it would seem to go against the nature of the problem.
In much the same way, if Omega is a perfect predictor, there is no possible world where you receive $1,000,000 and still end up going back for the second. Either Rachel wouldn't have objected, or the argument would've taken more than 5 minutes, and the boxes disappear, or somet...
In much the same way, estimates of value and calculations based on the number of permutations of atoms shouldn't be mixed together. There being a googleplex possible states in no way implies that any of them have a value over 3 (or any other number). It does not, by itself, imply that any particular state is better than any other. Let alone that any particular state should have value proportional to the total number of states possible.
Restricting yourself to atoms within 8000 light years, instead of the galaxy, just compounds the problem as well, but you n...
I still think the argument holds in this case, because even computer software isn't atom-less. It needs to be stored, or run, or something somewhere.
I don't doubt that you could drastically reduce the number of atoms required for many products today. For example, you could in future get a chip in your brain that makes typing without a keyboard possible. That chip is smaller than a keyboard, so represents lots of atoms saved. You could go further, and have that chip be an entire futuristic computer suite, by reading and writing your brain inputs and outputs...
I think this is a useful post, but I don't think the water thing helped in understanding:
"In the Twin Earth, XYZ is "water" and H2O is not; in our Earth, H2O is "water" and XYZ is not."
This isn't an answer, this is the question. The question is "does the function, curried with Earth, return true for XYZ, && does the function, curried with Twin Earth, return true for H2O?"
Now, this is a silly philosophy question about the "true meaning" of water, and the real answer should be something like "If it's useful, then yes, otherwise, no." But I don't thin...
I feel like I might be being a little coy stating this, but I feel like "heterogeneous preferences" may not be as inadequate as it seems. At least, if you allow that those heterogeneous preferences are not only innate like taste preference for apples over oranges.
If I have a comparative advantage in making apples, I'm going to have a lot of apples, and value the marginal apple less than the marginal orange. I don't think this is a different kind of "preference" than liking the taste of oranges better: Both base out in me preferring an orange to an apple. A...
I'm glad to hear that the question of what hypotheses produce actionable behavior is on people's minds.
I modeled Murphy as an actual agent, because I figured a hypothesis like "A cloaked superintelligence is operating the area that will react to your decision to do X by doing Y" is always on the table, and is basically a template for allowing Murphy to perform arbitrary action Y.
I feel like I didn't quite grasp what you meant by "a constraint on Murphy is picked according to this probability distribution/prior, then Murphy chooses from the available ...
I'm still confused. My biology knowledge is probably lacking, so maybe that's why, but I had a similar thought to dkirmani after reading this: "Why are children born young?" Given that sperm cells are active cells (which should give transposons opportunity to divide), why do they not produce children with larger transposon counts? I would expect whatever sperm divide from to have the same accumulation of transposons that causes problems in the divisions off stem cells.
Unless piRNA and siRNA are 100% at their jobs, and nothing is explicitly removing t...
My understanding is that transposon repression mechanisms (like piRNAs) are dramatically upregulated in the germ line. They are already very close to 100% effective in most cells under normal conditions, and even more so in the germ line, so that most children do not have any more transposons than their parents.
(More generally, my understanding is that germ line cells have special stuff going to make sure that the genome is passed on with minimal errors. Non-germ cells are less "paranoid" about mutations.)
Once the rate is low enough, it's handled by natural selection, same as any other mutations.
A little late to the party, but
I'm confused about the minimax strategy.
The first thing I was confused about was what sorts of rules could constrain Murphy, based on my actions. For example, in a bit-string environment, the rule "every other bit is a 0" constrains Murphy (he can't reply with "111..."), but not based on my actions. It doesn't matter what bits I flip, Murphy can always just reply with the environment that is maximally bad, as long as it has 0s in every other bit. Another example would be if you have the rule "environment must be a valid chess...
Thank you for the reply. I think I'll need to look into things more.
One clarification I wanted to make. I wouldn't have normally said that I need to put effort into listening. I think I generally feel like it doesn't take effort. But somewhat recently I had an interaction with someone go poorly. It was a date, and they said afterwards that they didn't feel like I wanted to get to know them, because I hadn't asked enough things about them[1].
So I figure there's something lacking in my interaction with people. Something that I'm not doing that I should... (read more)