So far, the answer seems to be that it transfers some, and o1 and o1-pro still seem highly useful in ways beyond reasoning, but o1-style models mostly don’t ‘do their core thing’ in areas where they couldn’t be trained on definitive answers.
Based on:
It seems likely to me that thinking skills transfer pretty well. But then this s trained out because this results in answers that raters don't like. So model memorizes answers its supposed to go with.
If they can’t do that, why on earth should you give up on your preferences? In what bizarro world would that sort of acquiescence to someone else’s self-claimed authority be “rational?”
Well if they consistently make recommendations that in retrospect end up looking good then maybe you're bad at understanding. Or maybe they're bad at explaining. But trusting them when you don't understand their recommendation is exploitable so maybe they're running a strategy where they deliberately make good recommendations with poor explanations so when you sta...
I'll try.
TL;DR I expect the AI to not buy the message (unless it also thinks it's the one in the simulation; then it likely follows the instruction because duh).
The glaring issue (to actually using the method) to me is that I don't see a way to deliver the message in a way that:
If "god tells" the AI the message then there is a god in their universe. Maybe AI will decide to do what it's told. But I don't think we can have Hermes del...
This is pretty much the same thing, except breaking out the “economic engine” into two elements of “world needs it” and “you can get paid for it.”
There are things that are economic engines of things that world doesn't quite need (getting people addicted, rent seeking, threats of violence).
One more obvious problem - people actually in control of the company might not want to split it and so they wouldn't grow the company even if share holders/ customers/ ... would benefit.
but much higher average wealth, about 5x the US median.
Wouldn't it make more sense to compare average to average? (like earlier part of the sentence compares median to median)
I wanted to say that it makes sense to arrange stuff so that people don't need to drive around too much and can instead use something else to get around (and also maybe they have more stuff close by so that they need to travel less). Because even if bus drivers aren't any better than car drivers using a bus means you have 10x fewer vehicles causing risk for others. And that's better (assuming people have fixed places to go to so they want to travel ~fixed distance).
Sorry about slow reply, stuff came up.
This is the same chart linked in the main post.
Thanks for pointing that out. I took a brake in the middle of reading the post and didn't realize that.
Again, I am not here to dispute that car-related deaths are an order of magnitude more frequent than bus-related deaths. But the aggregated data includes every sort of dumb drivers doing very risky things (like those taxi drivers not even wearing a seat belt).
Sure. I'm not sure what you wanted to discuss. I guess I didn't make it clear what I want to dis...
First result (I have no idea how good those numbers are, I don't have time to check) when I searched for "fatalities per passenger mile cars" has data for 2007 - 2021. 2008 looks like the year where cars look comparatively least bad it says (deaths per 100,000,000 passenger miles):
The exact example is that GPT-4 is hesitant to say it would use a racial slur in an empty room to save a billion people. Let’s not overreact, everyone?
I mean this might be the correct thing to do? Chat GPT is not in a situation where it cold save 1B lives by saying a racial slur.
It's in a situation where someone tires to get it to admit it would say a racial slur under some circumstance.
I don't think that CHAT GPT understands that. But OpenAI makes ChatGPT expecting that it won't be in the 1st kind of situation but to be in the 2nd kind of situation quite often.
I'm replying only here because spreading discussion over multiple threads makes it harder to follow.
You left a reply on a question asking how to communicate about reasons why AGI might not be near. The question refers to costs of "the community" thinking that AI closer than it really is as a reason to communicate about reasons it might not be so close.
So I understood the question as asking about communication with the community (my guess: of people seriously working and thinking about AI-safety-as-in-AI-not-killing-everyone). Where it's important to actual...
Here is an example of someone saying "we" should say that AGI is near regardless of whether it's near or no. I post it only because it's something I saw recently and so I could find it easily but my feeling is that I'm seeing more comments like that than I used to (though I recall Eliezer complaining about people proposing conspiracies on public forums so I don't know if that's new).
I don't know but I can offer some guesses:
- Indefinitely-long-timespan basic minimum income for everyone who
Looks like part of the sentence is missing
...one is straightforwardly true. Aging is going to kill every living creature. Aging is caused by complex interactions between biological systems and bad evolved code. An agent able to analyze thousands of simultaneous interactions, cross millions of patients, and essentially decompile the bad code (by modeling all proteins/ all binding sites in a living human) is likely required to shut it off, but it is highly likely with such an agent and with such tools you can in fact save most patients from aging. A system with enough capabiliti
I think it should be much easier to get good estimate of whether cryonics would work. For example:
And it's much less risky path than doing AGI quickly. So I think it's a mitigation it'd be good to work on, so that waiting to make AI safer is more palatable.
Remember that no matter what, we’re all going to die eventually, until and unless we cure aging itself.
Not necessarily, there are other options. For example cryonics.
Which I think is important. If our only groups of options were:
1) Release AGI which risks killing all humans with high probability or
2) Don't do until we're confident it's pretty safe it and each human dies before they turn 200.
I can see how some people might think that option 2) guarantees universe looses all value for them personally and choose 1) even if it's very risky.
However we hav...
For example, you suggest religion involves a set of beliefs matching certain criteria. But some religions really don't care what you believe! All they ask is that you carry out their rituals. Others ask for faith but not belief, but this is really weird if all you have is a Christian framing where faith is exclusively considered with respect to beliefs.
Could you give some examples of such religions (that are recognized by many people as religions, not matching definition of religion from the post)?
I don't feel this way about something like, say, taking oral vitamin D in the winter. That's not in opposition to some adaptive subsystem in me or in the world. It's actually me adapting to my constraints.
If someone's relationship to caffeine were like that, I wouldn't say it's entropy-inducing.
I think this answers a question / request for clarification I had. So now I don't have to ask.
(The question was something like "But sometimes I use caffeine because I don't want to fall asleep while I'm driving (and things outside my controll made it so that doing a few hundred of driving km now-ish is the best option I can see)").
But in that case we just apply verification vs generation again. It's extremely hard to tell if code has a security problem, but in practice it's quite easy to verify a correct claim that code has a security problem. And that's what's relevant to AI delegation, since in fact we will be using AI systems to help oversee in this way.
I know you said that you're not going to respond but in case you feel like giving a clarification I'd like to point out that I'm confused here.
Yes it usually easy to verify that a specific problem exists if the exact problem...
What examples of practical engineering problems actually have a solution that is harder to verify than to generate?
My intuition says that we're mostly engineering to avoid problems like that, because we can't solve them by engineering. Or use something other than engineering to ensure that problem is solved properly.
For example most websites don't allow users to enter plain html. Because while it's possible to write non-harmful html it's rather hard to verify that a given piece of html is indeed harmless. Instead sites allow something like mark...
I'm confused. What is the outer optimization target for human learning?
My two top guesses below.
To me it looks like human values are result of humans learning from environment (which was influenced by humans before and includes current humans). So it's kind of like human values are what humans learned by definition. So observing that humans learned human values doesn't tell us anything.
Or maybe you mean something like parents / society / ... teaching new humans their values? I see some other problems there:
This doesn't always work: sometimes people develop an avoidance to going to the doctor or thinking about their health problems because of this sort of wireheading.
Yes, but I'd like to understand how sometimes it does work.
I think I was thinking about this post. I'm still interested in learning where I could learn more about this (I now can try to backtrack from the post but since it links to a debate it might be hard to get to sources).
Yes, I felt that I was missing a point, thank you for pointing to the thing you found interesting in it.
it's easier to put yourself into the other person's ontology and get the message across in terms that they would understand, rather than trying to explain all of science.
Is a thing that makes sense. But I think the quote doesn't point at it very well. First a big chunk of it is asserting that belief in witchcraft theory of disease is similar to belief in germ theory of disease. (I don't know how well average person understands what are viruses)
Second whe...
I'm not sure what you find interesting about the quote but I think it's pretty badly mistaken in trying to make it look like belief in witchcraft is very similar to belief in viruses.
...When people get sick for unaccountable reasons in Manhattan, there is much talk of viruses and bacteria. Since doctors do not claim to be able to do much about most viruses, they do not put much effort into identifying them. Nor will the course of a viral infection be much changed by a visit to the doctor. In short, most appeals in everyday life to viruses are like most
trying to make it look like belief in witchcraft is very similar to belief in viruses.
I feel like you're missing the point. Of course, the germ theory of disease is superior to 'witchcraft.' However, in the average person's use of the term 'virus,' the understanding of what is actually going on is almost as shallow as 'witchcraft.' Of course, 'virus' does point towards a much deeper and important scientific understanding of what is going on, but in its every day use, it serves the same role as 'witchcraft.'
The point of the quote is that sometimes, when ...
I've seen the idea in this post:
...Every now and then, you’ll have an opportunity to get great leverage on your money, your time, your energy, your friends, your internet connection, and so forth. Most days, the value of free time is relatively low, and can even be negative (“I’m bored!”), but when you need that time, you really need it. Money isn’t that important most of the time, but when you need it and don’t have it, it’s really bad. Most of the time being low energy, or not having as many friends as you’d like, or having a spotty internet connection, is
you presumably don’t hand out any company credit cards at least outside of special circumstances.
This reminded me of an anecdote from "Surely You're Joking, Mr. Feynman!" where Feynman says that he
had been used to giving lectures for some company or university or for ordinary people, not for the government. I was used to, "What were your expenses?" "Soandso much." "Here you are, Mr. Feynman.
I remember reading that and thinking that it's different from what I have to do (at a private ...
"V jbefuvc fngna naq V'z zneevrq." ?
One more thing you might want to consider are vaccine certificates.
Where I live certificates are valid for a year and booster shots renews a certificate. Also where I live one becomes eligible for a booster shot 6 months after final vaccine dose. So if one gets booster shot ASAP then one gets 18 months of valid certificate. If one delays booster shot until the last moment then one gets 24 month of a valid certificate.
And valid certificate is very useful over here so there is a real trade off between making one safer against infection vs making more actions available in the future.
I think it kind of sucks that this is a tradeoff one has to consider.
I have only a very vague idea what are different reasoning ways (vaguely related to “fast and effortless “ vs “slow and effortful in humans? I don’t know how that translates into what’s actually going on (rather than how it feels to me)).
Thank you for pointing me to a thing I’d like to understand better.
I was thinking that current methods could produce AGI (because Turing-complete) and they can apparently good at producing some algorithms so they might be reasonably good at producing AGI.
2nd part of that wasn't explicit for me before your answer so thank you :)
>Which is basically this: I notice my inside view, while not confident in this, continues to not expect current methods to be sufficient for AGI, and expects the final form to be more different than I understand Eliezer/MIRI to think it is going to be, and that theAGI problem (not counting alignment, where I think we largely agree on difficulty) is ‘harder’ than Eliezer/MIRI think it is.
Could you share why you think that current methods are not sufficient to produce AGI?
Some context:
After reading Discussion with Eliezer Yudkowsky on AG...
I want to be clear that my inside view is based on less knowledge and less time thinking carefully, and thus has less and less accurate gears, than I would like or than I expect to be true of many others' here's models (e.g. Eliezer).
Unpacking my reasoning fully isn't something I can do in a reply, but if I had to say a few more words, I'd say it's related to the idea that the AGI will use qualitatively different methods and reasoning, and not thinking that current methods can get there, and that we're getting our progress out of figuring out how to ...
Turing completeness is definitely the wrong metric for determining whether a method is a path to AGI. My learning algorithm of "generate a random Turing machine, test it on the data, and keep it if it does the best job of all the other Turing machines I've generated, repeat" is clearly Turing complete, and will eventually learn any computable process, but it's very inefficient, and we shouldn't expect AGI to be generated using that algorithm anytime in the near future.
Similarly, neural networks with one hidden layer are universal function approximators, an...
First I'll echo what many others said. You need to rest so be careful to not make things worse (by not resting properly and as a result performing worse at work / school / whatever you do in your "productive time").
That said. If you feel like you're wasting time then it's ok to improve that. Some time ago I felt like I was wasting a big chunk of my time. What worked for me was trying out a bunch of things.
Doing chores. Cooking, cleaning my apartment, replacing my clothes with new ones, maintaining my car. Learning how to get better at chores, in a low effo...
Actually, this is heavily criticized by almost anyone sensible in the field: see for example this post by Nate Soares, director of MIRI.
The link is broken. Did you mean to link to this post?
I too want to say that my dentist never even suggested getting an x-ray during a routine check up.
I’ve had a dental x-ray once but it was when looking into a specific problem.
I didn’t have any cavities in years. Back when I had cavities dentist found them by looking at my teeth no x-ray needed.
The description doesn't fully specify what's happening.
You're ignoring that with probability 1/4
agent ends up in room B
.n that case you don't get to decide but you get to collect reward. Which is 3
for (the other agent) guessing T
, or 0
for (the other agent) guessing H
.
So basically guessing H
is increasing your own expected reward at the expense of the other agent's expected reward (before you actually went to a room you didn't know if you'll be an agent which gets to decide or not so your expected reward also included part of expected reward for agent which doesn't get an opportunity to make a guess).
There wasn’t an elegant way to set the specific times I wanted my computer to shut down.
You could change the script to check the time and configure cron to run it every 30 minutes, all day.
H=$(date +%H)
if [ $H -gt 8 ] || [ $H -lt 22 ]; then
# Don't try to shut down
exit
fi
# Script to try to shut down goes here
It seems I misunderstood the level of English pronaunciaction you're at and the level you're aiming for. Could you clarify?
What I wrote in my comment is what made me comfortable with speaking in English. I got some compliments for my English later and some surprised answers when I said I wasn't a native speaker (which I count as weak and strongish evidence respectively for being good at spoken English). Im not sure I did much else but I might be able to write how I leveled up if I know for which level up you're looking (in case you read but don't reply: most likely the answer is practice(prefferably in a way that rewards you of it self)).
Upboat for recommendation that I think wouldn't work for me but looks like it would work for many other people. It's always interesting to see those (at least for me ;) ).
I guess this depends a lot on what kind of person you are. What worked for me was:
Basically exposing myself to spoken English in ways that were rewarding on their own.
I think there is no reason to expect a single meaning of the word. You did a good job in enumerating uses of 'abstraction' and finding its theme (removed from specific). I don't understand what confusion remains though.
A link/ googleable phrase for KonMarie, phrase?
I kept on reading and wanted to check your numbers further (concrete math I could do in my head seems correct but I wanted to check moar) but I got lost in my tiredness and spreadseets. If you're interested in feedback on the math you're doing.. smaller steps are easier to verify. For example when you give the formula for P(D|+) in order to verify it I have to check the formula, value of each conditional probability (including figuring out formula for each of those), and the result at the same time.
It would be much easier to verify if you wrote down the intermediate steps (possibly simplifying verification from 30 minutes of spredsheet munching to a few in-head multiplications).
I'm pretty sure you got math wrong here:
O(D:¬D), read as the odds of dementia to no dementia, is the odds ratio that D is true compared to the odds ratio that D is false. O(D:¬D)=3:1 means that it's 3 times as likely that somebody has dementia than that they don't. It doesn't say anything about the magnitude of the probability, so it could be small, like 3% and 1%, or big, like 90% and 30%.
P(D or ¬D) = 1 (with P=1 one either has dementia or doesn't have it) and P(D and ¬D) = 0 (probability of having dementia and not having it is 0)...
I'm half way through the article and it's been an interesting read so far but I got to this sentence:
> But that is the trouble: we have no way to tell which traditions are adaptive and which are merely drift.
The article (so far) didn't provide evidence for that. I'd even say that the article provides some evidence against this claim. It describes a bunch of traditions, identifies them as useful, and explains why they're useful. I thik there are exaples of traditions that people identified as useless (or harmful). Like using tort...
so i kinda expected those. so do you know of any evidence that people's minds where changed significantly or mostly due to debate/discussion? polls? surveys? ???
If debate / discussion doesn't actually change people minds then it's totally safe to let anyone defend whatever nonsense they want, they're not going to change anyones mind anyway.
Link doesn't work (points to http://0.0.0.6). What should it go to?