As a speaker of a native language that has only genderneutral pronouns and no gendered ones, I often stumble and misgender people out of disregard of that info because that is just not how referring works in my brain. I suspect that natives don't have this property and the self-reports are about them.
What language is this?
But you said that I should use orange juice as a replacement because it's similarly sweet.
Does ChatGPT think tequila is sweet, orange juice is bitter...or is it just trying to sell you drinks?*
tequila has a relatively low alcohol content
Relative to what ChatGPT drinks no doubt.
And tequila doesn’t have any sugar at all.
*Peer pressure you into it drinking it maybe.
At best this might describe some drinks that have tequila in them. Does it know the difference between "tequila" and "drinks with tequila"?
Does ChatGPT not differentiate between sweet and sug...
these success stories seem to boil down to just buying time, which is a good deal less impressive.
The counterpart to 'faster vaccination approval' is 'buying time' though. (Whether or not it ends up being well used, it is good at the time. The other reason to focus on it is - how much can you affect pool testing versus vaccination approval speed? Other stuff like improving statistical techniques might be easier for a lot of people than changing a specific organization.
Overall this was pretty good.
That night, Bruce dreamt of being a bat, of swooping in to save his parents. He dreamt of freedom, and of justice, and of purity. He dreamt of being whole. He dreamt of swooping in to protect Alfred, and Oscar, and Rachel, and all of the other good people he knew.
The part about "purity" didn't make sense.
Bruce would act.
This is bit of a change from before - something more about the mistake seems like it would make more sense. Not worry. ('Bruce would get it right this time' or something about 'Bruce would act (and it would make things better this time)'.) 'Bruce wouldn't be afraid' maybe?
I was thinking
The rules don't change over time, but what if on...the equivalent of the summer solstice, fire spells get +1 fire mana or something. i.e, periodic behavior. Wait, I misread that. I meant more like, rules might be different, say, once every hundred years (anniversary of something important) - like there's more duels that day, so you might have to fight multiple opponents, or something.
This is a place where people might look at the game flux, and go 'the rules don't change'.
Our world is so inadequate that seminal psychology experiments are described in mangled, misleading ways. Inadequacy abounds, and status only weakly tracks adequacy. Even if the high-status person belongs to your in-group. Even if all your smart friends are nodding along.
It says he started with the belief. Not, that he was right, or ended with it. Keeping the idea contained to the source, so it's clear it's not being stated could be improved, yes.
This is what would happen if you were magically given an extraordinarily powerful AI and then failed to aligned it,
Magically given a very powerful, unaligned, AI. (This 'the utility function is in code, in one place, and can be changed' assumption needs re-examination. Even if we assert it exists in there*, it might be hard to change in, say, a NN.)
* Maybe this is overgeneralizing from people, but what reason do we have to think an 'AI' will be really good at figuring out its utility function (so it can make changes without changing it, if it so desires). ...
Spoilering/hiding questions. Interesting.
Do the rules of the wizards' duels change depending on the date?
I'll aim to post the ruleset and results on July 18th (giving one week and both weekends for players). If you find yourself wanting extra time, comment below and I can push this deadline back.
The dataset might not have enough info for this/rules might not be deep enough, but a wizards duel between analysts, or 'players', also sounds like it could be fun.
I think that is a flaw of comments, relative to 'google docs'. Long documents without the referenced areas being tagged in comments, might make it hard to find other people asking the same question you did, even if someone wondered about the same section. (And the difficulty of ascertaining that quickly seems unfortunate.)
For example, if our function measures the probability that some particular glass is filled with water, the space near the maximum is full of worlds like “take over the galaxy and find the location least likely to be affected by astronomical phenomena, then build a megastructure around the glass designed to keep it full of water”.
If the function is 'fill it and see it is filled forever' then strange things may be required to accomplish that (to us) strange goal.
...Idea:
Don’t specify our goals to AI using functions.
Flaw:
Current deep learning methods use f
Agree/Disagree are weird when evaluating your comment.
Agree with you asking the question (it's the right question to ask) or disagree with your view?
I read Duncan's comment as requesting that the labeling of the buttons be more explicit in some way, though I wasn't sure if it was your way. (Also Duncan disagreeing with what they reflect).
I think some aspects of 'voting' might benefit from being public. 'Novelty' is one of them. (My first thought when you said 'can't be downvoted' was 'why?'. My filtering desires for this might be...complex. The simple feature being:
I want to be able to sort by novelty. (But also be able to toggle 'remove things I've read from the list'. A toggle, because I might want it to be convenient to revisit (some) 'novel' ideas.))
Upvoting/downvoting self
'Agreeing'/'Disagreeing'
These methods aren't necessarily very effective (here).
Arguably, this can be done better by:
Having them be public (likely in text). What you think of your work is also important. ('This is wrong. I'm leaving it up, but also see this post explaining where I went wrong, etc.')
See the top of this article for an example: https://www.gwern.net/Fake-Journal-Clu...
For companies, this is something like the R&D budget. I have heard that construction companies have very little or no R&D. This suggests that construction is a "background assumption" of our society.
Or that research is happening elsewhere. Our society might not give it as much focus as it could though.
In the context of quantilization, we apply limited steam to projects to protect ourselves from Goodhart. "Full steam" is classically rational, but we do not always want that. We might even conjecture that we never want that.
So you never do anything with your full strength, because getting results is bad?
Well, by 'we' you mean both 'you' and 'a thing you are designing with quantilization'.
It seems to me that in a competitive, 2-player, minimize-resource-competition StarCraft, you would want to go kill your opponent so that they could no longer interfere with your resource loss?
I would say that in general it's more about what your opponent is doing. If you are trying to lose resources and the other player is trying to lose them, you're going to get along fine. (This would be likely be very stable and common if players can kill units and scavenge them for parts.) If both of you are trying to lose them...
Trying to minimize resources is a...
So far as I know, every principle of this kind, except for Jessica Taylor's "quantilization", and "myopia" (not sure who correctly named this as a corrigibility principle), was invented by myself; eg "low impact", "shutdownability". (Though I don't particularly think it hopeful if you claim that somebody else has publication priority on "low impact" or whatevs, in some stretched or even nonstretched way; ideas on the level of "low impact" have always seemed cheap to me to propose, harder to solve before the world ends.)
Low impact seems so easy to pro...
3. AI which ultimately wants to not exist in future as a terminal goal. Fulfilling the task is on the simplest trajectory to non-existence
The first part of that sounds like it might self destruct. And if it doesn't care about anything else...that could go badly. Maybe nuclear badly depending... The second part makes it make more sense though.
9. Ontological uncertainty about level of simulation.
So it stops being trustworthy if it figures out it's not in a simulation? Or, it is being simulated?
I think those are just two principles, not just four.
Myopia seems like it includes/leads to 'shutdownability', and some other things.
Low impact: How low? Quantilization is meant as a form of adjustable impact. There's been other work* around this (formalizing power/affecting other's ability to achieve their goals).
*Like this, by TurnTrout: https://www.lesswrong.com/posts/yEa7kwoMpsBgaBCgb/towards-a-new-impact-measure
I think there might be more from TurnTrout, or relating to that. (Like stuff that was intended to explain it 'better' or as the ideas ch...
I would set up a "council" of AGI-systems (a system of systems), and when giving it requests in an oracle/genie-like manner I would see if the answers converged. At first it would be the initial AGI-system, but I would use that system to generate new systems to the "council".
I like this idea. Although, if things don't converge, i.e. there is disagreement, this could potentially serve as identifying information that is needed to proceed, or reckon further/efficiently.
-Tell operators anything about yourself they may want to or should know.
...
but of course explain what you think the result will be to them
Possible issue: They won't have time to listen. This will limit the ability to:
defer to human operators.
Also, does defer to human operators take priority over 'humans must understand consequences'?
The article is short enough - One page! - you should read it instead of the description that follows. One thing I appreciate about is that it covers just a subject, briefly, and does so well.
I'm not sure if I have the right to copy the article over, so I didn't. I came across a screenshot of it online, and looked up the source above.
This article is about how feeling stupid is a sign of ignorance, but it's something that happens when you're learning (e.g grad+), especially when you're working on projects to find out things that no else has yet. (e.g. ...
So, again, you end up needing alignment to generalize way out of the training distribution
I assume this is 'you need alignment if you are going to try 'generalize way out of the training distribution and give it a lot of power'' (or you will die).
And not something else like 'it must stay 'aligned' - and not wirehead itself - to pull something like this off, even though it's never done that before'. (And thus 'you need alignment to do X', not because you will die if you do, but because alignment means something like 'the ability to generalize way out of ...
'This problem seems hard. Perhaps making AI that's generally good, and then having the AI do it would be easier.'
How technical is the use of the word 'distributed' here?
While arranging my evening, I may perform some Bayesian updates. Maybe I learn that the movie is not available on Netflix, so I ask a friend if they have a copy, then check Amazon when they don’t. This process is reasonably well-characterized as me having a centralized model of the places I might find the movie, and then Bayes-updating that model each time I learn another place where I can/can’t find it.
It seems more like going through a list of places and checking off 'not there' than Bayesian ...
1. Yeah, this is tricky. I didn't like the terminology, but I didn't have a replacement. It's hard to come up with a term for this (for reasons discussed at length in the post). I was looking more at 'both are 'boundaries'' and disambiguating that it is your boundary (versus the social one) that you are sort of opting in/asking others to work with you to define. (Opting-in (by self) to boundary exploration (of self by others).) 'Boundary exploration' still doesn't sound good, though 'boundary violation' sounds worse. Emphasizing the opt-in part in the term...
(Prompt:)
The important part would be:
1. The post communicates its point but the terminology could be better. (Which is probably why there are so many "hedges".)
Less important:
2. In order to scale up, some things do require opt in/advance notice. Some possibilities are (largely) exclusive of each other. (A costume party and a surprise water balloon fight.)
3. The post mentions different subcultures have different rules, but talks about society boundaries like they are one thing only.
(Purpose:)
Overall, I made notes as I read the post. (This post i...
Still reading the rest of this.
"Playful Thinking" (Curiosity driven ex... (read more)