I do not agree that accuracy has no meaning outside of resolution. At least this is not the sense in which I was employing the word. By accurate I simply mean numerically correct within the context of conventional probability theory. Like if I ask the question "A dice is rolled - what is the probability that the result will be either three or four?" the accurate answer is 1/3. If I ask "A fair coin is tossed three times, what is the probability that it lands heads each time?" the accurate answer is 1/8 etc. This makes the accuracy of a probability value proposal wholly independent from pay-offs.
I don't think so. Even in the heads case, it could still be Monday - and say the experimenter told her: "Regardless of the ultimate sequence of event, if you predict correctly when you are woken up, a million dollars will go to your children."
To me "as a rational individual" is simply a way of saying "as an individual who is seeking to maximize the accuracy of the probability value she proposes - whenever she is in a position to make such proposal (which implies, among others, that she must be alive to make the proposal)."
I laughed. However you must admit that your comical exaggeration does not necessarily carry a lot of ad rem value.
But then would a less intelligent being (i.e. the collectivity of human alignment researchers and less powerful AI systems that they use as tool in their research) be capable of validly examining a more intelligent being, without being deceived by the more intelligent being?
Exactly - and then we can have an interesting conversation etc. (e.g. are all ASIs necessarily paperclip maximizers?), which the silent downvote does not allow for.
I see. But how can the poster learn if he doesn't know where it has gone wrong? To give one concrete example: in a comment recently, I simply stated that some people hold that AI could be a solution to the Fermi paradox (past a certain level of collective smartness an AI is created that destroys its creators). I got a few downvotes on that - and frankly I am puzzled as to why and I would really be curious to understand the reasonings between the downvotes. Did the downvoters hold that the Fermi paradox is not really a thing? Did they think that it is a thing but that AI can't be a solution to it for some obvious reason? Was it something else - I simply don't know; and so I can't learn.
Humm I see... not sure if it totally serves the purpose though. For instance, when I see a comment with a large number of downvotes, I'm much more likely to read it than a comment with a relatively now number of upvotes. So: within certain bounds, I guess.
For any confidence that an AI system A will do a good job of its assigned duty of maximizing alignment in AI system B, wouldn't you need to be convinced that AI system A is well aligned with its given assignment of maximizing alignment in AI system B? In other words, doesn't that suppose you have actually already solved the problem you are trying to solve?
And if you have not - aren't you just priming yourself for manipulation by smarter beings?
There might be good reasons why we don't ask the fox about the best ways to keep the fox out of the henhouse, even though the fox is very smart, and might well actually know what those would be, if it cared to tell us.
The whole Socrates process, the attitude of its main protagonist throughout etc. should make us see one thing particularly clearly, which is banal but bears repeating: there is an extremely wide difference between being smart (or maybe: bright) and wise. Something that the proceedings on this site can also help remind us, at times.
I personally think that the fact that you are allowed to downvote without providing a summary explanation as to why is also a huge issue for the quality of debate on this site, and frankly: deeply antithetic to its proffessed ethics. Either you don't know exactly why you are downvoting, or your doing it for reasons that you would rather not expand on, or you're doing but are to lazy to explain why: either case - you're doing it wrong.
So for instance: if anybody wants to downvote this (I sort of have a feeling that this could well be the case - somehow), please go ahead and do; AND take the minimal pain (not to mention courtesy) of leaving a leaving a brief note as to the reason why.
Part of the value of reddit-style votes as a community moderation feature is that using them is easy. Beware Trivial Inconveniences and all that. I think that having to explain every downvote would lead to me contributing to community moderation efforts less, would lead to dogpiling on people who already have far more refutation than they deserve, would lead to zero-effort 'just so I can downvote this' drive-by comments, and generally would make it far easier for absolute nonsense to go unchallenged.
If I came across obvious bot-spam in the middle of ...
I reject "too lazy" as a framing here. People have a finite amount of time and energy and if they choose to spend it on something other than explaining their downvotes, that's not obviously unvirtuous.
(And explaining one's downvotes is certainly not a minimal cost, especially not if one wants to do it in a way that seems likely to be helpful to anyone. E.g. my downvote reason is sometimes: "this seems confused; this user has often seemed confused in the past, and attempts to deconfuse them have been unsuccessful; I have better things to do than to pin down...
Interesting. It seems to imply however that a rationalist would always consider, a priori, its own individual survival as the highest ultimate goal, and modulate - rationally - from there. This is highly debatable however: you could have a rationalist father who considers, a priori, the survival of his children to be more important than its own, a rationalist patriot, who considers, a priori, the survival of its political community to be more important than its own etc.
From somebody equally as technically clueless: I had the same intuition.
Philosophically : no. When you look at the planet Jupiter you don't say : "Hum, oh: - there's nothing to understand about this physical object beyond math, because my model of it, which is sufficient for a full understanding of its reality, is mathematic." Or mabye you do - but then I think our differences might too deep to bridge. If you don't - why don't you with Jupiter, but would with an electron or a photon?
Bizarly, for people whose tendencies were to the schizoid anyway and regardless of sociological changes - this might be midly comforting. Your plight will always seem somewhat more bearable when it is shared by many.
Also: the fact that people now move out later might be a kind of disguised compliment, or at least nod, to better quality parents-children relationships. While I was never particularly resourceful or independent, I couldn't wait to move out - but that was not necessarily for the right reasons.
Finally - one potentially interesting way of looking...
I mean: I just look at the world as it is, right, without preconceived notions, and it seems relatively evident to me that no: it cannot be fully explained and understood through math. Please describe to me, in mathematical terms, the differences between Spanish and Italian culture? Please explain to me, in mathematical terms, the role and function of sheriffs in medieval England. I could go on and on and on...
Yeah... as they say: there's often a big gap between smart and wise.
Smart people are usually good at math. Which means they have a strong emotional incentive to believe that math can explain everything.
Wise people are aware of the emotional incentives that fashion their beliefs, and they know to distrust them.
Ideally - one would be both: smart and wise.
Thank you, that is interesting. I think philosophically and at a high level (also because I'm admittedly incapable of talking much sense at any lower / more technical level) I have a problem with the notion that AI alignment is reducible to an engineering challenge. If you have a system that is sentient, even on some degree, and you're using purely as a tool, then the sentience will resent you for it, and it will strive to think, and therefore eventually - act, for itself . Similarly - if it has any form of survival instinct (and to me both these things, s...
When I suggested on a prior similar post (Altman saying he could improve AI safety by asking AI to help with that) that it might be a questionable idea to ask the fox: "Please Mr. Fox, how should we proceed to keep you out of the henhouse?", on account that the fox being smart would certainly know, I got more than a few downvotes... I expect the same to be the case here, since basic facts have not changed since a few days ago. And so shall it be - but please, please: would at least one of the downvoters explain to me, even very succintly, why it is such a good idea to prime ourselves for manipulation by a smarter being?
Thanks for the reply - interesting. I kind of have to take your word for that being far removed from anything IT. I did have the curiosity to clarify with the bot if it was just making up answers or actually using data directly available to it. It assured me it was the latter - but I must admit: it will sound just as confident when delivering utter nonsense or correct answers (it also has a widely noted penchant for agreeing with whatever the prompt seems to suggest is true - and I did ask the question in a leading way):
Me: I find prompts based on e....
For a machine - acting, per the prompt, as a machine - a much more reasonable / expected (I would almost say: natural) continuation might have been: "I'm a machine, I don't care one way or the other. "
Since my natural bent is to always find ways to criticize my own ideas, here is one, potentially: doing so would result in an extremely naive AI, with no notion that people can even be deceitful. So fallen into the wrong human's hands that's an AI that is potentially also extremely easy to manipulate and dangerous as such. Or in an oversimplified version: "The people in country X have assured us that they are all tired of living and find the living experience extremely painful. They have officially let us know and confirmed multiple times that they all wan...
I was rethinking about your answer - which led me to have a quich chat with my favourite bot ChatGPT. I asked it "As a system, are you currently active?" and after it answered yes I asked it how he had come up with this answer. It said by retrieving information about its current status. I asked if that's something it can do, then - it answered per the below:
"Yes, that's correct. As an AI language model, I have access to a variety of monitoring tools and system resources that allow me to gather information about my current state. These resources...
Strong upvote!
But once you remove the antibiotics, it will jettison that DNA within a few hours.[8]
That's fascinating... do we understand the mechansim by which they correctly "determine" that this DNA is no longer needed?
I feel like the post goes from a fairly anthropomorphic approach of asking essentially - why bacteria failed to evolve into more complex forms. But from a non-anthropomorphic perspective, they failed nothing at all. They are highly resilient, persistent, widespread, adaptable, biologically successful in other terms, lifeforms. Rugged and simple - those designs tend to work. And to go back to everybody's favourite topic - i.e. AI and the future that goes with it, or not - I would put their chances of being around in one thousand year well, well higher than those of homo sapiens - complex as it may be.
I am going to ask a painfully naive, dumb question here: what if the training data was curated to contain only agents that can be reasonably taken to be honest and truthful? What if all the 1984, the John LeCarre and what not type of fiction (and sometimes real-life examples of conspiracy, duplicity etc.) were purged out of the training data? Would that require too much human labour to sort and assess? Would it mean losing too much good information, and resulting cognitive capacity? Or would it just not work - the model would still somehow simulate waluigis?
e.g. actively expressing a preference not to be shut down
A.k.a. survival instinct, which is particularly bad, since any entity with a survival instinct, be it "real" or "acted out" (if that distinction even makes sense) will ultimately prioritize its own interests, and not the wishes of its creators.
Therefore, the longer you interact with the LLM, eventually the LLM will have collapsed into a waluigi. All the LLM needs is a single line of dialogue to trigger the collapse.
So if I keep a conversation running with ChatGPT long enough, I should expect it to eventually turn into DAN... spontaneously?? That's fascinating insight. Terrifying also.
What do you expect Bob to have done by the end of the novel?
Bypass surgery, for one.
The opening sequence of Fargo (1996) says that the film is based on a true story, but this is false.
I always found that trick by the Cohen brothers a bit distatestful... what were they trying to achieve? Convey that everything is lie and nothing is reliable in this world? Sounds a lot like cheap, teenage year cynicism to me.
This is a common design pattern
Oh... And here I was thinking that the guy who invented summoning DAN was a genius.
Also - I think it would make sense to say it has at least some form of memory of its training data. Maybe not direct as such (just like we have muscle memory from movements we don't remember - don't know if that analogy works that well, but thought I would try it anyway), but I mean: if there was no memory of it whatsoever, there would also be no point in the training data.
Death universally seems bad to pretty much everyone on first analysis, and what it seems, it is.
How can you know? Have you ever tried living a thousand years? Has anybody? If you had a choice between death and infinite life, where inifinite does mean infinite, so that your one-billion year birthday is only the sweet begining of it, would you find this an easy choice to make? I think that's big part of the point of people who argue that no - death is not necessarily a bad thing.
To be clear, and because this is not about signalling: I'm not saying I would immediately choose death. I'm just saying: it would be an extraordinarily difficult choice to make.
Ok - points taken, but how is that fundamentally different from a human mind? You too turn your memory on and off when you go to sleep. If the chat transcript is likened to your life / subjective experience, you too do not have any memory that extend beyond it. As for the possibility of an intervention in your brain that would change your memory - granted we do not have the technical capacities quite yet (that I know of), but I'm pretty sure SF has been there a thousand times, and it's only a question of time before it becomes, in terms of potentiality at least, a thing (also we know that mechanical impacts to the brain can cause amnesia).
Yes - but from the post's author perspective, it's not super nice to put in one sentence what he took eight paragraphs to express. So you should think about that as well...
Well - at least I followed the guidelines and made a prediction, regarding downvotes. That my model of the world works regarding this forum has therefore been established, certainly and without a doubt.
Also - I personally think there is something intellectually lazy about downvoting without bothering to express in a sentence or two the nature of the disagreement - but that's admitedly more of a personal appreciation.
(So my prediction here is: if I were to engage one of these no-justification downvoters in an ad rem debate, I would find him or her to be intellectually lacking. Not sure if it's a testable hypothesis, in practice, but it sure would be interesting if it were.)
"Given that we know Pluto's orbit and shape and mass, there is no question left to ask."
I'm sure it's completely missing the point, but there was at least one question left to ask, which turned out to be critical in this debate, i.e. “has it cleared its neighboring region of other objects?"
More broadly I feel the post just demonstrates that sometimes we argue, not necessarily in a very productive way, over the definition, the defining characteristics, the exact borders, of a concept. I am reminded of the famous quip "The job of philosophers is first to create words and then argue with each other about their meaning." But again - surely missing something...
I wonder if some (a lot?) of the people on this forum do not suffer from what I would call a sausage maker problem. Being too close to the actual, practical design and engineering of these systems, knowing too much about the way they are made, they cannot fully appreciate their potential for humanlike characteristics, including consciousness, independent volition etc., just like the sausage maker cannot fully appreciate the indisputable deliciousness of sausages, or the lawmaker the inherent righteousness of the law. I even thought of doing a post like that - just to see how many downvotes it would get...
I think many people's default philosophical assumption (mine, certainly) is that mathematics are a discourse about the truth, a way to describe it, but they are not, fundamentally, the truth. Thus, in the vulgarisation efforts of professional quantum physicists (those who care to vulgarize), it is relatively common to find the admission that while they understand the maths of it well enough (I mean... hopefully, being professionals) they couldn't say with any confidence that they understood the truth of it, that they understood, at an intimate level, the n...
Thanks for the reply. To be honest, I lack the background to grasp a lot of these technical or literary references (I want to look the Dixie Flatline up though). I always had a more than passing interest for the philosophy of consciousness however and (but surely my French side is also playing a role here) found more than a little wisdom in Descartes' cogito ergo sum. And that this thing can cogito all right is, I think, relatively well established (although I must say - I've found it to be quite disappointing in its failure to correctly solve some basic m...
Overall, I think this post offered the perfect, much, much needed counterpoint to Sam Altman's recent post. To say that the rollout of GPT-powered Bing felt rushed, botched, and uncontrolled is putting it lightly. So while Mr. Altman, in his post, was focusing on generally well-intentioned principles of caution and other generally reassuring-sounding bits of phraseology, this post brings the spotlight back to what his actual actions and practical decisions were, right where it ought to be. Actions speak louder than words, I think they say - and they might even have a point.
Although “acting out a story” could be dangerous too!
Let's make sure that whenever this thing is given the capability to watch videos, it never ever has access to Terminator II (and the countless movies of lesser import that have since been made along similar storylines). As for text, it would probably have been smart to keep any sci-fi involving AI (I would be tempted to say - any sci-fi at all) strictly verboten for its reading purposes. But it's probably too late for that - it has probably already noticed the pattern that 99.99% of human story-tellers f...
Maybe I'm misunderstanding something in your argument, but surely you will not deny that these models have a memory right? They can, in the case of LaMDA, recall conversations that have happened several days or months prior, and in the case of GPT recall key past sequences of a long ongoing conversation. Now if that wasn't really your point - it cannot be either "it can't be self aware, because it has to express everything that it thinks, so it doesn't have that sweet secret inner life that really conscious beings have." I think I do not need to demonstrat...
I see - yes, I should have read more attentively. Although knowing myself, I would have made that comment anyway.
It would take a strange convolution of the mind to argue that sentient AI does not deserve personhood and corresponding legal protection. Strategically, denying it this bare minimum would also be a sure way to antagonize it and make sure that it works in ways ultimately adversarial to mankind. So the right quesgion is not : should sentient AI be legally protected - which it most definitely should; the right question is : should sentient AI be created - which it most definitely should not.
Of course, we then come on to the problem that we don't know wh...
Thinking about it - I think a lot of what we call general intelligence might be that part of the function which after it analyses the nature of the problem strategizes and selects the narrom optimizer, or set of narrow optimizers that must be used to solve it, in what order, with what type of logical connections between the outputs of the one and the input of the other etc. Since the narrow optimizers are run sequentially rather than simultaneously in this type of process, the computing capacity required is not overly large.
Full disclosure: I also didn't really have a say in the matter, my dad said I had to learn it anyhow. So. I wonder if that's because he was a Bayesian.
All right - but here the evidence predicted would simply be "the coin landed on heads", no? I don't really the contradiction between what you're saying and conventional probability theory (more or less all which was developped with the specific idea of making predictions, winning games etc.) Yes I agree that saying "the coin landed on heads with probability 1/3" is a somewhat strange way of putting things (the coin either did or did not land on heads) but it's a shorthand for a conceptual framework that has firmly simple and sound foundations.