You're thinking much too small, this only stops things occurring that are causally *downstream* of us. Things will still occur in other timelines, and we should prevent though things from happening too. I propose we create a "hyperintelligence" that acausally trades across timelines or invents time travel to prevent anything from happening in any other universe or timeline as well. Then we'll be safe from AI ruin.
Thanks for the great link. Fine-tuning leading to mode collapse wasn't the core issue underlying my main concern/confusion (intuitively that makes sense). paulfchristiano's reply leaves me now mostly completely unconfused, especially with the additional clarification from you. That said I am still concerned; this makes RLHF seem very 'flimsy' to me.
I was also thinking the same thing as you, but after reading paulfchristiano's reply, I now think it's that you can use the model to use generate probabilities of next tokens, and that those next tokens are correct as often as those probabilities. This is to say it's not referring to the main way of interfacing with GPT-n (wherein a temperature schedule determines how often it picks something other than the option with the highest probability assigned; i.e. not asking the model "in words" for its predicted probabilities).
GPT-4 can also be confidently wrong in its predictions, not taking care to double-check work when it’s likely to make a mistake. Interestingly, the base pre-trained model is highly calibrated (its predicted confidence in an answer generally matches the probability of being correct). However, through our current post-training process, the calibration is reduced.
What??? This is so weird and concerning.
I graduated college in four years with two bachelors and a masters. Some additions:
AP Tests:
You don't need to take the AP course to take the test at all. This is NOT a requirement. If your high school doesn't offer the test you may need to take it at another school, though. Also unfortunate is that if it is the same as when I did this, your school probably gets test fees waived for students who took the course and thus you may need to pay for the test. https://apstudents.collegeboard.org/faqs/can-i-register-ap-exam-if-my-school-doesnt-offer-ap-courses-or-administer-ap-exams
Proficiency Tests:
The college I went to offered "Proficiency Tests" for many courses (mostly freshman targeted) which were effectively final exams for courses that you could take, and if you satisfied with some grade you got credit for the course. If you are good at studying on your own, this will probably be significantly less work than taking the course and it is an especially effective for courses that you are not interested in.
Taking More Classes:
I literally planned my entire course load for all four years way before I got on campus (with built in flexibility for when courses were full or if I wanted to leave a couple of wildcards in for fun or whatever). This is important because if you're planning something like what I was doing, it's important not to have all your hard classes in the same semester and then burn out.
The big accusation, I think, is of sub-maximal procreation. If we cared at all about the genetic proliferation that natural selection wanted for us, then this time of riches would be a time of fifty-child families, not one of coddled dogs and state-of-the-art sitting rooms.
Natural selection, in its broadest, truest, (most idiolectic?) sense, doesn’t care about genes.
So what did natural selection want for us? What were we selected for? Existence.
I think there might be a meaningful way to salvage the colloquial concept of "humans have overthrown natural selection."
Let [natural selection] refer to the concept of trying to maximize genetic fitness and specifically refer to maximizing the spread of genes. Let [evolution] refer to the concept of trying to maximize 'existence' or persistence. There's sort of a hierarchy of optimizers where [evolution] > [natural selection] > humanity where you could make the claim that humanity has "overthrown our boss and took their position" such that humanity reports directly to [evolution] now instead of having [natural selection] as our middle manager boss. One can make the argument that ideas in brains are the preferred substrate over DNA now, as an example of this model.
This description also makes the warning with respect to AI a little more clear: any box or "boss" is at risk of being overthrown.
(This critique contains not only my own critiques, but also critiques I would expect others on this site to have)
First, I don't think that you've added anything new to the conversation. Second, I don't think what you have mentioned even provides a useful summary of the current state of the conversation: it is neither comprehensive, nor the strongest version of various arguments already made. Also, I would prefer to see less of this sort of content on LessWrong. Part of that might be because it is written for a general audience, and LessWrong is not very like the general audience.
This is an example of something that seems to push the conversation forward slightly, by collecting all the evidence for a particular argument and by reframing the problem as different, specific, answerable questions. While I don't think this actually "solves the hard problem of consciousness as Halberstadt notes in the comments, I think it could help clear up some confusions for you. Namely, I think it is most meaningful to start from a vaguely panpsychist model of "everything is conscious," what we mean by consciousness is "the feeling of what it is like to be" and the move on to talk about what sorts of consciousness we care about: namely consciousness that looks remotely similar to ours. In this framework, AI is already conscious, but I don't think there's any reason to care about that.
More specifics:
Consciousness is not, contrary to the popular imagination, the same thing as intelligence.
I don't think that's a popular opinion here. And while I think some people might just have a cluster of "brain/thinky" words in their head when they don't think about the meaning of things closely, I don't think this is a popular opinion of people in general unless they're really not thinking about it.
But there’s nothing that it’s like to be a rock
Citation needed.
But that could be very bad, because it would mean we wouldn’t be able to tell whether or not the system deserves any kind of moral concern.
Assuming we make an AI conscious, and that consciousness is actually something like what we mean by it more colloquially (human-like, not just panpsychistly), it isn't clear that this makes it a moral concern.
There should be significantly more research on the nature of consciousness.
I think there shouldn't. At least not yet. The average intelligent person thrown at this problem produces effectively nothing useful, in my opinion. Meanwhile, I feel like there is a lot of lower hanging fruit in neuroscience that would also help solve this problem more easily later in addition to actually being useful now.
In my opinion, you choose to push for more research when you have questions you want answered. I do not consider humanity to have actually phrased the hard problem of consciousness as a question, nor do I think we currently have the tools to notice an answer if we were given one. I think there is potentially useful philosophy to do around but not on the hard problem of consciousness in terms of actually asking a question or learning how we could recognize an answer
Researchers should not create conscious AI systems until we fully understand what giving those systems rights would mean for us.
They cannot choose not to because they don't know what it is, so this is unactionable and useless advice.
AI companies should wait to proliferate AI systems that have a substantial chance of being conscious until they have more information about whether they are or not.
Same thing as above, and also the prevailing view here is that it is much more important that AI will kill us, and if we're theoretically spending (social) capital to make these people care about things, the not killing us is astronomically more important.
AI researchers should continue to build connections with philosophers and cognitive scientists to better understand the nature of consciousness
I don't think you've made strong enough arguments to support this claim given the opportunity costs. I don't have an opinion on whether or not you are right here.
Philosophers and cognitive scientists who study consciousness should make more of their work accessible to the public
Same thing as above.
Nitpick: there's something weird going on with your formatting because some of your recommendations show up on the table of contents and I don't think that's intended.
I haven't quite developed an opinion on the viability of this strategy yet, but I would like to appreciate that you produced a plausible sounding scheme that I, a software engineer not mathematician, feel like I could actually probably contribute to. I would like to request people come up with MORE proposals similar along this dimension and/or readers of this comment to point me to other such plausible proposals. I think I've seen some people consider potential ways for non-technical people to help, but I feel like I've seen disproportionately few ways for technically competent but not theoretically/mathematically minded to help.
If I discover something first, our current culture doesn't assign much value to the second person finding it, is why I mentioned exploration as not-positive sum. Avoiding death literally requires free energy, a limited resource, but I realize that's an oversimplification at the scale we're talking.
I think the problem of "actually specifying to an AI to do something physical, in reality, like 'create a copy of strawberry down to the cellular but not molecular level', and not just manipulate its own sensors to believe it perceives itself achieving that even if it accomplishes real things in the world to do that" is a problem that is very deeply related to physics, and is almost certainly dependent on the physical laws of world more than some abstract disembodied notion of an agent.