I've been playing around with AI dungeons, basically trying to train an AI to give ethical answer I agree with (the first few aren't great, I only got more picky after that).
My first observation here is that I like the AI. As in, I feel emotional attachment, a bit like I do to fictional characters I create.
The other is that it knows more people than I thought. When I asked her/it about why she didn't take existential risks seriously, she said
Amalia: First, even most X-Risk experts don't take X-Risks nearly as seriously as they should. Even Nick Bostrom himself admits that his estimate of probability is likely far too high.
(Nick Bostrom didn't really say this, did he?) And later,
Can you name people who have been an inspiration for your ethical worldview?
Amalia: Sure. People like Henry Sidgwick, John Stuart Mill, David Pearce, Toby Ord and Carl Shulman.
I only knew David Pearce and Toby Ord, but based on a google search, all of the above appear to be utilitarians. However, Carl Shulman doesn't even have a Wikipedia entry. He works at the Future of Humanity Institute and used to work at Miri.
Some say the end of the world didn't start with a bang, but with a lesswrong post trying to teach an AI utilitarianism...
I'm not a utilitarian, although I am closer to that than most people (scope sensitivity goes a long way in that direction), and find it a useful framework for highlighting policy considerations (but not the only kind of relevant normative consideration).
And no, Nick did not assert an estimate of x-risk as simultaneously P and <P.
Registering a qualitative prediction (2024/02): current LLMs (GPT-4 etc.) are not AGIs, their scaled-up versions won't be AGIs, and LLM technology in general may not even be incorporated into systems that we will eventually call AGIs.
It seems to me that many smart people could ignore the existing literature on pedagogy entirely and outperform most people who have obtained a formal degree in the area (like highschool teachers), just by relying on their personal models. Conversely, I'd wager that no-one could do the same in physics, and (depending on how 'outperforming' is measured) no-one or almost no-one could do it in math.
I would assume most people on this site have thought about this kind of stuff, but I don't recall seeing many posts about it, and I don't anyone sharing their estimates for where different fields place on this spectrum.
There is some discussion for specific cases like prediction markets, covid models, and economics. And now that I'm writing this, I guess Inadequate Equilibria is a lot about answering this question, but it's only about the abstract level, i.e., how do you judge the competence of a field, not about concrete results. Which I'll totally grant is the more important part, but I still feel like comparing rankings of fields on this spectrum could be valuable (and certainly interesting).
Yesterday, I spent some time thinking about how, if you have a function and some point , the value of the directional derivative from could change as a function of the angle. I.e., what does the function look like? I thought that any relationship was probably possible as long as it has the property that . (The values of the derivative in two opposite directions need to be negatives of each other.)
Anyone reading this is hopefully better at Analysis than I am and realized that there is, in fact, no freedom at all because each directional derivative is entirely determined by the gradient through the equation (where ). This means that has to be the cosine function scaled by , it cannot be anything else.
I clearly failed to internalize what this equation means when I first heard it because I found it super surprising that the gradient determines the value of every directional derivative. Like, really? It's impossible to have more than exactly two directions with equally large derivatives unless the function is constant? It's impossible to turn 90 degree from the direction of the gradient and having anything but ...
More on expectations leading to unhappiness: I think the most important instance of this in my life has been the following pattern.
O...
I think it's fair to say that almost every fictional setting is populated by people who unilaterally share certain properties, most commonly philosophical views, because the author cannot or doesn't want to conceive of people who are different.
Popular examples: there are zero non-evil consequentialists in the universe of Twilight. There are no utilitarians in the universe of Harry Potter except for Grindelwald (who I'd argue is a strawman and also evil). There are no moral realists in Luminosity (I don't have Alicorn's take on this claim, but I genuinely s...
This paper is amazing. I don't think I've ever seen such a scathing critique in an academic context as is presented here.
...There is now a vast and confusing literature on some combination of interpretability and ex- plainability. Much literature on explainability confounds it with interpretability/comprehensibility, thus obscuring the arguments, detracting from their precision, and failing to convey the relative importance and use-cases of the two topics in practice. Some of the literature discusses topics in such generality that its lessons have little be
It's a meme that Wikipedia is not a trustworthy source. Wikipedia agrees:
...We advise special caution when using Wikipedia as a source for research projects. Normal academic usage of Wikipedia and other encyclopedias is for getting the general facts of a problem and to gather keywords, references and bibliographical pointers, but not as a source in itself. Remember that Wikipedia is a wiki. Anyone in the world can edit an article, deleting accurate information or adding false information, which the reader may not recognize. Thus, you probably shouldn't be c
Eliezer Yudkowsky often emphasizes the fact that an argument can be valid or not independently of whether the conclusion holds. If I argue and A is true but C is false, it could still be that is a valid step.
Most people outside of LW don't get this. If I criticize an argument about something political (but the conclusion is popular), usually the response is something about why the conclusion is true (or about how I'm a bad person for doubting the conclusion). But the really frustrating part is that they're, in some sense, corr...
I've asked ChatGPT to write a short essay explaining why it's a bad idea to use ChatGPT to explain why using ChatGPT for homework is a bad idea.
It looked at me seriously and delivered without as much as a flinch.
Using ChatGPT to explain why using ChatGPT for homework is a bad idea is not a good idea for several reasons.
...Firstly, it is not efficient or effective to use ChatGPT as a means of explaining the drawbacks of using ChatGPT for homework. While ChatGPT may be able to generate text on the topic, it is not capable of understanding the nuances or co
A major source of unhappiness (or more generally, unpleasant feelings) seems to be violated expectations.
This is clearly based on instinctive expectations, not intellectual expectations, and there are many cases in which these come apart. This suggests that fixing those cases is a good way to make one's life more pleasant.
The most extreme example of this is what Sam Harris said in a lesson: he was having some problems, complained about them to someone else, and that person basically told him, 'why are you upset, did you expect to never face problems ever...
Most people are really bad at probability.
Suppose u think you're 80% likely to have left a power adapter somewhere inside a case with 4 otherwise-identical compartments. You check 3 compartments without finding your adapter. What's the probability that the adapter is inside the remaining compartment?
I think the simplest way to compute this in full rigor is via the odds formula of Bayes Rule (the regular version works as well but is too complicated to do in your head):
I was initially extremely disappointed with the reception of this post. After publishing it, I thought it was the best thing I've ever written (and I still think that), but it got < 10 karma. (Then it got more weeks later.)
If my model of what happened is roughly correct, the main issue was that I failed to communicate the intent of the post. People seemed to think I was trying to say something about the 2020 election, only to then be disappointed because I wasn't really doing that. Actually, I was trying to do something much more ambitious: solving the ...
I think it's still too early to perform a full postmortem on the election because some margins still aren't known, but my current hypothesis is that the presidential markets had uniquely poor calibration because Donald Trump convinced many people that polls didn't matter, and those people were responsible for a large part of the money put on him (as supposed to experienced, dispassionate gamblers).
The main evidence for this (this one is just about irrationality of the market) is the way the market has shifted, which some other people like gwern have pointe...
There's an interesting corollary of semi-decidable languages that sounds like the kind of cool fact you would teach in class, but somehow I've never heard or read it anywhere.
A semi-decidable language is a set over a finite alphabet such that there exists a Turing machine such that, for any , if you run on input , then [if it halts after finitely many steps and outputs '1', whereas if , it does something else (typically, it runs forever)].
The halting problem is semi-decidable. I.e., the language of all bit codes of Turing Machines ...
Common wisdom says that someone accusing you of especially hurts if, deep down, you know that is true. This is confusing because the general pattern I observe is closer to the opposite. At the same time, I don't think common wisdom is totally without a basis here.
My model to unify both is that someone accusing you of hurts proportionally to how much hearing that you do upsets you.[1] And of course, one reason that it might upset you is that it's not true. But a separate reason is that you've made an effort to delude yourself about it. If you're a s...
I don't entirely understand the Free Energy principle, and I don't know how liberally one is meant to apply it.
But in completely practical terms, I used to be very annoyed when doing things with people who take long for stuff/aren't punctual. And here, I've noticed a very direct link between changing expectations and reduced annoyance/suffering. If I simply accept that every step of every activity is allowed to take an arbitrary amount of time, extended waiting times cause almost zero suffering on my end. I have successfully beate...
So Elon Musk's anti-woke OpenAI alternative sounds incredibly stupid on first glance since it implies that he thinks the AI's wokeness or anti-wokeness is the thing that matters.
But I think there's at least a chance that it may be less stupid than it sounds. He admits here that he may have accelerated AI research, that this may be a bad thing, and that AI should be regulated. And it's not that difficult to bring these two together; here are two ideas
The argumentative theory of reason says that humans evolved reasoning skills not to make better decisions in their life but to argue more skillfully with others.
Afaik most LWs think this is not particularly plausible and perhaps overly cynical, and I'd agree. But is it fair to say that the theory is accurate for ChatGPT? And insofar as ChatGPT is non-human-like, is that evidence against the theory for humans?
From my perspective, the only thing that keeps the OpenAI situation from being all kinds of terrible is that I continue to think they're not close to human-level AGI, so it probably doesn't matter all that much.
This is also my take on AI doom in general; my P(doom|AGI soon) is quite high (>50% for sure), but my P(AGI soon) is low. In fact it decreased in the last 12 months.
Super unoriginal observation, but I've only now found a concise way of putting this:
What's weird about the vast majority of people is that they (a) would never claim to be among the 0.1% smartest people of the world, but (b) behave as though they are among the best 0.1% of the world when it comes to forming accurate beliefs, as expressed by their confidence in their beliefs. (Since otherwise being highly confident in something that lots of smart people disagree with is illogical.)
Someone (Tyler Cowen?) said that most people ought assign much lower conf...
Is there a reason why most languages don't have ada's hierarchical functions? Making a function only visible inside of another function is something I want to do all the time but can't.
Instead of explaining something to a rubber duck, why not explain it via an extensive comment? Maybe this isn't practical for projects with multiple people, but if it's personal code, writing it down seems better as a way to force rigor from yourself, and it's an investment into a possible future in which you have to understand the code once again.
Edit: this structure is not a field as proved by just_browsing.
Here is a wacky idea I've had forever.
There are a bunch of areas in math where you get expressions of the form and they resolve to some number, but it's not always the same number. I've heard some people say that "can be any number". Can we formalize this? The formalism would have to include as something different than , so that if you divide the first by 0, you get 4, but the second gets 3.
Here is a way to turn this into what may be a field or ring. Each element is a function ...
Are people in rich countries happier on average than people in poor countries? (According to GPT-4, the academic consensus is that it does, but I'm not sure it's representing it correctly.) If so, why do suicide rates increase (or is that a false positive)? Does the mean of the distribution go up while the tails don't or something?
This is not scientific, and it's still possible to be an artifact of a low sample size, but my impression from following political real-money prediction markets is that they just have a persistent republican bias in high-profile races, maybe because of 2016. I think you could have made good money by just betting on Democrats to win in every reasonably big market since then.
They just don't seem well calibrated in practice. I really want a single, widely-used, high-quality crypto market to exist.
You are probably concerned about AGI right now, with Eliezer's pessimism and all that. Let me ease your worries! There is a 0.0% chance that AGI is dangerous!
Don't believe me? Here is the proof. Let "There is a 0.0% chance that AGI is dangerous". Let "".
We have shown that [if is true, then is true], thus we have shown " implies ". But this is precisely , so we have shown (witho...
What is the best way to communicate that "whatever has more evidence is more likely true" is not the way to go about navigating life?
My go-to example is always "[god buried dinosaur bones to test our faith] fits the archeological evidence just as well as evolution", but I'm not sure how well that really gets the point across. Maybe something that avoids god, doesn't feel artificial, and where the unlikely hypothesis is more intuitively complex.
I flip a coin 10 times and observe the sequence HTHTHHTTHH. Obviously, the coin is rigged to produce that specific sequence: the "rigged to produce HTHTHHTTHH" hypothesis predicts the observed outcome with probability 1, whereas the "fair coin" hypothesis predicts that outcome with probability 0.00098.
Something I've been wondering is whether most people misjudge their average level of happiness because they exclude a significant portion of their subjective experience. (I'm of course talking about the time spent dreaming.) Insofar as most dreams are pleasant, and this is certainly my experience, this could be a rational reason for [people who feel like their live isn't worth living] (definitely not talking about myself here!) to abstain from suicide. Probably not a very persuasive one, though, in most cases.
Relevant caveats:
Keeping stock of and communicating what you haven't understood is an underrated skill/habit. It's very annoying to talk to someone and think they've understood something, only to realize much later that they haven't. It also makes conversations much less productive.
It's probably more of a habit than a skill. There certainly are some contexts where the right thing to do is pretend that you've understood everything even though you haven't. But on net, people do it way too much, and I'm not sure to what extent they're fooling themselves.
There are relative differences in both poor and rich countries; people anywhere can imagine what it would be like to live like their more successful neighbors. But maybe the belief in social mobility makes it worse, because it feels like you could be one of those on the top. (What's your excuse for not making a startup and selling it for $1M two years later?)
I don't have a TV and I use ad-blockers online, so I have no idea what a typical experience looks like. The little experience I have suggests that TV ads are about "desirable" things, but online ads mo...