Yeah I guess I should be clear that I generally like the idea of building virtuous AI and maybe somehow this solves some of the problems we have with other designs, the trick is building something that actually implements whatever we think it means to be virtuous, which means getting precise enough about what it means to be virtuous that we can be sure we don't simply collapse back into the default thing all negative feedback systems do: optimize for their targets as hard as they can (with "can" doing a lot of work here!).
No matter what the goal, power seeking is of general utility. Even if an AI is optimizing for virtue instead of some other goal, more power would, in general, give them more ability to behave virtuously. Even if the virtue is something like "be an equal partner with other beings", an AI could ensure equality by gaining lots of power and enforcing equality on everyone.
A counterpoint: when I skip showers, my cat appears strongly in favor of smell of my armpits- occasionally going so far as to burrow into my shirt sleeves and bite my armpit hair (which, to both my and my cat's distress, is extremely ticklish). Since studies suggest that cats have a much more sensitive olfactory sense than humans (see https://www.mdpi.com/2076-2615/14/24/3590), it stands to reason that their judgement regarding whether smelling nice is good or bad should hold more weight than our own. And while my own cat's preference for me smelling...
Yes, though note that this is still concerning.
Normally the way this works in a startup is that spend exceeding revenue should be in service of bootstrapping the company. That means that money is usually spent in a few ways:
OpenAIs spend is concerning for the same reason, say, Uber and Netflix have/had concerning spend: they have to actually win their market to have a chance of reaping rewards, and if they don't they'll simply be forced to raise prices and cut quality/R&D.
I think there's something to be said for this straightforward counterargument here: conceptual rounding off is performing the valuable service of lumping together ideas that are not that different and only seem importantly different to their authors/supporters rather than actually meaningfully different ideas.
An idea should either be precisely defined enough that it's clear why it can't be rounded off (once the precise definition is known), or it's a vague idea and it either needs to become more precise to avoid being rounded or it is inherently vague and being vague there can't be much harm from rounding because it already wasn't clear where its boundaries were in concept space.
So... there surely are things like (overlapping, likely non-exhaustive):
My understanding is that that cloud method wastes perfume because less of it actually gets on you. Better to spray directly on the skin, usually either on the chest or on the back of the neck. If you want some on your hands, rub the back of your hands on the sprayed area immediately after spraying.
3 sprays with the cloud method is probably fine.
So while your point is mostly true, I want to highlight there are some situations where simply asking people to respect your food norms is a problem, and they mostly arise in a specific sort of culture that is especially communal with regard to food and sees you as part of the ingroup.
For example, it's a traditional upper-class Anglo norm that it's rude to put your hosts out by asking them to make you something special to accommodate your diet. You're expected to get along and eat what everyone else eats. You will be accommodated if you ask, but you will a...
It is indeed rude to ask your hosts to make you something special to accommodate your diet. That's why I don't do it. This is part of how I try to not be a problem for other people. If I'm not expecting vegetarian options, I just eat in advance and then nibble on the bread or something. I did this around Anglos even back when I ate a normal diet, because Anglos often serve so little food.
My East Asian family doesn't see it as an affront (though I can't speak for everyone—especially not anyone under the age of 18). To the contrary, it's a source of common g...
Okay, you just doubled down, so clearly this discussion isn't going anywhere. It's also off topic anyway since it's not directly addressing the claims of this post.
Please kindly refrain from continuing this line for discussion on this post. I'd welcome additional comments if you wanted to address the claims of the post directly, though.
Your claims overgeneralize and it makes them false. To assure I'm not just biased because I am religious myself, here's Claude's take on your comment when I asked it to fact check it.
Prompt: can you fact check this comment that was posted to an online forum (not mine but i suspect it might not be correct but don't trust myself not to be biased so looking for your take) [your comment copied in]
Response:
...I'll analyze this forum comment by examining its claims about religion, compassion, and philosophical history.
The comment contains several sweeping generaliz
Ranked in order of how interesting they were to me when I got interested in them, which means in approximately chronological order because the more ideas I knew the less surprising new ideas were (since they were in part predicted by earlier ideas that had been very interesting).
While history suggests we should be skeptical, current AI models produce real results of economic value, not just interesting demos. This suggests that we should be willing to take more seriously the possibility that they will be produce TAI since they are more clearly on that path and already having significant transformative effects on the world.
I don't think a philosophy of mind is necessary for this, no, although I can see why it might seem like it is if you've already assumed that philosophy is necessary to understand the world.
It's enough to just be able to model other minds in the world to know how to show them compassion, and even without modeling compassion can be enacted, even if it's not known to be compassionate behavior. This modeling need not rise to the level of philosophy to get the job done.
I'm a SWE, use AI everyday to do my job, and I think the idea that AI is the cause of reduced engineer hiring is basically false.
There is probably some marginal effect, but I instead think what we're seeing today is because:
For example, I suspect philosophical intelligence was a major driver behind Eliezer's success (and not just for his writing about philosophy). Conversely, I think many people with crazy high IQ who don't have super impressive life achievements (or only achieve great things in their specific domain, which may not be all that useful for humanity) probably don't have super high philosophical intelligence.
Rather than "philosophical intelligence" I might call this "ability to actually win", which is something like being able to keep your thoughts in contact wit...
AI will not kill everyone without sequential reasoning.
This statement might be literally true, but only because of a loophole like "AI needs humans to help it kill everyone". Like we're probably not far away from, or may already have, the ability to create novel biological weapons, like engineered viruses, that could kill all humans before a response could be mustered. Yes, humans have to ask the LLM to help it create the thing and then humans have to actual do the lab work and deployment, but from an outside view (which is especially important from a policy perspective), this looks a lot like "AI could kill everyone without sequential reasoning".
I basically don't care about philosophy of mind anymore, mostly because I don't care about philosophy anymore.
Philosophy, as a project, is usually about two things. One, figure out metaphysics. Two, figure out a correct ontology for reality.
Both of these are flawed projects. Metaphysics is that which we can't know from experience, so it's all speculative and also unnecessary, because we can model the world adequately without supposing to know how it works beyond our ability to observe it. Fake metaphysics is helpful contingently because it lets you have fa...
This is gonna sound mean, but the quality of EA-oriented online spaces has really gone downhill in the last 5 years. I barely even noticed Kat Woods' behavior, because she is just one more in a sea of high volume, low quality content being posted in EA spaces.
That's what I've given up mostly on EA sites and events, other than attending EA Global (can't break my streak), and just hang out here on Less Wrong where the vibes are still good and the quality bar is higher.
A couple notes:
No, I've only tried it with Claude so far. I did think about trying other models to see how it compares, but I think Claude gave me enough info that trying to do this in chat is unlikely to be useful. I got enough info to feel like, in theory, teaching LLMs to meditate is not exactly a useful thing to do, but if it is then it needs to happen as part of training.
Also, more generally, no prediction market price means you can immediately conclude what the probability of any outcome is, because most markets we have only subjective probability (maybe this is always true but I'm trying to ignore things like fair coin flips that have agreed upon "objective" probabilities), so there is no fact of the matter about what the real probability of something happening is, only the subjective probability based on the available information.
Instead a prediction market is simply, in the ideal case, the market clearing price at whic...
Honestly, this fits my intuition. If I think of all the rationalists I know, they feel like they are on average near 120 IQ, with what feels like a standard distribution around it, though in reality it's probably not quite normal with a longer upper tail than lower tail, i.e. fewer 90s than 150s, etc. Claims that the average is much higher than 120 feel off to me, relative to folks I know and have interacted with in the community (insert joke about how I have "dumb" friends maybe).
Mine:
The world is perfect, meaning it is exactly as it is and always was going to be. However, the world as we know it is an illusion in that it only exists in our minds. We only know our experience, and all (metaphysical) claims to know reality, no matter how useful and predictive they are, are contingent and not fundamental. But we get confused about this because those beliefs are really useful and really predictive, and we separate ourselves from reality by first thinking the world is real, and then thinking our beliefs are about the world rather than o...
I can't help but wonder if part of the answer is that they seem dangerous and people are selecting out of producing them.
Like I'm not an expert but creating AI agents seems extremely fun and appealing, and I'm intentionally working on it none because it seems safer not to build them. (Whether you think my contributions to trying to build them would matter or not is another question.)
Most arguments I see in favor of AGI ignore economic constraints. I strongly suspect that we can't actually afford to create AGI yet; world GDP isn't high enough. They seem to be focused on inside-view arguments for why method X will make it happen, which sure, maybe, but even if we achieve AGI, if we aren't rich enough to run it or use it for anything it hardly matters.
So the question in my mind is, if you think AGI is soon, how are we getting the level of economic growth needed in the next 2-5 years to afford to use AGI at all before AGI is created?
I generally avoid alts for myself, and one of the benefits I see is that I feel the weight of what I'm about to post.
Maybe I would sometimes writer funnier, snarkier things on Twitter that would get more likes, but because my name is attached I'm forced to reconsider. Is this actually mean? Do I really believe this? Does this joke go to far?
Strange to say perhaps, but I think not having alts makes me a better person, in the sense of being better at being the type of person I want to be, because I can't hide behind anonymity.
Thanks for writing this up. This is something I think a lot of people are struggling with, and will continue to struggle with as AI advances.
I do have worries about AI, mostly that it will be unaligned with human interests and we'll build systems that squash us like bugs because they don't care if we live or die. But I have no worries about AI taking away our purpose.
The desire to feel like one has a purposes is a very human characteristic. I'm not sure that any other animals share our motivation to have a motivation. In fact, past humans seemed to have le...
I want to push back a little in that I was fully vegan for a few years with no negative side effects, other than sometimes being hungry because there was nothing I would eat and annoying my friends with requests to accommodate my dietary preferences. I even put on muscle and cut a lot of fat from my body!
I strongly suspect, based on experience with lots of other vegans, that vegans who struggle with nutritional deficiencies are bad at making good choices about macro nutrients.
Broadly speaking, the challenge in a vegan diet is getting enough lysine. Most ev...
What would it mean for this advice to not generalize? Like what cases are you thinking of where what someone needs to do to be more present isn't some version of resolving automatic predictions of bad outcomes?
I ask because this feels like a place where disagreeing with the broad form of the claim suggests you disagree with the model of what it means to be present rather than that you disagree with the operationalization of the theory, which is something that might not generalize.
I think you still have it wrong, because being present isn't a skill. It's more like an anti-skill: you have stop doing all the stuff you're doing that keeps you from just being.
There is, instead, a different skill that's needed to make progress towards being present. It's a compound skill around noticing what you do out of habit rather than in response to present conditions, figuring out why you have those habits, practice not engaging in those habits when you otherwise would, and thereby developing trust that you can safely drop those habits, thus retraining yourself to do less out of habit and be closer to just being and responding.
I can't think of a time where such false negatives were a real problem. False positives, in this case, are much more costly, even if the only cost is reputation.
If you never promise anything that could be a problem. Same if you make promises but no one believes them. Being able to make commitments is sometimes really useful, so you need to at least keep live the ability to make and hit commitments so you can use them when needed.
As AI continues to accelerate, the central advice presented in this post to be at peace with doom will become incresingly important to help people stay sane in a world where it may seem like there is no hope. But really there is hope so long as we keep working to avert doom, even if it's not clear how we do that, because we've only truly lost when we stop fighting.
I'd really like to see more follow up on the ideas made in this post. Our drive to care is arguably why we're willing to cooperate, and making AI that cares the same way we do is a potentially viable path to AI aligned with human values, but I've not seen anyone take it up. Regardless, I think this is an important idea and think folks should look at it more closely.
This post makes an easy to digest and compelling case for getting serious about giving up flaws. Many people build their identity around various flaws, and having a post that crisply makes the case that doing so is net bad is helpful to be able to point people at when you see them suffering in this way.
I think this post is important because it brings old insights from cybernetics into a modern frame that relates to how folks are thinking about AI safety today. I strongly suspect that the big idea in this post, that ontology is shaped by usefulness, matters greatly to addressing fundamental problems in AI alignment.
I'm less confident than you are about your opening claim, but I do think it's quite likely that we can figure out how to communicate with orcas. Kudos for just doing things.
I'm not sure how it would fit with their mission, but maybe there's a way you could get funding from EA Funds. It doesn't sound like you need a lot of money.
Back when I tried playing some calibration games, I found I was not able to get successfully calibrated above 95%. At that point I start making errors from things like "misinterpreting the question" or "randomly hit the wrong button" and things like that.
The math is not quite right on this, but from this I've adopted a personal 5% error margin policy, this seems to practically be about the limit of my ability to make accurate predictions, and it's served me well.
Depends on whose sense of smell you're optimizing for.
My cats like to sniff each other's butts.
Many dogs love smelling stinky garbage.
I'm not sure I would trust my cats' senses of smell to tell me if I would smell good to other humans.