Power allows people to benefit from immoral acts without having to take responsibility or even be aware of them. The most powerful person in a situation may not be the most morally culpable, as they can remain distant from the actual "crime". If you're not actively looking into how your wants are being met, you may be unknowingly benefiting from something unethical.
In our jobs as AI safety researchers, we think a lot about what it means to have reasonable beliefs and to make good decisions. This matters because we want to understand how powerful AI systems might behave. It also matters because we ourselves need to know how to make good decisions in light of tremendous uncertainty about how to shape the long-term future.
It seems to us that there is a pervasive feeling in this community that the way to decide which norms of rationality to follow is to pick the ones that win. When it comes to the choice between CDT vs. EDT vs. LDT…, we hear we can simply choose the one that gets the most utility. When we say that perhaps we ought to be...
Without a clear definition of "winning,"
This is part of the problem we're pointing out in the post. We've encountered claims of this "winning" flavor that haven't been made precise, so we survey different things "winning" could mean more precisely, and argue that they're inadequate for figuring out which norms of rationality to adopt.
Epistemic Status: I believe I am well-versed in this subject. I erred on the side of making claims that were too strong and allowing readers to disagree and start a discussion about precise points rather than trying to edge-case every statement. I also think that using memes is important because safety ideas are boring and anti-memetic. So let’s go!
Many thanks to @scasper, @Sid Black , @Neel Nanda , @Fabien Roger , @Bogdan Ionut Cirstea, @WCargo, @Alexandre Variengien, @Jonathan Claybrough, @Edoardo Pona, @Andrea_Miotti, Diego Dorn, Angélina Gentaz, Clement Dumas, and Enzo Marsot for useful feedback and discussions.
When I started this post, I began by critiquing the article A Long List of Theories of Impact for Interpretability, from Neel Nanda, but I later expanded the scope of my critique. Some ideas...
The former can be sufficient—e.g. there are good theoretical researchers who have never done empirical work themselves.
In hindsight I think "close conjunction" was too strong—it's more about picking up the ontologies and key insights from empirical work, which can be possible without following it very closely.
In the USA, the president isn't determined by a straight vote. Instead, each state gets a certain number of Electoral College (EC) votes, and the candidate with 270 EC votes wins.
It's up to each state to decide how to allocate its EC votes. Most do “winner-takes-all,” but some, e.g., Maine and Nebraska, split them up.
California and Texas have the most EC votes of any state, with 54 and 40 votes respectively, so you would think they would get a lot of love from presidential candidates. Instead, they're mostly ignored—California will always be Blue, and Texas Red, so what's the point of pandering to them? This is clearly bad for Californians and Texans as their interests aren't listened to.
So why doesn't California switch to a proportional EC vote...
I think it would be better to form a big winner-take-all bloc. With proportional voting, the number of electoral votes at stake will be only a small fraction of the total, so the per-voter influence of CA and TX would probably remain below the national average.
In this post, we’re going to use the diagrammatic notation of Bayes nets. However, we use the diagrams a little bit differently than is typical. In practice, such diagrams are usually used to define a distribution - e.g. the stock example diagram
... in combination with the five distributions , , , , , defines a joint distribution
In this post, we instead take the joint distribution as given, and use the diagrams to concisely state properties of the distribution. For instance, we say that a distribution “satisfies” the diagram
if-and-only-if . (And once we get to approximation, we’ll say that approximately satisfies the diagram, to within , if-and-only-if .)
The usage we’re interested in looks like:
In other words, we want to write proofs diagrammatically - i.e....
Proof that the quoted bookkeeping rule works, for the exact case:
The approximate case then follows by the new-and-improved Bookkeeping Theorem.
Not sure where the disconnect/con...
As Americans know, the electoral college gives disproportionate influence to swing states, which means a vote in the extremely blue state of California was basically wasted in the 2024 election, as are votes in extremely red states like Texas, Oklahoma, and Louisiana. State legislatures have the Constitutional power to assign their state's electoral votes. So why don't the four states sign a compact to assign all their electoral votes in 2028 and future presidential elections to the winner of the aggregate popular vote in those four states? Would this even be legal?
The population of CA is 39.0M (54 electoral votes), and the population of the three red states is 38.6M (55 electoral votes). The combined bloc would control a massive 109 electoral votes, and would have gone...
When I was growing up most families in our neighborhood had the daily paper on their lawn. As getting your news on the internet became a better option for most folks, though, delivery became less efficient: fewer houses on the same route. Prices went up, more people cancelled, and decline continued. I wonder if we might see something similar with the power grid?
Solar panels keep getting cheaper. When we installed panels in 2019 they only made sense because of the combination of net metering and the relatively generous SREC II incentives. By the time we installed our second set of panels a few months ago net metering alone was enough to make it worth it.
Now, as I've said a few times, net metering is kind of absurd. The way it works here is that...
You say solar is getting cheaper, but it is only the panels that are getting cheaper. They will continue to get even cheaper, but this is not relevant to retrofitting individual houses, where the cost is already dominated by labor. As the cost of labor dominates, economies of scale in labor will be more relevant.
Foresight Institute's AI Safety Grants Program added a new focus area in response to the continually evolving field. Moving forward, our funding ($1.5M-$2M annually) will be allocated across the following four focus areas:
2. Neurotech to integrate with or compete against AGI
3. Security technologies for securing AI systems
Hi there.
Quick question. I am using a few articles from LessWrong for a dissertation. Are there any mainstream articles/sources that reference LessWrong as being the catalyst/partial source for AI alignment, researchers, and other academic literature? I think it's snobbish, or, discriminatory to regard LessWrong as merely another online website. I was hoping to get some advice on how to formulate a paragraph justifying the citation of LessWrong?
Thanks.
Depending on the posts I think you could argue they're comparable to one of thosebother source types I listed.
I asked for further details on the 10th point and Claude listed a bunch of stuff I’ve absolutely never heard of. I’d say it’s probably related to meditation if I had to guess. Here’s that.
—
Claude Let me break down Time Perception Management into its deeper components, as this is one of the most subtle yet powerful micro-skills...