All of momom2's Comments + Replies

momom230

Contra 2:
ASI might provide a strategic advantage of a kind which doesn't negatively impact the losers of the race, e.g. it increases GDP by x10 and locks competitors out of having an ASI.
Then, losing control of the ASI could [not being able of] posing an existential risk to the US.
I think it's quite likely this is what some policymakers have in mind: some sort of innovation which will make everything better for the country by providing a lot cheap labor and generally improving productivity, the way we see AI applications do right now but on a bigger scale.... (read more)

momom210

From the disagreement between the two of you, I infer there is yet debate as to what environmentalism means. The only way to be a true environmentalist then is to make things as reversible as possible until such time as an ASI can explain what the environmentalist course of action regarding the Sun should be.

momom210

The paradox arises because the action-optimal formula mixes world states and belief states. 
The [action-planning] formula essentially starts by summing up the contributions of the individual nodes as if you were an "outside" observer that knows where you are, but then calculates the probabilities at the nodes as if you were an absent-minded "inside" observer that merely believes to be there (to a degree). 

So the probabilities you're summing up are apples and oranges, so no wonder the result doesn't make any sense. As stated, the formula for actio... (read more)

momom221

Having read Planecrash, I do not think there is anything in this review that I would not have wanted to know before reading the work (which is the important part of what people consider "spoilers" for me).

momom210

Top of the head like when I'm trying to frown too hard

momom210

distraction had no effect on identifying true propositions (55% success for uninterrupted presentations, vs. 58% when interrupted); but did affect identifying false propositions (55% success when uninterrupted, vs. 35% when interrupted)

If you are confused by these numbers (why so close to 50%? Why below 50%) it's because participants could pick four options (corresponding to true, false, don't know and never seen). 
You can read the study, search for keyword "The Identification Test".

momom210
  1. I don't see what you mean by the grandfather problem.
    1. I don't care about the specifics of who spawns the far future generation; whether it's Alice or Bob I am only considering numbers here.
    2. Saving lives now has consequences for the far future insofar as current people are irrepleceable: if they die, no one will make more children to compensate, resulting in a lower total far future population. Some deaths are less impactful than others for the far future.
  2. That's an interesting way to think about it, but I'm not convinced; killing half the population does not
... (read more)
3AnthonyC
I think the grandfather idea is that if you kill 100 people now, and the average person who dies would have had 1 descendant, and the large loss would happen in 100 years (~4 more generations), then the difference in total lives lived between the two scenarios is ~500, not 900. If the number of descendants per person is above ~1.2, then burying the waste means population after the larger loss in 100 years is actually higher than if you processed it now. Obviously I'm also ignoring a whole lot of things here that I do think matter, as well. And of course, as you pointed out in your reply to my comment above, it's probably better to ignore the scenario description and just look at it as a pure choice along the lines of something like "Is it better to reduce total population by 900 if the deaths happen in 100 years instead of now?"
momom230

Yes, that's the first thing that was talked about in my group's discussion on longtermism. For the sake of the argument, we were asked to assume that the waste processing/burial choice amounted to a trade in lives all things considered... but the fact that any realistic scenario resembling this thought experiment would not be framed like that is the central part of my first counterargument.

momom230

I enjoy reading any kind of cogent fiction on LW, but this one is a bit too undeveloped for my tastes. Perhaps be more explicit about what Myrkina sees in the discussion which relates to our world?
You don't have to always spell earth-shattering revelations out loud (in fact it's best to let the readers reach the correct conclusion by themselves imo), but there needs to be enough narrative tension to make the conclusion inevitable; as it stands, it feels like I can just meh my way out of thinking more than 30s on what the revelation might be, the same way Tralith does.

2Logan Zoellner
  I'm glad you found one of the characters sympathetic.  Personally I feel strongly both ways, which is why I wrote the story the way that I did.
momom230

Thanks, it does clarify, both on separating the instantiation of an empathy mechanism in the human brain vs in AI and on considering instantiation separately from the (evolutionary or training) process that leads to it.

momom230

I was under the impression that empathy explained by evolutionary psychology as a result of the need to cooperate with the fact that we already had all the apparatus to simulate other people (like Jan Kulveit's first proposition).
(This does not translate to machine empathy as far as I can tell.)

I notice that this impression is justified by basically nothing besides "everything is evolutionary psychology". Seeing that other people's intuitions about the topic are completely different is humbling; I guess emotions are not obvious.

So, I would appreciate if yo... (read more)

2Steven Byrnes
I definitely think that the human brain has innate evolved mechanisms related to social behavior in general, and to caring about (certain) other people’s welfare in particular. And I agree that the evolutionary pressure explaining why those mechanisms exist are generally related to the kinds of things that Robert Trivers and other evolutionary psychologists talk about. This post isn’t about that. Instead it’s about what those evolved mechanisms are, i.e. how they work in the brain. Does that help? …But I do want to push back against a strain of thought within evolutionary psychology where they say “there was an evolutionary pressure for the human brain to do X, and therefore the human brain does X”. I think this fails to appreciate the nature of the constraints that the brain operates under. There can be evolutionary pressure for the brain to do something, but there’s no way for the brain to do it, so it doesn’t happen, or the brain does something kinda like that but with incidental side-effects or whatever. As an example, imagine if I said: “Here’s the source code for training an image-classifier ConvNet from random initialization using uncontrolled external training data. Can you please edit this source code so that the trained model winds up confused about the shape of Toyota Camry tires specifically?” The answer is: “Nope. Sorry. There is no possible edit I can make to this PyTorch source code such that that will happen.” You see what I mean? I think this kind of thing happens in the brain a lot. I talk about it more specifically here. More of my opinions about evolutionary psychology here and here.
momom21512

I do not find this post reassuring about your approach.

  • Your plan is unsound; instead of a succession of events which need to go your way, I think you should aim for incremental marginal gains. There is no cost-effectiveness analysis, and the implicit theory of change is lacunar.
  • Your press release is unreadable (poor formatting), and sounds like a conspiracy theory (catchy punchlines, ALL CAPS DEMANDS, alarmist vocabulary and unsubstantiated claims) ; I think it's likely to discredit safety movements and raise attention in counterproductive ways.
  • The figures
... (read more)
4Remmelt
Thanks, as far as I can this is a mix of critiques of strategic approach (fair enough), about communication style (fair enough), and partial misunderstandings of the technical arguments.   I agree that we should not get hung up on a succession of events to go a certain way. IMO, we need to get good at simultaneously broadcasting our concerns in a way that’s relatable to other concerned communities, and opportunistically look for new collaborations there.   At the same time, local organisers often build up an activist movement by ratcheting up the number of people joining the events and the pressure they put on demanding institutions to make changes. These are basic cheap civil disobedience tactics that have worked for many movements (climate, civil rights, feminist, changing a ruling party, etc). I prefer to go with what has worked, instead of trying to reinvent the wheel based on fragile cost-effectiveness estimates. But if you can think of concrete alternative activities that also have a track record of working, I’m curious to hear. I think this is broadly fair.  The turnaround time of this press release was short, and I think we should improve on the formatting and give more nuanced explanations next time. Keep in mind the text is not aimed at you but people more broadly who are feeling concerned and we want to encourage to act. A press release is not a paper. Our press release is more like a call to action – there is a reason to add punchy lines here.     Let me recheck the AI Impacts paper. Maybe I was ditzy before, in which case, my bad.   As you saw from my commentary above, I was skeptical about using that range of figures in the first place.   Not sure what you see as the conflation?  AGI, as an autonomous system that would automate many jobs, would necessarily be self-modifying – even in the limited sense of adjusting its internal code/weights on the basis of new inputs.    The reasoning shared in the press release by my colleague was rather l
momom230

I agree with the broad idea, but I'm going to need a better implementation.
In particular, the 5 criteria you give are insufficient because the example you give scores well on them, and is still atrocious: if we decreed that "black people" was unacceptable and should be replaced by "black peoples", it would cause a lot of confusion on account of how similar the two terms are and how ineffective the change is.

The cascade happens because of a specific reason, and the change aims at resolving that reason. For example, "Jap" is used as a slur, and not saying it... (read more)

momom231
  • Probability of existential catastrophe before 2032 assuming AGI arrives in that period and Harris wins[12] = 30%

  • Probability of existential catastrophe before 2032 assuming AGI arrives in that period and Trump wins[13] = 35%.

A lot of your AI-risk reason to support Harris seems to hinge on this, which I find very shaky. How wide are your confidence intervals here?
My own guesses are much more fuzzy. According to your argument, if my intuition was .2 vs .5, then it's an overwhelming case for Harris but I'm unfamiliar enough with the topic that it cou... (read more)

momom221

Seems like you need to go beyond arguments of authority and stating your conclusions and instead go down to the object-level disagreements. You could say instead "Your argument for ~X is invalid because blah blah" and if Jacob says "Your argument for the invalidity of my argument for ~X is invalid because blah blah" then it's better than before because it's easier to evaluate argument validity than ground truth.
(And if that process continues ad infinitam, consider that someone who cannot evaluate the validity of the simplest arguments is not worth arguing with.)

momom230

It's thought-provoking.
Many people here identify as Bayesians, but are as confused as Saundra by the troll's questions, which indicates that they're missing something important.

momom210

It wasn't mine. I did grow up in a religious family, but becoming a rationalist came gradually, without sharp divide with my social network. I always figured people around me were making all sorts of logical mistakes though, and noticed very early deep flaws in what I was taught.

momom231

It's not. The paper is hype, the authors don't actually show that this could replace MLPs.

momom221

This is very interesting!
I did not expect that Chinese would be more optimistic about benefits than worried about risks and that they would rank it so low as an existential risk. 
This is in contrast with posts I see on social media and articles showcasing safety institutes and discussing doomer opinions, which gave me the impression that Chinese academia was generally more concerned about AI risk and especially existential risk than the US.

I'm not sure how to reconcile this survey's results with my previous model. Was I just wrong and updating too much on anecdotal evidence?
How representative of policymakers and of influential scientists do you think these results are?

5Nick Corvino
I think the crux is that the thoughts of the CCP and Chinese citizens don't necessarily have to have a strong correlation - in many ways they can be orthogonal, and sometimes even negatively correlated (like when the gov trades off on personal freedoms for national security).   I think recent trends suggest the Chinese gov / Xi Jingping are taking risks (especially the tail risks) more seriously, and have done some promising AI safety stuff. Still mostly unclear, tho. Highly recommend checking out Concordia AI's The State of AI Safety in China Spring 2024 Report. 
momom210

About the Christians around me: it is not explicitly considered rude, but it is a signal that you want to challenge their worldview, and if you are going to predictably ask that kind of question often, you won't be welcome in open discussions.
(You could do it once or twice for anecdotal evidence, but if you actually want to know whether many Christians believe in a literal snake, you'll have to do a survey.)

momom2123

I disagree – I think that no such perturbations exist in general, rather than that we have simply not had any luck finding them.

I have seen one such perturbation. It was two images of two people, one which was clearly male and the other female, though I wasn't be able to tell any significant difference between the two images on 15s of trying to find one except for a slight difference in hue. 
Unfortunately, I can't find this example again on a 10mn search. It was shared on Discord; the people in the image were white and freckled. I'll save it if I find it again.

https://x.com/jeffreycider/status/1648407808440778755

(I'm writing a post on cognitohazards, the perceptual inputs that hurt you. So, i have this post conveniently referenced in my draft lol)

momom210

The pyramids and Mexico and the pyramids in Egypt are related via architectural constraints and human psychology.

momom220

In practice, when people say "one in a million" in that kind of context, it's much higher than that. I haven't watched Dumb and Dumber, but I'd be surprised if Lloyd did not, actually, have a decent chance of ending together with Mary.

On one hand, we claim [dumb stuff using made up impossible numbers](https://www.lesswrong.com/posts/GrtbTAPfkJa4D6jjH/confidence-levels-inside-and-outside-an-argument) and on the other hand, we dismiss those numbers and fall back on there's-a-chancism.
These two phenomena don't always perfectly compensate one another (as examples show in both posts), but common sense is more reliable that it may seem at first. (I'm not saying it's the correct approach nonetheless.)

Answer by momom220

Epistemic status: amateur, personal intuitions.

If this were the case, it makes sense to hold dogs (rather than their owners, or their breeding) responsible for aggressive or violent behaviour.

I'd consider whether punishing the dog would make the world better, or whether changing the system that led to its breeding, or providing incentives to the owner or any combination of other actions would be most effective.

Consequentialism is about considering the consequences of actions to judge them, but various people might wield this in various ways. 
Implicitl... (read more)

momom210

I can imagine plausible mechanisms for how the first four backlash examples were a consequence of perceived power-seeking from AI safetyists, but I don't see one for e/acc. Does someone have one?

Alternatively, what reason do I have to expect that there is a causal relationship between safetyist power-seeking and e/acc even if I can't see one?

e/acc has coalesced in defense of open-source, partly in response to AI safety attacks on open-source. This may well lead directly to a strongly anti-AI-regulation Trump White House, since there are significant links between e/acc and MAGA.

I think of this as a massive own goal for AI safety, caused by focusing too much on trying to get short-term "wins" (e.g. dunking on open-source people) that don't actually matter in the long term.

momom210

That's not interesting to read unless you say what your reasons are and they differ from other critics'. Perhaps not say it all in a comment, but at least a link to a post.

momom210

Interestingly, I think that one of the examples of proving too much on Wikipedia can itself be demolished by a proving too much argument, but I’m not going to say which one it is because I want to see if other people independently come to the same conclusion.

For those interested in the puzzle, here is the page Scott was linking to at the time: https://en.wikipedia.org/w/index.php?title=Proving_too_much&oldid=542064614
The article was edited a few hours later, and subsequent conversation showed that Wikipedia editors came to the conclusion Scott hinted a... (read more)

momom210

Another way to avoid the mistake is to notice that the implication is false, regardless of the premises. 
In practice, people's beliefs are not deductively closed, and (in the context of a natural language argument) we treat propositional formulas as tools for computing truths rather than timeless statements.

momom220

it can double as a method for creating jelly donuts on demand

For those reading this years later, here's the comic that shows how to make ontologically necessary donuts.

momom281

I'd appreciate examples of the sticker shortcut fallacy with in-depth analysis of why they're wrong and how the information should have been communicated instead.

9ymeskhout
I wanted to include very basic examples first: I am planning yet another follow-up to outline more contentious examples. Basically, almost any dispute  that is based on a disguised query and hinges on specific categorization matches the fallacy. Some of the prominent examples that come to mind, with the sticker shortcut label italicized: * Was January 6th an insurrection? * Is Israel committing a genocide? * Are IQ tests a form of eugenics? All of these questions appear to be a disguised query into asking whether X is a "really bad thing". But instead of asking this directly, they try to sneak in the connotation through the label. Similarly, the whole debate over whether transwomen are women is a hodgepodge of disguised queries that try to sneak in a preferred answer through the acceptance of labels. In each of these examples, we're better served by discussing the thing directly rather than debating over labels. Does this help clarify?
momom210

"Anyone thinks they're a reckless idiot" is far too easy a bar to reach for any public figure.
I do not know of major anti-Altman currents in my country, but considering surveys consistently show a majority of people worried about AI risk, a normal distribution of extremeness of opinion on the subject ensures there'll be many who do consider Sam Altman a reckless idiot (for good or bad reason - I expect a majority of them to consider Sam Altman to have any negative trait that comes to their attention because it is just that easy to have a narrow hateful opinion on a subject for a large portion of the population).

1Anders Lindström
For the record. I do not mean to single out Altman. I am talking in general about leading figures (i.e. Altman et al.) in the AI space for which Altman have become a convenient proxy since he is a very public figure. 
momom251

I have cancelled my subscription as well. I don't have much to add to the discussion, but I think signalling participation in the boycott will help conditional on the boycott having positive value.

momom232

Thanks for the information.
Consider though that for many people the price of the subscription is motivated by convenience of access and use.

It took me a second to see how your comment was related to the post so here it is for others: 
Given this information, using the API preserves most of the benefits of access to SOTA AI (assuming away the convenience value) while destroying most of the value for OpenAI, which makes this a very effective intervention compared to cancelling the subscription entirely.

1O O
There’s an API playground which is essentially a chat interface. It’s highly convenient.
momom210

When I vote, I basically know the full effect this has on what is shown to other users or to myself. 

Mindblowing moment: It has been a private pet peeve of mine that it was very unclear what policy I should follow for voting.

In practice, I vote mostly on vibes (and expect most people to), but given my own practices for browsing LW, I also considered alternative approaches.
- Voting in order to assign a specific score (weighted for inflation by time and author) to the post. Related uses: comparing karma of articles, finding desirable articles on a given... (read more)

momom210

Not everything suboptimal, but suboptimal in a way that causes suffering on an astronomical scale (e.g. galactic dystopia, or dystopia that lasts for thousands of years, or dystopia with an extreme number of moral patients (e.g. uploads)).
I'm not sure what you mean by Ord, but I think it's reasonable to have a significant probability of S-risk from a Christiano-like failure.

momom21-1

I think you miss one important existential risk separate from extinction, which is having a lastingly suboptimal society. Like, systematic institutional inefficiency, and being unable to change anything because of disempowerment.
In that scenario, maybe humanity is still around because one of the things we can measure and optimize for is making sure a minimum amount of humans are alive, but the living conditions are undesirable.

4otto.barten
Stretching the definition to include anything suboptimal is the most ambitious stretch I've seen so far. It would include literally everything that's wrong, or can ever be wrong, in the world. Good luck fixing that. On a more serious note, this post is about existential risk as defined by eg Ord. Anything beyond that (and there's a lot!) is out of scope.
momom220

I'm not sure either, but here's my current model:
Even though it looks pretty likely that AISC is an improvement on no-AISC, there are very few potential funders:
1) EA-adjacent caritative organizations.
2) People from AIS/rat communities.

Now, how to explain their decisions?
For the former, my guess would be a mix of not having heard of/received an application from AISC and preferring to optimize heavily towards top-rated charities. AISC's work is hard to quantify, as you can tell from the most upvoted comments, and that's a problem when you're looking for pro... (read more)

momom220

Follow this link to find it. The translation is made by me, and open to comments. Don't hesitate to suggest improvements.

1Heron
Thanks!
momom232

It's not obvious at all to me, but it's certainly a plausible theory worth testing!

5Lao Mein
The most direct way would be to spell-check the training data and see how that impacts spelling performance. How would spelling performance change when you remove typing errors like " hte" vs phonetic errors like " hygeine" or doubled-letters like " Misissippi"? Also, misspellings often break up a large token into several small ones (" Mississippi" is [13797]; " Misissippi" is [31281, 747, 12715][' Mis', 'iss', 'ippi']) but are used in the same context, so maybe looking at how the spellings provided by GPT3 compare to common misspellings of the target word in the training text could be useful. I think I'll go do that right now. The research I'm looking at suggests that the vast majority of misspellings on the internet are phonetic as opposed to typing errors, which makes sense since the latter is much easier to catch.   Also, anyone have success in getting GPT2 to spell words? 
momom220

To whom it may concern, here's a translation of "Bold Orion" in French.

1Heron
Où?
momom21811

A lot of the argumentation in this post is plausible, but also, like, not very compelling?
Mostly the "frictionless" model of sexual/gender norms, and the examples associated: I can see why these situations are plausible (if at least because they're very present in my local culture) but I wouldn't be surprised if they are a bunch of social myth either, in which case the whole post is invalidated.

I appreciate the effort though; it's food for thought even if it doesn't tell me much about how to update based on the conclusion.

momom266

Epistemic status: Had a couple conversations on AI Plans with the founder, participated in the previous critique-a-thon. I've helped AI Plans a bit before, so I'm probably biased towards optimism.

 

Neglectedness: Very neglected. AI Plans wants to become a database of alignment plans which would allow quick evaluation of whether an approach is worth spending effort on, at least as a quick sanity check for outsiders. I can't believe it didn't exist before! Still very rough and unuseable for that purpose for now, but that's what the critique-a-thon is for... (read more)

momom210

Thank you, this is incredibly interesting! Did you ever write up more on the subject? I'm excited to see how it relates to mesa-optimisation in particular.

In the finite case, where , then 

Typo: I think you mean  ?

momom220

I'm surprised to hear they're posting updates about CoEm.

At a conference held by Connor Leahy, I said that I thought it was very unlikely to work, and asked why they were interested in this research area, and he answered that they were not seriously invested in it.

We didn't develop the topic and it was several months ago, so it's possible that 1- I misremember or 2- they changed their minds 3- I appeared adversarial and he didn't feel like debating CoEm. (For example, maybe he actually said that CoEm didn't look promising and this changed recently?)
Still, anecdotal evidence is better than nothing, and I look forward to seeing OliviaJ compile a document to shed some light on it.

momom211

I invite you. You can send me this summary in private to avoid downvotes.

momom21-2

There's a whole part of the argument which is missing which is the framing of this as being about AI risk.
I've seen various propositions for why this happened, and the board being worried about AI risk is one of them but not the most plausible afaict.
 

In addition this is phrased similarly to technical problems like the corrigibility, which it is very much not about.
People who say "why can't you just turn it off" typically refer to literally turning off the AI if it appears to be dangerous, which this is not about. This is about turning off the AI company, not the AI.

momom213

1- I didn't know Executive Order could be repealed easily. Could you please develop?
2- Why is it good news? To me, this looks like a clear improvement on the previous status of regulations.

9Colin McGlynn
Executive Orders aren't legislation.  They are instructions that the white house makes to executive branch agencies.  So the president can issue new executive orders that change or reverse older executive orders made by themselves or past presidents.
momom25-2

AlexNet dates back to 2012, I don't think previous work on AI can be compared to modern statistical AI.
Paul Christiano's foundational paper on RLHF dates back to 2017.
Arguably, all of agent foundations work turned out to be useless so far, so prosaic alignment work may be what Roko is taking as the beginning of AIS as a field.

2Vaniver
When were convnets invented, again? How about backpropagation?
5Roko
yes
momom286

The AI safety leaders currently see slow takeoff as humans gaining capabilities, and this is true; and also already happening, depending on your definition. But they are missing the mathematically provable fact that information processing capabilities of AI are heavily stacked towards a novel paradigm of powerful psychology research, which by default is dramatically widening the attack surface of the human mind.

I assume you do not have a mathematical proof of that, or you'd have mentioned it. What makes you think it is mathematically provable?
I would be ve... (read more)

5trevor
Yes, I thought for years that the research should be private but as it turns out, most people in policy are pretty robustly not-interested in anything that sounds like "mind control" and the math is hard to explain, so if this stuff ends up causing a public scandal that damages the US's position in international affairs then it probably won't originate from here (e.g. it would get popular elsewhere like the AI surveillance pipeline) so AI safety might as well be the people that profit off of it by open-sourcing it early. It's actually a statistical induction. When you have enough human behavioral data in one place, you can use gradient descent to steer people in measurable directions if the people remain in the controlled interactive environment that the data came from (and social media news feeds are surprisingly optimized to be that perfect controlled environment). More psychologists mean better quality data-labeling, which means people can be steered more precisely.
Load More