LESSWRONG
LW

All of Drake Thomas's Comments + Replies

A problem I have that I think is fairly common:

I notice an incoming message of some kind.
For whatever reason it's mildly aversive or I'm busy or something.
Time passes.
I feel guilty about not having replied yet.
Interacting with the message is associated with negative emotions and guilt, so it becomes more aversive.
Repeat steps 4 and 5 until the badness of not replying exceeds the escalating 4/5 cycle, or until the end of time.

Curious if anyone who once had this problem feels like they've resolved it, and if so what worked!

1Hastings1mo

I haven’t totally defeated this, but I’ve had some luck with immediately replying “I am looking to reply to this properly, first I need to X” if there is an X blocking a useful reply

3Canaletto1mo

Check this out https://www.lesswrong.com/posts/EFQ3F6kmt4WHXRqik/ugh-fields

Drake Thomas's Shortform

Drake Thomas1mo232

So it's been a few months since SB1047. My sense of the main events that have happened since the peak of LW commenter interest (might have made mistakes or missed some items) are:

The bill got vetoed by Newsom for pretty nonsensical stated reasons, after passing in the state legislature (but the state legislature tends to pass lots of stuff so this isn't much signal).
My sense of the rumor mill is that there are perhaps some similar-ish bills in the works in various state legislatures, but AFAIK none that have yet been formally proposed or accrued serious di

... (read more)

Dan H1mo1412

though I don't think xAI took an official position one way or the other

I assumed most of everybody assumed xAI supported it since Elon did. I didn't bother pushing for an additional xAI endorsement given that Elon endorsed it.

Drake Thomas's Shortform

Drake Thomas2mo10

Note that the lozenges dissolve slowly, so (bad news) you'd have the taste around for a while but (good news) it's really not a very strong peppermint flavor while it's in your mouth, and in my experience it doesn't really have much of the menthol-triggered cooling effect. My guess is that you would still find it unpleasant, but I think there's a decent chance you won't really mind. I don't know of other zinc acetate brands, but I haven't looked carefully; as of 2019 the claim on this podcast was that only Life Extension brand are any good.

Drake Thomas's Shortform

Drake Thomas2mo40

On my model of what's going on, you probably want the lozenges to spend a while dissolving, so that you have fairly continuous exposure of throat and nasal tissue to the zinc ions. I find that they taste bad and astringent if I actively suck on them but are pretty unobtrusive if they just gradually dissolve over an hour or two (sounds like you had a similar experience). I sometimes cut the lozenges in half and let each half dissolve so that they fit into my mouth more easily, you might want to give that a try?

3Joey KL2mo

Interesting, I can see why that would be a feature. I don't mind the taste at all actually. Before, I had some of their smaller citrus flavored kind, and they dissolved super quick and made me a little nauseous. I can see these ones being better in that respect.

Drake Thomas's Shortform

Drake Thomas2mo50

I agree, zinc lozenges seem like they're probably really worthwhile (even in the milder-benefit worlds)! My less-ecstatic tone is only relative to the promise of older lesswrong posts that suggested it could basically solve all viral respiratory infections, but maybe I should have made the "but actually though, buy some zinc lozenges" takeaway more explicit.

Drake Thomas's Shortform

Drake Thomas2mo60

I liked this post, but I think there's a good chance that the future doesn't end up looking like a central example of either "a single human seizes power" or "a single rogue AI seizes power". Some other possible futures:

Control over the future by a group of humans, like "the US government" or "the shareholders of an AI lab" or "direct democracy over all humans who existed in 2029"
Takeover via an AI that a specific human crafted to do a good job at enacting that human's values in particular, but which the human has no further steering power over
Lots of diff

Drake Thomas2mo80

The action-relevant question, for deciding whether you want to try to solve alignment, is how the average world with human-controlled AGI compares to the average AGI-controlled world.

To nitpick a little, it's more like "the average world where we just barely didn't solve alignment, versus the average world where we just barely did" (to the extent making things binary in this way is sensible), which I think does affect the calculus a little - marginal AGI-controlled worlds are more likely to have AIs which maintain some human values.

(Though one might ... (read more)

Guive2mo104

See also: "Which World Gets Saved"

Drake Thomas's Shortform

Drake Thomas2mo52

Update: Got tested, turns out the thing I have is bacterial rather than viral (Haemophilius influenzae). Lines up with the zinc lozenges not really helping! If I remember to take zinc the next time I come down with a cold, I'll comment here again.

Drake Thomas's Shortform

Drake Thomas2mo40

My impression is that since zinc inhibits viral replication, it's most useful in the regime where viral populations are still growing and your body hasn't figured out how to beat the virus yet. So getting started ASAP is good, but it's likely helpful for the first 2-3 days of the illness.

An important part of the model here that I don't understand yet is how your body's immune response varies as a function of viral populations - e.g. two models you could have are

As soon as any immune cell in your body has ever seen a virus, a fixed scale-up of immune

... (read more)

1Maxwell Peterson2mo

Thanks!

Drake Thomas's Shortform

Drake Thomas2mo10

The 2019 LW post discusses a podcast which talks a lot about gears-y models and proposed mechanisms; as I understand it, the high level "zinc ions inhibit viral replication" model is fairly well accepted, but some of the details around which brands are best aren't as well-attested elsewhere in the literature. For instance, many of these studies don't use zinc acetate, which this podcast would suggest is best. (To its credit, the 2013 meta-analysis does find that acetate is (nonsignificantly) better than gluconate, though not radically so.)

Drake Thomas's Shortform

Drake Thomas2mo631

(TLDR: Recent Cochrane review says zinc lozenges shave 0.5 to 4 days off of cold duration with low confidence, middling results for other endpoints. Some reason to think good lozenges are better than this.)

There's a 2024 Cochrane review on zinc lozenges for colds that's come out since LessWrong posts on the topic from 2019, 2020, and 2021. 34 studies, 17 of which are lozenges, 9/17 are gluconate and I assume most of the rest are acetate but they don't say. Not on sci-hub or Anna's Archive, so I'm just going off the abstract and summary here; would love a P... (read more)

2Linda Linsefors2mo

If you email the authors they will probably send you the full article.

2David Matolcsi2mo

Does anyone know of a not peppermint flavored zinc acetate lozenge? I really dislike peppermint, so I'm not sure it would be worth it to drink 5 peppermint flavored glasses of water a day to decrease the duration of cold with one day, and I haven't found other zinc acetate lozenge options yet, the acetate version seems to be rare among zing supplement. (Why?)

1Joey KL2mo

I ordered some of the Life Extension lozenges you said you were using; they are very large and take a long time to dissolve. It's not super unpleasant or anything, I'm just wondering if you would count this against them?

5Drake Thomas2mo

Update: Got tested, turns out the thing I have is bacterial rather than viral (Haemophilius influenzae). Lines up with the zinc lozenges not really helping! If I remember to take zinc the next time I come down with a cold, I'll comment here again.

2Maxwell Peterson2mo

Thanks for putting this together! I have a vague memory of a post saying that taking zinc early, while virus was replicating in the upper respiratory tract, was much more important than taking it later, because later it would have spread all over the body and thus the zinc can’t get to it, or something like this. So I tend to take a couple early on then stop. But it sounds like you don’t consider that difference important. Is it your current (Not asking you to do more research!) impression that it’s useful to take zinc throughout the illness?

2stavros2mo

I woke up this morning thinking 'would be nice to have a concise source for the whole zinc/colds thing'. This is amazing. I help run an EA coliving space, so I started doing some napkin math on how many sick days you'll be saving our community over the next year. Then vaguely extrapolated to the broader lesswrong audience who'll read your post and be convinced/reminded to take zinc (and given decent guidance for how to use it effectively). I'd guess at minimum you've saved dozens of days over the next year by writing this post. That's pretty cool. Thankyou <3

1Hzn2mo

Do you have any thoughts on mechanism & whether prevention is actually worse independent of inconvenience?

3Elizabeth2mo

David Maciver over on Twitter likes a zinc mouthwash, which presumably has a similar mechanism

5MichaelDickens2mo

This is actually a crazy big effect size? Preventing ~10–50% of a cold for taking a few pills a day seems like a great deal to me.

Jimrandomh's Shortform

Drake Thomas3mo72

I agree this seems pretty good to do, but I think it'll be tough to rule out all possible contaminant theories with this approach:

Some kinds of contaminants will be really tough to handle, eg if the issue is trace amounts of radioactive isotopes that were at much lower levels before atmospheric nuclear testing.
It's possible that there are contaminant-adjacent effects arising from preparation or growing methods that aren't related to the purity of the inputs, eg "tomato plants in contact with metal stakes react by expressing obesogenic compounds in th

Drake Thomas4mo512

I've gotten enormous value out of LW and its derived communities during my life, at least some of which is attributable to the LW2.0 revival and its effects on those communities. More recently, since moving to the Bay, I've been very excited by a lot of the in-person events that Lighthaven has helped facilitate. Also, LessWrong is doing so many things right as a website and source-of-content that no one else does (karma-gated RSS feeds! separate upvote and agree-vote! built-in LaTeX support!) and even if I had no connection to the other parts of its missio... (read more)

Is the mind a program?

Drake Thomas4mo42

The theoretical maximum FLOPS of an Earth-bound classical computer is something like $2^{35}$ .

Is this supposed to have a different base or exponent? A single H100 already gets like $2^{45}$ FLOP/s.

2EuanMcLean4mo

fixed, thanks. Careless exponent juggling

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.

Drake Thomas4mo30

So I would guess it should be possible to post-train an LLM to give answers like "................... Yes" instead of "Because 7! contains both 3 and 5 as factors, which multiply to 15 Yes", and the LLM would still be able to take advantage of CoT

This doesn't necessarily follow - on a standard transformer architecture, this will give you more parallel computation but no more serial computation than you had before. The bit where the LLM does N layers' worth of serial thinking to say "3" and then that "3" token can be fed back into the start of N more layers... (read more)

3Vladimir_Nesov4mo

That's relevant, but about what I expected and why I hedged with "it should be possible to post-train", which that paper doesn't explore. Residual stream on many tokens is working memory, N layers of "vertical" compute over one token only have one activation vector to work with, while with more filler tokens you have many activation vectors that can work on multiple things in parallel and then aggregate. If a weaker model doesn't take advantage of this, or gets too hung up on concrete tokens to think about other things in the meantime, instead of being able to maintain multiple trains of thought simultaneously, a stronger model might[1]. Performance on large questions (such as reading comprehension) with immediate answer (no CoT) shows that N layers across many tokens and no opportunity to get deeper serial compute is sufficient for many purposes. But a question is only fully understood when it's read completely, so some of the thinking about the answer can't start before that. If there are no more tokens, this creates an artificial constraint on working memory for thinking about the answer, filler tokens should be useful for lifting it. Repeating the question seems to help for example (see Figure 3 and Table 5). ---------------------------------------- 1. The paper is from Jul 2023 and not from OpenAI, so it didn't get to play with 2e25+ FLOPs models, and a new wave of 2e26+ FLOPs models is currently imminent. ↩︎

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.

Drake Thomas4mo83

I don't think that's true - in eg the GPT-3 architecture, and in all major open-weights transformer architectures afaik, the attention mechanism is able to feed lots of information from earlier tokens and "thoughts" of the model into later tokens' residual streams in a non-token-based way. It's totally possible for the models to do real introspection on their thoughts (with some caveats about eg computation that occurs in the last few layers), it's just unclear to me whether in practice they perform a lot of it in a way that gets faithfully communicated to the user.

3green_leaf4mo

Are you saying that after it has generated the tokens describing what the answer is, the previous thoughts persist, and it can then generate tokens describing them? (I know that it can introspect on its thoughts during the single forward pass.)

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.

Drake Thomas4mo63

Yeah, I'm thinking about this in terms of introspection on non-token-based "neuralese" thinking behind the outputs; I agree that if you conceptualize the LLM as being the entire process that outputs each user-visible token including potentially a lot of CoT-style reasoning that the model can see but the user can't, and think of "introspection" as "ability to reflect on the non-user-visible process generating user-visible tokens" then models can definitely attain that, but I didn't read the original post as referring to that sort of behavior.

3green_leaf4mo

Yeah. The model has no information (except for the log) about its previous thoughts and it's stateless, so it has to infer them from what it said to the user, instead of reporting them.

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.

Drake Thomas4mo2212

In other words, they can think about the thoughts "behind" the previous words they wrote. If you doubt me on this, try asking one what its words are referring to, with reference to its previous words. Its "attention" modules are actually intentionally designed to know this sort of thing, using using key/query/value lookups that occur "behind the scenes" of the text you actually see on screen.

I don't think that asking an LLM what its words are referring to is a convincing demonstration that there's real introspection going on in there, as opposed to "plausi... (read more)

8Logan Riggs4mo

I tried a similar experiment w/ Claude 3.5 Sonnet, where I asked it to come up w/ a secret word and in branching paths: 1. Asked directly for the word 2. Played 20 questions, and then guessed the word In order to see if it does have a consistent it can refer back to. Branch 1: Branch 2: Which I just thought was funny. Asking again, telling it about the experiment and how it's important for it to try to give consistent answers, it initially said "telescope" and then gave hints towards a paperclip. Interesting to see when it flips it answers, though it's a simple setup to repeatedly ask it's answer every time. Also could be confounded by temperature.

3green_leaf4mo

Claude can think for himself before writing an answer (which is an obvious thing to do, so ChatGPT probably does it too). In addition, you can significantly improve his ability to reason by letting him think more, so even if it were the case that this kind of awareness is necessary for consciousness, LLMs (or at least Claude) would already have it.

Drake Thomas's Shortform

Drake Thomas5mo20

I think my original comment was ambiguous - I also consider myself to have mostly figured it out, in that I thought through these considerations pretty extensively before joining and am in a "monitoring for new considerations or evidence or events that might affect my assessment" state rather than a "just now orienting to the question" state. I'd expect to be most useful to people in shoes similar to my past self (deciding whether to apply or accept an offer) but am pretty happy to talk to anyone, including eg people who are confident I'm wrong and want to convince me otherwise.

Carl Feynman5mo15-1

Thanks for clearing that up. It sounds like we’re thinking along very similar lines, but that I came to a decision to stop earlier. From a position inside one of major AI labs, you’ll be positioned to more correctly perceive when the risks start outweighing the benefits. I was perceiving events more remotely from over here in Boston, and from inside a company that uses AI as a one of a number of tools, not as the main product.

I’ve been aware of the danger of superintelligence since the turn of the century, and I did my “just now orienting... (read more)

Drake Thomas's Shortform

Drake Thomas5mo72

See my reply to Ryan - I'm primarily interested in offering advice on something like that question since I think it's where I have unusually helpful thoughts, I don't mean to imply that this is the only question that matters in making these sorts of decisions! Feel free to message me if you have pitches for other projects you think would be better for the world.

3ryan_greenblatt5mo

Gotcha, I interpreted your comment as implying you were interested in trying to improve your views on the topic in collaboration with someone else (who is also interested in improving their views on the topic). So I thought it was relevant to point out that people should probably mostly care about a different question.

Drake Thomas's Shortform

Drake Thomas5mo50

Yeah, I agree that you should care about more than just the sign bit. I tend to think the magnitude of effects of such work is large enough that "positive sign" often is enough information to decide that it dominates many alternatives, though certainly not all of them. (I also have some kind of virtue-ethical sensitivity to the zero point of the impacts of my direct work, even if second-order effects like skill building or intra-lab influence might make things look robustly good from a consequentialist POV.)

The offer of the parent comment is more narrowly ... (read more)

4ryan_greenblatt5mo

FWIW, my guess is that this is technically true if you mean something broad by "many alternatives", but if you mean something like "the best several alternatives that you would think of if you spent a few days thinking about it and talking to people" then I would disagree.

Drake Thomas's Shortform

Drake Thomas5mo*402

I work on a capabilities team at Anthropic, and in the course of deciding to take this job I've spent^[1] a while thinking about whether that's good for the world and which kinds of observations could update me up or down about it. This is an open offer to chat with anyone else trying to figure out questions of working on capability-advancing work at a frontier lab! I can be reached at "graham's number is big" sans spaces at gmail.

^{^}
and still spend - I'd like to have Joseph Rotblat's virtue of noticing when one's former reasoning for working on a projec

... (read more)

Carl Feynman5mo171

I’m not “trying to figure out” whether to work on capabilities, having already decided I’ve figured it out and given up such work. Are you interested in talking about this to someone like me? I can’t tell whether you want to restrict discussion to people who are still in the figuring out stage. Not that there’s anything wrong with that, mind you.

1yc5mo

Just saw the OP replied in another comment that he is offering advice.

ryan_greenblatt5mo*130

Isn't the most relevant question whether it is the best choice for you? (Taking into account your objectives which are (mostly?) altruistic.)

I'd guess having you work on capabilities at Anthropic is net good for the world^[1], but probably isn't your best choice long run and plausibly isn't your best choice right now. (I don't have a good understanding of your alternatives.)

My current view is that working on capabilites at Anthropic is a good idea for people who are mostly altruistically motivated if and only if that person is very comparatively advantaged ... (read more)

Dario Amodei — Machines of Loving Grace

Drake Thomas5mo52

I agree it seems unlikely that we'll see coordination on slowing down before one actor or coalition has a substantial enough lead over other actors that it can enforce such a slowdown unilaterally, but I think it's reasonably likely that such a lead will arise before things get really insane.

A few different stories under which one might go from aligned "genius in a datacenter" level AI at time t to outcomes merely at the level of weirdness in this essay at t + 5-10y:

The techniques that work to align "genius in a datacenter" level AI don't scale to wildly s

... (read more)

Dario Amodei — Machines of Loving Grace

Drake Thomas5mo266

(I work at Anthropic.) My read of the "touch grass" comment is informed a lot by the very next sentences in the essay:

But more importantly, tame is good from a societal perspective. I think there's only so much change people can handle at once, and the pace I'm describing is probably close to the limits of what society can absorb without extreme turbulence.

which I read as saying something like "It's plausible that things could go much faster than this, but as a prediction about what will actually happen, humanity as a whole probably doesn't want thing... (read more)

ryan_greenblatt5mo2112

humanity as a whole probably doesn't want things to get incredibly crazy so fast, and so we're likely to see something tamer

Doesn't this require a pretty strong and unprecedented level of international coordination on stopping an obviously immediately extremely valuable and militarily relevent technology? I think a US backed entente could impose this on the rest of the world, but that would also be an unprecedentedly large effort.

I think this is certainly possible and I hope this level of coordination happens, but I don't exactly think this is likely in... (read more)

9aysja5mo

I feel confused about how this squares with Dario’s view that AI is "inevitable," and "driven by powerful market forces." Like, if humanity starts producing a technology which makes practically all aspects of life better, the idea is that this will just… stop? I’m sure some people will be scared of how fast it’s going, but it’s hard for me to see the case for the market in aggregate incentivizing less of a technology which fixes ~all problems and creates tremendous value. Maybe the idea, instead, is that governments will step in...? Which seems plausible to me, but as Ryan notes, Dario doesn’t say this.

Raemon's Shortform

Drake Thomas6mo30

I've fantasized about a good version of this feature for math textbooks since college - would be excited to beta test or provide feedback about any such things that get explored! (I have a couple math-heavy posts I'd be down to try annotating in this way.)

Linch's Shortform

Drake Thomas7mo200

(I work on capabilities at Anthropic.) Speaking for myself, I think of international race dynamics as a substantial reason that trying for global pause advocacy in 2024 isn't likely to be very useful (and this article updates me a bit towards hope on that front), but I think US/China considerations get less than 10% of the Shapley value in me deciding that working at Anthropic would probably decrease existential risk on net (at least, at the scale of "China totally disregards AI risk" vs "China is kinda moderately into AI risk but somewhat less than the US... (read more)

UFO Betting: Put Up or Shut Up

Drake Thomas2y2018

A proper Bayesian currently at less 0.5% credence for a proposition P should assign a less than 1 in 100 chance that their credence in P rises above 50% at any point in the future. This isn't a catch for someone who's well-calibrated.

In the example you give, the extent to which it seems likely that critical typos would happen and trigger this mechanism by accident is exactly the extent to which an observer of a strange headline should discount their trust in it! Evidence for unlikely events cannot be both strong and probable-to-appear, or the events would not be unlikely.

4Groudon4662y

If the purpose of this betting is to reward those who bet on the truth, though, then allowing a spike in credulity to count for it works against that purpose, and turns it into more of a combined bet of “Odds that the true evidence available to the public and LW suggests >50% likelihood or that substantial false evidence comes out for a very short period within the longer time period”. In his comment reply to me, OP mentioned he would be fine with a window of a month for things to settle and considered it a reasonable concern, which suggests that he is (rightly) focused more on betting about actual UFO likelihood, rather than the hybrid likelihood that includes hypothetical instances of massive short-term misinformation. While you are correct that the probability of that misinformation should theoretically be factored in on the better’s end, that’s not what the OP is really wanting to bet on in the first place; as such, I don’t think it was a mistake to point it out.

When is Goodhart catastrophic?

Drake Thomas2yΩ220

An example of the sort of strengthening I wouldn't be surprised to see is something like "If $V$ is not too badly behaved in the following ways, and for all $v \in R$ we have [some light-tailedness condition] on the conditional distribution $(X | V = v)$ , then catastrophic Goodhart doesn't happen." This seems relaxed enough that you could actually encounter it in practice.

3Thomas Kwa1y

Suppose that we are selecting for U=X+V where V is true utility and X is error. If our estimator is unbiased (E[X|V=v]=0 for all v) and X is light-tailed conditional on any value of V, do we have limt→∞E[V|X+V≥t]=∞? No; here is a counterexample. Suppose that V∼N(0,1), and X|V∼N(0,4) when V∈[−1,1], otherwise X=0. Then I think limt→∞E[V|X+V≥t]=0. This is worrying because in the case where V∼N(0,1) and X∼N(0,4) independently, we do get infinite V. Merely making the error *smaller* for large values of V causes catastrophe. This suggests that success caused by light-tailed error when V has even lighter tails than X is fragile, and that these successes are “for the wrong reason”: they require a commensurate overestimate of the value when V is high as when V is low.

When is Goodhart catastrophic?

Drake Thomas2y*Ω240

I'm not sure what you mean formally by these assumptions, but I don't think we're making all of them. Certainly we aren't assuming things are normally distributed - the post is in large part about how things change when we stop assuming normality! I also don't think we're making any assumptions with respect to additivity; $X = U - V$ is more of a notational or definitional choice, though as we've noted in the post it's a framing that one could think doesn't carve reality at the joints. (Perhaps you meant something different by additivity, though - feel... (read more)

3rotatingpaguro2y

I wasn't saying you made all those assumption, I was trying to imagine an empirical scenario to get your assumptions, and the first thing to come to my mind produced even stricter ones. I do realize now that I messed up my comment when I wrote Here there should not be Normality, just additivity and independence, in the sense of U−V⊥V. Sorry. I do agree you could probably obtain similar-looking results with relaxed versions of the assumptions. However, the same way U−V⊥V seems quite specific to me, and you would need to make a convincing case that this is what you get in some realistic cases to make your theorem look useful, I expect this will continue to apply for whatever relaxed condition you can find that allows you to make a theorem. Example: if you said "I made a version of the theorem assuming there exists f such that f(U,V)⊥V for f in some class of functions", I'd still ask "and in what realistic situations does such a setup arise, and why?"

Predictable updating about AI risk

Drake Thomas2y121

.00002% — that is, one in five hundred thousand

0.00002 would be one in five hundred thousand, but with the percent sign it's one in fifty million.

Indeed, even on basic Bayesianism, volatility is fine as long as the averages work out

I agree with this as far as the example given, but I want to push back on oscillation (in the sense of regularly going from one estimate to another) being Bayesian. In particular, the odds you should put on assigning 20% in the future, then 30% after that, then 20% again, then 30% again, and so on for ten up-down oscillations, s... (read more)

4Joe Carlsmith2y

Re: "0.00002 would be one in five hundred thousand, but with the percent sign it's one in fifty million." -- thanks, edited. Re: volatility -- thanks, that sounds right to me, and like a potentially useful dynamic to have in mind.

Noting an error in Inadequate Equilibria

Drake Thomas2y*2110

These graphs seem concerning to me, but I'm worried about an information cascade before Eliezer's responded or someone with substantial expertise in macroeconomic policy has weighed in, so I'm planning to refrain from voting on this post until a week from now.

(Posting as a comment in case others feel inclined to adopt a similar policy.)

Edit: I've now upvoted, since no contrary info has come in that I've seen and at least one person with experience in economics has commented supportively.

Decision theory does not imply that we get to have nice things

Drake Thomas2yΩ350

Late comment, but my reactions reading this:

Now's your chance to figure out what the next few obstacles are without my giving you spoilers first. Feel free to post your list under spoiler tags in the comment section.

[lightly edited for LaTeX and typos, not otherwise changed since seeing the spoilers]

1. You don’t know what you want all that legibly, or what kinds of concrete commitments the AI can make. This seems pretty okay, if you’re unhackable - the AI presents you with some formal specification of desiderata and you understand why they’re correct ones

... (read more)

2Donald Hobson1y

I disagree. If your simulation is perfectly realistic, the simulated humans might screw up at alignment and create an unfriendly superintelligence, for much the same reason real humans might. Also, if the space of goals that evolution + culture can produce is large, then you may be handing control to a mind with rather different goals.Rerolling the same dice won't give the same answer. These problems may be solvable, depending on what the capabilities here are, but they aren't trivial.

AGI Safety FAQ / all-dumb-questions-allowed thread

Drake Thomas3y72

I think a lot of people in AI safety don't think it has a high probability of working (in the sense that the existence of the field caused an aligned AGI to exist where there otherwise wouldn't have been one) - if it turns out that AI alignment is easy and happens by default if people put even a little bit of thought into it, or it's incredibly difficult and nothing short of a massive civilizational effort could save us, then probably the field will end up being useless. But even a 0.1% chance of counterfactually causing aligned AI would be extremely worth... (read more)

AGI Safety FAQ / all-dumb-questions-allowed thread

Drake Thomas3y30

Paul Christiano provided a picture of non-Singularity doom in What Failure Looks Like. In general there is a pretty wide range of opinions on questions about this sort of thing - the AI-Foom debate between Eliezer Yudkowsky and Robin Hanson is a famous example, though an old one.

"Takoff speed" is a common term used to refer to questions about the rate of change in AI capabilities at the human and superhuman level of general intelligence - searching Lesswrong or the Alignment Forum for that phrase will turn up a lot of discussion about these questions, thou... (read more)

AGI Safety FAQ / all-dumb-questions-allowed thread

Drake Thomas3y*32

Three thoughts on simulations:

It would be very difficult for 21st-century tech to provide a remotely realistic simulation relative to a superintelligence's ability to infer things from its environment; outside of incredibly low-fidelity channels, I would expect anything we can simulate to either have obvious inconsistencies or be plainly incompatible with a world capable of producing AGI. (And even in the low-fidelity case I'm worried - every bit you transmit leaks information, and it's not clear that details of hardware implementations could be safely obs

... (read more)

Public beliefs vs. Private beliefs

Drake Thomas3y40

I'm not claiming that you should believe this, I'm merely providing you the true information that I believe it.

Something feels off to me about this notion of "a belief about the object level that other people aren't expected to share" from an Aumann's Agreement Theorem point of view - the beliefs of other rational agents are, in fact, enormous updates about the world! Of course Aumannian conversations happen exceedingly rarely outside of environments with tight verifiable feedback loops about correctness, so in the real world maybe something like these ... (read more)

NYC Rationalist Community

Drake Thomas3y10

Since this very old post shows up prominently in the search results for New York rationality meetups, it’s worth clarifying that these are still going strong as of 2022! The google group linked in this post is still active and serves as the primary source of meetup announcements; good faith requests to join are generally approved.

Lies, Damn Lies, and Fabricated Options

Drake Thomas3y10

Of course the utility lost by missing a flight is vastly greater than that of waiting however long you’d have needed to to make it. But it’s a question of expected utilities - if you’re currently so cautious that you could take 1000 flights and never miss one, you’re arriving early enough to get a 99.9% chance of catching the flight. If showing up 2 minutes later lowers that to 99.8%, you’re not trading 2 minutes per missed flight, you’re trading 2000 minutes per missed flight, which seems worth it.

3JBlack3y

Yes, that was my point. If you set reasonable numbers for these things, you get something on the order of magnitude of 1% chance of missing one as a good target. Hence if you've made fewer than 100 flights then having not yet missed one is extremely weak evidence for having spent too much time in airports, and is largely consistent with having spent too little. Most people have not made 100 flights in their lives, so the advice stands a very high chance of being actively harmful to most people. It would be more reasonable to say that if you have missed a flight then you're spending "too much" time in airports, because you're probably doing way too much flying.

1Martin Randall3y

Yes, it's a question of expected utilities. But if you take the saying literally, someone who has taken ten flights in their life should have taken a >10% chance of missing each flight. At that rate the consequences of missing a flight weigh heavily in the expected utility. An alternative saying: if you've ever missed a flight, you're spending too much time in airports, either because your carbon emissions are too high, or because you are taking an excessive risk of missing a flight, or both. (or you're a pilot, or you were unlucky, or ...)

Lies, Damn Lies, and Fabricated Options

Drake Thomas3y10

Does “stamp out COVID” mean success for a few months, or epsilon cases until now? The latter seems super hard, and I think every nation that’s managed it has advantages over the US besides competence (great natural borders or draconian law enforcement).

New York City, NY – ACX Meetups Everywhere 2021

Drake Thomas4y20

Update: by 7:30 the meetup was maybe at 30% of peak attendance, at 8PM or so it migrated to another park because the first one closed, and the latest meetup interactions I know of went until around 12:40AM.

New York City, NY – ACX Meetups Everywhere 2021

Drake Thomas4y10

I think people typically hang out for as long as they want, and the size of the group gradually dwindles. There's no official termination point - I'd be a little surprised if more than half of people were left by 7:30, but I'd also be surprised if at least some meetup attendees weren't still interacting by 10PM or later.

2Drake Thomas4y

Why I Work on Ads

Drake Thomas4y40

A path ads could take that seems like it would both be more ethical and more profitable, yet I don't see happening: actually get direct consumer feedback!

I like the concept of targeted ads showing me things I enjoy and am interested in, but empirically, they're not very good at it! Maybe it's because I use an adblocker most of the time, but even on my phone, ads are reliably uninteresting to me, and I think the fraction that I click on or update positively towards the company from must be far below 1%.* So why don't advertisers have an option for me to say... (read more)

Why are people so bad at dating?

Drake Thomas5y20

You could pick many plausible metrics (number of matches, number of replies to messages, number of dates, number of longterm relationships) but it seems unlikely that any of them aren't impacted positively for most people in the online dating market by having better photos. Do you have reason to think that two reasonable metrics of success would affect the questions raised in this post differently?

2Charles Zheng5y

I personally don't have a desire to maximize any of these numbers. Do you know anyone who explicitly wants to maximize "number of longterm relationships?" I was being Socratic but the point I was trying to make is that I don't think there exists *any* metric that can adequately capture what people are looking for in a relationship. Hence, it becomes difficult to conclude that anyone is being "suboptimal", either.