All of kave's Comments + Replies

kave60

This is a bit of an aside, but I hesitate to be too shocked by differences in funding:DALY ratios. After all, what you really want to know is change in DALYs at a given level of funding. It seems pretty plausible that some diseases are 10x (or even 100x) as cost-effective to ameliorate as others.

That said, funding:DALY seems like a fine heuristic for searching for misallocated resources. And to be clear, I expect it's not actually a difference in cost-effectiveness that's driving the different spending, but I'd want to check before updating too much.

kave*20

I would definitely consider collaborative filtering ML, though I don't think people normally make deep models for it. You can see on Recombee's website that they use collaborative filtering, and use a bunch of weasel language that makes it unclear if they actually use anything else much at all

kave42

They tout their transformer ("beeformer") in marketing copy, but I expect mostly its driven by collaborative filtering, like most recommendation engines

4habryka
My guess is most recommendation engines in use these days are ML/DL based. At least I can't think of any major platform that hasn't yet switched over, based on what I read.
kave*40

We've actually been thinking about something quite related! More info soon.

(Typo: Lightcone for a writing retreat -> Lighthaven for a writing retreat)

kave20

My writing is often hard to follow. I think this is partly because I tend to write like I talk, but I can't use my non-verbals to help me out, and I don't get live feedback from the listener.

kave40

Interesting! Two yet more interesting versions of the test:

  • Someone who currently gets use from LLMs writing more memory-efficient code, though maybe this is kind of question-begging
  • Someone who currently gets use from LLMs, and also is pretty familiar with trying to improve the memory efficiency of their code (which maybe is Ray, idk)
kaveModerator Comment40

Moderation note: RFEs with interesting writeups have been a bit hard to frontpage recently. Normally, an announcement of a funding round is on the "personal" side, but I do think the content of this post, other than the announcement, is frontpage-worthy. For example, it would be interesting for people to see in recommendations in a few months time.

With the recent OpenPhil RFE, we asked them to split out the timeless content, which we then frontpaged. I would be happier if this post did that, but for now I'll frontpage it. I might change my mind and remove ... (read more)

4aog
Thanks for the heads up. I’ve edited the title and introduction to better indicate that this content might be interesting to someone even if they’re not looking for funding. 
kave62

This popped up in my Recommended feed, and it piqued my interest: I think it relates to a design skill I've picked up at Lightcone. When staging a physical space, or creating a website, I often want to stop iterating quite early, when the core idea is realised. "It would be a lot of work to get all the minor details here, but how much does stuff feeling a bit jank really matter?"

I think jank matters surprisingly much! When creating something, it's easy to think "wow, that's a cool idea. I've got a proof of concept sorted". But users, even if they think sim... (read more)

kave62

I had terrible luck with symbolic regression, for what its worth.

6abstractapplic
Looked into it more and you're right: conventional symbolic regression libraries don't seem to have the "calculate a quantity then use that as a new variable going forward" behavior I'd have needed to get Total Value and then decide&apply a tax rate based on that. I . . . probably should have coded up a proof-of-concept before impugning everyone including myself.
Answer by kave50

Two options I came up with:

  1. European Roulette has a house edge of 2.7%. I think in the UK, gambling winnings don't get taxed. I think in the US, you wouldn't tax each win, but just your total winnings.
  2. I'm seeing bid-ask spreads of about 10% on SPX call options. Naïvely, perhaps that means the "edge" is about 5% for someone looking to gamble. I think that the implied edge is much worse for further out-of-the-money options, but if you chain in-the-money options you'd have to pay tax on each win (assuming you won).
Dagon*152
  1. European Roulette has a house edge of 2.7%. I think in the UK, gambling winnings don't get taxed. I think in the US, you wouldn't tax each win, but just your total winnings.

If you go the casino route, craps is slightly better.  Don't pass is 1.40% house edge, and they let you take (or lay, for don't pass) "free odds" on a point, which is 0% (pays true odds of winning), getting it down below a percent.  Taxes may not matter if you can deduct the full charitable contribution.  

Note that if you had a perfect 50% even-money bet, you'd have to win 10 times to turn your $1000 into $1.024M. 0.5 ^ 10 = 0.000977, so you've got almost a tenth of a percent of winning.   

kave106

Yes, I expect they will mostly be negative EV. I'm interested in those

kave50

Yeah, I'm looking for how one would roll one's own donor lottery out of non-philanthropic funds

kaveΩ574

Meta: I'm confused and a little sad about the relative upvotes of Habryka's comment (35) and Sam's comment (28). I think it's trending better, but what does it even mean to have a highly upvoted complaint comment based on a misunderstanding, especially one more highly upvoted than the correction?

Maybe people think Habryka's comment is a good critique even given the correction, even though I don't think Habryka does?

7Adam Scholl
I interpreted Habryka's comment as making two points, one of which strikes me as true and important (that it seems hard/unlikely for this approach to allow for pivoting adequately, should that be needed), and the other of which was a misunderstanding (that they don't literally say they hope to pivot if needed).
kave20

Do we know how many silver pieces there are to a gold piece?

2aphyer
There are ten silver pieces to a gold piece.  I'll edit that into the doc.
kave20

In particular, it's hard to distinguish in the amount of time that I have to moderate a new user submission. Given that I'm trying to spend a few minutes on a new user, it's very helpful to be able to rely on style cues.

kave*40

Eusocial organisms have more specialisation at the individual level rather than non-eusocial organisms (I think). I might expect that I would want a large amount of interchangeable individuals for each specialisation (a low bus factor), rather than more expensive, big, rare entities.

This predicts that complex multicellular organisms would have smaller cells than unicellular organisms or simple multicellular organisms (i.e. those were there isn't differentiation between the cells)

kave136

A relevant plot from the wonderful calibration.city:

Early in a prediction questions lifetime, the average Brier score is something like 0.17 to 0.24, which is like a 1-8% edge.

kave*235

I quite like the article The Rise and Fall of the English Sentence, which partially attributes reduced structural complexity to increase in noun compounds (like "state hate crime victim numbers" rather than "the numbers of victims who have experienced crimes that were motivated by hatred directed at their ethnic or racial identity, and who have reported these crimes to the state")

kave141

They're looking to make bets with people who disagree. Could be a good opportunity to get some expected dollars

7Rafael Harth
Thanks. I've submitted my own post on the 'change our mind form', though I'm not expecting a bounty. I'd instead be interested in making a much bigger bet (bigger than Cole's 100 USD), gonna think about what resolution criterion is best.
8Cole Wyeth
Sure, I’ll keep it simple (will submit through proper channels later): Here’s my attempt to change their minds: https://www.lesswrong.com/posts/vvgND6aLjuDR6QzDF/my-model-of-what-is-going-on-with-llms I’ll bet 100 USD that by 2027 AI agents have not replaced human AI engineers. If it’s hard to decide I’ll pay 50 USD. 
kave40

Have you ever noticed a verbal tic and tried to remove it? For example, maybe you've noticed you say "tubular" a lot, and you'd like to stop. Or you've decided to use "whom" 'correctly'.

When I try and change my speech, I notice that by default my "don't do that" triggers run on other people's speech as well as my own, and I have to make a conscious refinement to separate them.

kave20

I'm inclined to agree, but at least this is an improvement over it only living in Habryka's head. It may be that this + moderation is basically sufficient, as people seem to have mostly caught on to the intended patterns.

kave*160

I spent some time Thursday morning arguing with Habryka about the intended use of react downvotes. I think I now have a fairly compact summary of his position.

PSA: When to upvote and downvote a react

Upvote a react when you think it's helpful to the conversation (or at least, not antihelpful) and you agree with it. Imagine a react were a comment. If you would agree-upvote it and not karma-downvote it, you can upvote the react.

Downvote a react when you think it's unhelpful for the conversation. This might be because you think the react isn't being used for i... (read more)

It's not really feasible for the feature to rely on people reading this PSA to work well. The correct usage needs to be obvious.

Elizabeth133

follow up: if you would disagree-vote with a react but not karma downvote, you can use the opposite react. 

kave*Ω12232

You claim (and I agree) that option control will probably not be viable at extreme intelligence levels. But I also notice that when you list ways that AI systems help with alignment, all but one (maybe two), as I count it, are option control interventions.

evaluating AI outputs during training, labeling neurons in the context of mechanistic interpretability, monitoring AI chains of thought for reward-hacking behaviors, identifying which transcripts in an experiment contain alignment-faking behaviors, classifying problematic inputs and outputs for the purpos

... (read more)
kave*Ω340

I do not think your post is arguing for creating warning shots. I understand it to be advocating for not averting warning shots.

To extend your analogy, there are several houses that are built close to a river, and you think that a flood is coming that will destroy them. You are worried that if you build a dam that would protect the houses currently there, then more people will build by the river and their houses will be flooded by even bigger floods in the future. Because you are worried people will behave in this bad-for-them way, you choose not to help t... (read more)

kaveΩ465

I expect moderately sized warning shots to increase the chances humanity as a whole takes serious actions and, for example, steps up efforts to align the frontier labs.

It seems naïvely evil to knowingly let the world walk into a medium-sized catastrophe. To be clear, I think that sometimes it is probably evil to stop the world from walking into a catastrophe, if you think that increases the risk of bad things like extinctions. But I think the prior of not diagonalising against others (and of not giving yourself rope with which to trick yourself) is strong.

3Jan_Kulveit
The quote is somewhat out of context. Imagine a river with some distribution of flood sizes. Imagine this proposed improvement: a dam which is able to contain 1-year, 5-year and 10-year floods. It is too small for 50-year floods or larger, and may even burst and make the flood worse. I think such device is not an improvement, and may make things much worse - because of the perceived safety, people may build houses close to the river, and when the large flood hits, the damages could be larger.   I have hard time parsing what do you want to say relative to my post. I'm not advocating for people to deliberately create warning shots. 
kave114

there's evidence about bacteria manipulating weather for this purpose

Sorry, what?

gwern180

Ice-nucleating bacteria: https://www.nature.com/articles/ismej2017124 https://www.sciencefocus.com/planet-earth/bacteria-controls-the-weather

If you can secrete the right things, you can potentially cause rain/snow inside clouds. You can see why that might be useful to bacteria swept up into the air: the air may be a fine place to go temporarily, and to go somewhere, but like a balloon or airplane, you do want to come down safely at some point, usually somewhere else, and preferably before the passengers have begun to resort to cannibalism. So given that ev... (read more)

kaveΩ342

I think you train Claude 3.7 to imitate the paraphrased scratchpad, but I'm a little unsure because you say "distill". Just checking that Claude 3.7 still produces CoT (in the style of the paraphrase) after training, rather than being trained to perform the paraphrased-CoT reasoning in one step?

4Fabien Roger
By distillation, I mean training to imitate. So in the distill-from-paraphrased setting, the only model involved at evaluation time is the base model fine-tuned on paraphrased scratchpads, and it generates an answer from beginning to end.
kave*93

It's been a long time since I looked at virtual comments, as we never actually merged them in. IIRC, none were great, but sometimes they were interesting (in a kind of "bring your own thinking" kind of way).

They were implemented as a Turing test, where mods would have to guess which was the real comment from a high karma user. If they'd been merged in, it would have been interesting to see the stats on guessability.

kave3-1

Could exciting biotech progress lessen the societal pressure to make AGI?

Suppose we reach a temporary AI development pause. We don't know how long the pause will last; we don't have a certain end date nor is it guaranteed to continue. Is it politically easier for that pause to continue if other domains are having transformative impacts?

I've mostly thought this is wishful thinking. Most people don't care about transformative tech; the absence of an alternative path to a good singularity isn't the main driver of societal AI progress.

But I've updated some her... (read more)

kave71

I think your comment is supposed to be an outside view argument that tempers the gears-level argument in the post. Maybe we could think of it as providing a base-rate prior for the gears-level argument in the post. Is that roughly right? I'm not sure how much I buy into this kind of argument, but I also have some complaints by the outside views lights.

First, let me quickly recap your argument as I understand it.

R&D increases welfare by allowing an increase in consumption. We'll assume that our growth in consumption is driven, in some fraction, by R&

... (read more)
1Vasco Grilo
Thanks for the great summary, Kave! Nitpick. SWP received 1.82 M 2023-$ (= 1.47*10^6*1.24) during the year ended on 31 March 2024, which is 1.72*10^-8 (= 1.82*10^6/(106*10^12)) of the gross world product (GWP) in 2023, and OP estimated R&D has a benefit-to-cost ratio of 45. So I estimate SWP can only be up to 1.29 M (= 1/(1.72*10^-8)/45) times as cost-effective as R&D due to this increasing SWP’s funding. Fair points, although I do not see how they would be sufficiently strong to overcome the large baseline difference between SWP and general R&D. I do not think reducing the nearterm risk of human extinction is astronomically cost-effective, and I am sceptical of longterm effects.
kave20

From population mean or from parent mean?

4GeneSmith
Population mean
kave90

Curated. Genetically enhanced humans are my best guess for how we achieve existential safety. (Depending on timelines, they may require a coordinated slowdown to work). This post is a pretty readable introduction to a bunch of the why and how and what still needs to be down.

I think this post is maybe slightly too focused on "how to genetically edit for superbabies" to fully deserve its title. I hope we get a treatment of more selection-based methods sometime soon.

GeneSmith mentioned the high-quality discussion as a reason to post here, and I'm glad we're a... (read more)

5GeneSmith
Yes, the two other approaches not really talked about in this thread that could also lead to superbabies are iterated meiotic selection and genome synthesis. Both have advantages over editing (you don't need to have such precise knowledge of causal alleles with iterated meiotic selection or with genome synthesis), but my impression is they're both further off than an editing approach. I'd like to write more about both in the future.
kave*30

My understanding when I last looked into it was that the efficient updating of the NNUE basically doesn't matter, and what really matters for its performance and CPU-runnability is its small size.

kave22

I'm not aware of a currently published protocol; sorry for confusing phrasing!

kave70

There are various technologies that might let you make many more egg cells than are possible to retrieve from an IVF cycle. For example, you might be able to mature oocytes from an ovarian biopsy, or you might be able to turn skin cells into eggs.

2JenniferRM
Wait, what? I know Aldous Huxley is famous for writing a scifi novel in 1931 titled "Don't Build A Method For Simulating Ovary Tissue Outside The Body To Harvest Eggs And Grow Clone Workers On Demand In Jars" but I thought that his warning had been taken very very seriously. Are you telling me that science has stopped refusing to do this, and there is now a protocol published somewhere outlining "A Method For Simulating Ovary Tissue Outside The Body To Harvest Eggs"???
kave233

Copying over Eliezer's top 3 most important projects from a tweet:

1.  Avert all creation of superintelligence in the near and medium term.

2.  Augment adult human intelligence.

3.  Build superbabies.

kave40

Looks like the base url is supposed to be niplav.site. I'll change that now (FYI @niplav)

kaveΩ682

I think TLW's criticism is important, and I don't think your responses are sufficient. I also think the original example is confusing; I've met several people who, after reading OP, seemed to me confused about how engineers could use the concept of mutual information.

Here is my attempt to expand your argument.

We're trying to design some secure electronic equipment. We want the internal state and some of the outputs to be secret. Maybe we want all of the outputs to be secret, but we've given up on that (for example, radio shielding might not be practical or... (read more)

7johnswentworth
I think that's basically right, and good job explaining it clearly and compactly. I would also highlight that it's not just about adversaries. One the main powers of proof-given-assumptions is that it allows to rule out large classes of unknown unknowns in one go. And, insofar as the things-proven-given-assumptions turn out to be false, it allows to detect previously-unknown unknowns.
kave20

With LLMs, we might be able to aggregate more qualitative anonymous feedback.

kave20

The general rule is roughly "if you write a frontpage post which has an announcement at the end, that can be frontpaged". So for example, if you wrote a post about the vision for Online Learning, that included as a relatively small part the course announcement, that would probably work.

By the way, posts are all personal until mods process them, usually around twice a day. So that's another reason you might sometimes see posts landing on personal for awhile.

2Alex Flint
Got it! Good to know.
kave20

Mod note: this post is personal rather than frontpage because event/course/workshop/org... announcements are generally personal, even if the content of the course, say, is pretty clearly relevant to the frontpage (as in this case)

2Alex Flint
Thanks! We were wondering about that. Is there any way we could be changed to the frontpage category?
kave40

I believe it includes some older donations:

  1. Our Manifund application's donations, including donations going back to mid-May, totalling about $50k
  2. A couple of older individual donations, in October/early Nov, totalling almost 200k
kave*40

Mod note: I've put this on Personal rather than Frontpage. I imagine the content of these talks will be frontpage content, but event announcements in general are not.

kave70

neural networks routinely generalize to goals that are totally different from what the trainers wanted

I think this is slightly a non sequitor. I take Tom to be saying "AIs will care about stuff that is natural to express in human concept-language" and your evidence to be primarily about "AIs will care about what we tell it to", though I could imagine there being some overflow evidence into Tom's proposition.

I do think the limited success of interpretability is an example of evidence against Tom's proposition. For example, I think there's lots of work where... (read more)

8eggsyntax
Thanks, that's a totally reasonable critique. I kind of shifted from one to the other over the course of that paragraph.  Something I believe, but failed to say, is that we should not expect those misgeneralized goals to be particularly human-legible. In the simple environments given in the goal misgeneralization spreadsheet, researchers can usually figure out eventually what the internalized goal was and express it in human terms (eg 'identify rulers' rather than 'identify tumors'), but I would expect that to be less and less true as systems get more complex. That said, I'm not aware of any strong evidence for that claim, it's just my intuition. I'll edit slightly to try to make that point more clear.
kave551

I dug up my old notes on this book review. Here they are:

So, I've just spent some time going through the World Bank documents on its interventions in Lesotho. The Anti-Politics Machine is not doing great on epistemic checking

  • There is no recorded Thaba-Tseka Development Project, despite the period in which it should have taken place being covered
  • There is a Thaba-Bosiu development project (parts 1 and 2) taking place at the correct time.
    • Thaba-Bosiu and Thaba-Tseka are both regions of Lesotho
    • The spec doc for Thaba-Bosiu Part 2 references the alleged problems
... (read more)
8Benquo
Wow, thanks for doing the legwork on this - seems like quite possibly I'm analyzing fiction? Annoying if true. Google's AI response to my search for the Thaba-Tseka Development Project says: There's a good chance this is an AI hallucination, though; a cursory search of the main documents didn't yield any references to a "Thaba-Tseka development project," or the wood or ponies. I'm not familiar with World Bank documentation, though, and likely the right followup would involve looking at exactly what's cited in the book. However, the other lead funder, the Canadian International Development Agency, does seem to have at least one publicly referenced document about a "Thaba-Tseka rural development program": Evaluation, the Kingdom of Lesotho rural development : evaluation design for phase 1, the Thaba Tseka project

I think 2023 was perhaps the peak for discussing the idea that neural networks have surprisingly simple representations of human concepts. This was the year of Steering GPT-2-XL by adding an activation vectorcheese vectors, the slightly weird lie detection paper and was just after Contrast-consistent search.

This is a pretty exciting idea, because if it’s easy to find human concepts we want (or don’t want) networks to possess, then we can maybe use that to increase the chance that systems that are honest, kind, loving (and can ask them... (read more)

Load More