All of Peter Wildeford's Comments + Replies

What do you think of the counterargument that OpenAI announced o3 in December and publicly solicited external safety testing then, and isn't deploying until ~4 months later?

4Zach Stein-Perlman
I don't know. I don't have a good explanation for why OpenAI hasn't released o3. Delaying to do lots of risk assessment would be confusing because they did little risk assessment for other models.

Here's my summary of the recommendations:

  • National security testing
    • Develop robust government capabilities to evaluate AI models (foreign and domestic) for security risks
    • Once ASL-3 is reached, government should mandate pre-deployment testing
    • Preserve the AI Safety Institute in the Department of Commerce to advance third-party testing
    • Direct NIST to develop comprehensive national security evaluations in partnership with frontier AI developers
    • Build classified and unclassified computing infrastructure for testing powerful AI systems
    • Assemble interdisciplinary team
... (read more)

If you've liked my writing in the past, I wanted to share that I've started a Substack: https://peterwildeford.substack.com/

Ever wanted a top forecaster to help you navigate the news? Want to know the latest in AI? I'm doing all that in my Substack -- forecast-driven analysis about AI, national security, innovation, and emerging technology!

5Chris_Leong
Thanks, seems pretty good on a quick skim, I'm a bit less certain on the corrigibility section, also more issues might become apparent if I read through it more slowly.

My current working take is that it is at the level of a median-but-dedicated undergraduate of a top university who is interested and enthusiastic about AI safety. But Deep Research can do in 10 minutes what would take that undergraduate about 20 hours.

Happy to try a prompt for you and see what you think.

How about "Please summarise Eliezer Yudkowsky's views on decision theory and its relevance to the alignment problem".

This was very helpful for you to put together - thank you!

1Valentin Baltadzhiev
Glad to hear that!

I'd also add that historically I believe about two-thirds of LTFF's money has also come from OpenPhil, so LTFF doesn't represent a fully independent funder (though the decisionmaking around grants is pretty independent).

This is a great poll and YouGov is a highly reputable pollster, but there is a significant caveat to note about the pause finding.

The way the question is framed provides information about "1000 technology leaders" who have signed a letter in favor of the pause but does not mention any opposition to the pause. I think this would push respondents to favor the pause. Ideal question construction would present more neutrally with both support and oppose statements.

3Lech Mazur
YouGov's answer to these concerns: https://today.yougov.com/topics/technology/articles-reports/2023/04/14/ai-nuclear-weapons-world-war-humanity-poll "Even with all those changes, results on concern over AI's potential to end humanity were almost identical to the first poll: 18% of U.S. adult citizens are very concerned and 28% are somewhat concerned about AI ending the human race; 10% think it's impossible. (Another poll asking the same question, conducted by Conjointly, got similar results.)"
1Ulisse Mini
Strong upvoted and agreed. I don't think the public has opinions on AI X-Risk yet, so any attempt to elicit them will entirely depend on framing.

Good to know you won't be out of a job just yet!

I asked GPT4 the same prompt and here is what it said:

Of course! Here's a new contra dance for you:

Contra Dance
Whirlwind Delight
By ChatGPT
(Duple improper)

A1


(8) Circle left 3/4


(8) Neighbor swing

 

A2


(8) Long lines forward and back


(8) Ladies chain

 

B1


(8) Star right 1x


(8) Gents aleman left 1.5

 

B2


(8) Half hey, ladies pass right shoulders to start


(8) Partner balance and swing, end facing down in a line of four

 

Enjoy dancing Whirlwind Delight!

How did it do?

4jefftk
1: Yes, all the figures are real 2: Yes, though it's the same "basically everything is eight counts" that ChatGPT did 3: Not quite: "Partner balance and swing" in the B2 should be 16 beats 4: No. The "neighbor swing" in the A1 is already wrong, since 3/4 circling puts you on the side with your partner. 5: Very much no. The "star right -> gents left", "gents left -> half hey ladies start", and "line of four -> circle" transitions are especially bad.

If we want to know what arguments resonate with New York Times articles we can actually use surveys, message testing, and focus groups to check and we don't need to guess! (Disclaimer: My company sells these services.)

4JakubK
Does RP have any results to share from these studies? What arguments seem to resonate with various groups?

Cool - I'll follow up when I'm back at work.

That makes a lot of sense. We can definitely test a lot of different framings. I think the problem with a lot of these kinds of problems is that they are low saliency, and thus people tend not to have opinions already, and thus they tend to generate an opinion on the spot. We have a lot of experience polling on low saliency issues though because we've done a lot of polling on animal farming policy which has similar framing effects.

8habryka
I would definitely vote in favor of a grant to do this on the LTFF, as well as the SFF, and might even be interested in backstopping it with my personal funds or Lightcone funds.
6Stefan_Schubert
I think that's exactly right.

I'll shill here and say that Rethink Priorities is pretty good at running polls of the electorate if anyone wants to know what a representative sample of Americans think about a particular issue such as this one. No need to poll Uber drivers or Twitter when you can do the real thing!

7Stefan_Schubert
I think that could be valuable. It might be worth testing quite carefully for robustness - to ask multiple different questions probing the same issue, and see whether responses converge. My sense is that people's stated opinions about risks from artificial intelligence, and existential risks more generally, could vary substantially depending on framing. Most haven't thought a lot about these issues, which likely contributes. I think a problem problem with some studies on these issues is that researchers over-generalise from highly framing-dependent survey responses.

I'd very much like to see this done with standard high-quality polling techniques, e.g. while airing counterarguments (like support for expensive programs that looks like majority but collapses if higher taxes to pay for them is mentioned). In particular, how the public would react given different views coming from computer scientists/government commissions/panels.

Yeah, it came from a lawyer. The point being that if you confess to something bad, we may be legally required to repot that, so be careful.

Feel free to skip questions if you feel they aren't applicable to you.

Does the chance evolution got really lucky cancel out with the chance that evolution got really unlucky? So maybe this doesn't change the mean but does increase the variance?as for how much to increase the variance, maybe like an additional +/-1 OOM tacked on to the existing evolution anchor?

I'm kinda thinking there's like a 10% chance you'd have to increase it by 10x and a 10% chance you'd have to decrease it by 10x. But maybe I'm not thinking about this right?

2Ege Erdil
The problem with the "evolution got really unlucky" assumption is the Fermi paradox. It seems like to resolve the Fermi paradox we basically have to assume that evolution got really lucky at least at some point if we assume the entire Great Filter is already behind us. Of course in principle it's possible all of this luck was concentrated in an early step like abiogenesis which AI capabilities research has already achieved the equivalent of, and there was no special luck that was needed after that. The important question seems to be whether we're already past "the Great Filter" in what makes intelligence difficult to evolve naturally or not. If the difficulty is concentrated in earlier steps then we're likely already past it and it won't pose a problem, but e.g. if the apes -> humans transition was particularly difficult then it means building AGI might take far more compute than we'll have at our disposal, or at least that evolutionary arguments cannot put a good bound on how much compute it would take. The counterargument I give is that Hanson's model implies that if the apes -> humans transition was particularly hard then the number of hard steps in evolution has to be on the order of 100, and that seems inconsistent with both details of evolutionary history (such as how long it took to get multicellular life from unicellular life, for example) and what we think we know about Earth's remaining habitability lifespan. So the number of hard steps was probably small and that is inconsistent with the apes -> humans transition being a hard step.

There are a lot of different ways you can talk about "efficiency" here. The main thing I am thinking about with regard to the key question "how much FLOP would we expect transformative AI to require?" is whether, when using a neural net anchor (not evolution) to add a 1-3 OOM penalty to FLOP needs due to 2022-AI systems being less sample efficient than humans (requiring more data to produce the same capabilities) and with this penalty decreasing over time given expected algorithmic progress. The next question would be how much more efficient potential AI (... (read more)

2Jesse Hoogland
To me this isn't clear. Yes, we're better one-shot learners, but I'd say the most likely explanation is that the human training set is larger and that much of that training set is hidden away in our evolutionary past. It's one thing to estimate evolution FLOP (and as Nuño points out, even that is questionable). It strikes me as much more difficult (and even more dubious) to estimate the "number of samples" or "total training signal (bytes)" over one's lifetime / evolution.

Yeah ok 80%. I also do concede this is a very trivial thing, not like some "gotcha look at what stupid LMs can't do no AGI until 2400".

This is admittedly pretty trivial but I am 90% sure that if you prompt GPT4 with "Q: What is today's date?" it will not answer correctly. I think something like this would literally be the least impressive thing that GPT4 won't be able to do.

2Matt Goldenberg
AFAICT OpenAI now includes the current date in the prompt, so I think this is wrong
9gwern
Are you really 90% sure on that? For example, LaMDA apparently has live web query access (a direction OA was also exploring with WebGPT), and could easily recognize that as a factual query worth a web query, and if you search Google for "what is today's date?" it will of course spit back "Monday, August 22, 2022", which even the stupidest LMs could make good use of. So your prediction would appear to boil down to "OA won't do an obviously useful thing they already half-did and a competitor did do a year ago".

Is it ironic that the link to "All the posts I will never write" goes to a 404 page?

3adamShimi
It's a charitable (and hilarious) interpretation. What actually happened is that he drafted it by mistake instead of just editing it to add stuff. It should be fine now.
5Owain_Evans
We didn't try but I would guess that finetuning on simple math questions wouldn't help with Metaculus forecasting. The focus of our paper is more "express your own uncertainty using natural language" and less "get better at judgmental forecasting". (Though some of the ideas in the paper might be useful in the forecasting domain.)

This sounds like something that could be done as an organization creating a job for it, which could help with mentorship/connections/motivation/job security relative to expecting people to apply to EAIF/LTFF

My organization (Rethink Priorities) is currently hiring for research assistants and research fellows (among other roles) and some of their responsibilities will include distillation.

These conversations are great and I really admire the transparency. It's really nice to see discussions that normally happen in private happen instead in public where everyone can reflect, give feedback, and improve their own thoughts. On the other hand, the combined conversations combined to a decent-sized novel - LW says 198,846 words! Is anyone considering investing heavily in summarizing the content for people to get involved without having to read all that content?

6Daniel Kokotajlo
Here is a heavily condensed summary of the takeoff speeds thread of the conversation, incorporating earlier points made by Hanson, Grace, etc. https://objection.lol/objection/3262835 :) (kudos to Ben Goldhaber for pointing me to it)

Echoing that I loved these conversations and I'm super grateful to everyone who participated — especially Richard, Paul, Eliezer, Nate, Ajeya, Carl, Rohin, and Jaan, who contributed a lot.

I don't plan to try to summarize the discussions or distill key take-aways myself (other than the extremely cursory job I did on https://intelligence.org/late-2021-miri-conversations/), but I'm very keen on seeing others attempt that, especially as part of a process to figure out their own models and do some evaluative work.

I think I'd rather see partial summaries/respons... (read more)

7Ben Pace
I chatted briefly the other day with Rob Bensinger about me turning them into a little book. My guess is I'd want to do something to compress especially the long Paul/Eliezer bet hashing out, that felt super long to me and not all worth the reading. Interested in other suggestions for compression. (This is not a commitment to do this, I probably won't.)

I don't recall the specific claim, just that EY's probability mass for the claim was in the 95-99% range. The person argued that because EY disagrees with some other thoughtful people on that question, he shouldn't have such confidence.

 

I think people conflate the very reasonable "I am not going to adopt your 95-99% range because other thoughtful people disagree and I have no particular reason to trust you massively more than I trust other people" with the different "the fact that other thoughtful people mean there's no way you could arrive at 95-99% confidence" which is false. I think thoughtful people disagreeing with you is decent evidence you are wrong but can still be outweighed.

I will be on the lookout for false alarms.

I can see whether the site is down or not. Seems pretty clear.

1Forged Invariant
Just be aware that other users have already noticed messages which could be deliberate false alarms: https://www.lesswrong.com/posts/EW8yZYcu3Kff2qShS/petrov-day-2021-mutually-assured-destruction?commentId=JbsutYRotfPDLNskK
1MichaelStJules
I don't think you'll be able to retaliate if the site is down.
1[comment deleted]

Attention LessWrong - I am a chosen user of EA Forum and I have the codes needed to destroy LessWrong. I hereby make a no first use pledge and I will not enter my codes for any reason, even if asked to do so. I also hereby pledge to second strike - if the EA Forum is taken down, I will retaliate.

Regarding your second strike pledge: it would of course be wildly disingenuous to remember Petrov's action, which was not jumping to retaliation, by doing the opposite and jumping to retaliation.

I believe you know this, and would guess that if in fact one of the sites went down, you'd do nothing but instead later post about your moral choice of not retaliating.

(I'd also guess, if you choose to respond to this comment, it'd be to reiterate the pledge to retaliate, as you've done elsewhere. This does make sense--threats must be unequivocal to be believed, e... (read more)

4Neel Nanda
Mutual Assured Destruction just isn't the same when you can see for sure whether you were nuked

Seems like "the right prompt" is doing a lot of work here. How do we know if we have given it "the right prompt"?

Do you think GPT-4 could do my taxes?

1Michaël Trazzi
re right prompt: GPT-3 has a context window of 2048 tokens, so this limits quite a lot what it could do. Also, it's not accurate at two-digit multiplication (what you would at least need to multiply your $ to %), even worse at 5-digit. So in this case, we're sure it can't do your taxes. And in the more general case, gwern wrote some debugging steps to check if the problem is GPT-3 or your prompt. Now, for GPT-4, given they keep scaling the same way, it won't be possible to have accurate enough digit multiplication (like 4-5 digits, cf. this thread) but with three more scalings it should do it. Prompt would be "here is a few examples on how to do taxe multiplication and addition given my format, so please output result format", and concatenate those two. I'm happy to bet $1 1:1 on GPT-7 doing taxe multiplication to 90% accuracy (given only integer precision).

1.) I think the core problem is that honestly no one (except 80K) actually is investing significant effort on growing the EA community since 2015 (especially comparable to the pre-2015 effort and especially as a percentage of total EA resources)

2.) Some of these examples are suspect. The GiveWell numbers definitely look to be increasing beyond 2015, especially when OpenPhil's understandably constant fundraising is removed - and this increase in GiveWell seems to line up with GiveWell's increased investment in their outreach. The OpenPhil numbers also look ... (read more)

FWIW I I put together "Is EA Growing? EA Growth Metrics for 2018" and I'm looking forward for doing 2019+2020 soon

1AppliedDivinityStudies
This is great, thanks! Wish I had seen this earlier.

Mr. Money Mustache has a lot of really good advice that I find a lot of value from. However, I think Mr. Money Mustache underestimates the ease and impact of opportunities to grow income relative to cutting spending - especially if you're in (or can be in) a high-earning field like tech. Doubling your income will put you on a much faster path than cutting your spending a further 5%.

2Adam Zerner
Yeah, that makes a lot of sense.
Answer by Peter Wildeford40

PredictionBook is really great for lightweight, private predictions and does everything you're looking for. Metaculus is great for more fully-featured predicting and I believe also supports private questions, but may be a bit of overkill for your use case. A spreadsheet also seems more than sufficient, as others have mentioned.

Thanks. I'll definitely aim to produce them more quickly... this one got away from me.

My understanding is that we also have and might in the future also spend a decent amount of time in a "level 2.5", where some but not all non-essential businesses are open (i.e., no groups larger than ten, restaurants are closed to dine-in, hair salons are open).

A binary search strategy still could be more efficient, depending on the ratio of positives to negatives.

2Steven Byrnes
Don't forget there could be many positives per pool...

Not really an answer, but a statement and a question - I imagine this is literally the least neglected issue in the world right now. How much does that affect the calculus? How much should we defer to people with more domain expertise?

3Kenny
We should defer to people with more domain expertise exactly as much as we would normally do (all else being equal). Almost all of what's posted to and discussed on this site is 'non-original work' (or, at best, original derivative work). That's our comparative advantage! Interpreting and synthesizing other's work is what we do best and this single issue affects both every regular user and any potentially visitor immensely. There's no reason why we can't continue to focus long-term on our current priorities – but the pandemic affects all of our abilities to do so and I don't think any of us can completely ignore this crisis.

It could also be on the list of pros, depending on how one uses LW.

Raemon170

I feel obligated to note that it will in fact only destroy the frontpage of LW, not the rest of the site.

Are you offering to take donations in exchange for pressing the button or not pressing the button?

1Ramiro P.
I thought he was being ambiguous on purpose, so as to maximize donations.
1William_S
I think the better version of this strategy would involve getting competing donations from both sides, using some weighting of total donations for/against pushing the button to set a probability of pressing the button, and tweaking the weighting of the donations such that you expect the probability of pressing the button will be low (because pressing the button threatens to lower the probability of future games of this kind, this is an iterated game rather than a one-shot).
7jefftk
I would give someone my launch codes in exchange for a sufficiently large counterfactual donation. I haven't thought seriously about how large it would need to be, because I don't expect someone to take me up on this, but if you're interested we can talk.

What happens if you don't check off everything for the day?

3VipulNaik
That's a normal part of life :). Any things that I decide to do in a future day, I'll copy/paste to over there, but I usually won't delete the items from the checklist for the day where I didn't complete them (thereby creating a record of things I expected or hoped to do, but didn't). For instance, at https://github.com/vipulnaik/daily-updates/issues/54 I have two undone items.
Load More