LESSWRONG
LW

All of davekasten's Comments + Replies

Ok, so it seems clear that we are, for better or worse, likely going to try to get AGI to do our alignment homework.

Who has thought through all the other homework we might give AGI that is as good of an idea, assuming a model that isn't an instant-game-over for us? E.G., I remember @Buck rattling off a list of other ideas that he had in his The Curve talk, but I feel like I haven't seen the list of, e.g., "here are all the ways I would like to run an automated counterintelligence sweep of my organization" ideas.

(Yes, obviously, if the AI is sne... (read more)

2Quinn3d

I'm working on making sure we get high quality critical systems software out of early AGI. Hardened infrastructure buys us a lot in the slightly crazy story of "self-exfiltrated model attacks the power grid", but buys us even more in less crazy stories about all the software modules adjacent to AGI having vulnerabilities rapidly patched at crunchtime.

3Ebenezer Dukakis7d

I think unlearning could be a good fit for automated alignment research. Unlearning could be a very general tool to address a lot of AI threat models. It might be possible to unlearn deception, scheming, manipulation of humans, cybersecurity, etc. I challenge you to come up with an AI safety failure story that can't, in principle, be countered through targeted unlearning in some way, shape, or form. Relative to some other kinds of alignment research, unlearning seems easy to automate, since you can optimize metrics for how well things have been unlearned. I like this post.

4Thane Ruthenis7d

Technology for efficient human uploading. Ideally backed by theory we can independently verify as correct and doing what it's intended to do (rather than e. g. replacing the human upload with a copy of the AGI who developed this technology).

5trevor7d

How to build a lie detector app/program to release to the public (preferably packaged with advice/ideas on ways to use and strategies for marketing the app, e.g. packaging it with an animal body-language to english translator).

1yams7d

Preliminary thoughts from Ryan Greenblatt on this here.

Buck7d142

@ryan_greenblatt is working on a list of alignment research applications. For control applications, you might enjoy the long list of control techniques in our original post.

nikola's Shortform

davekasten10d21

Huh? "fighting election misinformation" is not a sentence on this page as far as I can tell. And if you click through to the election page, you will see that the elections content is them praising a bipartisan bill backed by some of the biggest pro-Trump senators.

-3ChristianKl9d

You are right, the wording is even worse. It says "Partnering with governments to fight misinformation globally". That would be more than just "election misinformation". I just tested that ChatGPT is willing to answer "Tell me about the latest announcement of the trump administration about cutting USAID funding?" while Gemini isn't willing to answer that question, so in practice their policy isn't as bad as Gemini's. It's still sounds different from what Elon Musk advocates as "truth aligned"-AI. Lobbyists should be able to use AI to inform themselves about proposed laws. If you would ask David Sachs as the person who coordinates AI policy, I'm very certain that he supports Elon Musks idea where AI should help people to learn the truth about political questions. If they wanted to appeal to the current administration they could say something about the importance of AI to tell truthful information and not mislead the user instead of speaking about "fighting misinformation".

-1Maxwell Peterson10d

The Elections panel on OP’s image says “combat disinformation”, so while you’re technically right, I think Christian’s “fighting election misinformation” rephrasing is close enough to make no difference.

nikola's Shortform

davekasten12d81

Without commenting on any strategic astronomy and neurology, it is worth noting that "bias", at least, is a major concern of the new administration (e.g., the Republican chair of the House Financial Services Committee is actually extremely worried about algorithmic bias being used for housing and financial discrimination and has given speeches about this).

nikola's Shortform

davekasten13d71

I am not a fan, but it is worth noting that these are the issues that many politicians bring up already, if they're unfamiliar with the more catastrophic risks. Only one missing on there is job loss. So while this choice by OpenAI sucks, it sort of usefully represents a social fact about the policy waters they swim in.

3ChristianKl10d

The page does not seem to o be directed at what's politically advantageous. The Trump administration who fights DEI is not looking favorably at the mission to prevent AI from reinforcing stereotypes even if those stereotypes are true. "Fighting election misinformation" is similarly a keyword that likely invite skepticism from the Trump administration. They just shut down USAID and their investment in "combating misinformation" is one of the reasons for that. It seems time more likely that they hired a bunch of woke and deep state people into their safety team and this reflects the priorities of those people.

7aogara12d

I’m surprised they list bias and disinformation, as I doubt those concerns will be popular with the new administration. (Maybe this is a galaxy brained attempt to make AI safety seem left-coded, but I doubt it. Seems more likely that x-risk focused people left the company while traditional AI ethics people stuck around and rewrote the website.)