LESSWRONG
LW

All of tamgent's Comments + Replies

AI alignment researchers don't (seem to) stack

Not a textbook (more for a general audience) but The Alignment Problem by Brian Christian is a pretty good introduction that I reckon most people interested in this would get behind.

How it feels to have your mind hacked by an AI

tamgent2y10

Yes please

How it feels to have your mind hacked by an AI

tamgent2y10

Do you have the transcript from this?

1Pablo Villalobos2y

Yes, it's in Spanish though. I can share it via DM.

How it feels to have your mind hacked by an AI

tamgent2y10

I like it - interesting how much is to do with the specific vulnerabilities of humans, and how humans exploiting other humans' vulnerabilities was what enabled and exacerbated the situation.

How it feels to have your mind hacked by an AI

tamgent2y10

There's also a romantic theme ;-)

How it feels to have your mind hacked by an AI

tamgent2y10

Whilst we're sharing stories...I'll shamelessly promote one of my (very) short stories on human manipulation by AI. In this case the AI is being deliberative at least in achieving its instrumental goals. https://docs.google.com/document/d/1Z1laGUEci9rf_aaDjQKS_IIOAn6D0VtAOZMSqZQlqVM/edit

1tamgent2y

There's also a romantic theme ;-)

How it feels to have your mind hacked by an AI

tamgent2y10

Is it a coincidence that your handle is blaked? (It's a little similar to Blake) Just curious.

4blaked2y

Throwaway account specifically for this post, Blake is used as a verb here :) (or an adjective? past participle? not a native English speaker)

Slack matters more than any outcome

tamgent2y30

Ha! I meant the former, but I like your second interpretation too!

Slack matters more than any outcome

tamgent2y40

I like, 'do the impossible - listen'.

2Valentine2y

Just curious: Do you mean "Do the impossible, which is to listen"? Or "Do the impossible, and then listen"? Or something else?

Let’s think about slowing down AI

tamgent2y30

Recruitment - in my experience often a weeks long process from start to finish, well oiled and systematic and using all the tips from the handbook on organizational behaviour on selection, often with feedback given too. By comparison, some tech companies can take several months to hire, with lots of ad hoc decision-making, no processes around biases or conflicts of interest, and no feedback.

Happy to give more examples if you want by DM.

I should say my sample size is tiny here - I know one gov dept in depth, one tech company in depth and a handful of other tech companies and gov depts not fully from the inside but just from talking with friends that work there, etc.

Shared reality: a key driver of human behavior

tamgent2y10

What exactly is the trust problem you're referring to?

Is it you think that people are not as trusting as you think they should be, in general?

AI alignment is distinct from its near-term applications

tamgent2y10

I also interpreted it this way and was confused for a while. I think your suggested title is clearer, Neel.

Let’s think about slowing down AI

tamgent2y20

Thank you for writing this. On your section 'Obstruction doesn't need discernment' - see also this post that went up on LW a while back called The Regulatory Option: A response to near 0% survival odds. I thought it was an excellent post, and it didn't get anywhere near the attention it deserved, in my view.

Let’s think about slowing down AI

tamgent2y65

I think the two camps are less orthogonal than your examples of privacy and compute reg portray. There's room for plenty of excellent policy interventions that both camps could work together to support. For instance, increasing regulatory requirements for transparency on algorithmic decision-making (and crucially, building a capacity both in regulators and in the market supporting them to enforce this) is something that I think both camps would get behind (the xrisk one because it creates demand for interpretability and more and the other because eg. it's... (read more)

Your posts should be on arXiv

tamgent3y10

To build on the benefit you noted here:

better citability (e.g. if somebody writes an ML paper to be published in ML venues, it gives more credibility to cite arXiv papers than Alignment Forum/LessWrong posts.

There are some areas of work whereby it's useful to not be implicitly communicating that you affiliate with a somewhat weird group like LW or AF folks but you want the content to be read at face value when you share it with folks who are coming from different subcultures and perspectives. I think it'd be hugely valuable for this collection of people who are sharing things.

Your posts should be on arXiv

tamgent3y10

This seems solvable and very much worth solving!

Could we use recommender systems to figure out human values?

tamgent3y20

Agree.

Human values are very complex and most recommender systems don't even try to model them. Instead most of them optimise for things like 'engagement' which they claim to be aligned with a user's 'revealed preference'. This notion of 'revealed preference' is a far cry from true preferences (which are very complex) let alone human values (which are also very complex). I recommend this article for an introduction to some of the issues here: https://medium.com/understanding-recommenders/what-does-it-mean-to-give-someone-what-they-want-the-nature-of-prefere... (read more)

Jack Clark on the realities of AI policy

tamgent3y31

Support.

I would add to this that The Alignment Problem by Brian Christian is a fantastic general audience book that shows how the immediate and long-term AI policy really are facing the same problem and will work better if we all work together.

More Is Different for AI

tamgent3y10

If you know of any more such analyses could you share?

Where are the red lines for AI?

tamgent3y21

I would be interested in seeing a list of any existing work in this area. I think determining the red lines well are going to be very useful for policymakers in the next few years.

Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment

tamgent3y10

Thanks kindly for the offer, I will DM you

My vision of a good future, part I

tamgent3y10

I enjoyed reading this, and look forward to future parts.

Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment

tamgent3y30

I just want to let you know that this table was really useful for me for something I'm working on. Thank you for making it.

2elspood3y

I'm glad you found it useful, even in this form. If the thing you're working on is something you could share, I'd be happy to offer further assistance, if you like.

What Are You Tracking In Your Head?

tamgent3y82

I was explicitly taught to model this physical thing in a wood carving survivalist course.

Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment

tamgent3y10

Thanks for sharing, this is a really nice resource for a number of problems and solutions.

Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment

tamgent3y30

Thanks for writing this, I find the security mindset useful all over the place and appreciate its applicability in this situation.

I have a small thing unrelated to the main post:

To my knowledge, no one tried writing a security test suite that was designed to force developers to conform their applications to the tests. If this was easy, there would have been a market for it.

I think weak versions exist (ie things that do not guarantee/force, but nudge/help). I first learnt to code in a bootcamp which emphasised test-driven development (TDD). One of the f... (read more)

4elspood3y

My project seems to have expired from the OWASP site, but here is an interactive version that should have most of the data: https://periodictable.github.io/ You'll need to mouse over the elements to see the details, so not really mobile friendly, sorry. I agree that linters are a weak form of automatic verification that are actually quite valuable. You can get a lot of mileage out of simply blacklisting unsafe APIs and a little out of clever pattern matching.

Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc

tamgent3y30

Even if you could find some notion of a, b, c we think are features in this DNN - how would you know you were right? How would you know you're on the correct level of abstraction / cognitive separation / carving at the joints instead of right through the spleen and then declaring you've found a, b and c. It seems this is much harder than in a model where you literally assume the structure and features all upfront.

Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc

tamgent3y50

I'm not in these fields, so take everything I say very lightly, but intuitively this feels wrong to me. I understood your point to be something like: the labels are doing all the work. But for me, the labels are not what makes those approaches seem more interpretable than a DNN. It's that in a DNN, the features are not automatically locatable (even pseudonymously so) in a way that lets you figure out the structure /shape that separates them - each training run of the model is learning a new way to separate them and it isn't clear how to know what those sha... (read more)

3tamgent3y

Benign Boundary Violations

tamgent3y50

Siblings do this a lot growing up.

What an actually pessimistic containment strategy looks like

tamgent3y30

I didn't downvote this just because I disagree with it (that's not how I downvote), but if I could hazard a guess at why people might downvote, it'd be that some might think it's a 'thermonuclear idea'.

Intergenerational trauma impeding cooperative existential safety efforts

tamgent3y20

Try Googling a few AI-related topics that no one talked about 5-10 years ago to see if today more people are talking about one or more of those topics.

You can use Google Trends to see search term popularity over time data.

What an actually pessimistic containment strategy looks like

tamgent3y10

These are really interesting, thanks for sharing!

The Regulatory Option: A response to near 0% survival odds

tamgent3y20

So regulatory capture is a thing that can happen. I don't think I got a complete picture of your image of how oversight for dominant companies is scary. You mentioned two possible mechanisms: rubber stamping things, and enforcing sharing of data. It's not clear to me that either of these are obviously contra the goal of slowing things down. Like, maybe sharing of data (I'm imagining you mean to smaller competitors, as in the case of competition regulation) - but data isn't really useful alone, you need to compute and technical capability to use it. More li... (read more)

The Regulatory Option: A response to near 0% survival odds

tamgent3y90

Thank you for your elaboration, I appreciate it a lot, and upvoted for the effort. Here are your clearest points paraphrased as I understand them (sometimes just using your words), and my replies:

The FDA is net negative for health, therefore creating an FDA-for-AI would be likely net negative for the AI challenges.

I don't think you can come to this conclusion, even if I agree with the premise. The counterfactuals are very different. With drugs the counterfactual of no FDA might be some people get more treatments, and some die but many don't, and they ... (read more)

The Regulatory Option: A response to near 0% survival odds

tamgent3y20

No worries, thank you, I look forward to it

8Aiyen3y

Alright, if we want to estimate the likely effects of allowing government regulation of AI, it's worth considering the effects of government regulation of everything else. The FDA's efforts to slow the adoption of new medicines kill far more people than they save (at least according to SlateStarCodex, which has a lot of excellent material on the topic). It is not uncommon for them to take actions that do not even pretend to be about patient safety, such as banning a drug because initial trials make it appear so helpful that it would be "unethical" to have a control group that was denied it in further studies, but apparently not so helpful that it's worth allowing the general public access. I highly recommend Zvi Moskowitz' blog posts on the subject; he's collected more sources and examples on the topic than this margin can contain. There is a very common reaction I have noticed to these sorts of events, where most people brush them off as "just how the world works". A patient dying due to having been deliberately refused medicine is treated as a tragedy, but no one is actually at fault. Meanwhile, a patient who is slightly inconvenienced by an officially approved treatment is treated as strong evidence that we need more medical regulation. Certainly this reaction is not universal, but it's common enough to create an inferential gap between general perceptions of regulation and a claim like "the FDA are mass murderers". However, whether or not you want to call it murder, the real-world effect of medical regulation is primarily to make more people sick and dead. This raises two concerns about having an "FDA for AI", as the original post recommends. First, that the same sorts of errors would occur, even in the absence of malice. And secondly, that malice would in fact be present, and create deliberate problems for the general population. How likely is this? Enough errors would almost certainly occur in AI regulation to make it net negative. Even leaving

The Regulatory Option: A response to near 0% survival odds

tamgent3y40

Another response to the China objection is that similar to regulators copying each other internationally, so do academics/researchers, so if you slow down development of research in some parts of the world you also might slow down development of that research in other parts of the world too. Especially when there's an asymmetry with openness of publication of the research.

The Regulatory Option: A response to near 0% survival odds

tamgent3y20

I'm a bit confused about why you think it's so clearly a bad idea, your points weren't elaborated at all, so I'd absolutely love some elaboration by you or some of the people that voted up your comment, because clearly I'm missing something.

on the reduction of chance of FAI being developed, sure, some of this of course would happen, but slowing down development of solutions to a problem (alignment problem) whilst slowing down growth of the problem itself even more is surely net good for stopping the problem? Especially if you're really worried about the

... (read more)

7Aiyen3y

Hey, that’s a great question. When I get a bit more time I’ll write a clarification. Sorry for the delay.

The Regulatory Option: A response to near 0% survival odds

tamgent3y40

I would also appreciate an elaboration by Aiyen on the suffering risk point.

The Regulatory Option: A response to near 0% survival odds

tamgent3y30

I'd find it really hard to imagine MIRI getting regulated. It's more common that regulation steps in where an end user or consumer could be harmed, and for that you need to deploy products to those users/consumers. As far as I'm aware, this is quite far from the kind of safety research MIRI does.

Sorry I must be really dumb but I didn't understand what you mean by the alignment problem for regulation? Aligning regulators to regulate the important/potentially harmful bits? I don't think this is completely random, even if focused more on trivial issues, they're more likely to support safety teams (although sure the models they'll be working on making safe won't be as capable, that's the point).

[$20K in Prizes] AI Safety Arguments Competition

tamgent3y10

OK I admit this one doesn't fit any audience under any possible story in my mind except a general one. Let me know if you want to read the private (not yet drafted) news article though and I'll have a quick go.

[$20K in Prizes] AI Safety Arguments Competition

tamgent3y10

ML engineers?

[$20K in Prizes] AI Safety Arguments Competition

tamgent3y10

Policymakers?

[$20K in Prizes] AI Safety Arguments Competition

tamgent3y10

OK I have to admit, I didn't think through audience extremely carefully as most of these sound like clickbait news article headlines, but I'll go with tech executives. I do think reasonably good articles could be written explaining the metaphor though.

[$20K in Prizes] AI Safety Arguments Competition

tamgent3y00

"What do condoms have in common with AI?"

1tamgent3y

[$20K in Prizes] AI Safety Arguments Competition

tamgent3y00

"Evolution didn’t optimize for contraception. AI developers don’t optimize against their goals either. Accidents happen. Use protection (optional this last bit)"

1tamgent3y

ML engineers?

[$20K in Prizes] AI Safety Arguments Competition

tamgent3y00

"Evolution wasn’t prepared for contraception. We can do better. When deploying AI, think protection."

1tamgent3y

[$20K in Prizes] AI Safety Arguments Competition

tamgent3y00

"We tricked nature with contraception; one day, AI could trick us too."

1tamgent3y

Policymakers?

[$20K in Prizes] AI Safety Arguments Competition

tamgent3y20

Ah, instrumental and epistemic rationality clash again

Narrative Syncing

tamgent3y50

I am curious about how you felt when writing this bit:

There's no need to make reference to culture.

5ambigram3y

I guess I'd say frustrated, worried, confused. I was somewhat surprised/alarmed by the conclusion that Alec was actually trying to request information on how to be considered part of the group. It seems to me like a rather uncharitable interpretation of Alec's response, to assume that he just wants to figure out how to belong, rather than genuinely desiring to find out how best to contribute. I would be rather insulted by this response, because it implies that I am looking for a vetted-by-the-group answer, and also seems to be criticising me for asking Anna about opimal careers. Firstly, that was never my intent. Secondly, asking an expert for their opinion sounds like a perfectly reasonable course of action to me. However, this does assume that Alec shares my motivations and assumptions. I'm not sure of my assumptions/beliefs/conclusions though. I might be missing context (e.g. I don't know what bay area is like, or the cultural norms), and I didn't really understand the essay (I found the example too distracting for me to focus on the concept of narrative syncing - I like the new examples much more).

Narrative Syncing

tamgent3y10

I think the difference between 1 and 3 is that in 3 there is explicit acknowledgement of the idea that what the person might be asking for is "what is the done thing around here" by attempt to directly answer the inferred subtext.

Also, I like your revised answer.