All of tamgent's Comments + Replies

Not a textbook (more for a general audience) but The Alignment Problem by Brian Christian is a pretty good introduction that I reckon most people interested in this would get behind.

Do you have the transcript from this?

1Pablo Villalobos
Yes, it's in Spanish though. I can share it via DM.

I like it - interesting how much is to do with the specific vulnerabilities of humans, and how humans exploiting other humans' vulnerabilities was what enabled and exacerbated the situation.

Whilst we're sharing stories...I'll shamelessly promote one of my (very) short stories on human manipulation by AI. In this case the AI is being deliberative at least in achieving its instrumental goals. https://docs.google.com/document/d/1Z1laGUEci9rf_aaDjQKS_IIOAn6D0VtAOZMSqZQlqVM/edit

1tamgent
There's also a romantic theme ;-)

Is it a coincidence that your handle is blaked? (It's a little similar to Blake) Just curious.

4blaked
Throwaway account specifically for this post, Blake is used as a verb here :) (or an adjective? past participle? not a native English speaker)

Ha! I meant the former, but I like your second interpretation too!

I like, 'do the impossible - listen'.

2Valentine
Just curious: Do you mean "Do the impossible, which is to listen"? Or "Do the impossible, and then listen"? Or something else?

Recruitment - in my experience often a weeks long process from start to finish, well oiled and systematic and using all the tips from the handbook on organizational behaviour on selection, often with feedback given too. By comparison, some tech companies can take several months to hire, with lots of ad hoc decision-making, no processes around biases or conflicts of interest, and no feedback.

Happy to give more examples if you want by DM.

I should say my sample size is tiny here - I know one gov dept in depth, one tech company in depth and a handful of other tech companies and gov depts not fully from the inside but just from talking with friends that work there, etc.

What exactly is the trust problem you're referring to?

Is it you think that people are not as trusting as you think they should be, in general?

I also interpreted it this way and was confused for a while. I think your suggested title is clearer, Neel.

Thank you for writing this. On your section 'Obstruction doesn't need discernment' - see also this post that went up on LW a while back called The Regulatory Option: A response to near 0% survival odds. I thought it was an excellent post, and it didn't get anywhere near the attention it deserved, in my view.

I think the two camps are less orthogonal than your examples of privacy and compute reg portray. There's room for plenty of excellent policy interventions that both camps could work together to support. For instance, increasing regulatory requirements for transparency on algorithmic decision-making (and crucially, building a capacity both in regulators and in the market supporting them to enforce this) is something that I think both camps would get behind (the xrisk one because it creates demand for interpretability and more and the other because eg. it's... (read more)

To build on the benefit you noted here:

  1. better citability (e.g. if somebody writes an ML paper to be published in ML venues, it gives more credibility to cite arXiv papers than Alignment Forum/LessWrong posts.

There are some areas of work whereby it's useful to not be implicitly communicating that you affiliate with a somewhat weird group like LW or AF folks but you want the content to be read at face value when you share it with folks who are coming from different subcultures and perspectives. I think it'd be hugely valuable for this collection of people who are sharing things.

This seems solvable and very much worth solving!

Agree.

Human values are very complex and most recommender systems don't even try to model them. Instead most of them optimise for things like 'engagement' which they claim to be aligned with a user's 'revealed preference'. This notion of 'revealed preference' is a far cry from true preferences (which are very complex) let alone human values (which are also very complex). I recommend this article for an introduction to some of the issues here: https://medium.com/understanding-recommenders/what-does-it-mean-to-give-someone-what-they-want-the-nature-of-prefere... (read more)

Support.

I would add to this that The Alignment Problem by Brian Christian is a fantastic general audience book that shows how the immediate and long-term AI policy really are facing the same problem and will work better if we all work together.

If you know of any more such analyses could you share?

I would be interested in seeing a list of any existing work in this area. I think determining the red lines well are going to be very useful for policymakers in the next few years.

I enjoyed reading this, and look forward to future parts.

I just want to let you know that this table was really useful for me for something I'm working on. Thank you for making it.

2elspood
I'm glad you found it useful, even in this form. If the thing you're working on is something you could share, I'd be happy to offer further assistance, if you like.

I was explicitly taught to model this physical thing in a wood carving survivalist course.

Thanks for sharing, this is a really nice resource for a number of problems and solutions.

Thanks for writing this, I find the security mindset useful all over the place and appreciate its applicability in this situation.

I have a small thing unrelated to the main post:

To my knowledge, no one tried writing a security test suite that was designed to force developers to conform their applications to the tests. If this was easy, there would have been a market for it.

I think weak versions exist (ie things that do not guarantee/force, but nudge/help). I first learnt to code in a bootcamp which emphasised test-driven development (TDD). One of the f... (read more)

4elspood
My project seems to have expired from the OWASP site, but here is an interactive version that should have most of the data: https://periodictable.github.io/ You'll need to mouse over the elements to see the details, so not really mobile friendly, sorry. I agree that linters are a weak form of automatic verification that are actually quite valuable. You can get a lot of mileage out of simply blacklisting unsafe APIs and a little out of clever pattern matching.

Even if you could find some notion of a, b, c we think are features in this DNN - how would you know you were right? How would you know you're on the correct level of abstraction / cognitive separation / carving at the joints instead of right through the spleen and then declaring you've found a, b and c. It seems this is much harder than in a model where you literally assume the structure and features all upfront.

I'm not in these fields, so take everything I say very lightly, but intuitively this feels wrong to me. I understood your point to be something like: the labels are doing all the work. But for me, the labels are not what makes those approaches seem more interpretable than a DNN. It's that in a DNN, the features are not automatically locatable (even pseudonymously so) in a way that lets you figure out the structure /shape that separates them - each training run of the model is learning a new way to separate them and it isn't clear how to know what those sha... (read more)

3tamgent
Even if you could find some notion of a, b, c we think are features in this DNN - how would you know you were right? How would you know you're on the correct level of abstraction / cognitive separation / carving at the joints instead of right through the spleen and then declaring you've found a, b and c. It seems this is much harder than in a model where you literally assume the structure and features all upfront.

Siblings do this a lot growing up.

I didn't downvote this just because I disagree with it (that's not how I downvote), but if I could hazard a guess at why people might downvote, it'd be that some might think it's a 'thermonuclear idea'.

Try Googling a few AI-related topics that no one talked about 5-10 years ago to see if today more people are talking about one or more of those topics.

You can use Google Trends to see search term popularity over time data.

These are really interesting, thanks for sharing!

So regulatory capture is a thing that can happen. I don't think I got a complete picture of your image of how oversight for dominant companies is scary. You mentioned two possible mechanisms: rubber stamping things, and enforcing sharing of data. It's not clear to me that either of these are obviously contra the goal of slowing things down. Like, maybe sharing of data (I'm imagining you mean to smaller competitors, as in the case of competition regulation) - but data isn't really useful alone, you need to compute and technical capability to use it. More li... (read more)

Thank you for your elaboration, I appreciate it a lot, and upvoted for the effort. Here are your clearest points paraphrased as I understand them (sometimes just using your words), and my replies:

  1. The FDA is net negative for health, therefore creating an FDA-for-AI would be likely net negative for the AI challenges.

I don't think you can come to this conclusion, even if I agree with the premise. The counterfactuals are very different. With drugs the counterfactual of no FDA might be some people get more treatments, and some die but many don't, and they ... (read more)

No worries, thank you, I look forward to it

8Aiyen
Alright, if we want to estimate the likely effects of allowing government regulation of AI, it's worth considering the effects of government regulation of everything else.  The FDA's efforts to slow the adoption of new medicines kill far more people than they save (at least according to SlateStarCodex, which has a lot of excellent material on the topic).  It is not uncommon for them to take actions that do not even pretend to be about patient safety, such as banning a drug because initial trials make it appear so helpful that it would be "unethical" to have a control group that was denied it in further studies, but apparently not so helpful that it's worth allowing the general public access.  I highly recommend Zvi Moskowitz' blog posts on the subject; he's collected more sources and examples on the topic than this margin can contain. There is a very common reaction I have noticed to these sorts of events, where most people brush them off as "just how the world works".  A patient dying due to having been deliberately refused medicine is treated as a tragedy, but no one is actually at fault.  Meanwhile, a patient who is slightly inconvenienced by an officially approved treatment is treated as strong evidence that we need more medical regulation.  Certainly this reaction is not universal, but it's common enough to create an inferential gap between general perceptions of regulation and a claim like "the FDA are mass murderers".  However, whether or not you want to call it murder, the real-world effect of medical regulation is primarily to make more people sick and dead.   This raises two concerns about having an "FDA for AI", as the original post recommends.  First, that the same sorts of errors would occur, even in the absence of malice.  And secondly, that malice would in fact be present, and create deliberate problems for the general population.  How likely is this? Enough errors would almost certainly occur in AI regulation to make it net negative.  Even leaving

Another response to the China objection is that similar to regulators copying each other internationally, so do academics/researchers, so if you slow down development of research in some parts of the world you also might slow down development of that research in other parts of the world too. Especially when there's an asymmetry with openness of publication of the research.

I'm a bit confused about why you think it's so clearly a bad idea, your points weren't elaborated at all, so I'd absolutely love some elaboration by you or some of the people that voted up your comment, because clearly I'm missing something.

  • on the reduction of chance of FAI being developed, sure, some of this of course would happen, but slowing down development of solutions to a problem (alignment problem) whilst slowing down growth of the problem itself even more is surely net good for stopping the problem? Especially if you're really worried about the
... (read more)
7Aiyen
Hey, that’s a great question. When I get a bit more time I’ll write a clarification. Sorry for the delay.

I would also appreciate an elaboration by Aiyen on the suffering risk point.

I'd find it really hard to imagine MIRI getting regulated. It's more common that regulation steps in where an end user or consumer could be harmed, and for that you need to deploy products to those users/consumers. As far as I'm aware, this is quite far from the kind of safety research MIRI does.

Sorry I must be really dumb but I didn't understand what you mean by the alignment problem for regulation? Aligning regulators to regulate the important/potentially harmful bits? I don't think this is completely random, even if focused more on trivial issues, they're more likely to support safety teams (although sure the models they'll be working on making safe won't be as capable, that's the point).

OK I admit this one doesn't fit any audience under any possible story in my mind except a general one. Let me know if you want to read the private (not yet drafted) news article though and I'll have a quick go.

OK I have to admit, I didn't think through audience extremely carefully as most of these sound like clickbait news article headlines, but I'll go with tech executives. I do think reasonably good articles could be written explaining the metaphor though.

"What do condoms have in common with AI?"

1tamgent
OK I admit this one doesn't fit any audience under any possible story in my mind except a general one. Let me know if you want to read the private (not yet drafted) news article though and I'll have a quick go.

"Evolution didn’t optimize for contraception. AI developers don’t optimize against their goals either. Accidents happen. Use protection (optional this last bit)"

1tamgent
ML engineers?

"Evolution wasn’t prepared for contraception. We can do better. When deploying AI, think protection."

1tamgent
OK I have to admit, I didn't think through audience extremely carefully as most of these sound like clickbait news article headlines, but I'll go with tech executives. I do think reasonably good articles could be written explaining the metaphor though.

"We tricked nature with contraception; one day, AI could trick us too."

1tamgent
Policymakers?

Ah, instrumental and epistemic rationality clash again

I am curious about how you felt when writing this bit:

There's no need to make reference to culture.

5ambigram
I guess I'd say frustrated, worried, confused. I was somewhat surprised/alarmed by the conclusion that Alec was actually trying to request information on how to be considered part of the group. It seems to me like a rather uncharitable interpretation of Alec's response, to assume that he just wants to figure out how to belong, rather than genuinely desiring to find out how best to contribute. I would be rather insulted by this response, because it implies that I am looking for a vetted-by-the-group answer, and also seems to be criticising me for asking Anna about opimal careers. Firstly, that was never my intent. Secondly, asking an expert for their opinion sounds like a perfectly reasonable course of action to me. However, this does assume that Alec shares my motivations and assumptions. I'm not sure of my assumptions/beliefs/conclusions though. I might be missing context (e.g. I don't know what bay area is like, or the cultural norms), and I didn't really understand the essay (I found the example too distracting for me to focus on the concept of narrative syncing - I like the new examples much more).

I think the difference between 1 and 3 is that in 3 there is explicit acknowledgement of the idea that what the person might be asking for is "what is the done thing around here" by attempt to directly answer the inferred subtext.

Also, I like your revised answer.

Load More