Alex Flint

Independent AI alignment researcher

Sequences

The accumulation of knowledge

Wiki Contributions

Comments

It's worse, even, in a certain way, than that: the existence of optimizing systems organized around a certain idea of "natural class" feeds back into more observers observing data that is distributed according to this idea of "natural class", leading to more optimizing systems being built around that idea of "natural class", and so on.

Once a certain idea of "natural class" gains a foothold somewhere, observers will make real changes in the world that further suggest this particular idea of "natural class" to others, and this forms a feedback loop.

If you pin down what a thing refers to according to what that thing was optimized to refer to, then don't you have to look at the structure of the one who did the optimizing in order to work out what a given thing refers to? That is, to work out what the concept "thermodynamics" refers to, it may not be enough to look at the time evolution of the concept "thermodynamics" on its own, I may instead need to know something about the humans who were driving those changes, and the goals held within their minds. But, if this is correct, then doesn't it raise another kind of homunculus-like regression where we were trying to directly explain semantics, but we ended up needing to inquire into yet another mind, the complete understanding of which would require further unpacking of the frames and concepts held in that mind, and the complete understanding of those frames and concepts requiring even further inquiry into a yet earlier mind that was responsible for doing the optimization of those frames and concepts?

There seems to be some real wisdom in this post but given the length and title of the post, you haven't offered much of an exit -- you've just offered a single link to a youtube channel for a trauma healer. If what you say here is true, then this is a bit like offering an alcoholic friend the sum total of one text message containing a single link to the homepage of alcoholics anonymous -- better than nothing, but not worthy of the bombastic title of this post.

friends and family significantly express their concern for my well being

What exact concerns do they have?

  1. You don't get to fucking assume any shit on the basis of "but... ah... come on". If you claim X and someone asks why, then congratulations now you're in a conversation. That means maybe possible shit is about to get real, like some treasured assumptions might soon be questioned. There are no sarcastic facial expressions or clever grunts that get you an out from this. You gotta look now at the thing itself.

I just want to acknowledge the very high emotional weight of this topic.

For about two decades, many of us in this community have been kind of following in the wake of a certain group of very competent people tackling an amazingly frightening problem. In the last couple of years, coincident with a quite rapid upsurge in AI capabilities, that dynamic has really changed. This is truly not a small thing to live through. The situation has real breadth -- it seems good to take it in for a moment, not in order to cultivate anxiety, but in order to really engage with the scope of it.

It's not a small thing at all. We're in this situation where we have AI capabilities kind of out of control. We're not exactly sure where any of the leader's we've previously relied on stand. We all have this opportunity now to take action. The opportunity is simply there. Nobody, actually, can take it away. But there is also the opportunity, truly available to everyone regardless of past actions, to falter, exactly when the world most needs us.

What matters, actually, is what, concretely, we do going forward.

That is correct. I know it seems little weird to generate a new policy on every timestep. The reason it's done that way is that the logical inductor needs to understand the function that maps prices to the quantities that will be purchased, in order to solve for a set of prices that "defeat" the current set of trading algorithms. That function (from prices to quantities) is what I call a "trading policy", and it has to be represented in a particular way -- as a set of syntax tree over trading primitives -- in order for the logical inductor to solve for prices. A trading algorithm is a sequence of such sets of syntax trees, where each element in the sequence is the trading policy for a different time step.

Normally, it would be strange to set up one function (trading algorithms) that generates another function (trading policies) that is different for every timestep. Why not just have the trading algorithm directly output the amount that it wants to buy/sell? The reason is that we need not just the quantity to buy/sell, but that quantity as a function of price, since prices themselves are determined by solving an optimization problem with respect to these functions. Furthermore, these functions (trading policies) have to be represented in a particular way. Therefore it makes most sense to have trading algorithms output a sequence of trading policies, one per timestep.

Thank you for this extraordinarily valuable report!

I believe that what you are engaging in, when you enter into a romantic relationship with either a person or a language model, is a kind of artistic creation. What matters is not whether the person on the "other end" of the relationship is a "real person" but whether the thing you create is of true benefit to the world. If you enter into a romantic relationship with a language model and produce something of true benefit to the world, then the relationship was real, whether or not there was a "real person" on the other end of it (whatever that would mean, even in the case of a human).

This is a relatively banal meta-commentary on reasons people sometimes give for doing worst-case analysis, and the differences between those reasons. The post reads like a list of things with no clear through-line. There is a gesture at an important idea from a Yudkowsky post (the logistic success curve idea) but the post does not helpfully expound that idea. There is a kind of trailing-off towards the end of the post as things like "planning fallacy" seem to have been added to the list with little time taken to place them in the context of the other things on the list. In the "differences between these arguments" section, the post doesn't clearly elucidate deep differences between the arguments, it just lists verbal responses that you might make if you are challenged on plausibility grounds in each case.

Overall, I felt that this post under-delivered on an important topic.

Load More