All of Alex_Altair's Comments + Replies

Alex_Altair2mo20

Indeed, we know about those posts! Lmk if you have a recommendation for a better textbook-level treatment of any of it (modern papers etc). So far the grey book feels pretty standard in terms of pedagogical quality.

Shallow review of technical AI safety, 2024

Alex_Altair2mo40

Has anyone started accumulating errata for the SLT grey book? (I.e. Algebraic Geometry and Statistical Learning Theory by Sumio Watanabe.) This page on Watanabe's website seems to just be about the Japanese version of the book.

2Alexander Gietelink Oldenziel2mo

meta note that I would currently recommend against spending much time with Watanabe's original texts for most people interested in SLT. Good to be aware of the overall outlines but much of what most people would want to know is better explained elsewhere [e.g. I would recommend first reading most posts with the SLT tag on LessWrong before doing a deep dive in Watanabe] meta note * if you do insist on reading Watanabe, I highly recommend you make use of AI assistance. I.e. download a pdf, cut down them down into chapters and upload to your favorite LLM.

Alex_Altair3moΩ240

Some small corrections/additions to my section ("Altair agent foundations"). I'm currently calling it "Dovetail research". That's not publicly written anywhere yet, but if it were listed as that here, it might help people who are searching for it later this year.

Which orthodox alignment problems could it help with?: 9. Humans cannot be first-class parties to a superintelligent value handshake

I wouldn't put number 9. Not intended to "solve" most of these problems, but is intended to help make progress on understanding the nature of the problems through... (read more)

2technicalities3mo

Done, thanks!

Learn to write well BEFORE you have something worth saying

Alex_Altair3mo10

"Gain writing skills BEFORE..."

Shallow review of live agendas in alignment & safety

Alex_Altair3mo31

FWIW I can't really tell what this website is supposed to be/do by looking at the landing page and menu

Learn to write well BEFORE you have something worth saying

Alex_Altair3mo3-5

The title reads ambiguous to me; I can't tell if you mean "learn to [write well] before" or "learn to write [well before]".

2eukaryote3mo

😅 You know, I was thinking of calling it "Learn to write good BEFORE you have something worth saying", but figured I'd get some people rolling their eyes at the grammar of "write good" in a post purporting to offer writing advice. This would however have disambiguated the point you mentioned, which I hadn't thought about. Really goes to show you something or other.

Hire (or Become) a Thinking Assistant

Alex_Altair3mo50

DM me if you're interested.

I, too am quite interested in trialing more people for roles on this spectrum.

Alex_Altair3mo30

Thanks. Is "pass@1" some kind of lingo? (It seems like an ungoogleable term.)

7Vladimir_Nesov3mo

Pass@k means that at least one of k attempts passes, according to an oracle verifier. Evaluating with pass@k is cheating when k is not 1 (but still interesting to observe), the non-cheating option is best-of-k where the system needs to pick out the best attempt on its own. So saying pass@1 means you are not cheating in evaluation in this way.

Alex_Altair3mo40

I guess one thing I want to know is like... how exactly does the scoring work? I can imagine something like, they ran the model a zillion times on each question, and if any one of the answers was right, that got counted in the light blue bar. Something that plainly silly probably isn't what happened, but it could be something similar.

If it actually just submitted one answer to each question and got a quarter of them right, then I think it doesn't particularly matter to me how much compute it used.

4Zach Stein-Perlman3mo

It was one submission, apparently.

Alex_Altair3mo76

On the livestream, Mark Chen says the 25.2% was achieved "in aggressive test-time settings". Does that just mean more compute?

2Charlie Steiner3mo

It likely means running the AI many times and submitting the most common answer from the AI as the final answer.

1Jonas Hallgren3mo

Extremely long chain of thought, no?

Alex_Altair3mo101

I wish they would tell us what the dark vs light blue means. Specifically, for the FrontierMath benchmark, the dark blue looks like it's around 8% (rather than the light blue at 25.2%). Which like, I dunno, maybe this is nit picking, but 25% on FrontierMath seems like a BIG deal, and I'd like to know how much to be updating my beliefs.

Logan Riggs3mo162

From an apparent author on reddit:

[Frontier Math is composed of] 25% T1 = IMO/undergrad style problems, 50% T2 = grad/qualifying exam style porblems, 25% T3 = early researcher problems

The comment was responding to a claim that Terence Tao said he could only solve a small percentage of questions, but Terence was only sent the T3 questions.

9Eric Neyman3mo

My random guess is: * The dark blue bar corresponds to the testing conditions under which the previous SOTA was 2%. * The light blue bar doesn't cheat (e.g. doesn't let the model run many times and then see if it gets it right on any one of those times) but spends more compute than one would realistically spend (e.g. more than how much you could pay a mathematician to solve the problem), perhaps by running the model 100 to 1000 times and then having the model look at all the runs and try to figure out which run had the most compelling-seeming reasoning.

7Alex_Altair3mo

On the livestream, Mark Chen says the 25.2% was achieved "in aggressive test-time settings". Does that just mean more compute?

The 2023 LessWrong Review: The Basic Ask

Alex_Altair4mo90

things are almost never greater than the sum of their parts Because Reductionism

Isn't it more like, the value of the sum of the things is greater than the sum of the value of each of the things? That is, $f (a + b) > f (a) + f (b)$ (where perhaps $f$ is a utility function). That seems totally normal and not-at-all at odds with Reductionism.

2cubefox4mo

More specifically, for a Jeffrey utility function U defined over a Boolean algebra of propositions, and some propositions a,b, "the sum is greater than its parts" would be expressed as the condition U(a∧b)>U(a)+U(B) (which is, of course, not a theorem). The respective general theorem only states that U(a∧b)=U(a)+U(b∣a), which follows from the definition of conditional utility U(b∣a)=U(a∧b)−U(a).

4habryka4mo

I think people usually want that sentence to mean something confused. I agree it has fine interpretations, but people by default use it as a semantic stopsign to stop looking for ways the individual parts mechanistically interface with each other to produce the higher utility thing than the individual parts naively summed would (see also https://www.lesswrong.com/posts/8QzZKw9WHRxjR4948/the-futility-of-emergence )

2Raemon4mo

Oliver specifically wanted me to include the word "naive" because obviously there are sensible things people could mean by this but they phrase things overly strongly and the Lightcone Team's Autism is Powerful. Yes I think your equation looks right.

A Straightforward Explanation of the Good Regulator Theorem

Alex_Altair4mo100

I'd vote for removing the stage "developing some sort of polytime solution" and just calling 4 "developing a practical solution". I think listing that extra step is coming from the perspective of something who's more heavily involved in complexity classes. We're usually interested in polynomial time algorithms because they're usually practical, but there are lots of contexts where practicality doesn't require a polynomial time algorithm, or really, where we're just not working in a context where it's natural to think in terms of algorithms with run-times.

5Noosphere894mo

What contexts is it not natural to think in terms of algorithms with specific run-times?

Alex_Altair4mo62

Thank you for writing this! Your description in the beginning about trying to read about the GRT and coming across a sequence of resources, each of which didn't do quite what you wanted, is a precise description of the path I also followed. I gave up at the end, wishing that someone would write an explainer, and you have written exactly the explainer that I wanted!

Habryka's Shortform Feed

Alex_Altair5mo70

Positive feedback, I am happy to see the comment karma arrows pointing up and down instead of left and right. I have some degree of left-right confusion and was always click and unclicking my comments votes to figure out which was up and down.

Also appreciate that the read time got put back into main posts.

(Comment font stuff looks totally fine to me, both before and after this change.)

Alex_Altair5mo20

[Some thoughts that are similar but different to my previous comment;]

I suspect you can often just prove the behavioral selection theorem and structural selection theorem in separate, almost independent steps.

Prove a behavioral theorem
add in a structural assumption
prove that behavioral result plus structural assumption implies structural result.

Behavior essentially serves as an "interface", and a given behavior can be implemented by any number of different structures. So it would make sense that you need to prove something about structure separately (and t... (read more)

Alex_Altair5mo20

For some reason the "only if" always throws me off. It reminds me of the unless keyword in ruby, which is equivalent to if not, but somehow always made my brain segfault.

Alex_Altair5mo40

It's maybe also worth saying that any other description method is a subset of programs (or is incomputable and therefore not what real-world AI systems are). So if the theoretical issues in AIT bother you, you can probably make a similar argument using a programming language with no while loop, or I dunno, finite MDPs whose probability distributions are Gaussian with finite parameter descriptions.

Seeking AI Alignment Tutor/Advisor: $100–150/hr

Alex_Altair5mo60

Yeah, I think structural selection theorems matter a lot, for reasons I discussed here.

This is also one reason why I continue to be excited about Algorithmic Information Theory. Computable functions are behavioral, but programs (= algorithms) are structural! The fact that programs can be expressed in the homogeneous language of finite binary strings gives a clear way to select for structure; just limit the length of your program. We even know exactly how this mathematical parameter translates into real-world systems, because we can know exactly how many bi... (read more)

4Alex_Altair5mo

Alex_Altair6mo124

FWIW I think this would be a lot less like "tutoring" and a lot more like "paying people to tell you their opinions". Which is a fine thing to want to do, but I just want to make sure you don't think there's any kind of objective curriculum that comprises AI alignment.

3Seth Herd6mo

There's a lot of detailed arguments for why alignment it's going to be more or less difficult. Understanding all of those arguments, starting with the most respected, is a curriculum. Just pulling a number out of your own limited perspective is a whole different thing.

8habryka6mo

Hmm, a bit confused what this means. There is I think a relatively large set of skills and declarative knowledge that is pretty verifiable and objective and associated with AI Alignment. It is the case that there is no consensus on what solutions to the AI Alignment problem might look like, but I think the basic arguments for why this is a thing to be concerned about are pretty straightforward and are associated with some pretty objective arguments.

Work with me on agent foundations: independent fellowship

Alex_Altair6mo20

Nice! Yeah I'd be happy to chat about that, and also happy to get referrals of any other researchers who might be interested in receiving this funding to work on it.

1Cole Wyeth6mo

Cool, I'll DM you.

Alex_Altair6mo110

Note to readers; it is an obligatory warning on any post like this that you should not run random scripts downloaded from the internet without reading them to see what they do, because there are many harmful things they could be doing.

6Richard_Kennaway6mo

This one is sufficiently egregious that it should be deleted and the author banned. It's at best spam, at worst malware. Fortunately, the obfuscated URL does not actually work.

Work with me on agent foundations: independent fellowship

Alex_Altair6mo30

<3!

Perplexity wins my AI race

Alex_Altair7mo52

FWIW I have used Perplexity twice since you mentioned it, it was somewhat helpful both times, but also, both times the citations had errors. By that I mean it would say something and then put a citation number next to it, but what it said was not in the cited document.

2Elizabeth7mo

I got my first hallucination shortly after posting this- it's definitely not perfect. But I still find the ease of checking a big improvement over other models.

My Apartment Art Commission Process

Alex_Altair7mo40

Aren’t they sick as hell???

Can confirm, these are sick as hell

Alex_Altair7mo50

I know that there's something called the Lyapunov exponent. Could we "diminish the chaos" if we use logarithms, like with the Richter scale for earthquakes?

This is a neat question. I think the answer is no, and here's my attempt to describe why.

The Lyapunov exponent measures the difference between the trajectories over time. If your system is the double pendulum, you need to be able to take two random states of the double pendulum and say how different they are. So it's not like you're measuring the speed, or the length, or something like that. And if you ... (read more)

1Hudjefa7mo

Hopefully, not talking out of my hat, but the difference between the final states of a double pendulum can be typed: 1. Somewhere in the middle of the pendulum's journey through space and time. I've seen this visually and true there's divergence. This divergence is based on measurement of the pendulum's position in space at a given time. So with initial state A, the pendulum at time Tn was at position P1 while beginning with initial stateB(|A−B|≈0), the pendulum at time Tn was at position P2. The alleged divergence is the difference |P1−P2|, oui? Take in absolute terms, |P1−P2|=106, but logarithmically, log|P1−P2|=only 6. 2. At the very end when the pendulum comes to rest. There's no divergence there, oui?

Alex_Altair7mo30

It possesses this subjective element (what we consider to be negligible differences) that seems to undermine its standing as a legitimate mathematical discipline.

I think I see what you're getting at here, but no, "chaotic" is a mathematical property that systems (of equations) either have or don't have. The idea behind sensitive dependence on initial conditions is that any difference, no matter how small, will eventually lead to diverging trajectories. Since it will happen for arbitrarily small differences, it will definitely happen for whatever difference... (read more)

The paper Gleick was referring to is this one, but it would be a lot of work to discern whether it was causal in getting telephone companies to do anything different. It sounds to me like the paper is saying that the particular telephone error data they were looking at could not be well-modeled as IID, nor could it be well-modeled as a standard Markov chain; instead, it was best modeled as a statistical fractal, which corresponds to a heavy-tailed distribution somehow.

Definitely on the order of "tens of hours", but it'd be hard to say more specifically. Also, almost all of that time (at least for me) went into learning stuff that didn't go into this post. Partly that's because the project is broader than this post, and partly because I have my own research priority of understanding systems theory pretty well.

Alex_Altair7mo22

For what it's worth, I think you're getting downvoted in part because what you write seems to indicate that you didn't read the post.

Huh, interesting! So the way I'm thinking about this is, your loss landscape determines the attractor/repellor structure of your phase space (= network parameter space). For a (reasonable) optimization algorithm to have chaotic behavior on that landscape, it seems like the landscape would either have to have 1) a positive-measure flat region, on which the dynamics were ergodic, or 2) a strange attractor, which seems more plausible.

I'm not sure how that relates to the above link; it mentions the parameters "diverging", but it's not clear to me how neural network weights can diverge; aren't they bounded?

I'm curious about this part;

even though the motion of the trebuchet with sling isn't chaotic during the throw, it can be made chaotic by just varying the initial conditions, which rules out a simple closed form solution for non-chaotic initial conditions

Do you know what theorems/whatever this is from? It seems to me that if you know that "throws" constitute a subset of phase space that isn't chaotic, then you should be able to have a closed-form solution for those trajectories.

6Hastings7mo

So this turns out to be a doozy, but it's really fascinating. I don't have an answer- an answer would look like "normal chaotic differential equations don't have general exact solutions" or "there is no relationship between being chaotic and not having an exact solution" but deciding which is which won't just require proof, it would also require good definitions of "normal differential equation" and "exact solution." (the good definition of "general" is "initial conditions with exact solutions have nonzero measure") I have some work. A chaotic differential equation has to be nonlinear and at least third order- and almost all nonlinear third order differential equations don't admit general exact solutions. So, the statement "as a heuristic, chaotic differential equations don't have general exact solutions" seems pretty unimpressive. However, I wrongly believed the strong version of this heuristic and that belief was useful: I wanted to model trebuchet arm-sling dynamics, recognized that the true form could not be solved, and switched to a simplified model based on what simplifications would prevent chaos (no gravity, sling is wrapped around a drum instead of fixed to the tip of an arm) and then was able to find an exact solution (note that this solvable system starts as nonlinear 4th order, but can be reduced using conservation of angular momentum hacks) Now, it is known that a chaotic difference equation can have an exact solution: the equation x(n+1) = 2x(n) mod 1 is formally chaotic and has the exact solution 2^n x mod 1. A chaotic differential equation exhibiting chaotic behaviour can have an exact solution if it has discontinuous derivatives because this difference equation can be constructed: equation is in three variables x, y, z dz/dt always equals 1 if 0 < z < 1: if x > 0: dx/dt = 0 dy dt = 1 if x < 0: dx/dt = 0 dy/dt = -1 if 1 < z < 2: if y > 0 dx/dx = -.5 dy dt = 0 if y < 0 dy dt

1Hastings7mo

I am suddenly unsure whether it is true! It certainly would have to be more specific than how I phrased it, as it is trivially false if the differential equation is allowed to be discontinuous between closed form regions and chaotic regions

It turns out I have the ESR version of firefox on this particular computer: Firefox 115.14.0esr (64-bit). Also tried it in incognito, and with all browser extensions turned off, and checked multiple posts that used sections.

2Raemon7mo

Yeah I just replicated this with the mac version of the ESR version.

Habryka's Shortform Feed

Alex_Altair7mo40

My overall review is, seems fine, some pros and some cons, mostly looks/feels the same to me. Some details;

I had also started feeling like the stuff between the title and the start of the post content was cluttered.
I think my biggest current annoyance is the TOC on the left sidebar. This has actually disappeared for me, and I don't see it on hover-over, which I assume is maybe just a firefox bug or something. But even before this update, I didn't like the TOC. Specifically, you guys had made it so that there was spacing between the sections that was suppos

... (read more)

It does not! At least, not anywhere that I've tried hovering.

2habryka7mo

Huh, want to post your browser and version number? Could be a bug related to that (it definitely works fine in Chrome, FF and Safari for me)

2Raemon7mo

It definitely should appear if you hover over it – doublechecking that on the ones you're trying it on, there are actual headings in the post such that there'd be a ToC?

1Mateusz Bagiński7mo

It does for me

A simple model of math skill

Is it just me, or did the table of contents for posts disappear? The left sidebar just has lines and dots now.

1[comment deleted]7mo

1[anonymous]7mo

Does it reappear when you hover your cursor over it?

Alex_Altair8mo40

There is a little crackpot voice in my head that says something like, "the real numbers are dumb and bad and we don't need them!" I don't give it a lot of time, but I do let that voice exist in the back of my mind trying to work out other possible foundations. A related issue here is that it seems to me that one should be able to have a uniform probability distribution over a countable set of numbers. Perhaps one could do that by introducing infinitesimals.

1Michael Roe8mo

I guess you could view that random number in [0,1] as a choice sequence (cf. intuitionism) and you're allowed to see any finite number of bits of it by flipping coins to see what those bits are, but you don't know the answer to any question that would require seeing infitely many bits...

2leogao8mo

How do you sample uniformly from the integers?

1LGS8mo

I think the problem to grapple with is that I can cover the rationals in [0,1] with countably many intervals of total length only 1/2 (eg enumerate rationals in [0,1], and place interval of length 1/4 around first rational, interval of length 1/8 around the second, etc). This is not possible with reals -- that's the insight that makes measure theory work! The covering means that the rationals in an interval cannot have a well defined length or measure which behaves reasonably under countable unions. This is a big barrier to doing probability theory. The same problem happens with ANY countable set -- the reals only avoid it by being uncountable.

1notfnofn8mo

I'd be surprised if it could be salvaged using infinitesmals (imo the problem is deeper than the argument from countable additivity), but maybe it would help your intuition to think about how some Bayesian methods intersect with frequentist methods when working on a (degenerate) uniform prior over all the real numbers. I have a draft of such a post that I'll make at some point, but you can think about univariate linear regression, the confidence regions that arise, and what prior would make those confidence regions credible regions.

2022 AI Alignment Course: 5→37% working on AI safety

Alex_Altair9mo42

Agreed the title is confusing. I assumed it meant that some metric was 5% for last year's course, and 37% for this year's course. I think I would just nix numbers from the title altogether.

What distinguishes "early", "mid" and "end" games?

Alex_Altair9mo1910

One model I have is that when things are exponentials (or S-curves), it's pretty hard to tell when you're about to leave the "early" game, because exponentials look the same when scaled. If every year has 2x as much activity as the previous year, then every year feels like the one that was the big transition.

For example, it's easy to think that AI has "gone mainstream" now. Which is true according to some order of magnitude. But even though a lot of politicians are talking about AI stuff more often, it's nowhere near the top of the list for most of them. I... (read more)

5William_S9mo

You get more discrete transitions when one s-curve process takes the lead from another s-curve process, e.g. deep learning taking over from other AI methods.

(Tiny bug report, I got an email for this comment reply, but I don't see it anywhere in my notifications.)

Done

0. CAST: Corrigibility as Singular Target

I propose that this tag be merged into the tag called Infinities In Ethics.

2habryka10mo

I am in favor. If you tag all the posts in Infinities in Ethics, I'll delete this tag.