Since it was kind of a pain to run, sharing these probably minimally interesting results. I tried encoding this paragraph from my comment:
...I wonder how much information there is in those 1024-dimensional embedding vectors. I know you can jam an unlimited amount of data into infinite-precision floating point numbers, but I bet if you add Gaussian noise to them they still decode fine, and the magnitude of noise you can add before performance degrades would allow you to compute how many effective bits there are. (Actually, do people use this technique on la
You appear to have two full copies of the entire post here, one above the other. I wouldn't care (it's pretty easy to recognize this and skip the second copy) except that it totally breaks the way LW does comments on and reactions to specific parts of the text; one has to select a unique text fragment to use those, and with two copies of the entire post, there aren't any unique fragments.
Wow, the SONAR encode-decode performance is shockingly good, and I read the paper and they explicitly stated that their goal was translation, and that the autoencoder objective alone was extremely easy! (But it hurt translation performance, presumably by using a lot of the latent space to encode non-semantic linguistic details, so they heavily downweighted autoencoder loss relative to other objectives when training the final model.)
I wonder how much information there is in those 1024-dimensional embedding vectors. I know you can jam an unlimited amount of ...
Sorry, I think it's entirely possible that this is just me not knowing or understanding some of the background material, but where exactly does this diverge from justifying the AI pursuing a goal of maximizing the inclusive genetic fitness of its creators? Which clearly either isn't what humans actually want (there are things humans can do to make themselves have more descendants that no humans, including the specific ones who could take those actions, want to take, because of godshatter) or is just circular (who knows what will maximize inclusive genetic...
As the person who requested of MIRI to release the Sequences as paper books in the first place, I have asked MIRI to release the rest of them, and credibly promised to donate thousands of dollars if they did so. Given the current situation vis-a-vis AI, I'm not that surprised that it still does not appear to be a priority to them, although I am disappointed.
MIRI, if you see this, yet another vote for finishing the series! And my offer still stands!
Thank you for writing this. It has a lot of stuff I haven't seen before (I'm only really interested in neurology insofar as it's the substrate for literally everything I care about, but that's still plenty for "I'd rather have a clue than treat the whole area as spooky stuff that goes bump in the night").
As I understand it, you and many scientists are treating energy consumption by anatomical part of the brain (as proxied by blood flow) as the main way to see "what the brain is doing". It seems possible to me that there are other ways that specific though...
The most recent thing I've seen on the topic is this post from yesterday on debate, which found that debate does basically nothing. In fairness there have also been some nominally-positive studies (which the linked post also mentions), though IMO their setup is more artificial and their effect sizes are not very compelling anyway.
My qualitative impression is that HCH/debate/etc have dropped somewhat in relative excitement as alignment strategies over the past year or so, more so than I expected. People have noticed the unimpressive results to some extent, ...
might put lawyers out of business
This might be even worse than she thought. Many, many contracts include the exact opposite of this clause, i.e., that the section titles are without any effect whatsoever on the actual interpretation of the contract. I never noticed until just now that this is an instance of self-dealing on the part of the attorneys (typically) drafting the contracts! They're literally saying that if they make a drafting error, in a way that makes the contract harder to understand and use and is in no conceivable way an improvem...
I was just reading about this, and apparently subvocalizing refers to small but physically detectable movement of the vocal cords. I don't know whether / how often I do this (I am not at all aware of it). But it is literally impossible for me to read (or write) without hearing the words in my inner ear, and I'm not dyslexic (my spelling is quite good and almost none of what's described in OP sounds familiar, so I doubt it's that I'm just undiagnosed). I thought this was more common than not, so I'm kind of shocked that the reacts on this comment's grandpar...
Leaving an unaligned force (humans, here) in control of 0.001% of resources seems risky. There is a chance that you've underestimated how large the share of resources controlled by the unaligned force is, and probably more importantly, there is a chance that the unaligned force could use its tiny share of resources in some super-effective way that captures a much higher fraction of resources in the future. The actual effect on the economy of the unaligned force, other than the possibility of its being larger than thought or being used as a springboard to g...
Ah, okay, some of those seem to me like they'd change things quite a lot. In particular, a week's notice is usually possible for major plans (going out of town, a birthday or anniversary, concert that night only, etc.) and being able to skip books that don't interest one also removes a major class of reason not to go. The ones I can still see are (1) competing in-town plans, (2) illness or other personal emergency, and (3) just don't feel like going out tonight. (1) is what you're trying to avoid, of course. On (3) I can see your opinion going either way. ...
I started a book club in February 2023 and since the beginning I pushed for the rule that if you don't come, you pay for everyone's drinks next time.
I'm very surprised that in that particular form that worked, because the extremely obvious way to postpone (or, in the end, avoid) the penalty is to not go next time either (or, in the end, ever again). I guess if there's agreement that pretty close to 100% attendance is the norm, as in if you can only show up 60% of the time don't bother showing up at all, then it could work. That would make sense for some...
I think this is a very important distinction. I prefer to use "maximizer" for "timelessly" finding the highest value of an objective function, and reserve "optimizer" for the kind of stepwise improvement discussed in this post. As I use the terms, to maximize something is to find the state with the highest value, but to optimize it is to take an initial state and find a new state with a higher value. I recognize that "optimize" and "optimizer" are sometimes used the way you're saying, as basically synonymous with "maximize" / "maximizer", and I could retre...
Good post; this has way more value per minute spent reading and understanding it than the first 6 chapters of Jaynes, IMO.
There were 20 destroyed walls and 37 intact walls, leading to 10 − 3×20 − 1×37 = 13db
This appears to have an error; 10 − 3×20 − 1×37 = 10 - 60 - 37 = -87, not 13. I think you meant for the 37 to be positive, in which case 10 - 60 + 37 = -13, and the sign is reversed because of how you phrased which hypothesis the evidence favors (although you could also just reverse all the signs if you want the arithmetic to come out perfectly).
Al...
The way that LLM tokenization represents numbers is all kinds of stupid. It's honestly kind of amazing to me they don't make even more arithmetic errors. Of course, an LLM can use a calculator just fine, and this is an extremely obvious way to enhance its general intelligence. I believe "give the LLM a calculator" is in fact being used, in some cases, but either the LLM or some shell around it has to decide when to use the calculator and how to use the calculator's result. That apparently didn't happen or didn't work properly in this case.
Thanks for your reply. "70% confidence that... we have a shot" is slightly ambiguous - I'd say that most shots one has are missed, but I'm guessing that isn't what you meant, and that you instead meant 70% chance of success.
70% feels way too high to me, but I do find it quite plausible that calling it a rounding error is wrong. However, with a 20 year timeline, a lot of people I care about will almost definitely still die, who could have not died if death were Solved, which group with very much not negligible probability includes myself. And as you note do...
P.S. Having this set of values and beliefs is very hard on one's epistemics. I think it's a writ-large version of what Eliezer has stated as "thinking about AI timelines is bad for one's epistemics". Here are some examples:
(1) Although I've never been at all tempted by e/acc techno-optimism (on this topic specifically) / alignment isn't a problem at all / alignment by default, boy, it sure would be nice to hear about a strategy for alignment that didn't sound almost definitely doomed for one reason or another. Even though Eliezer can (accurately, IMO) sh...
I agree with the Statement. As strongly as I can agree with anything. I think the hope of current humans achieving... if not immortality, then very substantially increased longevity... without AI doing the work for us, is at most a rounding error. And ASI that was even close to aligned, that found it worth reserving even a billionth part of the value of the universe for humans, would treat this as the obvious most urgent problem and solve death pretty much if there's any physically possible way of doing so. And when I look inside, I find that I simply don...
Yes he should disclose somewhere that he's doing this, but deepfakes with the happy participation of the person whose voice is being faked seems like the best possible scenario.
Yes and no. The main mode of harm we generally imagine is to the person deepfaked. However, nothing prevents the main harm in a particular incident of harmful deepfaking from being to the people who see the deep fake and believe the person depicted actually said and did the things depicted.
That appears to be the implicit allegation here - that recipients might be deceived into th...
I've seen a lot of attempts to provide "translations" from one domain-specific computer language to another, and they almost always have at least one of these properties:
Malbolge? Or something even nastier in a similar vein, since it seems like people actually figured out (with great effort) how to write programs in Malbolge. Maybe encrypt all the memory after every instruction, and use a real encryption algorithm, not a lookup table.
Some points which I think support the plausibility of this scenario:
(1) EY's ideas about a "simple core of intelligence", how chimp brains don't seem to have major architectural differences from human brains, etc.
(2) RWKV vs Transformers. Why haven't Transformers been straight up replaced by RWKV at this point? Looks to me like potentially huge efficiency gains being basically ignored because lab researchers can get away with it. Granted, affects efficiency of inference but not training AFAIK, and maybe it wouldn't work at the 100B+ scale, but it certainly...
I certainly don't think labs will only try to improve algorithms if they can't scale compute! Rather, I think that the algorithmic improvements that will be found by researchers trying to figure out how to improve performance given twice as much compute as the last run won't be the same ones found by researchers trying to improve performance given no increase in compute.
One would actually expect the low hanging fruit in the compute-no-longer-growing regime to be specifically the techniques that don't scale, since after all, scaling well is an existing cons...
Slowing compute growth could lead to a greater focus on efficiency. Easy to find gains in efficiency will be found anyway, but harder to find gains in efficiency currently don't seem to me to be getting that much effort, relative to ways to derive some benefit from rapidly increasing amounts of compute.
If models on the capabilities frontier are currently not very efficient, because their creators are focused on getting any benefit at all from the most compute that is practically available to them now, restricting compute could trigger an existing "efficie...
I can actually sort of write the elevator pitch myself. (If not, I probably wouldn't be interested.) If anything I say here is wrong, someone please correct me!
Non-realizability is the problem that none of the options a real-world Bayesian reasoner is considering is a perfect model of the world. (It actually information-theoretically can't be, if the reasoner is itself part of the world, since it would need a perfect self-model as part of its perfect world-model, which would mean it could take its own output as an input into its decision process, but th...
Let's say that I can understand neither the original IB sequence, nor your distillation. I don't have the prerequisites. (I mean, I know some linear algebra - that's hard to avoid - but I find topology loses me past "here's what an open set is" and I know nothing about measure theory.)
I think I understand what non-realizability is and why something like IB would solve it. Is all the heavy math actually necessary to understand how IB does so? I'm very tempted to think of IB as "instead of a single probability distribution over outcomes, you just keep a (c...
I was wondering if anyone would mention that story in the comments. I definitely agree that it has very strong similarities in its core idea, and wondered if that was deliberate. I don't agree with any implications (which you may or may not have intended) that it's so derivative as to make not mentioning Omelas dishonest, though, and independent invention seems completely plausible to me.
Edited to add: although the similar title does incline rather strongly to Omelas being an acknowledged source.
It seems like there might be a problem with this argument if the true are not just unknown, but adversarially chosen. For example, suppose the true are the actual locations of a bunch of landmines, from a full set of possible landmine positions . We are trying to get a vehicle from A to B, and all possible paths go over some of the . We may know that the opponent placing the landmines only has landmines to place. Furthermore, suppose each landmine only goes off with some probability even if the vehicle drives over it. If we can mechanist...
I like this frame, and I don't recall seeing it already addressed.
What I have seen written about deceptiveness generally seems to assume that the AGI would be sufficiently capable of obfuscating its thoughts from direct queries and from any interpretability tools we have available that it could effectively make its plans for world domination in secret, unobserved by humans. That does seem like an even more effective strategy for optimizing its actual utility function than not bothering to think through such plans at all, if it's able to do it. But it's ha...
Hmm. My intuition says that your A and B are "pretty much the same size". Sure, there are infinitely many times that they switch places, but they do so about as regularly as possible and they're always close.
If A is "numbers with an odd number of digits" and B is "numbers with an even number of digits" that intuition starts to break down, though. Not only do they switch places infinitely often, but the extent to which one exceeds the other is unbounded. Calling A and B "pretty much the same size" starts to seem untenable; it feels more like "the concept of...
The obvious way to quickly and intuitively illustrate whether reactions are positive or negative would seem to be color; another option would be grouping them horizontally or vertically with some kind of separator. The obvious way to quickly and intuitively make it visible which reactions were had by more readers would seem to be showing a copy of the same icon for each person who reacted a certain way, not a number next to the icon.
I make no claim that either of these changes would be improvements overall. Clearly the second would require a way to handl...
In the current UI, the list of reactions from which to choose is scrollable, but that's basically impossible to actually see. While reading the comments I was wondering what the heck people were talking about with "Strawman" and so forth. (Like... did that already get removed?) Then I discovered the scrolling by accident after seeing a "Shrug" reaction to one of the comments.
I've had similar thoughts. Two counterpoints:
This is basically misuse risk, which is not a weird problem that people need to be convinced even needs solving. To the extent AI appears likely to be powerful, society at large is already working on this. Of course, its efforts may be ineffective or even counterproductive.
They say power corrupts, but I'd say power opens up space to do what you were already inclined to do without constraints. Some billionaires, e.g. Bill Gates, seem to be sincerely trying to use their resources to help people. It isn't har
On SBF, I think a large part of the issue is that he was working in an industry called cryptocurrency that is basically has fraud as the bedrock of it all. There was nothing real about crypto, so the collapse of FTX was basically inevitable.
I don't deny that the cryptocurrency "industry" has been a huge magnet for fraud, nor that there are structural reasons for that, but "there was nothing real about crypto" is plainly false. The desire to have currencies that can't easily be controlled, manipulated, or implicitly taxed (seigniorage, inflation) by gove...
Thank you for writing these! They've been practically my only source of "news" for most of the time you've been writing them, and before that I mostly just ignored "news" entirely because I found it too toxic and it was too difficult+distasteful to attempt to decode it into something useful. COVID the disease hasn't directly had a huge effect on my life, and COVID the social phenomenon has been on a significant decline for some time now, but your writing about it (and the inclusion of especially notable non-COVID topics) have easily kept me interested enou...
I don't know how far a model trained explicitly on only terminal output could go, but it makes sense that it might be a lot farther than a model trained on all the text on the internet (some small fraction of which happens to be terminal output). Although I also would have thought GPT's architecture, with a fixed context window and a fixed number of layers and tokenization that isn't at all optimized for the task, would pay large efficiency penalties at terminal emulation and would be far less impressive at it than it is at other tasks.
Assuming it does work, could we get a self-operating terminal by training another GPT to roleplay the entering commands part? Probably. I'm not sure we should though...
Thanks! This is much more what I expected. Things that look generally like outputs that commands might produce, and with some mind-blowing correct outputs (e.g. the effect of tr
on the source code) but also some wrong outputs (e.g. the section after echo A >a; echo X >b; echo T >c; echo H >d
; the output being consistent between cat a a c b d d
and cat a a c b d d | sort
(but inconsistent with the "actual contents" of the files) is especially the kind of error I'd expect an LLM to make).
Got it. This post also doesn't appear to actually be part of that sequence though? I would have noticed if it was and looked at the sequence page.
EDIT: Oh, I guess it's not your sequence.
EDIT2: If you just included "Alignment Stream of Thought" as part of the link text in your intro where you do already link to the sequence, that would work.
ASoT
What do you mean by this acronym? I'm not aware of its being in use on LW, you don't define it, and to me it very definitely (capitalization and all) means Armin van Buuren's weekly radio show A State of Trance.
If only relative frequency of genes matters, then the overall size of the gene pool doesn't matter. If the overall size of the gene pool doesn't matter, then it doesn't matter if that size is zero. If the size of the gene pool is zero, then whatever was included in that gene pool is extinct.
Yes, it's true people make all kinds of incorrect inferences because they think genes that increase the size of the gene pool will be selected for or those that decrease it will be selected against. But it's still also true that a gene that reduces the size of the po...
I mean, just lag, yes, but there's also plain old incorrect readings. But yes, it would be cool to have a system that incorporated glucagon. Though, diabetics' body still produce glucagon AFAIK, so it'd really be better to just have something that senses glucose and releases insulin the same way a working pancreas would.
Context: I am a type 1 diabetic. I have a CGM, but for various reasons use multiple daily injections rather than an insulin pump; however, I'm familiar with how insulin pumps work.
A major problem with a closed-loop CGM-pump system is data quality from the CGM. My CGM (Dexcom G6) has ~15 minutes of lag (because it reads interstitial fluid, not blood). This is the first generation of Dexcom that doesn't require calibrations from fingersticks, but I've occasionally had CGM readings that felt way off and needed to calibrate anyway. Accuracy and noisiness v...
Looks like an anti-football (*American* football, that is) thing, to me. American football doesn't have goals, and soccer (which is known as "football" in most of the world) does. And you mentioned earlier that the baseball neuron is also anti-football.