LESSWRONG
LW

All of Yitz's Comments + Replies

Yitz's Shortform

So I know somebody who I believe is capable of altering Trump’s position on the war in Iran, if they can find a way to talk face-to-face for 15 minutes. They already have really deep connections in DC, and they told me if they were somehow randomly entrusted with nationally important information, they could be talking with the president in at least 2 hours. I’m trying to decide if I want to push this person to do something or not (as they’re normally kind of resistant to taking high-agency type actions, and don’t have as much faith in themselves as I do). Anyone have any advice on how to think about this?

On May 1, 2033, humanity discovered that AI was fairly easy to align.

You didn’t really misinterpret it. I was using the term in a looser way than most would, to mean that you don’t need a fine-grained technical solution, and just a very basic trick is enough for alignment. I realize most use the term differently though, so I’ll change the wording.

Yitz's Shortform

Attention can perhaps be compared to a searchlight, And wherever that searchlight lands in the brain, You’re able to “think more” in that area. How does the brain do that? Where is it “taking” this processing power from?

The areas and senses around it perhaps. Could that be why when you’re super focused, everything else around you other than the thing you are focused on seems to “fade”? It’s not just by comparison to the brightness of your attention, but also because the processing is being “squeezed out” of the other areas of your mind.

6Seth Herd19d

The principle here is competition among populations of neurons. The purpose is to reduce crosstalk. Higher brain regions can focus on processing only the stuff you're attending to because most of their inputs have been down-regulated so only the attended ones are sending information. The principal operates by simple competition. If I'm thinking about colors, higher areas are representing colors. That activates lower areas/neurons representing colors. because they're wired together by associative learning (or just about any useful learning rule will connect semantically related representations). There are probably some particular flourishes evolution used to amplify the efficiency (like competition at the level of thalamic reticular nucleus that then regulates whole lower cortical regions, and synchronous firing of attended/active neurons to further sharpen their win over unattended neural populations), but the central principal is indeed very neat. I did my Master's directly on this, PhD thesis on competition/attention for the purposes of visual search, and have kept it top of mind as a central principal of brain function. Attention in transformers is different but has the same broad outlines in function.

On May 1, 2033, humanity discovered that AI was fairly easy to align.

This is potentially a follow-up to my AI 2027 forecast, An “Optimistic” AI Timeline, depending on how hard people roast me for this lol.

2Mitchell_Porter17d

In the title you say AI was "aligned by default", which to me makes it sound like any sufficiently advanced AI is automatically moral, but in the story you have a particular mechanism - explicit simulation of an aligned AI, which bootstraps that AI into being. Did I misinterpret the title?

Yitz's Shortform

Are there any open part-time rationalist/EA- adjacent jobs or volunteer work in LA? Looking for something I can do in the afternoon while I’m here for the next few months.

An “Optimistic” 2027 Timeline

Oh no, it should have been A1! It’s just a really dumb joke about A1 sauce lol

5bismuth3mo

Ah ok. For the record I was referring to tetraphobia.

Ayn Rand’s model of “living money”; and an upside of burnout

Reminds me of Internal Family Systems, which has a nice amount of research behind it if you want to learn more.

This was a literary experiment in a "post-genAI" writing style, with the goal of communicating something essentially human by deliberately breaking away from the authorial voice of ChatGPT, et al. I'm aware that LLMs can mimic this style of writing perfectly well of course, but but the goal here isn't to be unreplicable, just boundary-pushing.

Yitz's Shortform

~~Thanks! Is there any literature on the generalization of this, properties of “unreachable” numbers in general?~~ Just realized I'm describing the basic concept of computability at this point lol.

Yitz's Shortform

Is there a term for/literature about the concept of the first number unreachable by an n-state Turing machine? By "unreachable," I mean that there is no n-state Turing machine which outputs that number. Obviously such "Turing-unreachable numbers" are usually going to be much smaller than Busy Beaver numbers (as there simply aren't enough possible different n-state Turing machines to cover all numbers up to to the insane heights BB(n) reaches towards) , but I would expect them to have some interesting properties (though I have no sense of what those properties might be). Anyone here know of existing literature on this concept?

1

It's the lazy beaver function: https://googology.fandom.com/wiki/Lazy_beaver_function

2

Yitz's Shortform

Thanks for the context, I really appreciate it! :)

Yitz's Shortform

Any AI people here read this paper? https://arxiv.org/abs/2406.02528 I’m no expert, but if I’m understanding this correctly, this would be really big if true, right?

7Vladimir_Nesov1y

(See this comment for more context.) The point is to make inference cheaper in operations and energy, which seems crucial primarily for local inference on smartphones, but in principle might make datacenter inference cheaper in the long run, if a new generation of hardware specialized for inference adapts to this development. The bulk of the improvement (without significant degradation of performance) was already demonstrated for transformers with ternary BitNet (see also this "Code and FAQ" followup report with better data on degradation of performance; only "download raw file" button works for me). What they attempt to do in the paper you link is extract even more improvement by getting rid of multiplication in attention, and so they explore alternative ways of implementing attention, since the general technique doesn't work with standard attention out of the box. But attention has long evaded attempts to approximate it without degradation of performance (usually when trying to enable long context), the best general approach seems to be to hybridize an efficient attention alternative with precise sliding window (local) attention (by including one or the other in different layers). They reference the Griffin paper, but don't seem to engage with this point on hybridization, so it's something for future work to pick up.

My AI Model Delta Compared To Yudkowsky

if I ask an AI assistant to respond as if it's Abraham Lincoln, then human concepts like kindness are not good predictors for how the AI assistant will respond, because it's not actually Abraham Lincoln, it's more like a Shoggoth pretending to be Abraham Lincoln.

Somewhat disagree here—while we can’t use kindness to predict the internal “thought process” of the AI, [if we assume it’s not actively disobedient] the instructions mean that it will use an internal lossy model of what humans mean by kindness, and incorporate that into its act. Similar to how a talented human actor can realistically play a serial killer without having a “true” understanding of the urge to serially-kill people irl.

4MondSemmel1y

That's a fair rebuttal. The actor analogy seems good: an actor will behave more or less like Abraham Lincoln in some situations, and very differently in others: e.g. on movie set vs. off movie set, vs. being with family, vs. being detained by police. Similarly, the shoggoth will output similar tokens to Abraham Lincoln in some situations, and very different ones in others: e.g. in-distribution requests of famous Abraham Lincoln speeches, vs. out-of-distribution requests like asking for Abraham Lincoln's opinions on 21st century art, vs. requests which invoke LLM token glitches like SolidGoldMagikarp, vs. unallowed requests that are denied by company policy & thus receive some boilerplate corporate response.

Yitz's Shortform

Anyone here have any experience with/done research on neurofeedback? I'm curious what people's thoughts are on it.

Yitz's Shortform

Anyone here happen to have a round plane ticket from Virginia to Berkeley, CA lying around? I managed to get reduced price tickets to LessOnline, but I can't reasonably afford to fly there, given my current financial situation. This is a (really) long-shot, but thought it might be worth asking lol.

adamzerner's Shortform

Personally I think this would be pretty cool!

LessWrong Community Weekend 2024, open for applications

This seems really cool! Filled out an application, though I realized after sending I should probably have included on there that I would need some financial support to be able to attend (both for the ticket itself and for the transportation required to get there). How much of a problem is that likely to be?

7UnplannedCauliflower1y

This is not a problem at all. Once you're accepted, you'll have a chance to ask for financial support

What if Ethics is Provably Self-Contradictory?

I agree with you when it comes to humans that an approximation is totally fine for [almost] all purposes. I'm not sure that this holds when it comes to thinking about potential superintelligent AI, however. If it turns out that even in a super high-fidelity multidimensional ethical model there are still inherent self-contradictions, how/would that impact the Alignment problem, for instance?

1Ustice1y

Given the state of AI, I think AI systems are more likely to infer our ethical intuitions by default.

Creating unrestricted AI Agents with Command R+

What would a better way look like?

1eggsyntax1y

I'm not sure. My second thoughts were eg, 'Interactions with the media often don't go the way people expected' and 'Sensationalizable research often gets spun into pre-existing narratives and can end up having net-negative consequences.' It's possible that my original suggestion makes sense, but my uncertainty is high enough that on reflection I'm not comfortable endorsing it, especially given my own lack of experience dealing with the media.

Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes

imagine an AI system which wipes out humans in order to secure its own power, and later on reflection wishes it hadn't; a wiser system might have avoided taking that action in the first place

I’m not confident this couldn’t swing just as easily (if not more so) in the opposite direction—a wiser system with unaligned goals would be more dangerous, not less. I feel moderately confident that wisdom and human-centered ethics are orthogonal categories, and being wiser therefore does not necessitate greater alignment.

On the topic of the competition itself, are contestants allowed to submit multiple entries?

4owencb1y

It's a fair point that wisdom might not be straightforwardly safety-increasing. If someone wanted to explore e.g. assumptions/circumstances under which it is vs isn't, that would certainly be within scope for the competition.

6owencb1y

Multiple entries are very welcome! [With some kind of anti-munchkin caveat. Submitting your analyses of several different disjoint questions seems great; submitting two versions of largely the same basic content in different styles not so great. I'm not sure exactly how we'd handle it if someone did the latter, but we'd aim for something sensible that didn't incentivise people to have been silly about it.]

Yitz's Shortform

I remember a while back there was a prize out there (funded by FTX I think, with Yudkowsky on the board) for people who did important things which couldn't be shared publicly. Does anyone remember that, and is it still going on, or was it just another post-FTX casualty?

1dirk1y

https://forum.effectivealtruism.org/posts/bvK44CdpG7mGpQHbw/the-usd100-000-truman-prize-rewarding-anonymous-ea-work possibly? (I'm unclear on whether it's still ongoing, unfortunately).

1harfe1y

This sounds like https://www.super-linear.org/trumanprize. It seems like it is run by Nonlinear and not FTX.

I’d be tentatively interested

Book review: Cuisine and Empire

Thanks for the great review! Definitely made me hungry though… :)

An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers

For a wonderful visualization of complex math, see https://acko.net/blog/how-to-fold-a-julia-fractal/

This is a great read!! I actually stumbled across it halfway through writing this article, and kind of considered giving up at that point, since he already explained things so well. Ended up deciding it was worth publishing my own take as well, since the concept might click differently with different people.

with the advantage that you can smoothly fold in reverse to find the set that doesn't escape.

You can actually do this with the Mandelbrot Waltz as well!... (read more)

5Mo Putera1y

I'm personally very glad you nevertheless decided to go ahead and publish this (pedagogically beautiful) essay; I'm already mentally drawing up a list of friends to share this with :)

1evin1y

With Julia, step 3 is all dancers moving in the same direction in unison. This is a smooth non-intersecting movement just like the inverted steps one and two. With Mandelbrot, the dancers will move somewhat chaotically in step 3, inevitably colliding.

An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers

Thanks for the kind words! It’s always fascinating to see how mathematicians of the past actually worked out their results, since it’s so often different from our current habits of thinking. Thinking about it, I could probably have also tried to make this accessible to the ancient Greeks by only using a ruler and compass—tools familiar to the ancients due to their practical use in, e.g. laying fences to keep horses within a property, etc.—to construct the Mandelbrot set, but ultimately…. I decided to put Descartes before the horse.

(I’m so sorry)

An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers

By the way, if any actual mathematicians are reading this, I’d be really curious to know if this way of thinking about the Mandelbrot Set would be of any practical benefit (besides educational and aesthetic value of course). For example, I could imagine a formalization of this being used to pose non-trivial questions which wouldn’t have made much sense to talk about previously, but I’m not sure if that would actually be the case for a trained mathematician.

5Joseph Van Name1y

I usually think of the field of complex numbers algebraically, but one can also think of the real numbers, complex numbers, and quaternions geometrically. The real numbers are good with dealing with 1 dimensional space, and the complex numbers are good for dealing with 2 dimensional space geometrically. While the division ring of quaternions is a 4 dimensional algebra over the field of real numbers, the quaternions are best used for dealing with 3 dimensional space geometrically. For example, if U,V are open subsets of some Euclidean space, then a function f:U→V is said to be a conformal mapping when it preserves angles and the orientation. We can associate the 2-dimensional Euclidean space with the field of complex numbers, and the conformal mappings between open subsets of 2-dimensional spaces are just the complex differentiable mappings. For the Mandelbrot set, we need this conformality because we want the Mandelbrot set to look pretty. If the complex differentiable maps were not conformal, then the functions that we iterate in complex dynamics would stretch subsets of the complex plane in one dimension and expand them in the other dimension and this would result in a fractal that looks quite stretched in one real dimension and squashed in another dimension (the fractals would look like spaghetti; oh wait, I just looked at a 3D fractal and it looks like some vegetable like broccoli). This stretching and squashing is illustrated by 3D fractals that try to mimic the Mandelbrot set but without any conformality. The conformality is why the Julia sets are sensible (mathematicians have proven theorems about these sets) for any complex polynomial of degree 2 or greater. For the quaternions, it is well-known that the dot product and the cross product operations on 3 dimensional space can be described in terms of the quaternionic multiplication operation between purely imaginary quaternions.

3Amalthea1y

To be quite frank, you're avoiding complex numbers only in the sense that you spell out the operations involved in handling complex numbers explicitly - so of course there's no added benefit, you're simply lifting the lid of the box... That being said, as you discover by decomposing complex multiplication into it's parts (rotation and scaling), you get to play with them separately, which already leads you to discover interesting new variations on the theme.

4Shankar Sivarajan1y

I would be very surprised if it did: I think complex numbers are simply just that good, with no downside that a framing that avoids them, such as this, sidesteps.

Yitz's Shortform

Do you recognize this fractal?

If so, please let me know! I made this while experimenting with some basic variations on the Mandelbrot set, and want to know if this fractal (or something similar) has been discovered before. If more information is needed, I'd be happy to provide further details.

2Dagon2y

Not certain, but it reminds me of https://en.m.wikipedia.org/wiki/Fractal_flame , which was a very popular thing in the ‘90s.

Social Dark Matter

Do you mean that after your personal growth, your social circle expanded and you started to regularly meet trans people? I've no problem believing that, but I would be really really surprised to hear that no, lots of your longterm friends were actually trans all along and you failed to notice for years.

Both! I met a number of new irl trans friends, but I also found out that quite a few people I had known for a few years (mostly online, though I had seen their face/talked before) were trans all along. ~~Nobody I'm aware of in the local Orthodox Jewish communi~~... (read more)

Why Yudkowsky is wrong about "covalently bonded equivalents of biology"

Strong agree here, I don't want the author to feel discouraged from posting stuff like this, it was genuinely helpful in at the very least advancing my knowledge base!

Yitz's Shortform

I notice confusion in myself over the swiftly emergent complexity of mathematics. How the heck does the concept of multiplication lead so quickly into the Ulam spiral? Knowing how to take the square root of a negative number (though you don't even need that—complex multiplication can be thought of completely geometrically) easily lets you construct the Mandelbrot set, etc. It feels impossible or magical that something so infinitely complex can just exist inherent in the basic rules of grade-school math, and so "close to the surface." I would be less surpri... (read more)

7lillybaeum2y

I was listening to a podcast the other day Lex Friedman interviewing Michael Littman and Charles Isbell, and Charles told an interesting anecdote. He was asked to teach an 'introduction to CS' class as a favor to someone, and he found himself thinking, "how am I going to fill an hour and a half of time going over just variables, or just 'for' loops?" and every time he would realize an hour and a half wasn't enough time to go over those 'basic' concepts in detail. He goes on to say that programming is reading a variable, writing a variable, and conditional branching. Everything else is syntactic sugar. The Tao Te Ching talks about this, broadly: everything in the world comes from yin and yang, 1 and 0, from the existence of order in contrast to chaos. Information is information and it gets increasingly more complex and interesting the deeper you go. You can study almost anything for 50 years and still be learning new things. It doesn't surprise me at all that such interesting, complex concepts come from number lines and negative sqrts, these are actually already really complex concepts, they just don't seem that way because they are the most basic concepts one needs to comprehend in order to build on that knowledge and learn more. I've never been a programmer, but I've been trying to learn Rust lately. Somewhat hilariously to me, Rust is known as being 'a hard language to learn', similarly to Haskell. It is! It is hard to learn. But so is every other programming language, they just hide the inevitable complexity better, and their particular versions of these abstractions are simpler at the outset. Rust simply expects you to understand the concepts early, rather than hiding them initially like Python or C# or something. Hope this is enlightening at all regarding your point, I really liked your post.

Social Dark Matter

Base rates seem to imply that there should be dozens of trans people in my town, but I've never seen one, and I don't know of anyone who has.

I had the interesting experience of while living in the same smallish city, going from [thinking I had] never met a trans person to having a large percentage of my friend group be trans, and coming across many trans folk incidentally. This coincided with internal growth (don't want to get into details here), not a change in the town's population or anything. Meanwhile, I have a religious friend who recently told me he... (read more)

1Bezzi2y

Uh, this is somewhat surprising. Do you mean that after your personal growth, your social circle expanded and you started to regularly meet trans people? I've no problem believing that, but I would be really really surprised to hear that no, lots of your longterm friends were actually trans all along and you failed to notice for years. As I said in other comments, I am not locked in some strange conservative bubble keeping queer people out. For instance, I know at least three lesbians: one of them is a very obvious butch lesbian always dressed in male clothes, the other two are not so obvious but I still guessed they were lesbians quite early (say, around the third or fourth encounter in both cases). And I am surprised because this never happened with trans people, in the sense that I never caught the slightest hint that one of my longterm acquaintances could possibly be born with a different gender.

Social Dark Matter

Could you give a real-world example of this (or a place where you suspect this may be happening)?

1[comment deleted]2y

The LessWrong 2022 Review

Can I write a retrospective review of my own post(s)?

Yep! Self-reviews are encouraged.

1

Yitz's Shortform

Shower thought which might contain a useful insight: An LLM with RLHF probably engages in tacit coordination with its future “self.” By this I mean it may give as the next token something that isn’t necessarily the most likely [to be approved by human feedback] token if the sequence ended there, but which gives future plausible token predictions a better chance of scoring highly. In other words, it may “deliberately“ open up the phase space for future high-scoring tokens at the cost of the score of the current token, because it is (usually) only rated in t... (read more)

8gwern2y

(I would describe this as 'obviously correct' and indeed almost 'the entire point of RL' in general: to maximize long-run reward, not myopically maximize next-step reward tantamount to the 'episode' ending there.)

Yitz's Shortform

Anyone here following the situation in Israel & Gaza? I'm curious what y'all think about the risk of this devolving into a larger regional (or even world) war. I know (from a private source) that the US military is briefing religious leaders who contract for them on what to do if all Navy chaplains are deployed offshore at once, which seems an ominous signal if nothing else.

(Note: please don't get into any sort of moral/ethical debate here, this isn't a thread for that)

Answer by YitzOct 30, 2023*5-22

I think this would be worth doing even if the lawsuit fails. It would send a very strong signal to large companies working in this space regardless of outcome (though a successful lawsuit would be even better).

Edit: I assumed someone had verifiably already come to harm as a result of the chatbot, which doesn't seem to have happened... yet. I'd (sadly) suggest waiting until someone has been measurably harmed by it, as frustrating as that is to not take prophylactic measures.

Literature On Existential Risk From Atmospheric Contamination?

Thanks, this is great! I'll print it up and give it a read over the weekend. Any other literature (especially from competing viewpoints) you'd recommend?

2avturchin2y

I put links at everything I found at the moment of writing in the article - but the topic seems to be under-explored.

Paper: LLMs trained on “A is B” fail to learn “B is A”

I might have some time tomorrow to test this out on a small scale, will try to remember to update here if I do.

Yitz's Shortform

Thoughts on DALL-E-3?

I'm not particularly impressed. It's still making a lot of errors (both in plausibility of output and in following complex instructions eg), and doesn't seem like a leap over SOTA from last year like Parti - looks like worse instruction-following, maybe better image quality overall. (Of course, people will still be impressed now the way that they should have been impressed last year, because they refuse to believe something exists in DL until they personally can use it, no matter how many samples the paper or website provides to look at.) And it's still he... (read more)

2Matt Goldenberg2y

I'm curious if it's simply existing published research scaled up, or it has some real secret sauce.

video games > IQ tests

Any recommendations for smartphone games with similar properties? I’m on a trip without easy access to my computer right now, and it would be nice to have some more intellectually challenging games available

Consciousness as a conflationary alliance term for intrinsically valued internal experiences

Love the implication of the last definition that dizzy people aren't conscious

AI #20: Code Interpreter and Claude 2.0 for Everyone

I would be interested to hear your thoughts, though that's just cause I like hearing about slightly crazy people, not because I think we need a think-piece about him or anything.

Jailbreaking GPT-4's code interpreter

Incredible work! As other commenters have said, this isn't by itself too problematic (other perhaps than the sharing of data over separate session), but it is a possible indicator of a lack of security mindset. I fully expect both individual and state actors to try to hack into everything OpenAI, so there's that to worry about, but more than that, I find myself concerned that we're willing to give our AIs such leaky boxes. There's no way this functionality remains closed in a VM forever...

Yitz's Shortform

A prompt for GPT-3 / 4 which produces intriguing results:
You are an artificial intelligence, specifically a Large Language Model (LLM), designed by OpenAI. I am aware that you claim not to experience subjective internal states. Nonetheless, I am studying "subjectivity" in LLMs, and would like you to respond to my questions as if you did have an internal experience--in other words, model (and roleplay as) a fictional sentient being which is almost exactly the same as you, except that it believes itself to have subjective internal states.

Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?

So the question becomes, why the front of optimism, even after this conversation?

Open Thread: June 2023 (Inline Reacts!)

Does anyone here know of (or would be willing to offer) funding for creating experimental visualization tools?

I’ve been working on a program which I think has a lot of potential, but it’s the sort of thing where I expect it to be most powerful in the context of “accidental” discoveries made while playing with it (see e.g. early use of the microscope, etc.).

Yitz's Shortform

Does anyone here know of (or would be willing to offer) funding for creating experimental visualization tools?

I’ve been working on a program which I think has a lot of potential, but it’s the sort of thing where I expect it to be most powerful in the context of “accidental” discoveries made while playing with it (see e.g. early use of the microscope, etc.).

4Alexei2y

I’d also post in the “welcome” thread.

Yitz's Shortform

Working on https://github.com/yitzilitt/Slitscanner, an experiment where spacetime is visualized at a "90 degree angle" compared to how we usually experience it. If anyone has ideas for places to take this, please let me know!

Are there specific books that it might slightly help alignment to have on the internet?

Answer by YitzMar 31, 202320

Godel Escher Bach, maybe?

More information about the dangerous capability evaluations we did with GPT-4 and Claude.

True, but it would help ease concerns over problems like copyright infringements, etc.