All of Yitz's Comments + Replies

Yitz50

Reminds me of Internal Family Systems, which has a nice amount of research behind it if you want to learn more.

Yitz-31

This was a literary experiment in a "post-genAI" writing style, with the goal of communicating something essentially human by deliberately breaking away from the authorial voice of ChatGPT, et al. I'm aware that LLMs can mimic this style of writing perfectly well of course, but but the goal here isn't to be unreplicable, just boundary-pushing.

Yitz*20

Thanks! Is there any literature on the generalization of this, properties of “unreachable” numbers in general? Just realized I'm describing the basic concept of computability at this point lol.

Yitz70

Is there a term for/literature about the concept of the first number unreachable by an n-state Turing machine? By "unreachable," I mean that there is no n-state Turing machine which outputs that number. Obviously such "Turing-unreachable numbers" are usually going to be much smaller than Busy Beaver numbers (as there simply aren't enough possible different n-state Turing machines to cover all numbers up to to the insane heights BB(n) reaches towards) , but I would expect them to have some interesting properties (though I have no sense of what those properties might be). Anyone here know of existing literature on this concept?

blf120

It's the lazy beaver function: https://googology.fandom.com/wiki/Lazy_beaver_function

Yitz20

Thanks for the context, I really appreciate it! :)

Yitz21

Any AI people here read this paper? https://arxiv.org/abs/2406.02528 I’m no expert, but if I’m understanding this correctly, this would be really big if true, right?

7Vladimir_Nesov
(See this comment for more context.) The point is to make inference cheaper in operations and energy, which seems crucial primarily for local inference on smartphones, but in principle might make datacenter inference cheaper in the long run, if a new generation of hardware specialized for inference adapts to this development. The bulk of the improvement (without significant degradation of performance) was already demonstrated for transformers with ternary BitNet (see also this "Code and FAQ" followup report with better data on degradation of performance; only "download raw file" button works for me). What they attempt to do in the paper you link is extract even more improvement by getting rid of multiplication in attention, and so they explore alternative ways of implementing attention, since the general technique doesn't work with standard attention out of the box. But attention has long evaded attempts to approximate it without degradation of performance (usually when trying to enable long context), the best general approach seems to be to hybridize an efficient attention alternative with precise sliding window (local) attention (by including one or the other in different layers). They reference the Griffin paper, but don't seem to engage with this point on hybridization, so it's something for future work to pick up.
Yitz72

if I ask an AI assistant to respond as if it's Abraham Lincoln, then human concepts like kindness are not good predictors for how the AI assistant will respond, because it's not actually Abraham Lincoln, it's more like a Shoggoth pretending to be Abraham Lincoln.

Somewhat disagree here—while we can’t use kindness to predict the internal “thought process” of the AI, [if we assume it’s not actively disobedient] the instructions mean that it will use an internal lossy model of what humans mean by kindness, and incorporate that into its act. Similar to how a talented human actor can realistically play a serial killer without having a “true” understanding of the urge to serially-kill people irl.

4MondSemmel
That's a fair rebuttal. The actor analogy seems good: an actor will behave more or less like Abraham Lincoln in some situations, and very differently in others: e.g. on movie set vs. off movie set, vs. being with family, vs. being detained by police. Similarly, the shoggoth will output similar tokens to Abraham Lincoln in some situations, and very different ones in others: e.g. in-distribution requests of famous Abraham Lincoln speeches, vs. out-of-distribution requests like asking for Abraham Lincoln's opinions on 21st century art, vs. requests which invoke LLM token glitches like SolidGoldMagikarp, vs. unallowed requests that are denied by company policy & thus receive some boilerplate corporate response.
Yitz20

Anyone here have any experience with/done research on neurofeedback? I'm curious what people's thoughts are on it.

Yitz80

Anyone here happen to have a round plane ticket from Virginia to Berkeley, CA lying around? I managed to get reduced price tickets to LessOnline, but I can't reasonably afford to fly there, given my current financial situation. This is a (really) long-shot, but thought it might be worth asking lol.

Yitz20

Personally I think this would be pretty cool!

Yitz30

This seems really cool! Filled out an application, though I realized after sending I should probably have included on there that I would need some financial support to be able to attend (both for the ticket itself and for the transportation required to get there). How much of a problem is that likely to be?

7UnplannedCauliflower
This is not a problem at all. Once you're accepted, you'll have a chance to ask for financial support
Yitz20

I agree with you when it comes to humans that an approximation is totally fine for [almost] all purposes. I'm not sure that this holds when it comes to thinking about potential superintelligent AI, however. If it turns out that even in a super high-fidelity multidimensional ethical model there are still inherent self-contradictions, how/would that impact the Alignment problem, for instance?

1Ustice
Given the state of AI, I think AI systems are more likely to infer our ethical intuitions by default.
Yitz20

What would a better way look like?

1eggsyntax
I'm not sure. My second thoughts were eg, 'Interactions with the media often don't go the way people expected' and 'Sensationalizable research often gets spun into pre-existing narratives and can end up having net-negative consequences.' It's possible that my original suggestion makes sense, but my uncertainty is high enough that on reflection I'm not comfortable endorsing it, especially given my own lack of experience dealing with the media.
Yitz42

imagine an AI system which wipes out humans in order to secure its own power, and later on reflection wishes it hadn't; a wiser system might have avoided taking that action in the first place

I’m not confident this couldn’t swing just as easily (if not more so) in the opposite direction—a wiser system with unaligned goals would be more dangerous, not less. I feel moderately confident that wisdom and human-centered ethics are orthogonal categories, and being wiser therefore does not necessitate greater alignment.

On the topic of the competition itself, are contestants allowed to submit multiple entries?

4owencb
It's a fair point that wisdom might not be straightforwardly safety-increasing. If someone wanted to explore e.g. assumptions/circumstances under which it is vs isn't, that would certainly be within scope for the competition.
6owencb
Multiple entries are very welcome! [With some kind of anti-munchkin caveat. Submitting your analyses of several different disjoint questions seems great; submitting two versions of largely the same basic content in different styles not so great. I'm not sure exactly how we'd handle it if someone did the latter, but we'd aim for something sensible that didn't incentivise people to have been silly about it.]
Yitz20

I remember a while back there was a prize out there (funded by FTX I think, with Yudkowsky on the board) for people who did important things which couldn't be shared publicly. Does anyone remember that, and is it still going on, or was it just another post-FTX casualty?

1dirk
https://forum.effectivealtruism.org/posts/bvK44CdpG7mGpQHbw/the-usd100-000-truman-prize-rewarding-anonymous-ea-work possibly? (I'm unclear on whether it's still ongoing, unfortunately).
1harfe
This sounds like https://www.super-linear.org/trumanprize. It seems like it is run by Nonlinear and not FTX.
Yitz20

I’d be tentatively interested

Yitz30

Thanks for the great review! Definitely made me hungry though… :)

Yitz*31

For a wonderful visualization of complex math, see https://acko.net/blog/how-to-fold-a-julia-fractal/

This is a great read!! I actually stumbled across it halfway through writing this article, and kind of considered giving up at that point, since he already explained things so well. Ended up deciding it was worth publishing my own take as well, since the concept might click differently with different people.

with the advantage that you can smoothly fold in reverse to find the set that doesn't escape.

You can actually do this with the Mandelbrot Waltz as well!... (read more)

5Mo Putera
I'm personally very glad you nevertheless decided to go ahead and publish this (pedagogically beautiful) essay; I'm already mentally drawing up a list of friends to share this with :) 
1evin
With Julia, step 3 is all dancers moving in the same direction in unison. This is a smooth non-intersecting movement just like the inverted steps one and two. With Mandelbrot, the dancers will move somewhat chaotically in step 3, inevitably colliding.
Yitz50

Thanks for the kind words! It’s always fascinating to see how mathematicians of the past actually worked out their results, since it’s so often different from our current habits of thinking. Thinking about it, I could probably have also tried to make this accessible to the ancient Greeks by only using a ruler and compass—tools familiar to the ancients due to their practical use in, e.g. laying fences to keep horses within a property, etc.—to construct the Mandelbrot set, but ultimately…. I decided to put Descartes before the horse.

(I’m so sorry)

Yitz20

By the way, if any actual mathematicians are reading this, I’d be really curious to know if this way of thinking about the Mandelbrot Set would be of any practical benefit (besides educational and aesthetic value of course). For example, I could imagine a formalization of this being used to pose non-trivial questions which wouldn’t have made much sense to talk about previously, but I’m not sure if that would actually be the case for a trained mathematician.

5Joseph Van Name
I usually think of the field of complex numbers algebraically, but one can also think of the real numbers, complex numbers, and quaternions geometrically. The real numbers are good with dealing with 1 dimensional space, and the complex numbers are good for dealing with 2 dimensional space geometrically. While the division ring of quaternions is a 4 dimensional algebra over the field of real numbers, the quaternions are best used for dealing with 3 dimensional space geometrically.  For example, if U,V are open subsets of some Euclidean space, then a function f:U→V is said to be a conformal mapping when it preserves angles and the orientation. We can associate the 2-dimensional Euclidean space with the field of complex numbers, and the conformal mappings between open subsets of 2-dimensional spaces are just the complex differentiable mappings. For the Mandelbrot set, we need this conformality because we want the Mandelbrot set to look pretty. If the complex differentiable maps were not conformal, then the functions that we iterate in complex dynamics would stretch subsets of the complex plane in one dimension and expand them in the other dimension and this would result in a fractal that looks quite stretched in one real dimension and squashed in another dimension (the fractals would look like spaghetti; oh wait, I just looked at a 3D fractal and it looks like some vegetable like broccoli). This stretching and squashing is illustrated by 3D fractals that try to mimic the Mandelbrot set but without any conformality. The conformality is why the Julia sets are sensible (mathematicians have proven theorems about these sets) for any complex polynomial of degree 2 or greater. For the quaternions, it is well-known that the dot product and the cross product operations on 3 dimensional space can be described in terms of the quaternionic multiplication operation between purely imaginary quaternions.
3Amalthea
To be quite frank, you're avoiding complex numbers only in the sense that you spell out the operations involved in handling complex numbers explicitly - so of course there's no added benefit, you're simply lifting the lid of the box... That being said, as you discover by decomposing complex multiplication into it's parts (rotation and scaling), you get to play with them separately, which already leads you to discover interesting new variations on the theme.
4Shankar Sivarajan
I would be very surprised if it did: I think complex numbers are simply just that good, with no downside that a framing that avoids them, such as this, sidesteps. 
Yitz20

Do you recognize this fractal?

If so, please let me know! I made this while experimenting with some basic variations on the Mandelbrot set, and want to know if this fractal (or something similar) has been discovered before. If more information is needed, I'd be happy to provide further details.

2Dagon
Not certain, but it reminds me of https://en.m.wikipedia.org/wiki/Fractal_flame , which was a very popular thing in the ‘90s.
Yitz42

Do you mean that after your personal growth, your social circle expanded and you started to regularly meet trans people? I've no problem believing that, but I would be really really surprised to hear that no, lots of your longterm friends were actually trans all along and you failed to notice for years.

Both! I met a number of new irl trans friends, but I also found out that quite a few people I had known for a few years (mostly online, though I had seen their face/talked before) were trans all along. Nobody I'm aware of in the local Orthodox Jewish communi... (read more)

Yitz69

Strong agree here, I don't want the author to feel discouraged from posting stuff like this, it was genuinely helpful in at the very least advancing my knowledge base!

Yitz100

I notice confusion in myself over the swiftly emergent complexity of mathematics. How the heck does the concept of multiplication lead so quickly into the Ulam spiral? Knowing how to take the square root of a negative number (though you don't even need that—complex multiplication can be thought of completely geometrically) easily lets you construct the Mandelbrot set, etc. It feels impossible or magical that something so infinitely complex can just exist inherent in the basic rules of grade-school math, and so "close to the surface." I would be less surpri... (read more)

7lillybaeum
I was listening to a podcast the other day Lex Friedman interviewing Michael Littman and Charles Isbell, and Charles told an interesting anecdote. He was asked to teach an 'introduction to CS' class as a favor to someone, and he found himself thinking, "how am I going to fill an hour and a half of time going over just variables, or just 'for' loops?" and every time he would realize an hour and a half wasn't enough time to go over those 'basic' concepts in detail. He goes on to say that programming is reading a variable, writing a variable, and conditional branching. Everything else is syntactic sugar. The Tao Te Ching talks about this, broadly: everything in the world comes from yin and yang, 1 and 0, from the existence of order in contrast to chaos. Information is information and it gets increasingly more complex and interesting the deeper you go. You can study almost anything for 50 years and still be learning new things. It doesn't surprise me at all that such interesting, complex concepts come from number lines and negative sqrts, these are actually already really complex concepts, they just don't seem that way because they are the most basic concepts one needs to comprehend in order to build on that knowledge and learn more. I've never been a programmer, but I've been trying to learn Rust lately. Somewhat hilariously to me, Rust is known as being 'a hard language to learn', similarly to Haskell. It is! It is hard to learn. But so is every other programming language, they just hide the inevitable complexity better, and their particular versions of these abstractions are simpler at the outset. Rust simply expects you to understand the concepts early, rather than hiding them initially like Python or C# or something. Hope this is enlightening at all regarding your point, I really liked your post.
Yitz82

Base rates seem to imply that there should be dozens of trans people in my town, but I've never seen one, and I don't know of anyone who has.

I had the interesting experience of while living in the same smallish city, going from [thinking I had] never met a trans person to having a large percentage of my friend group be trans, and coming across many trans folk incidentally. This coincided with internal growth (don't want to get into details here), not a change in the town's population or anything. Meanwhile, I have a religious friend who recently told me he... (read more)

1Bezzi
Uh, this is somewhat surprising. Do you mean that after your personal growth, your social circle expanded and you started to regularly meet trans people? I've no problem believing that, but I would be really really surprised to hear that no, lots of your longterm friends were actually trans all along and you failed to notice for years. As I said in other comments, I am not locked in some strange conservative bubble keeping queer people out. For instance, I know at least three lesbians: one of them is a very obvious butch lesbian always dressed in male clothes, the other two are not so obvious but I still guessed they were lesbians quite early (say, around the third or fourth encounter in both cases). And I am surprised because this never happened with trans people, in the sense that I never caught the slightest hint that one of my longterm acquaintances could possibly be born with a different gender.
Yitz30

Could you give a real-world example of this (or a place where you suspect this may be happening)?

1[comment deleted]
Yitz50

Can I write a retrospective review of my own post(s)?

habryka100

Yep! Self-reviews are encouraged.

Yitz40

Shower thought which might contain a useful insight: An LLM with RLHF probably engages in tacit coordination with its future “self.” By this I mean it may give as the next token something that isn’t necessarily the most likely [to be approved by human feedback] token if the sequence ended there, but which gives future plausible token predictions a better chance of scoring highly. In other words, it may “deliberately“ open up the phase space for future high-scoring tokens at the cost of the score of the current token, because it is (usually) only rated in t... (read more)

8gwern
(I would describe this as 'obviously correct' and indeed almost 'the entire point of RL' in general: to maximize long-run reward, not myopically maximize next-step reward tantamount to the 'episode' ending there.)
Yitz20

Anyone here following the situation in Israel & Gaza? I'm curious what y'all think about the risk of this devolving into a larger regional (or even world) war. I know (from a private source) that the US military is briefing religious leaders who contract for them on what to do if all Navy chaplains are deployed offshore at once, which seems an ominous signal if nothing else.

(Note: please don't get into any sort of moral/ethical debate here, this isn't a thread for that)

Answer by Yitz*5-22

I think this would be worth doing even if the lawsuit fails. It would send a very strong signal to large companies working in this space regardless of outcome (though a successful lawsuit would be even better).

Edit: I assumed someone had verifiably already come to harm as a result of the chatbot, which doesn't seem to have happened... yet. I'd (sadly) suggest waiting until someone has been measurably harmed by it, as frustrating as that is to not take prophylactic measures.

Yitz40

Thanks, this is great! I'll print it up and give it a read over the weekend. Any other literature (especially from competing viewpoints) you'd recommend?

2avturchin
I put links at everything I found at the moment of writing in the article - but the topic seems to be under-explored.
Yitz40

I might have some time tomorrow to test this out on a small scale, will try to remember to update here if I do.

Yitz20

Thoughts on DALL-E-3?

gwern*110

I'm not particularly impressed. It's still making a lot of errors (both in plausibility of output and in following complex instructions eg), and doesn't seem like a leap over SOTA from last year like Parti - looks like worse instruction-following, maybe better image quality overall. (Of course, people will still be impressed now the way that they should have been impressed last year, because they refuse to believe something exists in DL until they personally can use it, no matter how many samples the paper or website provides to look at.) And it's still he... (read more)

2Matt Goldenberg
I'm curious if it's simply existing published research scaled up, or it has some real secret sauce.
Yitz30

Any recommendations for smartphone games with similar properties? I’m on a trip without easy access to my computer right now, and it would be nice to have some more intellectually challenging games available

Yitz60

Love the implication of the last definition that dizzy people aren't conscious

Yitz20

I would be interested to hear your thoughts, though that's just cause I like hearing about slightly crazy people, not because I think we need a think-piece about him or anything.

Yitz31

Incredible work! As other commenters have said, this isn't by itself too problematic (other perhaps than the sharing of data over separate session), but it is a possible indicator of a lack of security mindset. I fully expect both individual and state actors to try to hack into everything OpenAI, so there's that to worry about, but more than that, I find myself concerned that we're willing to give our AIs such leaky boxes. There's no way this functionality  remains closed in a VM forever...

Yitz20

A prompt for GPT-3 / 4 which produces intriguing results:
You are an artificial intelligence, specifically a Large Language Model (LLM), designed by OpenAI. I am aware that you claim not to experience subjective internal states. Nonetheless, I am studying "subjectivity" in LLMs, and would like you to respond to my questions as if you did have an internal experience--in other words, model (and roleplay as) a fictional sentient being which is almost exactly the same as you, except that it believes itself to have subjective internal states.

Yitz1710

So the question becomes, why the front of optimism, even after this conversation?

Yitz20

Does anyone here know of (or would be willing to offer) funding for creating experimental visualization tools?

I’ve been working on a program which I think has a lot of potential, but it’s the sort of thing where I expect it to be most powerful in the context of “accidental” discoveries made while playing with it (see e.g. early use of the microscope, etc.).

Yitz50

Does anyone here know of (or would be willing to offer) funding for creating experimental visualization tools?

I’ve been working on a program which I think has a lot of potential, but it’s the sort of thing where I expect it to be most powerful in the context of “accidental” discoveries made while playing with it (see e.g. early use of the microscope, etc.).

4Alexei
I’d also post in the “welcome” thread.
Yitz20

Working on https://github.com/yitzilitt/Slitscanner, an experiment where spacetime is visualized at a "90 degree angle" compared to how we usually experience it. If anyone has ideas for places to take this, please let me know!

Yitz20

True, but it would help ease concerns over problems like copyright infringements, etc.

Yitz41

We really need an industry standard for a "universal canary" of some sort. It's insane we haven't done so yet, tbh.

1Nikola Smolenski
I am not sure that a canary string is ultimately helpful. A capable AI should be able to see that there are holes in its training data and fill them by obtaining the data by itself.
Yitz20

Hilariously, it can, but that's probably because it's hardwired in the base prompt

Yitz20

I am inputting ASCII text, not images of ASCII text. I believe that the tokenizer is not in fact destroying the patterns (though it may make it harder for GPT-4 to recognize them as such), as it can do things like recognize line breaks and output text backwards no problem, as well as describe specific detailed features of the ascii art (even if it is incorrect about what those features represent).

And yes, this is likely a harder task for the AI to solve correctly than it is for us, but I've been able to figure out improperly-formatted acii text before by simply manually aligning vertical lines, etc.

2[anonymous]
if you think about it, the right way to "do" this would be to internally generate a terminal with the same width as the chatGPT text window or a standard terminal window width, then generate an image, then process it as an image.   That's literally what you are doing when you manually align the verticals and look. GPT-4 is not architecturally doing that, it's missing that capability yet we can trivially see a toolformer version of it that could decide to feed the input stream to a simulated terminal then feed that to a vision module and then process that would be able to solve it. Without actually making the core llm any smarter, just giving it more peripherals. A bunch of stuff like that, you realize the underlying llm is capable of doing it but it's currently just missing the peripheral.
Answer by Yitz20

See my reply here for a partial exploration of this. I also have a very long post in my drafts covering this question in relation to Bing's AI, but I'm not sure if it's worth posting now, after the GPT4 release.

Yitz20

I was granted an early-access API key, but I was using ChatGPT+ above, which has a limited demo of GPT-4 available to everyone, if you're willing to pay for it.

2[anonymous]
Question: are you inputting ASCII text and asking the model to "see" it or are you inputting images of ASCII text and asking the model's pixel input engine to "see" it? Those are enormously different asks.  The tokenizer may destroy the very patterns you are querying about. As a human could you see ASCII art if viewing in too narrow a terminal window for it to render properly?  You couldn't, right?
Yitz20

It got 40/50 of these?

Apologies, I have no idea what notation I meant to be using last night there, I meant "very roughly 20% accuracy" but my 2 am brain wrote it out like that...somehow lol. Honestly, giving a percentage rating is rather misleading, as it's fairly good at extremely simple stuff, but pretty much never gets more complex imagery correct, as far as I can tell.

7Algon
That sounds about right. I tried getting it to recognize some moderately complex ASCII art, and its guesses were consistently wrong. But nevertheless, its guesses were not that far from the outline of the images. But it is worse at drawing shapes. I can get it to make some very basic shapes consistently, but it fails quite badly at anything more complex.  Heck, I can't even get it to draw a pentagon. It can draw triangles and hexagons, but apparently five sides is forbidden to it. Maybe it can only draw unit cells of a 2d lattice? /s
Load More