AtillaYasar

Wiki Contributions

Comments

Sorted by

Editability and findability --> higher quality over time

 

Editability

Code being easier to find and easier to edit, for example,

if it's in the same live environment where you're working, or if it's a simple hotkey away, or an alt-tab away to a config file which updates your settings without having to restart,

makes it more likely to be edited, more subject to feedback loop dynamics.

Same applies to writing, or anything where you have connected objects that influence each other, where the "influencer node" is editable and visible.

configs : program layout / behavior
informal rules about how your relationship is to your friend : the dynamics of the relationship
layout of your desk : the way you work
underlying philosophy of ideas : writing ideas
ideas "going viral" in social media : people discussing them (think about Luigi Mangione triggering notions of killing people you don't like (this is bad!! and it's done under the memetic shield of "big corpo bad"), or Elon x Trump's doge thing having all sorts of people discussing efficiency of organizations and bureaucracy (this is amazing) )

(not sure if the last one is a good example)

Imagine if when writing this Quick Take:tm:, I had a side panel that on every keystroke, pulled up related paragraphs from all my existing writings!
I can see past writings which cool, but I can edit them way more easily (assuming a "jump to" feature), in the long term this yields many more edits, and a more polished and readable total volume of work.

 

Findability

If you can easily see the contents of something and go, "wait this is dumb". Then even if it's "far away" like, you have to find it in the browser, do mouse clicks and scrolls, you'll still do it.
What in fact determined you editing it, is that the threshold for loading its contents into your mind, had been lowered. When you load it, the opinion is instantly triggered.

It seems easier to fit an analogy to an image than to generate an image starting with a precise analogy (meaning a detailed prompt).

Maybe because an image is in a higher-dimensional space and you're projecting onto the lower-dimensional space of words.
(take this analogy loosely idk any linear algebra)

Claude: "It's similar to how it's easier to recognize a face than to draw one from memory!"

 

(what led to this:)

I've been trying to create visual analogies of concepts/ideas, in order to compress them and I noticed how hard it is. It's hard to describe images, and image generators have trouble sticking to descriptions (unless you pay a lot per image).

Then I started instead, to just caption existing images, to create sensible analogies. I found this way easier to do, and it gives good enough representations of the ideas.

(some people's solution to this is "just train a bigger model" and I'd say "lol")

 

(Note, images are high-dimensional not in terms of pixels but in concept space.
You have color, relationships of shared/contrasting colors between the objects (and replace "color" with any other trait that imply relationships among objects), the way things are positioned, lighting, textures, etc. etc.
Sure you can treat them all as configurations of pixel space but I just don't think this is a correct framing (too low of an abstraction))

 

 

(specific example that triggered this idea and this post:)

pictured: "algorithmic carcinogenesis" idea, talked about in this post https://www.conjecture.dev/research/conjecture-a-roadmap-for-cognitive-software-and-a-humanist-future-of-ai

(generated with NovelAI)

I was initially trying to do more precise and detailed analogies, like, sloppy computers/digital things spreading too fast like cancer, killing and overcrowding existing beauty, where the existing beauty is of the same "type" -- meaning also computers/digital things -- but just more organized or less "cancerous" (somehow). It wasn't working.

On our "interfaces" to the world...

 

Prelude

This is a very exploratory post.

It underlies my thinking in this post: https://www.lesswrong.com/posts/wQ3bBgE8LZEmedFWo/root-node-of-my-posts but it's hard to put into words but I'll make a first attempt here. I also expressed some of it here, https://www.lesswrong.com/posts/MCBQ5B5TnDn9edAEa/atillayasar-s-shortform?commentId=YNGb5QNYcXzDPMc5z, in the part about deeper mental models / information bottlenecks.

It's also why I've been spending over a year making a GUI that can encompass as much of my activities on the computer as possible -- which is most of my entire life.
(The beauty of coding personal tooling is that you get to construct your own slice of reality --- which can be trivial or extremely profound depending on your abilities and on whether controlling your slice of reality even matters to begin with -- and the digital world is so much more malleable than the physical world.)

 

Start

Our interface to the world and our minds is basically our memory, which is not a static lookup, it changes as you spend time "remembering" (or reconstructing your memories and ideas, which maybe shouldn't be seen as "remembering" exactly).

Tools extends the available "surface area" of our interface, meaning our memory, because the tool is still one "item" in your mind but it stores a lot more knowledge (and/or references to knowledge) than if you had to store the unpacked version.
When you replace "all programming knowledge known to man" with "skill at Googling", you pay for storage space with runtime compute.

(And sometimes you actually want to buy runtime compute by spending storage, meaning, you practice with some topic so that you can immediately recall things and actually think about things directly instead of every thought being behind a search query, which would be like 100x slower or something and totally impractical)

 

Holistic worldview means fewer root nodes

A very holistic worldview basically means you have fewer items of that initial interface, like, anything you wanna think about passes through a smaller number of initial nodes. This probably makes it easier to think about things, and if this worldview actually works and is consistent, you can iterate on it and refine it over years and years, which is much harder if you have like 100 scattered thoughts and ways of thinking, because they won't be revisited as often.

(I wonder if this is the holy grail of philosophy)

(But ofcourse it only works if it's not costing you too much "runtime compute", like, if you have a worldview that is arguably holistic and consistent, but every time you think about some topic you have to spend 5 minutes explaining how exactly it maps to your worldview, and it's better to just directly think about that topic instead, then it would be worthless.)

 

Examples

Having a longtime best friend/wife/husband/partner who you always talk to about whatever you're thinking of, is kind of like having a node that encompasses most of your worldview. So you can iterate on it.

Something like asking "what would Jesus do" is like that too, or any religious figure or role model. (Every time you do this, you actually understand Jesus better -- at least your model and other people's model of him -- but you also get better at usefully using this framing to inform your decisions (or your heart), it's not literally about doing what Jesus would do, I think.)

Maybe any general habit someone has that they consistently use to deal with problems, is like this. A thing which the majority of one's thoughts pass through. ( <-- wait that's weird lol )

 

This post got weird

Somehow I got a different definition than the "interface" idea I started out with.

Maybe the "node that most thoughts pass through" definition makes more sense than interfaces anyway, because we're talking about a cybernetic-like system, with constant feedback loops and no clear start or end. And the node can be anywhere within the system and have any arbitrary "visit frequency", like maybe your deep conversation with your best friend or mentor is only once every 2 months.

(I guess the "initial interface" framing is a special case of this more general version)

Anyway this is a good time to end it.

Just because X describes Y in a high level abstract way, doesn't mean studying X is the best of understanding Y.

Often, the best way is to simply study Y, and studying X just makes you sound smarter when talking about Y.

 

pointless abstractions: cybernetics and OODA loop

This is based on my experience trying to learn stuff about cybernetics, in order to understand GUI tool design for personal use, and to understand the feedback loop that roughly looks like, build -> use -> rethink -> let it help you rethink -> rebuild, where me and any LLM instance I talk to (via the GUI) are part of the cybernetic system. Whenever I "loaded cybernetics concepts" into my mind and tried to view GUI design from that perspective, I was just spending a bunch of effort mapping the abstract ideas to concrete things, and then being like, "ok but so what?".

A similar thing happened while looking into the OODA loop, though at least its Wiki page has a nice little flowchart, and it's much more concrete than cybernetics. And you can draw more concrete inspiration about GUI design by thinking about fighter pilot interfaces.

 

It's also because I often see people using abstract reasoning and, whenever I dig into what they're actually saying it doesn't make that much sense. Also because of personal experience where, things become way clearer and easier to think about, after phrasing them in very concrete and basic ways.

Jocko Willink talking about OODA loops, paraphrase

The f-86 wasn't as good at individual actions, but it could transition between them faster than the MiG-15

analogous to how end-to-end algorithms, llm agents, and things optimized for the tech demo, are "impressive single actions", but not as good for long term tasks

Two tools is a lot more complex than one, not just +1 or *2

When you have two tools, you have think about their differences, or about specifically what they each let you do, and then pattern match to your current problem, before using it. With one tool, you don't have to understand the shape of its contents at all, because it's the only tool and you already know you want to use it.

 

Concrete example, doing groceries

Let's compare the amount of information you need to remember with 1 vs 2 tools. You want food (task), you're going to get it from a supermarket (tool).

With 1 available supermarket:

You want food. Supermarket has food. Done. The tool only requires 1 trait: "has food".

With 2 available supermarket:

Market A has <list of products> (or a "vibe", which is a proxy for a list of products), market B has <list of products/vibe>, you compare the specific food you want to the products of each market, then make a decision.

Each tool's complexity grew from "has food" to "list of product types", your own "food" requirement grew to "specific food". It's from 1 to 2* ( complexity growth of a given comparison ( = tool complexity + requirement complexity ) ) + difficulty of comparing multiple tools.

 

And after two, the complexity increase gets smaller and smaller, for the obvious reason that now you're just adding 1 tool straightforwardly to the existing list of tools. But also because you're already comparing multiple tools to each other, which you weren't doing with one.

Twitter doesn't incentivize truth-seeking

Twitter is designed for writing things off the top of your head, and things that others will share or reply to. There are almost no mechanisms to reward good ideas, to punish bad ones, nor for incentivizing the consistency of your views, nor any mechanism for even seeing whether someone updates their beliefs, or whether a comment pointed out that they're wrong.

(The fact that there are comments is really really good, and it's part of what makes Twitter so much better than mainstream media. Community Notes is great too.)

The solution to Twitter sucking, is not to follow different people, and DEFINITELY not to correct every wrong statement (oops), it's to just leave. Even smart people, people who are way smarter and more interesting and knowledgeable and funny than me, simply don't care that much about their posts. If a post is thought-provoking, you can't even do anything with that fact, because nothing about the website is designed for deeper conversations. Though I've had a couple of nice moments where I went deep into a topic with someone in the replies.

 

Shortforms are better

The above thing is also a danger with Shortforms, but to a lesser extent, because things are easier to find, and it's much more likely that I'll see something I've written, see that I'm wrong, and delete it or edit it. Posts on Twitter or not editable, they're harder to find, there's no preview-on-hover, there is no hyperlinked text.

when chatting with an LLM, do you wonder what its purpose is in the responses it gives?  I'm pretty sure it's "predict a plausible next token", but I don't know how I'll know to change my belief.

 

I think "it has a simulated purpose, but whether it has an actual purpose is not relevant for interacting with it".

My intuition is that the token predicter doesn't have a purpose, that it's just answering, "what would this chatbot that I am simulating, respond with?"
For the chatbot character (the Simulacrum) it's, "What would a helpful chatbot want in this situation?" It behaves as if its purpose is to be helpful and harmless (or whatever personality it was instilled with).

(I'm assuming that as part of making a prediction, it is building (and/or using) models of things, which is a strong statement and idk how to argue for it)

I think framing it as "predicting the next token" is similar to explaining a rock's behavior when rolling as, "obeying the laws of physics". Like, it's a lower-than-useful level of abstraction. It's easier to predict the rock's behavior via things it's bouncing off of, its direction, speed, mass, etc.

Or put another way, "sure it's predicting the next token, but how is it actually doing that? what does that mean?". A straightforward way to predict the next token is to actually understand what it means to be a helpful chatbot in this conversation (which includes understanding the world, human psychology, etc.) and completing whatever sentence is currently being written, given that understanding.

(There's another angle that makes this very confusing: whether RLHF fundamentally changes the model or not. Does it turn it from a single token predicter to a multi-token response predicter? Also is it possible that the base model already has goals beyond predicting 1 token? Maybe the way it's trained somehow makes it useful for it to have goals.)

 

 

There have been a number of debates (which I can't easily search on, which is sad) about whether speech is an action (intended to bring about a consequence) or a truth-communication or truth-seeking (both imperfect, of course) mechanism

The immediate reason is, we just enjoy talking to people. Similar to "we eat because we enjoy food/dislike being hungry". Biologically, hunger developed for many low-level processes like our muscles needing glycogen, but subjectively, the cause of eating is some emotion.

I think asking what speech really is at some deeper level doesn't make sense. Or at least it should recognize that why individual people speak, and why speech developed in humans, are separate topics, with (I'm guessing) very small intersections.

personality( ground_truth ) --> stated_truth

Or, what someone says is a subset of what they're really thinking, transformed by their personality.
(personality plus intentions, context, emotions, etc.)

(You can tell I'm not a mathematician because I didn't express this in latex xD  but I feel like there's a very elegant Linear Algebra description where, ground truth is a high dimensional object, personality transforms to make it low-dimensional enough to communicate, and changes its "angle" (?) / vibe so it fits their goals better)

So, if you know someone's personality, meaning, what kind of way they reshape a given ground truth, you can infer things about the ground truth.

 

Anecdote

I feel like everyone does this implicitly, it's one of those social skills I struggle with. I always thought it was a flaw of other people, to not simply take things at face value, but to think at the level of "why do they say this", "this is just because they want <x>", while I'm always thinking about "no but what did they literally say? is it true or not?".

It's frustrating because usually, the underlying social/personality context completely dominates what someone says, like, 95% of the time it literally doesn't fucking matter one bit what someone is saying at the object level. And maybe, the fact that humans have evolved to think in this way, is evidence that in fact, 95% of what people say doesn't matter and shouldn't matter object-level, that the only thing that matters is implications about underlying personality and intentions.

My recollection of sitting through "adult conversations" of family and relative visits for probably many hundreds of hours is, "I remember almost zero things that anybody has ever talked about". I guess the point is that they're talking, not what it's about.

 

Practical

This reframing makes it easier to handle social situations. I can ask myself questions like, "what is the point of this social interaction?", "why is this person saying this to me?"

Often the answer is, "they just wanna vibe, and this is an excuse to do so".

Load More