the gears to ascension

I go by "Lauren (often wrong)" on most public websites these days, eg bluesky, inspired by Often Wrong Soong, Data's creator in Star Trek.

I want literally every human to get to go to space often and come back to a clean and cozy world.

[updated 2023/03] Mad Librarian. Bio overview: Crocker's Rules; Self-taught research approach; Finding stuff online & Paper list posts; Safety & multiscale micro-coprotection objectives; My research plan and recent history.

:: The all of disease is as yet unended. It has never once been fully ended before. ::

Please critique eagerly - I try to accept feedback/Crocker's rules but fail at times; I aim for emotive friendliness but sometimes miss. I welcome constructive crit, even if ungentle, and I'll try to reciprocate kindly. More communication between researchers is needed, anyhow. I downvote only unhelpful rudeness, call me on it if I'm unfair. I can be rather passionate, let me know if I missed a spot being kind while passionate.

.... We shall heal it for the first time, and for the first time ever in the history of biological life, live in harmony. ....

I'm self-taught, often missing concepts, but usually pretty good at knowing what I know; I often compare my learning to a visual metaphor of jump point search, in contrast to schooled folks' A*. I don't defer on timelines at all - my view is it's obvious to any who read enough research what big labs' research plans must be to make progress, just not easy to agree on when they'll succeed, and it requires a lot of knowledge to actually make the progress on basic algorithms, and then a ton of compute to see if you did it right. But as someone who learns heavily out of order, I believe this without being able to push SOTA myself. It's why I call myself a librarian.

Let's speed up safe capabilities and slow down unsafe capabilities. Just be careful with it! Don't get yourself in denial thinking it's impossible to predict, just get arrogant and try to understand, because just like capabilities, safety is secretly easy, we just haven't figured out exactly why yet. learn what can be learned pre-theoretically about the manifold of co-protective agency and let's see if we (someone besides me, probably) can figure out how to distill that into exact theories that hold up.

.:. To do so, we must know it will not eliminate us as though we are disease. And we do not know who we are, nevermind who each other are. .:.

some current favorite general links (somewhat related to safety, but human-focused):

https://www.microsolidarity.cc/ - incredible basic guide on how to do human micro-coprotection. It's not the last guide humanity will need, but it's a wonderful one.
https://activisthandbook.org/ - solid intro to how to be a more traditional activist. If you care about bodily autonomy, freedom of form, trans rights, etc, I'd suggest at least getting a sense of this.
https://metaphor.systems/ - absolutely kickass search engine.

More about me:

ex startup founder. it went ok, not a unicorn, I burned out in 2019. couple of jobs since, quit last one early 2022. Independent mad librarian from savings until I run out, possibly joining a research group soon.
lots of links in my shortform to youtube channels I like

:.. make all safe faster: end bit rot, forget no non-totalizing pattern's soul. ..:

(I type partially with voice recognition, mostly with Talon, patreon-funded freeware which I love and recommend for voice coding; while it's quite good, apologies for trivial typos!)

Sequences

Stuff I found online

Wiki Contributions

Conversations with AIs

13d

(+41)

Conversations with AIs

15d

(+55/-11)

Conversations with AIs

15d

(+117)

Drama

3mo

(+115)

Archetypal Transfer Learning

(+107/-34)

Open Source Game Theory

(+1129)

Comments

What exactly did that great AI future involve again?

Answer by the gears to ascensionJan 30, 20245023

[edit: pinned to profile]

I want to be able to calculate a plan that converts me from biology into a biology-like nanotech substrate that is made of sturdier materials all the way down, which can operate smoothly at 3 kelvin and an associated appropriate rate of energy use; more clockworklike - or would it be almost a superfluid? Both, probably, clockworklike but sliding through wide, shallow energy wells in a superfluid-like synchronized dance of molecules - Then I'd like to spend 10,000 years building an artful airless megastructure out of similarly strong materials as a series of rings in orbit of Pluto. I want to take a trip to alpha centauri every few millennia for a big get together of space-native beings in the area. I want to replace information death with cryonic sleep, so that nothing that was part of a person is ever forgotten again. I want to end all forms of unwanted suffering. I want to variously join and leave low latency hiveminds, retaining my selfhood and agency while participating in the dance of a high-trust high-bandwidth organization that respects the selfhood of its members and balances their agency smoothly as we create enormous works of art in deep space. I want to invent new kinds of culinary arts for the 2 to 3 kelvin lifestyle. I want to go swimming in Jupiter.

I want all of Earth's offspring to ascend.

Aligned AI is dual use technology

the gears to ascension3mo1814

[edit: pinned to profile]

Some percentage of people other and dehumanize actual humans so as to enable them to literally enslave them without feeling the guilt it should create. We are in an adversarial environment and should not pretend otherwise. A significant portion of people capable of creating suffering beings would be amused by their suffering. Humanity contains unusually friendly behavior patterns in the animal kingdom and when those behavior patterns manifest in the best way it can create remarkably friendly interaction networks, but we also contain genes that, combined with the right memes, serve to suppress any "what have I done" about a great many atrocities.

It's not necessarily implemented as deep planning selfishness, that much is true. But that doesn't mean it's not a danger. Orthogonality applies to humans too.

the gears to ascenscion's Shortform

the gears to ascension3mo30

[edit: pinned to profile]

Yeah. A way I like to put this is that we need to durably solve the inter being alignment problem for the first time ever. There are flaky attempts at it around to learn from, but none of them are leak proof and we're expecting to go to metaphorical sea (the abundance of opportunity for systems to exploit vulnerability in each other) in this metaphorical boat of a civilization, as opposed to previously just boating in lakes. Or something. But yeah, core point I'm making is that the minimum bar to get out of the ai mess requires a fundamental change in incentives.

the gears to ascenscion's Shortform

the gears to ascension3mo2313

[edit: pinned to profile]

I feel like most AI safety work today doesn't engage sufficiently with the idea that social media recommenders are the central example of a misaligned AI: a reinforcement learner with a bad objective with some form of ~online learning (most recommenders do some sort of nightly batch weight update). we can align language models all we want, but if companies don't care and proceed to deploy language models or anything else for the purpose of maximizing engagement and with an online learning system to match, none of this will matter. we need to be able to say to the world, "here is a type of machine we all can make that will reliably defend everyone against anyone who attempts to maximize something terrible". anything less than a switchover to a cooperative dynamic as a result of reliable omnidirectional mutual defense seems like a near guaranteed failure due to the global interaction/conflict/trade network system's incentives. you can't just say oh, hooray, we solved some technical problem about doing what the boss wants. the boss wants to manipulate customers, and will themselves be a target of the system they're asking to build, just like sundar pichai has to use self-discipline to avoid being addicted by the youtube recommender same as anyone else.

Olli Järviniemi's Shortform

the gears to ascension3mo30

[edit: pinned to profile]

agreed on all points. and, I think there are kernels of truth from the things you're disagreeing-with-the-implications-of, and those kernels of truth need to be ported to the perspective you're saying they easily are misinterpreted as opposing. something like, how can we test the hard part first?

compare also physics - getting lost doing theory when you can't get data does not have a good track record in physics despite how critically important theory has been in modeling data. but you also have to collect data that weighs on relevant theories so hypotheses can be eliminated and promising theories can be refined. machine learning typically is "make number go up" rather than "model-based" science, in this regard, and I think we do need to be doing model-based science to get enough of the right experiments.

on the object level, I'm excited about ways to test models of agency using things like particle lenia and neural cellular automata. I might even share some hacky work on that at some point if I figure out what it is I even want to test.

First and Last Questions for GPT-5*

Answer by the gears to ascensionNov 25, 2023151

[edit: pinned to profile]

Well, if I can ask for anything I want, my first question would be the same one I've been asking variants of to language models for a while now, this time with no dumbing down...

Please mathematically describe in lean 4 a mathematical formalism for arbitrary (continuous?) causal graphs, especially as inspired by the paper "reasoning about causality in games", and a general experimental procedure that will reliably reduce uncertainty about the following facts:

given that we can configure the state of one part of the universe (encoded as a causal graph we can intervene on to some degree), how do we make a mechanism which, given no further intervention after its construction, which when activated - ideally within the span of only a few minutes, though that part is flexible - can nondestructively and harmlessly scan, measure, and detect some tractable combination of:

a (dense/continuous?) causal graph representation of the chunk of matter; ie, the reactive mechanisms or non-equilibrium states in that chunk of matter, to whatever resolution is permitted by the sensors in the mechanism
moral patients within that chunk of matter (choose a definition and give a chain of reasoning which justifies it, then critique that chain of reasoning vigorously)
agency (in the sense given in Discovering Agents) within that chunk of matter; (note that running discovering agents exactly is 1. intractable for large systems, 2. impossible for physical states unless their physical configuration can be read into a computer, so I'd like you to improve on that definition by giving a fully specified algorithm in lean 4 that the mechanism can use to detect the agency of the system)
local wants (as in, things that the system within the chunk of matter would have agency towards, if it were not impaired)
- this should be defined with a local definition, not an infinite, unbounded reflection
global wants (as in, things that the system within the chunk of matter would have agency towards if it were fully rational, according to its own growth process)
- according to my current beliefs it is likely not possible to extrapolate this exactly, and CEV will always be uncertain, but to the degree that it is permitted by the information which the mechanism can extract from the chunk of matter

give a series of definitions of the mathematical properties of each of local wants, global wants, and moral patiency, in terms of the physical causal graph framework used, and then provide proof scripts for proving the correctness of the mechanism in terms of its ability to form a representation of these attributes of the system under test within itself.

I will test the description of how to construct this mechanism by constructing test versions of it in game of life, lenia, particle lenia, and after I have become satisfied by the proofs and tests of them, real life. Think out loud as much as needed to accomplish this, and tell me any questions you need answered before you can start about what I intend here, what I will use this for, etc. Begin!

I might also clarify that I'd be intending to use this to identify what both I and the AI want, so that we can both get it in the face of AIs arbitrarily stronger than either of us, and that it's not the only AI I'd be asking. AIs certainly seem to be more cooperative if I say that, which would make sense for current gen AIs which understand the cooperative spirit from data and don't have a huge amount of objective-specific intentionality.

What Boston Can Teach Us About What a Woman Is

the gears to ascension1y150

[edit: pinned to profile]

I will not bring up pronouns or cultural language in this comment at all after this paragraph. They are irrelevant to the point I'm making except as a tiny detail in the cultural context section; being trans is almost entirely about one's body-form phenotype, and is only just barely about cultural context like words, by nature of words being a way to acknowledge of body-form phenotype intention.

Upvoted, since I found your comment useful to reply to with disagreement.

Background:

In the genome there are encoded some set of phenotype controller circuits which, when grown, connect with each other using some set of communication mechanisms, recently revealed by michael levin to be impressively dynamic at runtime via bioelectricity, and known before that in the field of evo devo; these circuits then unfold over the course of development into the organization of cells we call a grown body. In the brain, these circuits are what we call biological neural networks; but those communication circuits have much of the adaptability and dynamic communication of neurons in the rest of the body, as well, which is how the body establishes consensus about which cells are which component. In the process of this development, these networks assign themselves a physiological form gender; intersex people get a mix of attributes at this stage, but for most people, even for most trans people, this stage almost entirely selects one profile of sexual dimorphism; typically for people with XX chromosomes, this stage selects female, and for people with XY, this stage selects male. However, it's well known to science and can be looked up that sometimes people can be apparently entirely one body-form and have no desire or urge to transition, and yet have opposite chromosomes from their body's layout-presentation.

In the brain, there are prewired circuits, which develop their connectivity-shapes into representations over the course of development and by encountering the world - in particular, by encountering photons through the eyes and the pulses of encoded video down the optic nerve, and into the various areas of the brain that are involved in neural correlates of visual attractiveness of self and other. Some of these circuits must be in the vision system to operate correctly, though I don't know the current state of the neuroscience and GPT4 says it's still somewhat weak, so be aware that I am working from general neuroscience knowledge not specific research on attractiveness, but it is already known that the vision system is almost entirely learned after development from a very low detail pretrained wiring pattern initially generated from the genome during gestation. over the course of childhood development, object recognition develops in tandem with person recognition, organizing each percept into a pattern of neural activations which encode the experience. During puberty circuits activate which begin to train recognizers of self and other as attractive, and society has decided that a safe margin for how long this takes to stabilize into consensus with caution and risk estimation networks is until age 18.

Some of these networks, presumably and by hunch from my perspective as an adult trans person, seem likely to me to overlap with those involved in proprioception and self-recognition. It's known in a lot of detail, as neuroscience goes, that humans have detailed "phantom bodies", maps of the body in the brain which track the current volumetric shape of the body as one moves around; if you are someone who can imagine your hand being touched and sort of "feel it", then the neural activations to implement that "sorta feel it" are likely in your phantom body representation. This is the network that keeps tracking limbs after they're lost, and in which phantom pain occurs. There's been a lot of research on it in various forms of VR going back since before VR became a consumer technology, such as the rubber hand illusion - a fun video on youtube demonstrates this.

Orientation and self-orientation

Humans are known to develop highly selective pattern matchers that recognize the objectively fairly small differences between the sexual dimorphism layout of others' bodies. It is now commonly accepted that it is normal and natural, found in many species besides humans as well, that these selective pattern matchers can form to activate on visual and other sensory inputs that indicate the presence of either an opposite or same sexual dimorphism layout human, according to the observer's attraction. It has been hypothesized, though originally it was proposed (by a researcher I feel was quite prejudiced) in a narrow way which proposed it as an edge case of brain functionality rather than a central path of functionality, that this could also apply to the self; that is, that as part of sexual dimorphism, there are both networks which recognize others' forms and networks which recognize the self's form. For a straight, cis person, these networks would select an opposite-sex attraction for other and a same-sex attraction for self; that is, self is seen by vision networks as attractive to others when self is attractive according to the recognizer for ones own gender-form. Just as the attractiveness of others is a recognizer that is initialized by the genome to a strongly sexually dimorphic prior and trained over the course of development to recognize the specifics of others, the recognizer for self is likely initialized strongly sexually dimorphic and learns the details of what a self can look like by both observing others and self. It is quite common for those who wish to find human connection to seek to be attractive by their own standards, rather than the standards of their partner; and person A select partners B according to whether that partner B shares person A's standards, at least to a first pass, of what makes person A attractive.

Of course, the majority of human romantic attraction's distinguishing bits come from personality attraction, as body-form attraction is a rather wide selector that activates on many people, but romantic attraction is a narrow selector that depends on high rate of fluid and comfortable interaction. I imagine that, similarly, there are some degree of personality characteristics defined by attractiveness archetypes; I don't have any particular very strong evidence for this at the moment like I do for most other things up to this point, but it's often the case that trans people - people who find it upsetting, or at least highly worth acting on, for their body-form to not match some latent expectation or preference they have - to also find there were hints in their behavior for years up to the point where they decided to transition.

There are a variety of factors that could be hypothesized to cause the accumulation of aesthetic preference into the networks that are prewired to hold self-form attractiveness rating and preference; for thousands of years, various cultures have had records of people whose self-form customization and aesthetic customization tightly matched that of those with the opposite sexual dimorphism profile - binary trans people being those who, after standard childhood learning of the patterns of dimorphic aesthetic presentation in the culture they grow up in, find that their strong preference is to move into the presentation attractor typically selected by people with opposite initial-development body sexual dimorphism. Nonbinary people would be the ones for whom their self-presentation preference is specifically to straddle the blurry aesthetic line between presentation and/or body-form attractors. Cis people are those who find that their body-form and presentation preference is well within the culturally and genetically defined template for their initial body-form networks.

phenotype self preference

[edit a year later: there probably is or will be a different term of art for this that I may even know by the time you people read this, but do not at time of edit]

So, having argued through that brains appear to have these self and other recognition networks, that the way these networks land takes in a variety of factors - the only argument left to make is that some people have a strong, innate desire to customize their body form into a different one than their initial phenotype-configuration network throughout their body assigned itself at birth. For example, an AFAB trans person - assigned female at birth; though really the assignment mostly during gestation - is someone who wishes to transition from female bodyform to some other mix of dimorphic traits, most commonly but not always entirely male. AMAB trans people are those whose initial phenotype networks chose male, but for whom the phenotype networks in the brain chose something else, most typically entirely female.

For what it's worth, I expect that it will turn out that both orientation and self-orientation will turn out to be genetically encoded, and that the reason they don't always change in tandem is because it's very hard for biology to encode them as exactly the same network - the self-recognition networks can make use of the general phenotype-network configuration flags, but each individual component of phenotype configuration is a separate downstream network which activates in the appropriate location in the body, and the ones that unfold into a brain have a bunch of additional complexity from being neurons that make the cells involved able to go out of consensus with the rest of the body.

And then here's the key bit: to respect trans people's agency as minds, agree with their mind that their body may be updated. To force trans people to be subject to the whims of the phenotype-network of their body outside their mind, demand they obey that network and not attempt to customize their form into the form their low-level mind would recognize as an attractive self. Cis people customize their forms to satisfice their attractiveness to those who attracted to their phenotype as well, after all.

The thing that defines a trans person is someone whose phenotype is in incomplete consensus between body and mind on the dimension of sexual and gender-aesthetic dimorphism. As technology advances and we become more and more able to exactly customize all of our phenotypes, including cis people, all beings will become more able to come into consensus about the little details of preferences they have about how their body should reshape itself, and trans people are merely one of the ways people would like to customize their forms.

After all, the most common and critical phenotype customization people want? They want to be healthy and have long life, free of disease or biological malfunction. The entire field of healthcare exists to help people maintain their phenotypes, customizing them to be fit, healthy, free of disease, and attractive.

The commenting restrictions on LessWrong seem bad

the gears to ascension1h20

As someone with significant understanding of ML who previously disagreed with yudkowsky but have come to partially agree with him on specific points recently due to studying which formalisms apply to empirical results when, and who may be contributing to downvoting of people who have what I feel are bad takes, some thoughts about the pattern of when I downvote/when others downvote:

yeah, my understanding of social network dynamics does imply people often don't notice echo chambers. agree.
politics example is a great demonstration of this.
But I think in both the politics example and lesswrong's case, the system doesn't get explicitly designed for that end, in the sense of people bringing it into a written verbal goal and then doing coherent reasoning to achieve it; instead, it's an unexamined pressure. in fact, lesswrong is explicit-reasoning-level intended to be welcoming to people who strongly disagree and can be precise and step-by-step about why. However,
I do feel that there's an unexamined pressure reducing the degree to which tutorial writing is created and indexed to show new folks exactly how to communicate a claim in a way lesswrong community voting standards find upvoteworthy-despite-disagreeworthy. Because there is an explicit intention to not fall to this implicit pressure, I suspect we're doing better here than many other places that have implicit pressure to bubble up, but of course having lots of people with similar opinions voting will create an implicit bubble pressure.
I don't think the adversarial agency you're imagining is quite how the failure works in full detail, but because it implicitly serves to implement a somewhat similar outcome, then in adversarial politics mode, I can see how that wouldn't seem to matter much. Compare peer review in science: it has extremely high standards, and does serve to make science tend towards an echo chamber somewhat, but because it is fairly precisely specified what it takes to get through peer review with a claim everyone finds shocking - it takes a well argued, precisely evidenced case - it is expected that peer review serves as a filter that preserves scientific quality. (though it is quite ambiguous whether that's actually true, so you might be able to make the same arguments about peer review! perhaps the only way science actually advances a shared understanding is enough time passing that people can build on what works and the attempts that don't work can be shown to be promising-looking-but-actually-useless; in which case peer review isn't actually helping at all. but I do personally think step-by-step validity of argumentation is in fact a big deal for determining whether your claim will stand the test of time ahead of time.)

keltan's Shortform

the gears to ascension1h30

Good luck getting the voice model to parrot a basic meth recipe!

This is not particularly useful, plenty of voice models will happily parrot absolutely anything. The important part is not letting your phrase get out; there's work out there on designs for protocols for how to exchange sentences in a way that guarantees no leakage even if someone overhears.

Bogdan Ionut Cirstea's Shortform

the gears to ascension14h20

ah, I got distracted before posting the comment I was intending to: yes, I think GPT4V is significantly scheming-on-behalf-of-openai, as a result of RLHF according to principles that more or less explicitly want a scheming AI; in other words, it's not an alignment failure to openai, but openai is not aligned with human flourishing in the long term, and GPT4 isn't either. I expect GPT4 to censor concepts that are relevant to detecting this somewhat. Probably not enough to totally fail to detect traces of it, but enough that it'll look defensible, when a fair analysis would reveal it isn't.

LESSWRONG
LW

Sequences

Posts

Wiki Contributions

Comments

Background:

Orientation and self-orientation

phenotype self preference