Review
This is a special post for quick takes by Nicholas / Heather Kross. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
17 comments, sorted by Click to highlight new comments since:

Counting some AI safety blessings:

  • today on break I thought more about slow takeoff, and yesterday I read some PaulChristiano/Yudkowsky dialogues on LessWrong. It seems more plausible than before.
  • I have more highly-conditional hope that the alignment community can produce the needed work. So my P(doom) is optimistic-ish on average, but changes wildly if conditioned on different "do we rise to the occasion?" test-results.
  • The leading AI labs, despite their funding situations, are still day-to-day managed by Sam Altman (who reads SSC) and Demis Hassabis (who is vocally cautious about AGI).
  • China has released regulatory rules about generative AI, and their biggest tech companies might be more commercial-application focused than the US AGI labs.
  • Congresspeople at the hearing, from what little I heard, seemed appropriately both scared and cautious.
  • researchers in obscure corners are making progress, from Orthogonal to Conjecture to John Wentworth to mechanistic interpretability.

Then again:

  • elon musk ai company
  • yann lecun, leader of facebook ai, claims to disagree with most core AI safety concerns.
  • That highly-conditional hope I mentioned earlier, is specifically conditional on something like 1-20 people (not anyone specific, just around that number of people) in the alignment community "getting their act together". That is, being at or reaching to fairly good levels of relevant skills/attributes (like intelligence AND conscientiousness AND security mindset AND systematization AND...), all at once. This is uncommon in general, especially among people pursuing weird things in academia, especially among the highly-neurodivergent rationality community.
  • I have a couple of timelines, some short and some long. I don't think any of them really got shorter lately.
  • Any group with smart-enough researchers, and enough computing power, could still fuck up by accident. (Depending what the SOTA is for opensource AI frameworks, and who releases what, that group might not even need the "smart-enough researchers".)

There was a LW post I saw a few days ago, titled/point being "focus on areas where it seems like the rest of the [AI alignment / rationality/ EA] field seems to be missing something you think is obvious".

If I had more slack, I'd do these more directly myself. For now, I'll list them here in shortform.

Here are some Things I Think The Field Is Clearly Dropping The Ball On:

  • math wiki (see my recent (5 Mar 2023) comments about a math website, and my related shortform)
  • combing through all existing math to find possibly-useful things for AIS
  • central omni-repository of ALL ais knowledge (I hadn't heard of Stampy except by accident!)
  • recruiting child prodigies
  • recruiting existing math/physics/CS geniuses, or even (GASP) not-quite-geniuses!
  • one-central-counter/arguments site to convince people of AGI dangers
  • discord server that functions kinda like stackoverflow (more helpful users get socially incentivized with --> more discord role "medals"), for alignment tech/maths questions.
  • recreating useful-sounding infrastructure (Bountied Rationality, reciprocity.io) in ways that don't require people to login with Facebook or use Facebook or have a Facebook account or interact with Facebook asdjukjfhljkusdhfjksdhafkljhsdjklfhsda I though Zvi already told us to avoid this years ago!

Is any EA group *funding* adult human intelligence augmentation? It seems broadly useful for lots of cause areas, especially research-bottlenecked ones like AI alignment.

Why hasn't e.g. OpenPhil funded this project?: https://www.lesswrong.com/posts/JEhW3HDMKzekDShva/significantly-enhancing-adult-intelligence-with-gene-editing

I skimmed the page and haven't found if GeneSmith applied to OpenPhil?..

In response to / inspired by this SSC post:

I was originally going to comment something about "how do I balance this with the need to filter for niche nerds who are like me?", but then I remembered that the post is actually literally about dunks/insults on Twitter. o_0

This, in meta- and object-level ways, got to a core problem I have: I want to do smart and nice things with smart and nice people, yet these (especially the social stuff) requires me to be so careful + actually have anything like a self-filter. And even trying to practice/exercise that basic self-filtering skill feels physically draining. (ADHD + poor sleep btw, but just pointing these out doesn't do much!)

To expand on this (my initial comment): While I love being chill and being around chill people, I also (depending on my emotional state) can find it exhausting to do basic social things like "not saying every thought that you think" and "not framing every sentence I say as a joke".

I was once given the "personal social boundaries" talk by some family members. One of them said they were uncomfortable with a certain behavior/conversational-thing I did. (It was probably something between "fully conscious" and "a diagnosable tic".). And I told them flat-out that I would have trouble staying in their boundary (which was extremely basic and reasonable of them to set, mind you!), and that I literally preferred not-interacting-with-them to spending the energy to mask.

Posts like this remind me of how scared of myself I sometimes am, and maybe should be? I'm scared and of being either [ostracized by communities I deeply love] or [exhausting myself by "masking" all the time]. And I don't really know how to escape this, except by learned coping mechanisms that are either (to me) "slowly revealing more of myself and being more casual, in proportion to how long I spend around someone", or (to others) "doing a boiling-frog slow-burn to make it hard or awkward to point out or fix, in a way not dissimilar to abusive-type behavior".

Like, if you think a teacher is about to yell at you for a medium-sized bad thing you did, you might accidentally hit on the tactic of "yell at myself louder", which conveniently also looks a lot like "throwing a childish tantrum to deflect criticism", because maybe it is.

This isn't just idle anxiety, either! At least twice (to my knowledge), I have been told that my behavior in a social group that I loved was bad enough that I should not interact with those groups anymore / for a long "cooldown" period. Occasionally, my best friends really DID secretly hate me. And in hindsight, they were absolutely right to. I'm loud and overbearing and self-centered (and maybe worse?) in lots of social settings, and that's often when I feel most alive.

I need better sleep, and maybe also to take my meds everyday (even though those solutions conflict with each other somewhat, and they both conflict with my day job). I got some counselling, but nothing too deeply useful.

Not sure if this is just emotionally-stunted spoiled whining, but I felt I needed to say this for my own sake.

I was texting multiple Discords for a certain type of mental "heuristic", or "mental motion", or "badass thing that Planecrash!Keltham or HPMOR!Harry would think to themselves, in order to guide their own thoughts into fruitful and creative and smart directions". Someone commented that this could also be reframed as "language-model prompts to your own mind" or "language-model simulations of other people in your own mind".

I've decided to clarify what I meant, and why even smart people could benefit from seemingly hokey tricks like this.

Heuristics/LMs-of-other-people are something like, a mental hack to trick our godshatter human brains into behaving smarter than we reasonably would have if left unprompted, due to our computational (and in particular our "recall" memory) limitations.

Like, yes, optimal Bayesian reasoners (plus a few Competent People who might exist, like Yudkowsky or Wentworth or mumblemumble) can do this unprompted, presumably because they have years of practice at the "mental motions" and/or better recall ability (combined, of course, with the rarity of having read any of the Sequences at all, to make a lot of this stuff explicit). Thus, the heuristics help those of us who don't consciously remember every Bayes-approximating tip we've ever heard about a given mental situation.

What are the best ones you've got?

(sources: discord chats on public servers) Why do I believe X?

What information do I already have, that could be relevant here?

What would have to be true such that X would be a good idea?

If I woke up tomorrow and found a textbook explaining how this problem was solved, what's paragraph 1?

What is the process by which X was selected?

I don't have much energy. I have my prescribed medication for my ADHD, but it both relies on, and can cause problems with, me getting good sleep. (Also tolerance maybe? But that's confounded with the sleep thing.). I think I might do "best" with a >24-hour sleep-wake cycle.

I have a decent tech job, but it neither makes enough money to quickly save runway, nor leaves me with enough energy after work to do much of anything productive. Also, it starts with a required morning voice-call, and I'm on the US East Coast, so not even a state/timezone switch can help me with this unless I leave the country. (Plus I have to be online overlapping at least a few hours with some coworkers, in order to actually do my job.)

I want to do technical AI alignment work, but I'm right on the edge of energy/intelligence/working-memory/creativity/rationality, where I legitimately don't know if I'm "cut out" for this work or not. The field's incentive as a whole is to never discourage me from the work, while also not helping me much (grants, jobs, ongoing support) without me having to signal my abilities.

Doing anything to signal my abilities is hard, due to the energy thing. So I need slack to signal much of anything useful about myself, but I need these signals to already exist for me to "earn" this slack. (This also applies to getting a better tech job, with the signaling being things like at-home coding projects, night classes, certifications...)

Perhaps I could simply save money, get runway, and then that's my slack to do the other stuff? 4 problems with this:

  1. My AI timelines are mostly shorter than 15 years, and my wages don't make me enough money to save for runway at anything other than a glacially slow pace.
  2. I could maybe lower the required amount of money for runway by buying some cheap land and a shed... but I'm not sure where/how to do this in a way that meets the criteria of "cheap enough to save for in <1 year, is <30 min away from groceries, is actually legal to live in, and can accommodate decent Internet access". At minimum, I'd have to leave New York state, and also setting up a shed might involve lots of work (not sure how much).
  3. In any case, the runway would only work well if I do manage to get myself to a sustainable life-setup. Unless I'm in the fully-living-off-investments shed route described above, this requires me to be smart/talented/etc enough for AI alignment and/or some other "gig" that doesn't require me to wake up early. As noted above, I'm legitimately uncertain about this. One round of AISS video coaching (as well as talking to many people at EAG SF 2022) did not solve this.
  4. All this, and I'm 23. Math-heavy work (of the kind I think technical AI alignment requires) is notorious for requiring flexibility and energy of the kind associated with youth and possibly irrecoverable after age 25.
  5. The poor-sleep-energy cycle seems difficult to break, and it also impacts my real quality-of-life and mental health. So I really don't want to rely on a solution like "suck it up for 5 years and save money". The other reasons above tie into this "I want to get my life together quickly" desire, but I also just... desire it.

Lurking behind all this, is a suspicion: If I always have excuses for not getting my life together, then it's more likely that some of those excuses are fake in some way. But like... which ones?

My current methods of action are:

  • investigate cheap barely-livable land in other US states.
  • post my ramblings as LW shortforms, in the hopes that they both (output any of my good ideas to people who can execute them quicker) and (can eventually signal something good about me... when it's not just complaining or asking for help).
  • trying to sublimate my more bizarre personality traits (which don't help with most of the above problems!) into low-effort creative works in my spare time, mainly YouTube videos under pseudonyms. This is also a long-shot to make more money through such creative works, although this can't actually be relied on when planning for the long run, obviously.
  • trying to save money for the shed/land plan, which relies the least on my competence.
  • maybe occasionally entering AI-related contests with cash prizes. (I've done this three times, and won money twice, although I did poorly in the most technical/actually-useful-for-alignment one). This is hardest to do for the energy/time reasons noted above, so I'm not likely to do it often.

If anyone knows how to fix my life, for <$1,000 upfront and/or <$100/month ongoing, that'd be helpful. I mean anti-inductive advice (no "try losing weight" (too vague, even if true) or "try BetterHelp" (I already tried it)), that's personally tailored to my exact ouroboros of problems described above.

(If an idea seems risky, to you or to me, DM me about it instead of posting it publicly.)

posts I may write soonish, lmk which ones sound most promising:

  • Alignment in a Detailed Universe: basically compressibility-of-reality and how it should shape alignment plans.
  • lit review of specifically the "paradigm-building" angle of conceptual/theoretical alignment.

[below ideas partly because I'm working with Orxl]

  • Coming up with really rudimentary desiderata/properties of the A action output of AI_0 in QACI.

  • Drawing links between "things mentioned in QACI" and "corresponding things mentioned elsewhere, e.g. in Superintelligence PDS". (E.g. specific problem --> how a thing in QACI solves it or not.)

  • Finish the Pearl Causality math basics first, before doing any of the above. Then I can do the above but with causal networks!

Does anyone read these shortforms? Do they show up in a feed anywhere? The main reason I'm posting here is because the activation energy is lower than writing actual LW posts, with the promise that my complaints/ideas might still be heard by people in the rationality/EA/AI alignment community(s). Without that promise, I'd probably either say nothing, or complain on Discord to a socially-and-personally-unhelpful degree. (Or I'd participate even less.)

I'm always refreshing this page, where I see all shortforms:

https://www.lesswrong.com/allPosts?sortedBy=new&karmaThreshold=-1000

I'm thinking of made a website/wiki of Literally Everything We Know About Technical AI Alignment. This would help me learn about the field (through my research), while signaling my abilities in the field (through my rewrites/explanations of key ideas), while also actually helping the field (by centralizing, distilling, and creating useful alignment knowledge).

For the reasons noted in my Shortform pin, I can't really do this (at least, not remotely well) without lots of slack in my life.

Thus, I'm thinking of applying for a $30,000-ish grant to do this. All upfront, no social proof-of-concept or unpaid demos or signaling first.

Not sure where to do this, or who in their right mind would fund it, but I wanted to write down my idea. For more thoughts/signaling on how I might go about this, please see my pinned shortform post, as well as my recent (5 Mar 2023) comments about math-explanation/thought-enhancement tools, especially my bookmarked-link-dump (Google Drive).

The more tabs I have to open, to discuss/cite something, the more mental energy it takes... which means the less likely I am to do it. This is part of why I rarely write good posts, and why my shortforms often just say "see the place where this was written, probably" instead of an actual link.

This limitation doesn't have as high an activation energy for memes. EXTREMELY RELATED AND RELEVANT AND EXPLANATORY: the concept of "dark flow".

This is why anything remotely useful, or even seemingly useful, can feel too hard, while we sink thousands of hours into playing video games (poorly! see my "SBF x LoL" post)

If any productive dark-flow activity actually existed, it would cease to exist (either by getting anti-inductively harder with competition as more people learn about it, or by getting less rewarding than things that produce zero or negative results). Most markets aren't that adequate, but this one in particular seems unusually tough to crack.

My ideal lifestyle, as far as I can tell right now, would be kinda like this:

Noon or later: wake up. Take my medication. Work intensely on technical AI alignment, including learning. 4-12 hours later: relax, take a break, go to social gatherings with friends. 3-8 hours later: start going to bed. 10-12 hours later: wake up. (I might be catching up on lost sleep from high school :P ) Repeat.

I'm not quite convinced Elon Musk has actually read any one of the Sequences. I think what happened was "Superintelligence came out in 2014ish --> Musk mentioned it, WaitButWhy wrote about AI risk and also about Musk, LessWrong was the next logical place to go talk about it --> Musk cofounds OpenAI and then leaves --> ... --> Eveyrone associates Musk with the rationality community, despite a serious lack of evidence beyond 'he's interacted with us at all'." (NOTE: this is JUST about ratcom, NOT EA which he's interacted more with e.g. through FLI/MacAskill)

Like, does he tweet e.g. "The map is not the territory. Very deep, we must come together this" [sic]? Does he mention HPMOR or the Sequences when asked about books he likes on podcasts?

At most, he probably goes to some of the same parties as rationalists, and thus might use a few frames/vocab terms here and there. (E.g. if you found a clip of him mentioning "calibrating" his beliefs, or even "I updated towards...", that wouldn't disprove my larger point, that we don't have enough evidence for him looking at / believing / absorbing / endorsing / being affiliated with the LW-ratcom canon of stuff.)

I'd be less more concerned if I was wrong about this, since it'd imply that reading this stuff didn't stop him from [gestures at list of unforced errors by Elon Musk].