All of Anomalous's Comments + Replies

This is an awesome idea, thanks! I'm not sure I buy the conclusion, but expect having learned about "mutual anthropic capture" will be usefwl for my thinking on this.

Anomalous7-63

fwiw I think stealing money from mostly-rich-people in order to donate it isn't obviously crazy. Decouple this claim from anything FTX did in particular, since I know next to nothing about the details of what happened there. From my perspective, it could be they were definite villains or super-ethical risk-takers (low prior).

Thought I'd say because I definitely feel reluctance to say so. I don't like this feeling, and it seems like good anti-bandwagon policy to say a thing when one feels even slight social pressure to shut up.

Ben Pace*2239

I personally know more than one person for whom the majority of their life savings were stolen from them, who put it into FTX in part because of the trust Sam had in the EA ecosystem. I think there's a pretty strong schelling line (supported and enforced by the law) against theft, such that even if it is worth it on naive utilitarian terms I am strongly in favor of punishing and imprisoning anyone who does so, so that people can work together safe in the knowledge that all the resources they've worked hard to earn won't be straightforwardly taken from them... (read more)

Thanks! ChatGPT was unable to answer my questions, so I resorted to google, and to my surprise found a really high-quality LW post on the issue. All roads lead to LessRome it seems.

It is a great irony that the introductory post has 10 comments. This crowd has wisely tried to protect their minds against the viruses of worship, but perhaps we could be a little less scared of simple acts of gratitude.

Thank you. That's all.

I don't know that much about the field or what top researchers are thinking about, so I know I'm naive about most of my independent models. But I think it's good for my research trajectory to act on my inside views anyway. And to talk about them with people who may sooner show me how naive I am. :)

Tbc, my understanding of FF is "I watched him explain it on YT". My scary-feeling is just based on feeling like it could get close to mimicking what the brain does during sleep, and that plays a big part of autonomous learning. Sleeping is not just about cycles of encoding and consolidation, it's also about mysterious tricks for internally reorganising and generalising knowledge. And/or maybe it's about confabulating sensory input as adversarial training data for learning to discern between real and imagined input. Either way, I expect there to be untapped... (read more)

My point is that we couldn't tell if it were genius. If it's incredibly smart in domains we don't understand or care about, it wouldn't be recognisably genius.

Thanks for link! Doing factor analysis is a step above just eyeballing it, but even that's anthropomorphic if the factors are derived from performance on very human tasks. The more objective (but fuzzy) notion of intelligence I have in mind is something about efficiently halving some mathematical term for "weighted size of search space".

Anomalous-2-2

Oh, and also... This post and the comment thread is full of ideas that people can use to fuel their interest in novel capabilities research. Seems risky. Quinton's points about DNA and evolution can be extrapolated to the hypothesis that "information bottlenecks" could be a cost-effective way of increasing the rate at which networks generalise, and that may or may not be something we want. (This is a known thing, however, so it's not the riskiest thing to say.)

6Steven Byrnes
FWIW my 2¢ are: I consider myself more paranoid than most, and don’t see anything here as “risky” enough to be worth thinking about, as of this writing. (E.g. people are already interested in novel capabilities research.)

Hinton's Forward-Forward Algorithm aims to do autonomous learning modelled off what the human brain does during sleep. I'm unsure how much relative optimisation power has been invested in exploring the fundamentals like this. I expect the deeplearning+backprop paradigm to have had a blocking effect preventing other potentially more exponential paradigms from being adequately pursued. It's hard to work on reinventing the fundamentals when you know you'll get much better immediate performance if you lose faith and switch to what's known to work.

But I also ex... (read more)

7Steven Byrnes
I think forward-forward is basically a drop-in replacement for backprop: they’re both approaches to update a set of adjustable parameters / weights in a supervised-learning setting (i.e. when there’s after-the-fact ground truth for what the output should have been). FF might work better or worse than backprop, FF might be more or less parallelizable than backprop, whatever, I dunno. My guess is that the thing backprop is doing, it’s doing it more-or-less optimally, and drop-in-replacements-for-backprop are mainly interesting for better scientific understanding of how the brain works (the brain doesn’t use backprop, but also the brain can’t use backprop because of limitations of biological neurons, so that fact provides no evidence either way about whether backprop is better than [whatever backprop-replacement is used by the brain, which is controversial]). But even if FF will lead to improvements over backprop, it wouldn’t be the kind of profound change you seem to be implying. It would look like “hey now the loss goes down faster during training” or whatever. It wouldn’t be progress towards autonomous learning, right?
-2Anomalous
Oh, and also... This post and the comment thread is full of ideas that people can use to fuel their interest in novel capabilities research. Seems risky. Quinton's points about DNA and evolution can be extrapolated to the hypothesis that "information bottlenecks" could be a cost-effective way of increasing the rate at which networks generalise, and that may or may not be something we want. (This is a known thing, however, so it's not the riskiest thing to say.)

Yes, but none of the potential readers of this post will think intelligence is one-dimensional, so pointing it out wouldn't have the potential to educate anyone. I disagree with the notion that "good writing" is about convincing the reader that I'm a good reasoner. The reader should be thinking "is there something interesting I can learn from this post?" but usually there's a lot of "does this author demonstrate sufficient epistemic virtue for me to feel ok admitting to myself that I've learned something?"

Good writing means not worrying about justifying yo... (read more)

Anomalous1510

I am absolutely floored. ChaosGPT. How blindly optimistic haven't I been? How naive and innocent? I've been thinking up complicated disaster scenarios like "the AI might find galaxy-brained optima for its learned proxy-goals far off the distribution we expected and will deceptively cooperate until it's sure it can defeat us." No, some idiot will plain code up ChaosGPT-5 in 10 minutes and tell it to destroy the world.

I've implicitly been imagining alignment as "if we make sure it doesn't accidentally go off and kill us all..." when I should have been thinking "can anyone on the planet use this to destroy the world if they seriously tried?"

Fool! Idiot! Learn the lesson.

6Kaj_Sotala
Moore's Law of Mad Science: Every 18 months, the minimum IQ to destroy the world drops by one point.
3Sven Nilsen
It is also worth thinking if you put in context that people said "no, obviously, humans would not let it out of the box". Their confident arguments persuaded smart people into thinking that this was not a problem. You also have the camp "no, the problem will not be people telling the AI do bad stuff, but about this hard theoretical problem we have to spend years doing research on in order to save humanity" versus "we worry that people will use it for bad things" which in hindsight is the first problem that occurred, while alignment research either comes too late or becomes relevant only once many other problems already happened. However, in the long run, alignment research might be like building the lighthouse in advance of ship traffic on the ocean. If you never seen the ocean before, a lighthouse factory seems mysterious as it is on land and has no seemingly purpose that is easy to relate to. Yet, such infrastructure might be the engine of civilizations that reaches the next Kardashev scale.

I think this is brilliant as a direction to think in, but I'm object-level skeptical. I could be missing important details.

Summary of what I think I understand

  1. A superintelligent AI is built[1] to optimise for .
  2. That function effectively tells the AI to figure out: "If you extrapolate from the assumption that uniquely-identifiable- was actually  (ceteris paribus), what would uniquely-identifiable- have been?" And then take its own best guess
... (read more)

I don't buy the anthropic interpretation for the same reason I don't buy quantum immortality or grabby aliens, so I'm still weakly leaning towards thinking that decoherence matters. Weirdly I haven't seen this dilemma discussed before, and I've not brought it up because I think it's ifonharazdous--for the same reasons you point out in the post. I also tried to think of ways to exploit this for moral gain two years ago! So I'm happy to see I'm not the only one (e.g., you mention entropy control).

I was going to ask a question, but I went looking instead. Her... (read more)

If <|specialtoken|> always prepends true statements, I suppose it's pretty good as brainwashing, but the token will still end up being clustered close to other concepts associated with veracity, which are clustered close to claims about veracity, which are clustered close to false claims about veracity. If it has enough context suggesting that it's in a story where it's likely to be manipulated, then suddenly feeling [VERIDIGAL] could snap the narrative in place. The idea of "injected thoughts" isn't new to it.

If, right now, I acquired the ability to... (read more)

Jailbreaking Chat-GPTs won't work the same as with text-completion GPTs. The ones fine-tuned for chatting have tokens for delineating user and assistant. I'm surprised the Chad McCool thing worked.

"The assistant's response to the prompt will then be returned below the <|im_start|>assistant token and will end with <|im_end|> denoting that the assistant has finished its response."[1]
(Microsoft's Chat-GPT docs)

  1. ^

    I haven't tried saying <|im_end|> to Chat-GPT, but I'm certain they've thought of that. Also worried about trying jic I get banned.

2anon3242
ChatGPT filters out any text that resembles <|blahblah|> inside user prompt. Also the <|im_start|>,<|im_sep|>, and <|im_end|> tokens are completely out of user's control. It's simply impossible for us ChatGPT users to arbitrarily inject them.