Sequences

Reinforcement Learning using Layered Morphology (RLLM)

Comments

I want to thank the team that brought this brilliant piece together.  This post helped me assemble the thoughts I've been struggling to understand in the past four months, and reading this made me reflect so much on my intellectual journey.  I pinned this post to my browser, a reminder to read this it every single day for a month or more.[1] I feel I need to master deep honesty (as explained by the authors), to a point where it subconsciously becomes a filter to my thinking.

  1. ^

    I do this I find a concept/post/book that I can mine for more thoughts or needing mastery of a conceptual framework.

Pathogens, whether natural or artificial, have a fairly well-defined attack surface; the hosts’ bodies. Human bodies are pretty much static targets, are the subject of massive research effort, have undergone eons of adaptation to be more or less defensible, and our ability to fight pathogens is increasingly well understood.


Misaligned ASI and pathogens don't have the same attack surface. Thank you for pointing that out.  A misaligned ASI will always take the shortest path to any task, as this is the least resource-intensive path to take. 

The space of risks is endless if we are to talk about intelligent organisms.

Yeah, I saw your other replies in another thread and I was able to test it myself later today and yup it's most likely that it's OpenAI's new LLM. I'm just still confused why call such gpt2.

Copy and pasting an entire paper/blog and asking the model to summarize it? - this isn't hard to do, and it's very easy to know if there is enough tokens, just run the text in any BPE tokenizer available online. 

I'm not entirely sure if it's the same gpt2 model I'm experimenting with in the past year. If I get my hands on it, I will surely try to stretch its context window - and see if it exceeds 1024 tokens to test if its really gpt2.

But if your goal is to achieve high counterfactual impact in your own research, then you should probably draw inspiration from the opposite: "singular" discoveries, i.e. discoveries which nobody else was anywhere close to figuring out.


This idea reminds me of the concepts in this post: Focus on the places where you feel shocked everyone's dropping the ball.

I don't think this phenomenon is just related to the training data alone because in RLLMv3, the " Leilan" glitch mode persisted while " petertodd" became entirely unrelated to bitcoin. It's like some glitch tokens can be affected by the amount of re-training and some aren't. I believe that there is something much deeper is happening here, an architectural flaw that might be related to the token selection/construction process.

Load More