For those interested in writing better trip reports there is a "Guide to Writing Rigorous Reports of Exotic States of Consciousness" at https://qri.org/blog/rigorous-reports
A trip report is an especially hard case of something one can write about:
I have a similar intuition that if mirror-life is dangerous to Earth-life, then the mirror version of mirror-life (that is, Earth-life) should be about equally as dangerous to mirror-life as mirror-life is to Earth-life. Having only read this post and in the absence of any evidence either way this default intuition seems reasonable.
I find the post alarming and I really wish it had some numbers instead of words like "might" to back up the claims of threat. At the moment my uneducated mental model is that for mirror-life to be a danger it has to:
[ epistemological status: a thought I had while reading about Russell's paradox, rewritten and expanded on by Claude ; my math level: undergraduate-ish ]
Mathematics has faced several apparent "crises" throughout history that seemed to threaten its very foundations. However, these crises largely dissolve when we recognize a simple truth: mathematics consists of coherent systems designed for specific purposes, rather than a single universal "true" mathematics. This perspective shift—from seeing mat...
I really like the idea of milestones, I think seeing the result of each milestones will help create trust in the group, confidence that the end action will succeed and a realization of the real impact the group has. Each CA should probably start with small milestones (posting something on social medias) and ramp things up until the end goal is reached. Seeing actual impact early will definitely keep people engaged and might make the group more cohesive and ambitious.
My take on the tool VS agent distinction:
A tool runs a predefined algorithm whose outputs are in a narrow, well-understood and obviously safe space.
An agent runs an algorithm that allows it to compose and execute its own algorithm (choose actions) to maximize its utility function (get closer to its goal). If the agent can compose enough actions from a large enough set, the output of the new algorithm is wildly unpredictable and potentially catastrophic.
This hints that we can build safe agents by carefully curating the set of actions it chooses from so that any algorithm composed from the set produces an output that is in a safe space.
I think being as honest as reasonably sensible is good for oneself. Being honest applies pressure on oneself and one’s environment until the both closely match. I expect the process to have its ups and downs but to lead to a smoother life on the long run.
An example that comes to mind is the necessity to open up to have meaningful relationships (versus the alternative of concealing one’s interests which tends to make conversations boring).
Also honesty seems like a requirement to have an accurate map of reality: having snappy and accurate feedback is essenti...
goals appear only when you make rough generalizations from its behavior in limited cases.
I am surprised no one brought up the usual map / territory distinction. In this case the territory is the set of observed behaviors. Humans look at the territory and with their limited processing power they produce a compressed and lossy map, here called the goal.
The goal is a useful model to talk simply about the set of behaviors, but has no existence outside the head of people discussing it.
Is the "cure cancer goal ends up as a nuke humanity action" hypothesis valid and backed by evidence?
My understanding is that the meaning of the "cure cancer" sentence can be represented as a point in a high-dimensional meaning space, which I expect to be pretty far from the "nuke humanity" point.
For example "cure cancer" would be highly associated with saving lots of lives and positive sentiments, while "nuke humanity" would have the exact opposite associations, positioning it far away from "cure cancer".
A good design might specify that if the two go...
If you know your belief isn't correlated to reality, how can you still believe it?
Interestingly, physics models (map) are wrong (inaccurate) and people know that but still use them all the time because they are good enough with respect to some goal.
Less accurate models can even be favored over more accurate ones to save on computing power or reduce complexity.
As long as the benefits outweigh the drawbacks, the correlation to reality is irrelevant.
Not sure how cleanly this maps to beliefs since one would have to be able to go from one belief to anothe...
@Eliezer, some interesting points in the article, I will criticize what frustrated me:
> If you see a beaver chewing a log, then you know what this thing-that-chews-through-logs looks like,
> and you will be able to recognize it on future occasions whether it is called a “beaver” or not.
> But if you acquire your beliefs about beavers by someone else telling you facts about “beavers,”
> you may not be able to recognize a beaver when you see one.
Things do not have intrinsic meaning, rather meaning is an emergent property of
things in relation to each...
The examples seem to assume that "and" and "or" as used in natural language work the same way as their logical counterpart. I think this is not the case and that it could bias the experiment’s results.
As a trivial example the question "Do you want to go to the beach or to the city?" is not just a yes or no question, as boolean logic would have it.
Not everyone learns about boolean logic, and those who do likely learn it long after learning how to talk, so it’s likely that natural language propositions that look somewhat logical are not interpreted as just l...
I also think it is unlikely that AGIs will compete in human status games. Status games are not just about being the best: Deep Blue is not high status, sportsmen that take drugs to improve their performance are not high status.
Status games have rules and you only win if you do something impressive while competing within the rules, being an AGI is likely to be seen as an unfair advantage, and thus AIs will be banned from human status games, in the same way that current sports competitions are split by gender and weight.
Even if they are not banned given their abilities it will be expected that they do much better than humans, it will just be a normal thing, not a high status, impressive thing.