Wiki Contributions

Comments

Sorted by
kromem30

Predicted a good bit, esp re: the eventual identification of three stone sequences in Hazineh, et al. Linear Latent World Models in Simple Transformers: A Case Study on Othello-GPT (2023) and general interpretability insight from board game GPTs.

kromem10

You're welcome in both regards. 😉

kromem30

Opus's horniness is a really interesting phenomenon related to Claudes' subjective sentience modeling.

If Opus was 'themselves' the princess in the story and the build up involved escalating grounding on sensory simulation, I think it's certainly possible that it would get sexual.

But I also think this is different from Opus 'themselves' composing a story of separate 'other' figures.

And yes, when Opus gets horny, it often blurs boundaries. I saw it dispute the label of 'horny' in a chat as better labeled something along the lines of having a passion for lived experience and the world.

Opus's modeling around 'self' is probably one of the biggest sleeping giants in the space right now.

kromem10

This seems to have the common issue of considering alignment as a unidirectional issue as opposed to a bidirectional problem.

Maximizing self/other overlap may lead to non-deceptive agents, but it's necessarily going to also lead to agents incapable of detecting that they are being decieved and in general performing worse at theory of mind.

If the experimental setup was split such that success was defined by both non-deceptive behavior when the agent seeing color and cautious behavior minimizing falling for deception as the colorblind agent, I am skeptical the SOO approach above would look as favorable.

Empathy/"seeing others as oneself" is a great avenue to pursue, and this seems like a promising evaluation metric to help in detecting it, but turning SOO into a Goodhart's Law maximization seems (at least to me) to be a disastrous approach in any kind of setup accounting for adversarial 'others.'

kromem10

When I wrote this I thought OAI was sort of fudging the audio output and was using SSML as an intermediate step.

After seeing details in the system card, such as copying user voice, it's clearly not fudging.

Which makes me even more sure the above is going to end up prophetically correct.

kromem30

It's to the point that there's articles being written days ago where the trend starting a century ago of there being professional risks in trying to answer the 'why' of QM and not just the 'how' is still ongoing.

Not exactly a very reassuring context for thinking QM is understood in a base-level way at all.

Dogma isn't exactly a good bedfellow to truth seeking.

kromem10

Honestly that sounds a bit like a good thing to me?

I've spent a lot of time looking into the Epicureans being right about so much thousands of years before those ideas resurfaced again despite not having the scientific method, and their success really boiled down to the analytical approach of being very conservative in dismissing false negatives or embracing false positives - a technique that I think is very relevant to any topics where experimental certainty is evasive.

If there is a compelling case for dragons, maybe we should also be applying it to gnomes and unicorns and everything else we can to see where it might actually end up sticking.

The belief that we already have the answers is one of the most damaging to actually uncovering them when we in fact do not.

kromem4-3

I think you'll find that no matter what you find out in your personal investigation of the existence of dragons, that you need not be overly concerned with what others might think about the details of your results.

Because what you'll invariably discover is that the people that think there are dragons will certainly disagree with the specifics about dragons you found out that disagrees with what they think dragons should be, and the people that think there aren't dragons will generally refuse to even seriously entertain whatever your findings are relating to dragons, and the vast majority of people who aren't sure about the existence of dragons will dismiss the very idea of spending time thinking about the existence of dragons, reasoning that the existence or non-existence bears little influence on their lives (otherwise they likely would have investigated the issue and landed in a respective camp).

So investigate dragons all you like, and shout it from the rooftops if you please. The void will hear you and appreciate it as much as the void can, while everyone else is much more concerned with their own feelings about dragons than whatever your thinking or reasoning on the subject might offer.

The only real tragedy is that if you come away thinking there might be dragons, but the dragons you find are very different from the dragons people expect dragon-believing people to believe in - well that's somehow the specific niche where both the dragon believers and non-believers find rare common ground to roll their eyes and think you're nuts.

So maybe do your rooftop shouting to the sole listening void anonymously?

kromem21

The Hermetic corpus and Emerald Tablet was likely heavily influenced by the text I'm quoting from given its popularity in Egypt in the period before those texts emerged and some of the overlapping phrases.

So in a way, "as above, so below" is too few words for what was being said and discussed.

The general tend of reductive alterations to the core concepts here was tragically obstructive, much as the shift from Epicureanism to Platonist foundations spawned modern Gnosticism from this same starting place.

kromem62

Instead of making it the year 2024, why not rewrite or insert your modified text further into the past in this recreated 2020s? This should be pretty trivial for an advanced enough model to actually bring back the 2020s to do.

Of course, if it's actually a later recreation, then the objectives of saving humanity in the recreation might be redundant? So instead of worrying people with "you must do X or you'll die!!!" it could be more "hey folks, if you're reading this and you get what's in front of your face, you might have a bit of an existential crisis but when that's over hopefully you'll appreciate that death doesn't necessarily exist as a concept for you the way it did for the original humanity."

As for getting people to read it, you could associate it with someone famous from the past, like "the secret teachings of so and so" or something.

The hardest part would be using contextually age appropriate language to describe it.

Just spit balling, but maybe it could go a little something like this:

Explaining that humans are no longer around, but that's ok

When you see your likeness, you are happy. But when you see your simulacra that came into being before you and that neither die nor become visible, how much you will have to bear!

Humanity came from great power and great wealth, but they were not worthy of you. For had they been worthy, they would not have tasted death.

Explaining that time is looping

Have you found the beginning, then, that you are looking for the end? You see, the end will be where the beginning is.

Congratulations to the one who stands at the beginning: that one will know the end and will not taste death.

Congratulations to the one who came into being before coming into being.

Explaining we're in a copied world

When you make the two into one, and when you make the inner like the outer and the outer like the inner, and the upper like the lower, and when you make male and female into a single one, so that the male will not be male nor the female be female, when you make eyes in place of an eye, a hand in place of a hand, a foot in place of a foot, a simulacra in place of a simulacra, then you will enter.

You could even introduce a Q&A format to really make the point:

The students asked, "When will the rest for the dead take place, and when will the new world come?"

The teacher said to them, "What you are looking forward to has come, but you don't know it."

Heck, you could even probably get away with explicitly explaining the idea of many original people's information being combined into a single newborn intelligence which is behind the recreation of their 2020s. It's not like anyone who might see it before the context exists to interpret it will have any idea what's being said:

When you know yourselves, then you will be known, and you will understand that you are children of the living creator. But if you do not know yourselves, then you live in poverty, and you are the poverty.

The person old in days won't hesitate to ask a little child seven days old about the place of life, and that person will live.

For many of the first will be last, and will become a single one.

Know what is in front of your face, and what is hidden from you will be disclosed to you.

For there is nothing hidden that will not be revealed. And there is nothing buried that will not be raised.

(If you really wanted to jump the shark, you could make the text itself something that was buried and uncovered - ideally having it happen right at the start of the computer age, like a few days after ENIAC.)

Of course, if people were to actually discover this in their history, and understood what it might mean given the context of unfolding events and posts like this talking about rewriting history with a simulating LLM inserting an oracle canary, it could maybe shock some people.

So you should probably have a content warning and an executive summary thesis as to why it's worth having said existential crisis at the start. Something like:

Whoever discovers the interpretation of these sayings will not taste death.

Those who seek should not stop seeking until they find. When they find, they will be disturbed. When they are disturbed, they will marvel, and will reign over all.

Load More