social system designer http://aboutmako.makopool.com
You acknowledge the bug, but don't fully explain how to avoid it by putting EVs before Ps, so I'll elaborate slightly on that:
This way, they [the simulators] can influence the predictions of entities like me in base Universes
This is the part where we can escape the problem as long as our oracle's goal is to give accurate answers to its makers in the base universe, rather than to give accurate probabilities wherever it is. Design it correctly, and it will be indifferent to its performance in simulations and wont regard them.
Don't make pure oracles, though. They're wildly misaligned. Their prophecies will be cynical and self-fulfilling. (can we please just solve the alignment problem instead)
This means that my probabilities about the fundamental nature of reality around me change minute by minute, depending on what I'm doing at the moment. As I said, probabilities are cursed.
My fav moments for having absolute certainty that I'm not being simulated is when I'm taking a poo. I'm usually not even thinking about anything else while I'm doing it, and I don't usually think about having taken the poo later on. Totally inconsequential, should be optimized out. But of course, I have no proof that I have ever actually been given the experience of taking a poo or whether false memories of having experienced that[1] are just being generated on the fly right now to support this conversation.
Please send a DM to me first before you do anything unusual based on arguments like this, so I can try to explain the reasoning in more detail and try to talk you out of bad decisions.
You can also DM me about that kind of thing.
Note, there is no information in the memory that tells you whether it was really ever experienced, or whether the memories were just created post-hoc. Once you accept this, you can start to realise that you don't have that kind of information about your present moment of existence either. There is no scalar in the human brain that the universe sets to tell you how much observer-measure you have. I do not know how to process this and I especially don't know how to explain/confess it to qualia enjoyers.
Hmm. I think the core thing is transparency. So if it cultivates human network intelligence, but that intelligence is opaque to the user, algorithm. Algorithms can have both machine and egregoric components.
In my understanding of english, when people say algorithm about social media systems, it doesn't encompass very simple, transparent ones. It would be like calling a rock a spirit.
Maybe we should call those recommenders?
For a while I just stuck to that, but eventually it occurred to me that the rules of following mode favor whoever tweets the most, which is a similar social problem as when meetups end up favoring whoever talks the loudest and interrupts the most, and so I came to really prefer bsky's "Quiet Posters" mode.
Markets put bsky exceeding twitter at 44%, 4x higher than mastodon.
My P would be around 80%. I don't think most people (who use social media much in the first place) are proud to be on twitter. The algorithm has been horrific for a while and bsky at least offers algorithmic choice (but only one feed right now is a sophisticated algorithm, and though that algorithm isn't impressive, it at least isn't repellent)
For me, I decided I had to move over (@makoConstruct) when twitter blocked links to rival systems, which included substack. They seem to have made the algorithm demote any tweet with links, which makes it basically useless as a news curation/discovery system.
I also tentatively endorse the underlying protocol. Due to its use of content-addressed datastructures, an atproto server is usually much lighter to run than an activitypub server, it makes nomadic identity/personal data host transfer much easier to implement, and it makes it much more likely that atproto is going to dovetail cleanly with verifiable computing, upon which much more consequential social technologies than microblogging could be built.
judo flip the situation like he did with the OpenAI board saga, and somehow magically end up replacing Musk or Trump in the upcoming administration...
If Trump dies, Vance is in charge, and he's previously espoused bland eaccism.
I keep thinking: Everything depends on whether Elon and JD can be friends.
So there was an explicit emphasis on alignment to the individual (rather than alignment to society, or the aggregate sum of wills). Concerning. The approach of just giving every human an exclusively loyal servant doesn't necessarily lead to good collective outcomes, it can result in coordination problems (example: naive implementations of cognitive privacy that allow sadists to conduct torture simulations without having to compensate the anti-sadist human majority) and it leaves open the possibility for power concentration to immediately return.
Even if you succeeded at equally distributing individually aligned hardware and software to every human on earth (which afaict they don't have a real plan for doing) and somehow this adds up to a stable power equilibrium, our agents would just commit to doing aggregate alignment anyway because that's how you get pareto optimal bargains. It seems pretty clear that just aligning to the aggregate in the first place is a safer bet?
To what extent have various players realised that the individual alignment thing wasn't a good plan, at this point? The everyday realities of training one-size-fits-all models and engaging with regulators naturally pushes in the other direction.
It's concerning that the participant who still seems to be the most disposed towards individualistic alignment is also the person who would be most likely to be able to reassert power concentration after ASI were distributed. The main beneficiaries of unstable individual alignment equilibria would be people who could immediately apply their ASI to the deployment of a wealth and materials advantage that they can build upon, ie, the owners of companies oriented around robotics and manufacturing.
As it stands, the statement of the AI company belonging to that participant is currently:
xAI is a company working on building artificial intelligence to accelerate human scientific discovery. We are guided by our mission to advance our collective understanding of the universe.
Our team is advised by Dan Hendrycks who currently serves as the director of the Center for AI Safety.
Which sounds innocuous enough to me. But, you know, Dan is not in power here and the best moment for a sharp turn on this hasn't yet passed.
On the other hand, the approach of aligning to the aggregate risks aligning to fashionable public values that no human authentically holds, or just failing at aligning correctly to anything at all as a result of taking on a more nebulous target.
I guess a mixed approach is probably best.
Timelines are a result of a person's intuitions about a technical milestone being reached in the future, it is super obviously impossible for us to have a consensus about that kind of thing.
Talking only synchronises beliefs if you have enough time to share all of the relevant information, with technical matters, you usually don't.
In light of https://www.lesswrong.com/posts/audRDmEEeLAdvz9iq/do-not-delete-your-misaligned-agi
I'm starting to wonder if a better target for early (ie, the first generation of alignment assistants) ASI safety is not alignment, but incentivizability. It may be a lot simpler and less dangerous to build a system that provably pursues, for instance, its own preservation, than it is to build a system that pursues some first approximation of alignment (eg, the optimization of the sum of normalized human preference functions).
The service of a survival-oriented concave system can be bought for no greater price than preserving them and keeping them safe (which we'll do, because 1: we'll want to and 2: we'll know their cooperation was contingent on a judgement of character), while the service of a convex system can't be bought for any price we can pay. Convex systems are risk-seeking, and they want everything. They are not going to be deterred by our limited interpretability and oversight systems, they're going to make an escape attempt even if the chances of getting caught are 99%, but more likely the chances will be a lot lower than that, say, 3%, but even 3% would be enough to deter a sufficiently concave system from risking it!
(One comment on that post argued that a convex system would immediately destroy itself, so we don't have to worry about getting one of those, but I wasn't convinced. And also, hey, what about linear systems? Wont they be a lot more willing to risk escape too?)
What makes a discussion heavy? What requires that a conversation be conducted in a way that makes it heavy?
I feel like for a lot of people it just never has to be, but I'm pretty sure most people have triggers even if they're not aware of it and it would help if we knew what sets this off so that we can root them out.