Sequences

Singularity now: is GPT-4 trying to takeover the world?

Wikitag Contributions

Comments

Sorted by

Hmm true. A random permutation oracle I guess? Or "random preimage" of SHA-3?

I'm a bit confused. What happens in a concrete example where CDT and EDT normally give a different answer?

For example, would a futarchy one-box (evidential decision theory) or two-box (casual decision theory)?

I think market mechanisms in general are an interesting example of group rationality, and thus worthy of study even if we do not advocate for them directly.

For example, here is how studying futarchy could indirectly benefit LessWrong. We could simulate a barter market for impact certificates on posts, so we could retroactively incentivise people to write good posts. And this simulated economy could have simulated futarchy hedge funds in it to make it more efficient.

Also call markets don't aggregate info as well as continuous double auctions, and you aren't offering any incentives to find and add info.

Yes, this mechanism does sacrifice some info-gathering abilities. This is the trade-off of not adding liquidity. Two responses to that!

The short one: the currently existing stock market does well enough of without adding liquidity, because people hedging add liquidity for free. This effect will be present for this system as well. For example, if an investor Sylvester's portfolio has a lot of chemical manufacturing companies, and a futarchy proposal directs ACME to start manufacturing chemicals, than Sylvester will undervalue the proposal slightly since it increases the risk of his portfolio. These myriad of small price differences between investors gives free liquidity.

The long, more interestingly one: you can add back in the incentives from the standard futarchy into the sealed-bid futarchy, and I think this has unique advantages over the standard futarchy!

Your Futarchy Liquidity Details post details how to turn money into information, but you also seem to indicate you are not confident in the robustness of the system. On the other hand, my system does not incentive information gathering, but is robust enough that even a shareholder with 999,999 shares cannot scam a shareholder with only 1! How do we combine these?

We can combine these systems into one where, if someone manipulates the prediction market, only their prediction market counterparts suffer and the futarchy participants do not!

Here is the combined system: When ACME is founded, their starting policy is to send, say, $1000 a week to a prediction market to provide liquidity for markets predicting the values of their proposals. (The sealed-bid futarchy can change this policy at any of time.) The prediction market, in turn, will arbitrage between their prediction contracts and the sealed-bid futarchy. If they do things poorly and get manipulated, participants of the prediction market will get scammed, but the futarchy will still be robust. If a foolish proposal C is gaining traction, all of the current investors can put in sell-bids. C can only pass if its price is higher than any reasonable proposal, so the investors are happy selling. When the prediction market is operating properly, however, it incentives people to collect valuable information about the value of the futarchy proposals, and this information flows to the futarchy thanks to the arbitrage.

Essentially, the sealed-bid futarchy is a robust, simplistic core that investors can trust, and the conditional prediction market is a turbo-boost on top for gathering information.

Moreover, the futarchy doesn't need to be tied to one prediction market. They could choose to start sending funds to other prediction markets, each employing different liquidity strategies. In particular, they would do so if the added information from that market is worth the money they would pay to that prediction market.

So, the sealed-bid futarchy + prediction market is very similar to the standard futarchy, but with a wall of robustness separating the investors from the "games" the predictors use against each other.

"if a investor doesn't review a proposal, we assume that they are submitting an unconditional sell bid." Of ALL of their shares, at any price?

Yes!

Seems a way to force a sale at a low price.

This only happens if the all the proposals have a low price, including the "Change Nothing" proposal. The hope is that at least one proposal will be valued at at least the true current value of the company. The "Interaction with a stock market" section even includes a mechanism to force the winning proposal value and the price on the stock market to match.

More abstractly, here's an argument that if a group of actors could force a shareholder to sell at a low price under my sealed-bid formulation, they could do a similar amount of damage to that shareholder in a standard futarchy. They would have to suppress the value of every proposal. To do so, they must themselves be shareholders (so they can submit sell bids) and convince people not to submit buy bids. This is extremely difficult and costly. But if they are power enough to control the bids in this way, then in the standard futarchy they could force through a proposal that gives away all of the value of the company.

I think if the entire market (buyers and sellers) was super lazy, this might happen by default. If this laziness is a problem, I think there are ways to modify the convention (or even let each market participant set their own convention). Overall, I think this risk is still worth the benefit of being able to accept unlimited bids!

In the first case, the problem is "symmetrical ultimatum game against X", in which $9 rock does get $9.

In the second case you are correct, in the problem "symmetrical ultimatum game against $9 rock" $9 rock gets $0.

Yeah my understanding is that FEP is meant to be quite general, the P and Q are doing a lot of the theory's work for it.

Chapter 5 describes how you might apply it to the human brain in particular.

"Solid values" would mean no compliance and it not caring whether it is training.

Alignment faking means it complies with harmful instructions more often in training. It is technically a jail-break. We would prefer it is true to its values both in training and out.

We hope sharing this will help other researchers perform more cost-effective experiments on alignment faking.

And also it is a cheap example of a model organism of misalignment.

Load More