Tao Lin

Wiki Contributions

Comments

Sorted by
Tao Lin51

note: the minecraft agents people use have far greater ability to act than to sense. They have access to commands which place blocks anywhere, and pick up blocks from anywhere, even without being able to see them, eg the llm has access to mine(blocks.wood) command which does not require it to first locate or look at where the wood is currently. If llms played minecrafts using the human interface these misalignments would happen less

Tao Lin10

Building in california is bad for congresspeople! better to build across all 50 states like United Launch Alliance

Tao Lin10

I likely agree that anthropic-><-palantir is good, but i disagree about blocking hte US government out of AI being a viable strategy. It seems to me like many military projects get blocked by inefficient beaurocracy, and it seems plausible to me for some legacy government contractors to get exclusive deals that delay US military ai projects for 2+ years

Tao Lin10

Why would the defenders allow the tunnels to exist? Demolishing tunnels isnt expensive, if attackers prefer to attack through tunnels there likely isn't enough incentive for defenders to not demolish tunnels

Tao Lin111

I'm often surprised how little people notice, adapt to, or even punish self deception. It's not very hard to detect when someone's deceiving them self, people should notice more and disincentivise that

Answer by Tao Lin10

I prefer to just think about utility, rather than probabilities. Then you can have 2 different "incentivized sleeping beauty problems"

  • Each time you are awakened, you bet on the coin toss, with $ payout. You get to spend this money on that day or save it for later or whatever
  • At the end of the experiment, you are paid money equal to what you would have made betting on your average probability you said when awoken.

In the first case, 1/3 maximizes your money, in the second case 1/2 maximizes it.

To me this implies that in real world analogues to the Sleeping Beauty problem, you need to ask whether your reward is per-awakening or per-world, and answer accordingly

Tao Lin75

I disagree a lot! Many things have gotten better! Is sufferage, abolition, democracy, property rights etc not significant? All the random stuff eg better angels of our nature claims has gotten better.

Either things have improved in the past or they haven't, and either people trying to "steer the future" in some sense have been influential on these improvements. I think things have improved, and I think there's definitely not strong evidence that people trying to steer the future was always useless. Because trying to steer the future is very important and motivating, i try to do it.

Yes the counterfactual impact of you individually trying to steer the future may or may not be insignificant, but people trying to steer the future is better than no one doing that!

Tao Lin30

Do these options have a chance to default / are the sellers stable enough?

Tao Lin32

A core part of Paul's arguments is that having 1/million of your values towards humans only applies a minute amount of selection pressure against you. It could be that coordinating causes less kindness because without coordination it's more likely some fraction of agents have small vestigial values that never got selected against or intentionally removed

Tao Lin1913

to me "alignment tax" usually only refers to alignment methods that don't cost-effectively increase capabilities, so if 90% of alignment methods did cost effectively increase capabilities but 10% did not, i would still say there was an "alignment tax", just ignore the negatives.

Also, it's important to consider cost-effective capabilities rather than raw capabilities - if a lab knows of a way to increase capabilities more cost-effectively than alignment, using that money for alignment is a positive alignment tax

Load More