This is a special post for quick takes by DanielFilan. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

Rationality-related writings that are more comment-shaped than post-shaped. Please don't leave top-level comments here unless they're indistinguishable to me from something I would say here.

DanielFilan's Shortform Feed
166 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

The below is the draft of a blog post I have about why I like AI doom liability. My dream is that people read it and decide "ah yes this is the main policy we will support" or "oh this is bad for a reason Daniel hasn't noticed and I'll tell him why". I think usually you're supposed to flesh out posts, but I'm not sure that adds a ton of information in this case.

Why I like AI doom liability

  • AI doom liability is my favourite approach to AI regulation. I want to sell you all on it.

  • the basic idea

    • general approach to problems: sue people for the negative impacts
      • internalizes externalities
      • means that the people figuring out how to avoid are informed and aligned (rather than bureaucrats less aware of on-the-ground conditions / trying to look good / seeking power)
      • less fucked than criminal law, regulatory law
        • look at what hits the supreme court, which stuff ends up violating people's rights the worst, what's been more persistent over human history, what causes massive protests, etc.
    • first-pass approach to AI: sue for liabilities after AI takes over
      • can't do that
    • so sue for intermediate disasters, get punitive damages for how close you were to AI takeover
      • intuition: pulling
... (read more)
3DanielFilan
Further note: this policy doesn't work to regulate government-developed AGI, which is a major drawback if you expect the government to develop AGI. It also probably lowers the relative cost for the government to develop AGI, which is a major drawback if you think the private sector would do a better job of responsible AGI development than the government.
1utilistrutil
I think you could also push to make government liable as part of this proposal
6DanielFilan
You could but (a) it's much harder constitutionally in the US (governments can only be sued if they consent to being sued, maybe unless other governments are suing them) and (b) the reason for thinking this proposal works is modelling affected actors as profit-maximizing, which the government probably isn't.
3DanielFilan
Oh: it would be sad if there were a bunch of frivolous suits for this. One way to curb that without messing up optionality would be to limit such suits to large enough intermediate disasters.
1Ben Millwood
You can't always use liability to internalise all the externality because e.g. you can't effectively sue companies for more than they have, and for some companies that stay afloat by regular fundraising rounds, that may not even be that much? like, if they're considering an action that is a coinflip between "we cause some huge liability" and "we triple the value of our company" then it's usually going to be correct from a shareholder perspective to take it, no matter the size of the liability, right? Criminal law has the ability to increase the deterrent somewhat – probably many people will not accept any amount of money for a significant enough chance of prison – though obviously it's not perfect either
5DanielFilan
OP doesn't emphasize liability insurance enough but part of the hope is that you can mandate that companies be insured up to $X00 billion, which costs them less than $X00 billion assuming that they're not likely to be held liable for that much. Then the hope is the insurance company can say "please don't do extremely risky stuff or your premium goes up".

Hot take: if you think that we'll have at least 30 more years of future where geopolitics and nations are relevant, I think you should pay at least 50% as much attention to India as to China. Similarly large population, similarly large number of great thinkers and researchers. Currently seems less 'interesting', but that sort of thing changes over 30-year timescales. As such, I think there should probably be some number of 'India specialists' in EA policy positions that isn't dwarfed by the number of 'China specialists'.

For comparison, in a universe where EA existed 30 years ago we would have thought it very important to have many Russia specialists.

6DanielFilan
Just learned that 80,000 hours' career guide includes the claim that becoming a Russia or India specialist might turn out to be a very promising career path.
5Adam Scholl
I've been wondering recently whether CFAR should try having some workshops in India for this reason. Far more people speak English than in China, and I expect we'd encounter fewer political impediments.
5[anonymous]
Also, anecdotally, there have been lots of Indian applicants (and attendees) at ESPR throughout the years. Seems like people there also think rationality is cool (lots of the people I interviewed had read HPMOR, there are LW meetups there, etc. etc.)
2Raemon
Also fyi, a nontrivial fraction of new users on LessWrong have Indian sounding usernames.
3Linch
Brazil is another interesting place. In addition to the large populations and GDP, anecdotally based on online courses I've taken, philosophy meme groups etc, Brazilians seem more interested in Anglo-American academic ethics than people from China or India, despite the presumably large language barrier.
3mingyuan
fwiw the global poverty part of EA already does a fair amount of work in India. I know EA is a bit (and increasingly) fragmented between different cause areas, but that still might be a useful entry point?

The Indian grammarian Pāṇini wanted to exactly specify what Sanskrit grammar was in the shortest possible length. As a result, he did some crazy stuff:

Pāṇini's theory of morphological analysis was more advanced than any equivalent Western theory before the 20th century. His treatise is generative and descriptive, uses metalanguage and meta-rules, and has been compared to the Turing machine wherein the logical structure of any computing device has been reduced to its essentials using an idealized mathematical model.

There are two surprising facts about this:

  1. His grammar was written in the 4th century BC.
  2. People then failed to build on this machinery to do things like formalise the foundations of mathematics, formalise a bunch of linguistics, or even do the same thing for languages other than Sanskrit, in a way that is preserved in the historical record.

I've been obsessing about this for the last few days.

A complaint about AI pause: if we pause AI and then unpause, progress will then be really quick, because there's a backlog of improvements in compute and algorithmic efficiency that can be immediately applied.

One definition of what an RSP is: if a lab makes observation O, then they pause scaling until they implement protection P.

Doesn't this sort of RSP have the same problem with fast progress after pausing? Why have I never heard anyone make this complaint about RSPs? Possibilities:

  • They do and I just haven't seen it
  • People expect "AI pause" to produce longer / more serious pauses than RSPs (but this seems incidental to the core structure of RSPs)
[-]aysja228

Basically I just agree with what James said. But I think the steelman is something like: you should expect shorter (or no) pauses with an RSP if all goes well, because the precautions are matched to the risks. Like, the labs aim to develop safety measures which keep pace with the dangers introduced by scaling, and if they succeed at that, then they never have to pause. But even if they fail, they're also expecting that building frontier models will help them solve alignment faster. I.e., either way the overall pause time would probably be shorter?

It does seem like in order to not have this complaint about the RSP, though, you need to expect that it's shorter by a lot (like by many months or years). My guess is that the labs do believe this, although not for amazing reasons. Like, the answer which feels most "real" to me is that this complaint doesn't apply to RSPs because the labs aren't actually planning to do a meaningful pause. 

Good point!

Man, my model of what's going on is:

  • The AI pause complaint is, basically, total self-serving BS that has not been called out enough
  • The implicit plan for RSPs is for them to never trigger in a business-relevant way
  • It is seen as a good thing (from the perspective of the labs) if they can lose less time to an RSP-triggered pause

...and these, taken together, should explain it.

9Matthew Barnett
The point that a capabilities overhang might cause rapid progress in a short period of time has been made by a number of people without any connections to AI labs, including me, which should reduce your credence that it's "basically, total self-serving BS". More to the point of Daniel Filan's original comment, I have criticized the Responsible Scaling Policy document in the past for failing to distinguish itself clearly from AI pause proposals. My guess is that your second and third points are likely mostly correct: AI labs think of an RSP as different from AI pause because it's lighter-touch, more narrowly targeted, and the RSP-triggered pause could be lifted more quickly, potentially minimally disrupting business operations.
3Amalthea
I think it's not an unreasonable point to take into account when talking price, but also a lot of the time it's serves as a BS talking point for people who don't really care about the subtleties.
4ZY
My guess is: * AI pause: no observation on what safety issue to address, work on capabilities anyways, then may lead to only capability improvements. (Assumption is that AI pausing means no releasing of models.) * RSP: observed O, shift more resources to work on mitigating O and less on capabilities, and when protection P is done, publish the model, then shift back to capabilities. (Ideally.)
4DanielFilan
I'm not saying there's no reason to think that RSPs are better or worse than pause, just that if overhang is a relevant consideration for pause, it's also a relevant consideration for RSPs.
3Sodium
I'd imagine that RSP proponents think that if we execute them properly, we will simply not build dangerous models beyond our control, period. If progress was faster than what labs can handle after pausing, RSPs should imply that you'd just pause again. On the other hand, there's not a clear criteria for when we would pause again after, say, a six month pause in scaling. Now whether this would happen in practice is perhaps a different question.
4DanielFilan
I think pause proponents think similarly!
2DanielFilan
Realized that I didn't respond to this - PauseAI's proposal is for a pause until safety can be guaranteed, rather than just for 6 months.
3alexbleakley
Are they the same people advocating for RSPs and also using compute/algorithm overhang as a primary argument against a pause? My understanding of the main argument in favor of RSPs over an immediate pause is: 1. Sure, we could continue to make some progress on safety if we paused other AI progress. 2. But: 1. we could make even more progress on safety if we could work with more advanced models; and  2. right now we have the necessary safety measures to create the next generation of models with low risk. 3. If AI progress continues without corresponding progress on safety, then (2.b) will no longer hold, so we should indeed pause at that time, hence the RSP. If you believe that (2.a) and (2.b) are both true, then you can argue that RSPs are better than an immediate pause without referring to compute/algorithm overhang. If you believe that one of (2.a) and (2.b) is false, but are skeptical of a pause because you believe compute/algorithm overhang would increase risk (or at least negate the benefit), then it seems you should also be skeptical of RSPs.
3DanielFilan
I'm not saying that RSPs are or aren't better than a pause. But I would think that if overhang is a relevant consideration for pauses, it's also a relevant consideration for RSPs.
1alexbleakley
I agree that if overhang is a relevant consideration for pauses, then it's also a relevant consideration for RSPs. My previous question was: Do you see the same people invoking overhang as an argument against pauses and also talking about RSPs as though they are not also impacted? Maybe you're not saying that there are people taking that position, but rather that those who invoke overhang as an argument against pauses don't seem to be equally vocal against RSPs (if not necessarily in favor of them either). I can think of a couple of separate reasons this could be the case: 1. To the extent I think a pause is bad (for example, because of overhang), I might still be more motivated to prioritize arguing against "unconditional pause" than "maybe pause in the future", even if the argument could apply to both. This is especially true if I consider the prospect of an unconditional pause a legitimate, near-term threat.  2. If I think a pause introduces a high, additional risk, and I think the base level of risk is low, it seems clear that I should not introduce that high risk. But if I get new evidence that there is an immediate, even-higher risk, which a pause could help mitigate, I should be willing to roll the dice on the pause, which now comes with a net reduction in risk. (2) isn't a very reassuring position, but it does suggest that "immediate pause bad because overhang" and "RSPs good [in spite of overhang]" are logically compatible.
2DanielFilan
I guess I'm not tracking this closely enough. I'm not really that focussed on any one arguer's individual priorities, but more about the discourse in general. Basically, I think that overhang is a consideration for unconditional pauses if and only if it's a consideration for RSPs, so it's a bad thing if overhang is brought up as an argument against unconditional pauses and not against RSPs, because this will distort the world's ability to figure out the costs and benefits of each kind of policy. Also, to be clear, it's not impossible that RSPs are all things considered better than unconditional pauses, and better than nothing, despite overhang. But if so, I'd hope someone somewhere would have written a piece saying "RSPs have the cost of causing overhang, but on net are worth it".
2Noosphere89
As others have said, I believe AI pauses by governments would absolutely be more serious and longer, preventing overhangs from building up too much. The big worry I do have with pause proposals in practice is that I expect most realistic pauses to buy us several years at most, but not decades long because people will shift their incentives towards algorithmic progress, which isn't very controllable by default, and I also expect there to be at most 1 OOM of compute left to build AGI which scales to superintelligence by the time we pause, meaning that it's a very unstable policy as any algorithmic advances like AI search actually working in complicated domains would immediately blow up the pause, and there are likely strong incentives to break the pause once people realize what superintelligence means. See here for one example: https://yellow-apartment-148.notion.site/AI-Search-The-Bitter-er-Lesson-44c11acd27294f4495c3de778cd09c8d
2DanielFilan
Are you saying that overhangs wouldn't build up too much under pauses because the government wouldn't let it happen, or that RSPs would have less overhang because they'd pause for less long so less overhang would build up? I can't quite tell.
2Noosphere89
That RSPs would have less overhang because they'd pause for less long so less overhang would build up.
[-]DanielFilanΩ17322

A theory of how alignment research should work

(cross-posted from danielfilan.com)

Epistemic status:

  • I listened to the Dwarkesh episode with Gwern and started attempting to think about life, the universe, and everything
  • less than an hour of thought has gone into this post
  • that said, it comes from a background of me thinking for a while about how the field of AI alignment should relate to agent foundations research

Maybe obvious to everyone but me, or totally wrong (this doesn't really grapple with the challenges of working in a domain where an intelligent being might be working against you), but:

  • we currently don't know how to make super-smart computers that do our will
    • this is not just a problem of having a design that is not feasible to implement: we do not even have a sense of what the design would be
    • I'm trying to somewhat abstract over intent alignment vs control approaches, but am mostly thinking about intent alignment
    • I have not thought that much about societal/systemic risks very much, and this post doesn't really address them.
  • ideally we would figure out how to do this
  • the closest traction that we have: deep learning seems to work well in practice, altho our theoretica
... (read more)
[-]Chris_LeongΩ410-2

I agree that we probably want most theory to be towards the applied end these days due to short timelines. Empirical work needs theory in order to direct it, theory needs empirics in order to remain grounded.

6Chris_Leong
Thanks for writing this. I think it is a useful model. However, there is one thing I want to push back against: I agree with Apollo Research that evals isn't really a science yet. It mostly seems to be conducted according to vibes. Model internals could help with this, but things like building experience or auditing models using different schemes and comparing them could help make this more scientific. Similarly, a lot of work with Model Organisms of Alignment requires a lot of careful thought to get right.
4DanielFilan
When I wrote that, I wasn't thinking so much about evals / model organisms as stuff like: * putting a bunch of agents in a simulated world and seeing how they interact * weak-to-strong / easy-to-hard generalization basically stuff along the lines of "when you put agents in X situation, they tend to do Y thing", rather than trying to understand latent causes / capabilities
5yams
I think the key missing piece you’re pointing at (making sure that our interpretability tools etc actually tell us something alignment-relevant) is one of the big things going on in model organisms of misalignment (iirc there’s a step that’s like ‘ok, but if we do interpretability/control/etc at the model organism does that help?’). Ideally this type of work, or something close to it, could become more common // provide ‘evals for our evals’ // expand in scope and application beyond deep deception. If that happened, it seems like it would fit the bill here. Does that seem true to you?
6DanielFilan
Yeah, that seems right to me.
4DanielFilan
Oh except: I did not necessarily mean to claim that any of the things I mentioned were missing from the alignment research scene, or that they were present.

I continue to think that agent foundations research is kind of underrated. Like, we're supposed to do mechinterp to understand the algorithm models implement - but how do we know what algorithms are good?

[-]Kaarel101

It additionally seems likely to me that we are presently missing major parts of a decent language for talking about minds/models, and developing such a language requires (and would constitute) significant philosophical progress. There are ways to 'understand the algorithm a model is' that are highly insufficient/inadequate for doing what we want to do in alignment — for instance, even if one gets from where interpretability is currently to being able to replace a neural net by a somewhat smaller boolean (or whatever) circuit and is thus able to translate various NNs to such circuits and proceed to stare at them, one probably won't thereby be more than of the way to the kind of strong understanding that would let one modify a NN-based AGI to be aligned or build another aligned AI (in case alignment doesn't happen by default) (much like how knowing the weights doesn't deliver that kind of understanding). To even get to the point where we can usefully understand the 'algorithms' models implement, I feel like we might need to have answered sth like (1) what kind of syntax should we see thinking as having — for example, should we think of a model/mind as a library of small prog... (read more)

2quetzal_rainbow
Not only "good ", but "obedient", "non-deceptive", "minimal impact", "behaviorist" and don't even talk about "mindcrime".
2cubefox
In this sense agent foundations research seems similar to research on normative ethics.

Shower thought[*]: the notion of a task being bounded doesn't survive composition. Specifically, say a task is bounded if the agent doing it is only using bounded resources and only optimising a small bit of the world to a limited extent. The task of 'be a human in the enterprise of doing research' is bounded, but the enterprise of research in general is not bounded. Similarly, being a human with a job vs the entire human economy. I imagine keeping this in mind would be useful when thinking about CAIS.

Similarly, the notion of a function being interpretable doesn't survive composition. Linear functions are interpretable (citation: the field of linear algebra), as is the ReLU function, but the consensus is that neural networks are not, or at least not in the same way.

I basically wish that the concepts that I used survived composition.

[*] Actually I had this on a stroll.

4Raemon
Fwiw, this seems like an interesting thought but I'm not sure I understand it, and curious if you could say it in different words. (but, also, if the prospect of being asked to do that for your shortform comments feels ughy, no worries)
6DanielFilan
Often big things are made of smaller things: e.g., the economy is made of humans and machines interacting, and neural networks are made of linear functions and ReLUs composed together. Say that a property P survives composition if knowing that P holds for all the smaller things tells you that P holds for the bigger thing. It's nice if properties survive composition, because it's easier to figure out if they hold for small things than to directly tackle the problem of whether they hold for a big thing. Boundedness doesn't survive composition: people and machines are bounded, but the economy isn't. Interpretability doesn't survive composition: linear functions and ReLUs are interpretable, but neural networks aren't.
[-]DanielFilanΩ11202

Frankfurt-style counterexamples for definitions of optimization

In "Bottle Caps Aren't Optimizers", I wrote about a type of definition of optimization that says system S is optimizing for goal G iff G has a higher value than it would if S didn't exist or were randomly scrambled. I argued against these definitions by providing a examples of systems that satisfy the criterion but are not optimizers. But today, I realized that I could repurpose Frankfurt cases to get examples of optimizers that don't satisfy this criterion.

A Frankfurt case is a thought experiment designed to disprove the following intuitive principle: "a person is morally responsible for what she has done only if she could have done otherwise." Here's the basic idea: suppose Alice is considering whether or not to kill Bob. Upon consideration, she decides to do so, takes out her gun, and shoots Bob. But little-known to her, a neuroscientist had implanted a chip in her brain that would have forced her to shoot Bob if she had decided not to. That said, the chip didn't activate, because she did decide to shoot Bob. The idea is that she's morally responsible, even tho she couldn't have done otherwise.

Anyway, let's do this w... (read more)

3Martín Soto
Interesting, but I'm not sure how successful the counterexample is. After all, if your terminal goal in the whole environment was truly for your side to win, then it makes sense to understand anything short of letting Shin play as a shortcoming of your optimization (with respect to that goal). Of course, even in the case where that's your true goal and you're committing a mistake (which is not common), we might want to say that you are deploying a lot of optimization, with respect to the different goal of "winning by yourself", or "having fun", which is compatible with failing at another goal. This could be taken to absurd extremes (whatever you're doing, I can understand you as optimizing really hard for doing exactly what you're doing), but the natural way around that is for your imputed goals to be required simple (in some background language or ontology, like that of humans). This is exactly the approach mathematically taken by Vanessa in the past (the equation at 3:50 here). I think this "goal relativism" is fundamentally correct. The only problem with Vanessa's approach is that it's hard to account for the agent being mistaken (for example, you not knowing Shin is behind you).[1] I think the only natural way to account for this is to see things from the agent's native ontology (or compute probabilities according to their prior), however we might extract those from them. So we're unavoidably back at the problem of ontology identification (which I do think is the core problem). 1. ^ Say Alice has lived her whole life in a room with a single button. People from the outside told her pressing the button would create nice paintings. Throughout her life, they provided an exhaustive array of proofs and confirmations of this fact. Unbeknownst to her, this was all an elaborate scheme, and in reality pressing the button destroys nice paintings. Alice, liking paintings, regularly presses the button. A naive application of Vanessa's criterion would impute Al

Live in Berkeley? I think you should consider running for the city council. Why?

  • 4 seats are going to be open with no incumbents:
    • District 4: the area between Sacramento, Blake, Fulton, and University, plus the area between University, Cedar, MLK, and Fulton. Lots of rationalists live in this area. This will be a special election that's yet to be scheduled, but I imagine it will be held in April or May, with a filing deadline in late Feb / early Mar. (Or maybe it will be held at the same time as District 7, on April 16, filing deadline on EOD Feb 16)
    • District 5: north of Cedar, between Spruce and Sacramento/Tulare/Nelson. Election in November.
    • District 6: north of Hearst, between Oxford/Spruce and Wildcat Canyon Road. Election in November.
    • District 7: campus and the couple blocks immediately south of it. Borders are hard to describe, check here. Special election: filing deadline is EOD Feb 16, election is April 16.
  • Nobody is running in those races yet.
  • You probably have gripes with how the city is running: maybe you wish policing were different, or there were more permissive zoning, or better education.
  • You probably have a bunch of friends who feel similarly who maybe would wan
... (read more)

On the most recent episode of the podcast Rationally Speaking, David Shor discusses how members of the USA's Democratic Party could perform better electorally by not talking about their unpopular extreme views, but notes that many individual Democrats have better lives by talking about their unpopular extreme views that are popular with left-wing activists (e.g. because they become more prominent and get to feel good about themselves), which cause some voters to associate those unpopular extreme views with the Democratic Party and not vote for them.

This is discussed as a sad irrationality that constitutes a coordination failure among Democrats, but I found that an odd tone. Part of the model in the episode is that Democratic politicians in fact have these unpopular extreme views, but it would hurt their electoral chances if that became known. From a non-partisan perspective, you'd expect it to be a good thing to know what elected officials actually think. Now, you might think that elected officials shouldn't enact the unpopular policies that they in fact believe in, but it's odd to me that they apparently can't credibly communicate that they won't enact those policies. At any rate, I'm a bit bothered by the idea of coordinated silence to ensure that people don't know what powerful people actually think being portrayed as good.

5Viliam
The episode is quite interesting! The system is set up in the way that before a Democratic candidate can enter their final battle against the Republican candidate, first they have to defeat their fellow Democrats. And things that help them in the previous rounds (talking like a SJW, to put it bluntly) seem to hurt them in the final round, and vice versa. The underlying reason is that within Democratic Party, the opinions of the vanguard got recently so far from the opinions of hoi polloi, that it became almost impossible for any candidate to make both happy. With the vanguard, you score by being extreme, by "pushing the Overton window". With hoi polloi, you score by being a relatable person, by (illusion of) caring about their boring everyday problems. (Followed by an interesting explanation why Republican Party doesn't have the symmetric problem. Within both parties, the more educated and more politically active people are more left-wing than their average voter. In Republican Party, this pushes the candidates towards center, making them more attractive for voters in general; in Democratic Party, this pushes the candidates away from center, making them less attractive for voters.) The part where I disagree with Julia's summary is that to Julia, if I understand her correctly, the vanguard is a more extreme version of hoi polloi. To me it seems like they often care about different things. Consider the television ads that were popular among elite Democrats, but actually made people more likely to vote for Republicans. I don't read this as "you represent my opinion too strongly", but rather as "you don't represent my opinion". David suggests an interesting solution: Democrats should have more non-white candidates with less woke opinions, because (these are my words) the vanguard will hesitate to attack them because of their color, and hoi polloi will find them more acceptable because of their opinions. (Kinda like Obama.) Cool trick, but I suspect it will stop wo
7DanielFilan
I think this is false. Shor, from the transcript: I don't have much to say about your take, but it was interesting!
3Viliam
You're right, the main difference is not between the primaries and the final round, but rather somewhere between Twitter/journalists and primaries.
4Unnamed
It seems clear that we want politicians to honestly talk about what they're intending to do with the policies that they're actively trying to change (especially if they have a reasonable chance of enacting new policies before the next election). That's how voters can know what they're getting. It's less obvious how this should apply to their views on things which aren't going to be enacted into policy. Three lines of thinking that point in the direction of maybe it's good for politicians to keep quiet about (many of) their unpopular views: It can be hard for listeners to tell how likely the policy is to be enacted, or how actively the politician will try to make it happen. I guess it's hard to fit into 5 words? e.g. I saw a list of politicians' "broken promises" on one of the fact checking sites, which was full of examples where the politician said they were in favor of something and then it didn't get enacted, and the fact checkers deemed that sufficient to count it as a broken promise. This can lead to voters putting too little weight on the things that they're actually electing the politician to do, e.g. local politics seems less functional if local politicians focus on talking about their views on national issues that they have no control over. Another issue is that it's cheap talk. The incentive structure / feedback loops seem terrible for politicians talking about things unrelated to the policies they're enacting or blocking. Might be more functional to have a political system where politicians mostly talk about things that are more closely related to their actions, so that their words have meaning that voters can see. Also, you can think of politicians' speech as attempted persuasion. You could think of voters as picking a person to go around advocating for the voters' hard-to-enact views (as well as to implement policies for the voters' feasible-to-enact views). So it seems like it could be reasonable for voters to say "I think X is bad, so I'm not going
2DanielFilan
Note that the linked podcast is not merely arguing that politicians should keep quiet about their views, it's also arguing that their fellow partisans in e.g. think-tanks and opinion sections should also keep quiet, because people can tell that the politicians secretly believe what the think-tankers and opinionists openly say. I think these arguments don't imply that those think-tankers and opinionists should keep quiet.
1Liam Donovan
Shor is very open about the fact that his views are to the left of 90%+ of the electorate, and that his goal is to maximize the power of people that share his views despite their general unpopularity. 
3DanielFilan
Yeah, I think I'm more surprised by Galef's tone than Shor's.

I get to nuke LW today AMA.

I think the use of dialogues to illustrate a point of view is overdone on LessWrong. Almost always, the 'Simplicio' character fails to accurately represent the smart version of the viewpoint he stands in for, because the author doesn't try sufficiently hard to pass the ITT of the view they're arguing against. As a result, not only is the dialogue unconvincing, it runs the risk of misleading readers about the actual content of a worldview. I think this is true to a greater extent than posts that just state a point of view and argue against it, because the dialogue format naively appears to actually represent a named representative of a point of view, and structurally discourages disclaimers of the type "as I understand it, defenders of proposition P might state X, but of course I could be wrong".

3Dagon
I've seen such dialogs, and felt exactly the same way. At least twice I've later found out that the dialog actually happened and there was no misrepresentation or simplification, just a HUGE inferential distance about what models of the universe (really, models of groups of people are the main sticking points) should be applied in what circumstances.
2Matt Goldenberg
Possibly this could also be a strength, because by representing the views separately like that it makes it easier to see exactly what assumptions are causing them to fail the ITT. On the other hand if they're sufficiently far off, the dialogue basically goes off in the entirely wrong direction.
1Mark Xu
Do you have examples of dialogues that fail to pass the ITT? I'm curious if you think any of the dialogues I've read might have been misleading.

A bunch of my friends are very skeptical of the schooling system and promote homeschooling or unschooling as an alternative. I see where they're coming from, but I worry about the reproductive consequences of stigmatising schooling in favour of those two alternatives. Based on informal conversations, the main reason why people I know aren't planning on having more children is the time cost. A move towards normative home/unschooling would increase the time cost of children, and as such make them less appealing to prospective parents[*]. This in turn would reduce birth rates, worsening the problem that first-world countries face in the next couple of decades of a low working-age:elderly population ratio [EDIT: also, low population leading to less innovation, also low population leading to fewer people existing who get to enjoy life]. As such, I tentatively wish that home/unschooling advocates would focus on more institutional ways of supervising children, e.g. Sudbury schools, community childcare, child labour [EDIT: or a greater emphasis on not supervising children who don't need supervision, or similar things].

[*] This is the weakest part of my argument - it's possible that more people home/unschooling their kids would result in cooler kids that were more fun to be around, and this effect would offset the extra time cost (or kids who are more willing to support their elderly parents, perhaps). But given how lucrative the first world labour market is, I doubt it.

9Isnasene
While I agree that a world where home/un-schooling is a norm would result in greater time-costs and a lower child-rate, I don't think that promoting home/un-schooling as an alternative will result in a world where home/un-schooling is normative. Because of this, I don't think that promoting home/un-schooling as an alternative to the system carries any particularly broad risks. Here's my reasoning: * I expect the associated stigmas and pressures for having kids to always dwarf the associated stigmas and pressures against having kids if they are not home/un-schooled. Having kids is an extremely strong norm both because of the underpinning evolutionary psychology and because a lot of life-style patterns after thirty are culturally centered around people who have kids. * Despite its faults, public school does the job pretty well for most of people. This applies to the extent that the opportunity cost of home/un-schooling instead of building familial wealth probably outweighs the benefits for most people. Thus, I don't believe that the promoting of home/un-schooling is scaleable to everyone. * Lots of rich people who have the capacity to home/un-school who dislike the school system decide not to do that. Instead they (roughly speaking) coordinate towards expensive private schools outside the public system. I doubt that this has caused a significant number of people to avoid having children for fear of not sending them to a fancy boading school. * Even if the school system gets sufficiently stigmatised, I actually expect that the incentives will naturally align around institutional schooling outside the system for most children. Comparative advantages exist and local communities will exploit them. * Home/un-schooling often already involves institutional aspects. Explicitly, home/un-schooled kids would ideally have outlets for peer-to-peer interactions during the school-day and these are often satisfied through community coordination I grant that maybe increased po
6DanielFilan
Developed countries already have below-replacement fertility (according to this NPR article, the CDC claims that the US has been in this state since 1971), so apparently you can have pressures that outweigh pressures to have children. In general I don't understand why you don't think that a marginal increase in the pressure to invest in each kid won't result in marginally fewer kids. Presumably this is not true in a world where many people believe that schools are basically like prisons for children, which is a sentiment that I do see and seems more memetically fit than "homeschooling works for some families but not others". My impression was that rich people often dislike the public school system, but are basically fine with schools in general? Rich people have fewer kids than poor people and it doesn't seem strange to me to imagine that that's partly due to the fact that each child comes at higher expected cost. This seems right to me barring strong normative home/unschooling, and I wish that this were a more promoted alternative (as my post mentions!). Yep - you'll notice that my post doesn't deny the manifold benefits of the home/unschooling movement, and I think the average unschooling advocate is basically right about how bad typical schools are.
1Isnasene
I think the crux of our perspective difference is that we model the decrease in reproduction differently. I tend to view poor people and developing countries having higher reproduction rates as a consequence of less economic slack. That is to say, people who are poorer have more kids because those kids are decent long-term investments overall (ie old-age support, help-around-the-house). In contrast, wealthy people can make way more money by doing things that don't involve kids. This can be interpreted in two ways: * Wealthier people see children as higher cost and elect not to have children because of the costs or * Wealthier people are not under as much economic pressure so have fewer children because they can afford to get away with it At the margin, both of these things are going on at the same time. Still, I attribute falling birthrates as mostly due to the latter rather than the former. So I don't quite buy the claim that falling birth-rates have been dramatically influenced by greater pressures. Of course, Wei Dai indicates that parental investment definitely has an effect so maybe my attribution isn't accurate. I'd be pretty interested in seeing some studies/data trying to connect falling birthrates to the cultural demands around raising children. ... Also, my understanding of the pressures re:homeschooling is something like this: * The social stigma against having kids is satisficing. Having one kid (below replacement level) hurts you dramatically less than having zero kids * The capacity to home-school is roughly all-or-nothing. Home-schooling one kid immediately scales to home-schooling all your kids. * I doubt the stigma for schooling would punish a parent who sends two kids to school more than a parent who sends one kid to school This means that, for a given family, you essentially chose between having kids and home-schooling all of them (expected-cause of home-schooling doesn't scale with number of children) or having no kids (maximum soc
2cousin_it
Kids will grow up and move away no matter if you're rich or poor though, so I'm not sure the investment explanation makes sense. But your last sentence rings true to me. If someone cares more about career than family, they will always have "no time" for a family. I've heard it from well-paid professionals many times: "I'd like to have kids... eventually..."
2Wei Dai
I think you're overstating the stigma against not having kids. I Googled "is there stigma around not having kids" and the top two US-based articles both say something similar: USA Today: Times:
1Isnasene
Agreed. Per my latest reply to DanielFilan: I massively underestimated the rate of childfree-ness and, broadly speaking, I'm in agreement with Daniel now.
2DanielFilan
[next quote is reformatted so that I can make it a quote] Glad to see we agree - and again, the important point for my argument isn't whether most of existing low fertility can be attributed to the existing cost of kids, but whether adding extra cost per kid will reduce the number of kids (as the law of demand predicts). I'm sure this can't be exactly right, but I do think that the low marginal cost of home-schooling was something I was missing. I continue to think that you aren't thinking on the margin, or making some related error (perhaps in understanding what I'm saying). Electing for no kids isn't going to become more costly, so if you make having kids more costly, then you'll get fewer of them than you otherwise would, as the people who were just leaning towards having kids (due to idiosyncratically low desire to have kids/high cost to have kids) start to lean away from the plan. (I assume you meant pressure in favour of home-schooling?) Please note that I never said it had a high effect relative to other things: merely that the effect existed and was large and negative enough to make it worthwhile for homeschooling advocates to change course.
1Isnasene
Yeah, I was thinking in broad strokes there. I agree that there is a margin at which point people switch from choosing to have kids to choosing not to have kids and that moving that margin to a place where having kids is less net-positive will cause some people to choose to have fewer kids. My point was that the people on the margin are not people who will typically say"well we were going to have two kids but now we're only going to have one because home-schooling"; they're people who will typically say "we're on the fence about having kids at all." Whereas most marginal effects relating to having kids (ie the cost of college) pertain to the former group, the bulk of marginal effects on reproduction pertaining to schooling stigmas pertain to the latter group. Both the margin and the population density at the margin matter in terms of determining the effect. What I'm saying is that the population density at the margin relevant to schooling-stigmas is notably small. However, I've actually been overstating my case here. The childfree rate in the US is currently around 15% which is much larger than I expected. The childfree rate for women with above a bachelor's degree is 25%. In absolute terms, these are not small numbers and I've gotta admit that this indicates a pretty high population density at the margin. Per the above stats, I've updated to agree with this claim.
4Wei Dai
The trend in China of extreme parental investment (lots of extra classes starting from a young age, forcing one's kid to practice hours of musical instrument each week, paying huge opportunity costs to obtain a 学区房) almost certainly contributes significantly to its current low birth rate. I think normative home/unschooling has the potential to have a similar influence elsewhere. But have you thought about whether lower birth rate is good or bad from a longtermist / x-risk perspective? It's not clear to me that it's bad, at least.
2DanielFilan
I haven't thought incredibly carefully about this. My guess is that a high birth rate accelerates basically everything but elderly care, and so the first-order question is whether you think humanity is pushing in roughly the right or wrong direction - I'd say it's going in the right direction. That being said, there's also a trickier factor of whether you'd rather have all your cognition be in serial or in parallel, and if you want it to be in serial, then low birth rates look good.
5Wei Dai
A couple of considerations in the "lower birth rate is good for longtermism" direction: 1. Lower birth rate makes war less likely. (Less perceived need to grab other people's resources. Parents are loathe to lose their only children in war.) 2. Increased parental investment and inheritance which shifts up average per-capita human and non-human capital, which is probably helpful for increasing understanding of x-risk and ability/opportunity to work on it. (Although this depends on the details of how the parental investment is done, since some kinds, e.g., helicopter parenting, can be counterproductive. Home/unschooling seems likely to be good in this regard though.)
2DanielFilan
One factor here that is big in my mind: I expect per-capita wealth to be lower in worlds with lower populations, since fewer people means fewer ideas that enrich everyone. I think that this makes 2 go in the opposite direction, but it's not obvious to me what it does for 1.
0Dagon
It's not clear that positive-sum innovation is linear (or even monotonically positive) with total population. There almost certainly exist levels at which marginal mouths to feed drive unpleasant and non-productive behaviors more than they do the growth-driving shared innovations. Whether we're in a downward-sloping portion of the curve, and whether it slopes up again in the next few generations, are both debatable. And they should be debated.
6DanielFilan
My sense is that on average, more population means more growth (see this study on the question). But certainly at some point probably you run out of ideas for how to make material more valuable and growth just becomes making more people with the same consumption per capita. I find this comment kind of irksome, because (a) neither I nor anybody else said that they weren't proper subjects for debate and (b) you've exhorted debate on the topic but haven't contributed anything other than the theoretical possibility that the effect could go the other way. So I see this as trying to advance some kind of point illegitimately. If you make another such comment that I find irksome in the same way, I'll delete it, as per my commenting guidelines.
2DanielFilan
I now think the biggest flaw with this argument is that home/unschooling actually don't take that many hours out of the day, and there's a lot of pooling of work going on. Thanks to many FB commenters and Isnasene for pointing that out.
2DanielFilan
And also that anti-standard-school memes are less fit than pro-home/unschooling memes, such that "normative home/unschooling" doesn't seem that likely to be a big thing.
1Pattern
So you're a proponent of improving institutional ways of supervising children?
2DanielFilan
Tentatively, yes. But I've only just had this thought today, so I'm not very committed to it. Also note my edit: it's more about being in favour of low-time-investment ways to raise children that don't have the problems schooling is alleged to have.

I often see (and sometimes take part in) discussion of Facebook here. I'm not sure whether when I partake in these discussions I should disclaim that my income is largely due to Good Ventures, whose money largely comes from Facebook investments. Nobody else does this, so shrug.

4Raemon
Huh. Indeed seems good to at least have talked about talking about.

Why I am less than infinitely hostile to the time / bloomberg pieces:

  • They are kinda informative about the way the scene has been in the past
  • Much of the behaviour described in them is pretty fucked up
  • It is relevant for people to know that EA/rationality is not an abuse-free zone
  • The reported people have faced professional consequences, including being expelled from the community for the most serious offenders, but given that there were several, it's plausible that others will crop up.
  • They point to a dynamic of "single-mindedness on AI stuff / extremely bad at normal human relationships" that is real and kinda bad.

One result that's related to Aumann's Agreement Theorem is that if you and I alternate saying our posterior probabilities of some event, we converge on the same probability if we have common priors. You might therefore wonder why we ever do anything else. The answer is that describing evidence is strictly more informative than stating one's posterior. For instance, imagine that we've both secretly flipped coins, and want to know whether both coins landed on the same side. If we just state our posteriors, we'll immediately converge to 50%, without actually learning the answer, which we could have learned pretty trivially by just saying how our coins landed. This is related to the original proof of the Aumann agreement theorem in a way that I can't describe shortly.

Models and considerations.

There are two typical ways of deciding whether on net something is worth doing. The first is to come up with a model of the relevant part of the world, look at all the consequences of doing the thing in the model, and determine if those consequences are net positive. When this is done right, the consequences should be easy to evaluate and weigh off against each other. The second way is to think of a bunch of considerations in favour of and against doing something, and decide whether the balance of considerations supports doing the thing or not.

I prefer model-building to consideration-listing, for the following reasons:

  • By building a model, you're forcing yourself to explicitly think about how important various consequences are, which is often elided in consideration-listing. Or rather, I don't know how to quantitatively compare importances of considerations without doing something very close to model-building.
  • Building a model lets you check which possible consequences are actually likely. This is an improvement on considerations, which are often of the form "such-and-such consequence might occur".
  • Building a model lets you notice consequences which you m
... (read more)
4DanielFilan
Homework: come up with a model of this.

Hot take: the norm of being muted on video calls is bad. It makes it awkward and difficult to speak, clap, laugh, or make "I'm listening" sounds. A better norm set would be:

  • use zoom in gallery mode, so somebody making noise doesn't make them more focussed than they were before
  • call from a quiet room
  • be more tolerant of random background sounds, the way we are IRL
1Pongo
Agreed. I often find myself unmuting because I'm trying to make social sounds (often laughter). However, in a large conversation, I prefer someone becomes a weird void without backchannel sounds than be plunged into domestic mayhem

As far as I can tell, people typically use the orthogonality thesis to argue that smart agents could have any motivations. But the orthogonality thesis is stronger than that, and its extra content is false - there are some goals that are too complicated for a dumb agent to have, because the agent couldn't understand those goals. I think people should instead directly defend the claim that smart agents could have arbitrary goals.

2DanielFilan
I no longer endorse this claim about what the orthogonality thesis says.

A rough and dirty estimate of the COVID externality of visiting your family in the USA for Christmas when you don't feel ill [EDIT: this calculation low-balls the externality, see below]:

You incur some number of μCOVIDs[*] a week, let's call it x. Since the incubation time is about 5 days, let's say that your chance of having COVID is about 5x/7,000,000 when you arrive at the home of your family with n other people. In-house attack rate is about 1/3, I estimate based off hazy recollections, so in expectation you infect 5xn/21,000,000 people, which is about... (read more)

4DanielFilan
I recently realized, thanks to a FB comment by Paul Christiano, that this is thinking about things in kind of the wrong way. R is approximately 1 because society is tamping down infection rates when infections are high and 'loosening' when infections are low. So, by infecting people, you cause some chain of counterfactual infections that perhaps ends when society notices and tamps down infection, but also you cause the rest of society to do less fun interacting in order to tamp down the virus. So the cost of infecting somebody is to cause everybody else to be more conservative. I'm still not quite sure how to think about that cost tho.
2DanielFilan
Note: this calculation only accounts for you infecting your relatives who then infect others, and not your relatives infecting you and you infecting others. Accounting for this should probably raise the cost by a factor of 2.
2DanielFilan
Note: this calculation assumes that travelling is not risky at all. Realistically that should be bundled into x.

Better to concretise 3 ways than 1 if you have the time.

Here's a tale I've heard but not verified: in the good old days, Intrade had a prediction market on whether Obamacare would become law, which resolved negative, due to the market's definition of Obamacare.

Sometimes you're interested in answering a vague question, like 'Did Donald Trump enact a Muslim ban in his first term' or 'Will I be single next Valentine's day'. Standard advice is to make the question more specific and concrete into something that can be more objectively evaluated. I think that th

... (read more)

This weekend, I looked up Benquo's post on zetetic explanation in order to nominate it for the 2019 review. Alas, it was posted in 2018, and wasn't nominated for that year's review. Nevertheless, I've recently gotten interested in amateur radio, and have noticed that the mechanistic/physical explanations of radio waves and such that I've come across while studying for exams are not really sufficient to empower me to actually get on the radio, and more zetetic explanations are useful, altho harder to test. Anyway, I recommend re-reading the post.

My bid for forecasters: come up with conditional prediction questions to forecast likely impacts of potential US policies towards Ukraine. See this thread where I brainstorm potential such questions.

Challenges as I see it: figuring out which policies are live options, operationalizing, and figuring out good success/failure metrics.

Benefits: potentially make policy more sane, or more realistically practice doing the sort of thing that might one day make policy more sane.

Ted Kaczynski as a relatively apolitical test case for cancellation norms:

Ted Kaczynski was a mathematics professor who decided that industrial society was terrible, and waged a terroristic bombing campaign to foment a revolution against technology. As part of this campaign, he wrote a manifesto titled "Industrial Society and Its Future" and said that if a major newspaper printed it verbatim he would desist from terrorism. He is currently serving eight life sentences in a "super-max" security prison in Colorado.

My understanding is that his manifesto (which... (read more)

Generally speaking, if someone commits heinous and unambiguous crimes in service of an objective like "getting people to read X", and it doesn't look like they're doing a tricky reverse-psychology thing or anything like that, then we should not cooperate with that objective. If Kaczynski had posted his manifesto on LessWrong, I would feel comfortable deleting it and any links to it, and I would encourage the moderator of any other forum to do the same under those circumstances.

But this is a specific and unusual circumstance. When people try to cancel each other, usually there's no connection or a very tenuous connection between their writing and what they're accused of. (Also the crime is usually less severe and less well proven.) In that case, the argument is different; either the people doing the cancelling think that the crime wasn't adequately punished, and are trying to create justice via a distributed minor punishment. If people are right about whether the thing is bad, then the main issues are about standards of evidence (biased readings and out-of-context quotes go a long way), proportionality (it's not worth blowing up peoples' lives over having said something dumb on the internet), and relation to nonpunishers (problems happen when things escalate from telling people why someone is bad, to punishing people for not believing or not caring).

2Dagon
There's no need to cancel anyone who's failing to have influence already.  I suspect there are no apolotical test cases: cancellation (in the form of verbally attacking and de-legitimizing someone as a person, rather than arguing against specific portions of their work) is primarily politically motivated.  It's pretty pure ad-hominem argument: "don't listen to or respect this person, regardless of what they're saying".  In this case, I'm not listening because I think it's low-value on it's own, regardless of authorship. The manifesto is pretty easy to find in PDF form for free.  I wasn't able to get very far - way too many crackpot signals and didn't seem worth my time.  To your bullet points: 1. I can read this two ways: "should anybody" meaning "do you recommend any specific person read it" or "do you object to people reading it".  My answers are "yes, but not many people", and "no.".  Anybody who is interested, either from a direct curiosity on the topic (which I predict won't be rewarded) or from wanting to understand this kind of epistemic pathology (which might be worthwhile) should read it.   2. It's absolutely acceptable.   I wouldn't enjoy it, but I'm not a member of the group, so no harm there.   To decide whether YOUR group should do it, try to identify what you'd hope to get out of it, and what likely consequences there are from pursuing that direction.  If your group is visible and sensitive to public perception (aka politically influenced), then certainly you should consider those affects.
2DanielFilan
To be explicit, here are some reasons that the EA community should cancel Kaczynski. Note that I do not necessarily think that they are sound or decisive. * EAs are known as utilitarians who are concerned about the impact of AI technology. By associating with him, that could give people the false impression that EAs are in favour of terroristic bombing campaigns to retard technological development, which would damage the EA community. * His threat to bomb more people and buildings if the Washington Post (WaPo) didn't publish his manifesto damaged good discourse norms by inducing the WaPo to talk about something it wasn't otherwise inclined to talk about, and good discourse norms are important for effective altruism. * It seems to me (not having read the manifesto) that the policies he advocates would cause large amounts of harm. For instance, without modern medical technology, I and many others would not have survived to the age of one year. * His bombing campaign is evidence of very poor character.
2Pattern
1. Did a newspaper print it verbatim? 2. Did he desist? Did he start again later? 3. Who wrote these: "(which, incidentally, has been updated and given a new title "Anti-Tech Revolution: Why and How", the second edition of which was released this year)"? 4. How long is "eight life sentences", and how much time does he have left?
2DanielFilan
1. Yes, it was published by the Washington Post. 2. Yes, there were no further bombings after its publication. 3. He did. 4. "eight life sentences", IIUC, means that he will serve the rest of his life, and if the justice system decides that one (or any number less than 8) of the sentences should be vacated, he will still serve the rest of his life. I'm not sure what his life expectancy is, but he's 78 at the moment.

I made this post with the intent to write a comment, but the process of writing the comment out made it less persuasive to me. The planning fallacy?

7Raemon
If this is all that Shortform Feed posts ever do it still seems net positive. :P [edit: conditional on, you know, you endorsing it being less persuasive]
4Raemon
Similarly, I sometimes start a shortform post and then realize "you know what, this is actually a long post". And I think that's also shortform doing an important job of lowering the barrier to getting started even if it doesn't directly get used.

Here's a script I wrote to analyze how good Manifold Markets is at predicting Ukraine stuff. Basically: it's about as good as you would be if you were calibrated at 80% accuracy if you average market prices over the life of the market, and if you take the probabilities at the mid-point of the market, it's about as good as you would be if you were calibrated at 72% accuracy.

3DanielFilan
In order to figure out how good this is, you'd also want to check how hard the questions were.

Some puzzles:

  • rubber ducking is really effective
  • it's very difficult to write things clearly, even if you understand them clearly

These seem like they should be related, but I don't quite know how. Maybe if someone thought about it for an hour they could figure it out.

3Gunnar_Zarncke
Related: As I wrote just recently: https://www.facebook.com/Xuenay/posts/10161257148333662?comment_id=10161257444543662  The feeling of something being obvious or easy in the above sense can be mistaken sometimes. It is an intuition or heuristic our brain applies I guess to figure out which things we are supposed to know in a tribe. It can be put on more solid footing by spelling out things and being forced to make intuitions explicit. 
1Measure
My 5-second take is basically what Gunnar_Zarncke already said. If you're finding difficulty writing something clearly, it might mean you don't understand it as clearly as you think. Maybe you understand 90%, and you gloss over the unclear 10%. Writing it out (or trying to fully explain it to someone) forces you to work through that 10%.
2DanielFilan
You might be better at writing than I am.

Quantitative claims about code maintenance from Working in Public, plausibly relevant to discussion of code rot and machine intelligence:

  • "most computer programmers begin their careers doing software maintenance, and many never do anything but", attributed to Nathan Ensmenger, professor at Indiana University.
  • "most software at Google gets rewritten every few years", attributed to Fergus Henderson of Google.
  • "A 2018 Stripe survey of software developers suggested that developers spend 42% of their time maintaining code" - link
  • "Nathan Ensmenger, the informa
... (read more)
2Viliam
Does this definition of "maintenance" include writing new functionality for existing applications? If yes, then I agree; it is a rare opportunity to start coding a non-trivial project from scratch. If no, then I find it difficult to believe how someone could e.g. fix bugs without ever having written their own code first (the school exercises do not count, because in my experience they do not resemble actual industry code).
4DanielFilan
From when the book introduces 'maintenance': So, sounds like the book author isn't including writing new functionality, but IDK if the term has such a fixed and clear meaning that Nathan Ensmenger and all the respondents to the Stripe survey mean the same thing as the book.

Here's a project idea that I wish someone would pick up (written as a shortform rather than as a post because that's much easier for me):

  • It would be nice to study competent misgeneralization empirically, to give examples and maybe help us develop theory around it.
  • Problem: how do you measure 'competence' without reference to a goal??
  • Prior work has used the 'agents vs devices' framework, where you have a distribution over all reward functions, some likelihood distribution over what 'real agents' would do given a certain reward function, and do Bayesian i
... (read more)
2DanielFilan
Toryn Q. Klassen, Parand Alizadeh Alamdari, and Sheila A. McIlraith wrote a paper on the multi-agent AUP thing, framing it as a study of epistemic side effects.

This is a fun Aumann paper that talks about what players have to believe to be in a Nash equilibrium. Here, instead of imagining agents randomizing, we're instead imagining that the probabilities over actions live in the heads of the other agents: you might well know exactly what you're going to do, as long as I don't. It shows that in 2-player games, you can write down conditions that involve mutual knowledge but not common knowledge that imply that the players are at a Nash equilibrium: mutual knowledge of player's conjectures about each other, players' ... (read more)

2DanielFilan
Got it, sort of. Once you have 3 people, then each person has a conjecture about the actions of the other two people. This means that your distribution might not be the product of the marginals over your distributions over the actions of each opponent, so you might be maximizing expected utility wrt your actual beliefs, but not wrt the product of the marginals - and the marginals are what are supposed to form the Nash equilibrium. Common knowledge and common priors mean stop this by forcing your conjecture over the different players to be independent. I still have a hard time explaining in words why this has to be true, but at least I understand the proof.

Let it be known: I'm way more likely to respond to (and thereby algorithmically signal-boost) criticisms of AI doomerism that I think are dumb than those that I think are smart, because the dumb objections are easier to answer. Caveat emptor.

An attempt at rephrasing a shard theory critique of utility function reasoning, while restricting myself to things I basically agree with:

Yes, there are representation theorems that say coherent behaviour is optimizing some utility function. And yes, for the sake of discussion let's say this extends to reward functions in the setting of sequential decision-making (even tho I don't remember seeing a theorem for that). But: just because there's a mapping, doesn't mean that we can pull back a uniform measure on utility/reward functions to get a reasonable mea... (read more)

Here are two EA-themed podcasts that I think someone could make. Maybe that someone is you!

  1. More or Less, but EA (or for forecasting)

More or Less is a BBC Radio program. They take some number that's circulating around the news, and provide context like "Is that literally true? How could someone know that? What is that actually measuring? Is that a big number? Does that mean what you think it means?" - stuff like that. They spend about 10 minutes on each number, and usually include interviews with experts in the field. IMO, someone could do this for numb... (read more)

6DanielFilan
If you're reading this, you might wonder: how do I actually make a podcast? Well, here's the basic technical stuff to get started. 1. Buy a decent microphone, e.g. the Blue Yeti (costs ~$100). This will make you not sound bad. 2. If you're going to be talking to people who aren't physically near you, use some service that will record both of you talking. I recommend Zencastr (free for how I use it). 3. Record some talking (this is the hard part). My strong advice is that if you're doing this remotely, you should both be wearing wired headphones. Please do this in a non-echoey, non-noisy space if you can. Kitchen is bad, sound-isolated place with blankets is good. 4. Do some minimal editing. Don't try to delete every um and ah, that will take way too long. You can use the computer program "audacity" for this (free), or ask me who I pay to do my editing. 5. Optionally, make transcripts by uploading your edited audio files to rev.com (~$1 per minute of audio). You'll then have to re-listen to the audio and fix mistakes in the transcript. If you do this, you will probably want to make a website to put transcripts on, which will maybe involve using Github Pages or Squarespace (or maybe you just put transcripts on a pre-existing Medium/Substack/blog?) 6. Think of a name and logo for your podcast. Your logo needs to be exactly square and high-res. 7. Use a podcast hosting service. I like libsyn (~$10/month for basic plan). Upload your audio files there, write descriptions and episode titles. You should now have an RSS feed. 8. Submit your RSS feed to Google Podcasts, Apple Podcasts, and Spotify. This will involve googling how to do this, you might make some errors, and then it will take ages for Apple to list your podcast. Once you've done all this and dealt with the inevitable hiccups, you now have a podcast! Congratulations! It is certainly possible to do all of this better, but you at least have the basics.
1Quadratic Reciprocity
What do you see as the main value of idea 2? 
2DanielFilan
More easily digestible discussion / analysis of AI alignment ideas. Also it might be fun to listen to.

A sad fact is that good methods to elicit accurate probabilities of the outcome of some future process, e.g. who will win the next election, give you an incentive to influence that outcome, e.g. by campaigning and voting for the candidate you said was more likely to win. But with mind uploading and the 'right' theory of personal identity, we can fix this!

First, suppose that you think of all psychological descendants of your current self as 'you', but you don't think of descendants of your past self as 'you'. So, if you were to make a copy of yourself tomor... (read more)

3Zac Hatfield-Dodds
Objections might include: 1. That's mindcrime and/or murder, which is bad. 2. Acausal trade is in fact a thing 3. blah blah technical feasibility
2DanielFilan
Why murder? No sims are being deleted in this proposal.
2DanielFilan
Ok, a much simpler way is to put yourself in storage right after making the prediction and revive you after the event happens (e.g. by not having the copy of you that hangs out between the prediction and the event). Then you don't need the weird theory of identity.
2Dagon
I'm having trouble supposing this.  Aren't ALL descendants of my past selves "me", including the me who is writing this comment?  I'm good with differing degrees of "me-ness", based on some edit-distance measure that hasn't been formalized, but that's not based on path, it's based on similarity.  My intuition is that it's symmetrical.
2DanielFilan
I'm sympathetic to the idea this is a silly assumption, I just think it buys you a neat result.

Suppose there are two online identities, and you want to verify that they're associated with the same person. It's not too hard to verify this: for instance, you could tell one of them something secretly, and ask the other what you told the first. But how do you determine that two online identities are different people? It's not obvious how you do this with anything like cryptographic keys etc.

One way to do it if the identities always do what's causal-decision-theoretically correct is to have the two identities play a prisoner's dilemma with each other, an... (read more)

2DanielFilan
Here's one way you can do it: Suppose we're doing public key cryptography, and every person is associated with one public key. Then when you write things online you could use a linkable ring signature. That means that you prove that you're using a private key that corresponds to one of the known public keys, and you also produce a hash of your keypair, such that (a) the world can tell you're one of the known public keys but not which public key you are, and (b) the world can tell that the key hash you used corresponds to the public key you 'committed' to when writing the proof.
2DanielFilan
Actually I'm being silly, you don't need ring signatures, just signatures that are associated with identities and also used for financial transfers.
2DanielFilan
Note that for this to work you need a strong disincentive against people sharing their private keys. One way to do this would be if the keys were also used for the purpose of holding cryptocurrency.
2ChristianKl
If you want to search for literature the relevant term is Sybil attack. 

Blog post request: a summary of all the UFO stuff and what odds I should put on alien visitations of earth.

'Seminar' announcement: me talking quarter-bakedly about products, co-products, deferring, and transparency. 3 pm PT tomorrow (actually 3:10 because that's how time works at Berkeley).

I was daydreaming during a talk earlier today (my fault, the talk was great), and noticed that one diagram in Dylan Hadfield-Menell's off-switch paper looked like the category-theoretic definition of the product of two objects. Now, in category theory, the 'opposite' of a product is a co-product, which in set theory is the disjoint union. So if the product of two actions is d... (read more)

4DanielFilan
I do not have many ideas here, so it might mostly be me talking about the category-theoretic definition of products and co-products.

Avoid false dichotomies when reciting the litany of Tarski.

Suppose I were arguing about whether it's morally permissible to eat vegetables. I might stop in the middle and say:

If it is morally permissible to eat vegetables, I desire to believe that it is morally permissible to eat vegetables. If it is morally impermissible to eat vegetables, I desire to believe that it is morally impermissible to eat vegetables. Let me not become attached to beliefs I may not want.

But this ignores the possibility that it's neither morally permissible nor morally impermi... (read more)

6DanielFilan
Alternate title: negation is a little tricky.

An interesting tension: it's kind of obvious from a micro-econ view that group houses should have Pigouvian taxes on uCOVIDs[*] (where I pay housemates for the chance I get them sick) rather than caps on how many uCOVIDs everyone can incur per week - and of course both of these are better than "just sort of be reasonable" or having no system. But uCOVID caps are nice in that they make it significantly easier to coordinate with other houses - it's much easier to figure out how risky interacting with somebody is when they can just tell you their cap, rather ... (read more)

5Zack_M_Davis
It's μCOVID, with a μ!
2DanielFilan
Sorry, I maybe should have cared enough to copy-paste but didn't.
2Ben Pace
(Which you get using option-m on a mac.)
6Zack_M_Davis
(Or M-x insert-char GREEK SM<tab> L<tab>M<tab> in Emacs.)
2DanielFilan
Even easier in Helm-mode!

FYI: I am not using the dialogue matching feature. If you want to dialogue with me, your best bet is to ask me. I will probably say no, but who knows.

Research project idea: formalize a set-up with two reinforcement learners, each training the other. I think this is what's going on in baby care. Specifically: a baby is learning in part by reinforcement learning: they have various rewards they like getting (food, comfort, control over environment, being around people). Some of those rewards are dispensed by you: food, and whether you're around them, smiling and/or mimicking them. Also, you are learning via RL: you want the baby to be happy, nourished, rested, and not cry (among other things). And the baby... (read more)

2mako yass
This will always multiply error, every time, until you have a society, at which point the agents aren't really doing naked RL any more because they need to be resilient enough to not get parasitized/dutchbooked.

Blog posts I could write up in the next few days:

  • My EDC as of late Dec 2022 [EDIT: here]
  • thoughts on media I consumed in 2022 (would include Kindle books + stuff I watched on Netflix and Amazon Prime Video)
4DanielFilan
I could also do my "cover" of the "you should start a blog" genre of post. [EDIT: done]
2DanielFilan
A rationalist cover of Paul Washer's shocking youth message sermon (worth watching if you're interested in sermons).
2Mitchell_Porter
I guess rationalist salvation is now a matter of degree: "What fraction of your multiverse measure will experience a future optimized according to true human values?"
2Dagon
I always enjoy a good EDC discussion.  I've switched this year from trying to use a phone/small tablet out in the world, to just admitting that I really do prefer a full keyboard/trackpad and an OS that's designed for it.  Right now that's a Surface Laptop 3 (13.5" screen, under 3 lbs).  It doesn't go everywhere with me, but it's around often enough that I don't try to type more than a sentence or two on my phone.

An argument for stock-picking:

I'm not sure whether I can pick stocks better than the market. But if I can, then money is more valuable to me in that world, since I have better-than-market opportunities in that world but only par-with-market opportunities in the EMH world. So I should buy stocks that look good to me, at least for a while, to check whether I'm in the world where I can do that, because it's a transfer from a world where money is less valuable to me to one where money is more valuable.

I think this argument goes thru if you assume market returns are equal in both worlds, which I think I think.

2Mark Xu
I agree market returns are equal in expectation, but you're exposing. yourself to more risk for the same expected returns in the "I pick stocks" world, so risk-adjusted returns will be lower.
2DanielFilan
According to Michael Dickens, if you pick like 50 and they're not too correlated it's not actually all that much more risk. Which sort of makes sense - it's like how you can accurately estimate population averages of stuff by taking a relatively small random sample and looking at the sample average.

Results from an experiment I just found about inside vs outside view thinking (but haven't read the actual study, just the abstract: beware!)

Contrary to expectation, participants who assigned more importance to inside factors estimated longer completion times, and participants who gave greater weight to outside factors showed higher degrees of confidence in their estimates.

Excerpts from a FB comment I made about the principle of charity. Quote blocks are a person that I'm responding to, not me. Some editing for coherence has been made. tl;dr: it's fine to conclude that people are acting selfishly, and even to think that it's likely that they're acting selfishly on priors regarding the type of situation they're in.


The essence of charitable discourse is assuming that even your opponents have internally coherent and non-selfish reasons for what they do.

If this were true, then one shouldn't engage in charitable discourse. P... (read more)

2Pattern
Whether it is more charitable to assume someone is or isn't selfish can depend on context.
2Dagon
I think you (and Wikipedia and Scott) are limiting your ideas of what the principle really means. _IF_ you only care about rationality, it's about assuming rationality. For those of us in conversations where we _ALSO_ care about intent, nuance, and connotation, it can include assuming goodwill and best intentions of your conversational partners. In all cases, the assumption is only a prior - you're getting a lot of evidence in the discussion, and you don't need to cling to a false belief when shown that your opponent and their statements are not correct or useful.

A failure of an argument against sola scriptura (cross-posted from Superstimulus)

Recently, Catholic apologist Joe Heschmeyer has produced a couple of videos arguing against the Protestant view of the Bible - specifically, the claims of Sola Scriptura and Perspicuity (capitalized because I'll want to refer to them as premises later). "Sola Scriptura" has been operationalized a few different ways, but one way that most Protestants would agree on is (taken from the Westminster confession):

The whole counsel of God, concerning all things necessary for [...] m

... (read more)
2DanielFilan
Another way of maintaining Sola Scriptura and Perspicuity in the face of Protestant disagreement about essential doctrines is the possibility that all of this is cleared up in the deuterocanonical books that Catholics believe are scripture but Protestants do not. That said, this will still rule out Protestantism, and it's not clear that the deuterocanon in fact clears everything up.