All of bideup's Comments + Replies

bideup43

seems to me to be a crux between "man, we're probably all going to die" and "we're really really fucked"

Sorry, what's the difference between these two positions? Is the second one meant to be a more extreme version of the first?

2Eli Tyre
Yes.
bideup30

What’s the difference between “Alice is falling victim to confusions/reasoning mistakes about X” and “Alice disagrees with me about X”?

I feel like using the former puts undue social pressure on observers to conclude that you’re right, and makes it less likely they correctly adjudicate between the perspectives.

(Perhaps you can empathise with me here, since arguably certain people taking this sort of tone is one of the reasons AI x-risk arguments have not always been vetted as carefully as they should!)

3[anonymous]
I suspect that, for Alex Turner, writing the former instead of the latter is a signal that he thinks he has identified the specific confusion/reasoning mistake his interlocutor is engaged in, likely as a result of having seen closely analogous arguments in the past from other people who turned out (or even admitted) to be confused about these matters after conversations with him.
bideup10

I learned maths mostly by teachers at school writing on a whiteboard, university lecturers writing on a blackboard or projector, and to a lesser extent friends writing on pieces of paper.

There was a tiny supplement of textbook-reading at school and large supplement of printed-notes-reading at university.

I would guess only a tiny fraction learn exclusively via typed materials. If you have any kind of teacher, how could you? Nobody shows you how to rearrange an equation by live-typing latex.

bideup52

In Texas Hold ‘Em, the most popular form of poker, there is no drawing or discarding, just betting and folding.

This seems like strong evidence that those parts are where the skill lies — somebody came up with a version that removed the other parts, and everyone switched to it.

Not sure how that affects the metaphor. For me I think it weakened the punch, since I had to stop and remember that there exist forms of poker with drawing and discarding.

bideup31

Right, I understand it now, thanks. I missed the labels on the x axis.

bideup10

I found your bar chart more confusing than illuminating. Does it make sense to mark the bottom 20% of people, and those people’s 43% probability of staying in the bottom 20%, as two different fractions of the same bar? The 43% is 43% of the 20%, not of the original 100%.

1James Stephen Brown
Hi bideup, thanks for your comment. The graph is simplified from one in the Pew Report with the left bar representing the lower quintile and the right representing the upper. I see what you mean, but the intention of pointing to the 20% mark is to show where it should be given 100% social mobility. Perhaps the omission of the central quintiles didn’t help.
bideup10

If many more people are extremely happy all the time than extremely depressed all the time, the bunch of people you describe would be managing their beliefs rationally. And indeed I think that’s probably the case.

bideup150

Can anybody confirm whether Paul is likely systematically silenced re OpenAI?

AnnaSalamon1112

I don't know the answer, but it would be fun to have a twitter comment with a zillion likes asking Sam Altman this question.  Maybe someone should make one?

I mean, if Paul doesn't confirm that he is not under any non-disparagement obligations to OpenAI like Cullen O' Keefe did, we have our answer.

In fact, given this asymmetry of information situation, it makes sense to assume that Paul is under such an obligation until he claims otherwise.

bideup10

I’m an adult from the UK and learnt the word faucet like last year

bideup10

Thanks. Do you use this system for reading list(s) too?

3Steven Byrnes
Yeah some of my to-do items are of the form "skim X". Inside the "card" I might have a few words about how I originally came across X and what I'm hoping to get out of skimming it.
bideup30

When you say you use a kanban-style system, does that just refer to the fact that there are columns that you drag items between, or does it specifically mean that you also make use of an 'in progress' column?

If so, do you have one for each 'todo' column, or what?

And do you have a column for the 'capture' aspect of GTD, or do you do something else for that?

4Steven Byrnes
It just refers to the fact that there are columns that you drag items between. I don't even really know how a "proper" kanban works. If a new task occurs to me in the middle of something else, I'll temporarily put it in a left (high-priority) column, just so I don't forget it, and then later when I'm at my computer and have a moment to look at it, I might decide to drag it to a right (low-priority) column instead of doing it.
bideup30

Are you interested in these debates in order to help form your own views, or convince others?

I feel like debates are inferior to reading people's writings for the former purpose, and for the latter they deal collateral damage by making the public conversation more adversarial.

3the gears to ascension
for bengio and turner, the former. for bezos vs connor, definitely the latter, but the public conversation is already adversarial which is why I care to respond in a way that seeks to establish truthseeking in a hostile context, ie, we can honor the reasonable claims but must dishonor the way the reasonable claims are entered in order to get any use out of that. Connor is reasonably good at this but needs to tone down some traits that I don't know how to advise further on. bengio and turner would hopefully do it in text I guess, yeah.
bideup61

I keep reading the title as Attention: SAEs Scale to GPT-2 Small.

Thanks for the heads up.

bideup32

I think what I was thinking of is that words can have arbitrary consequences and be arbitrarily high cost.

In the apologising case, making the right social API call might be an action of genuine significance. E.g. it might mean taking the hit on lowering onlookers' opinion of my judgement, where if I'd argued instead that the person I wronged was talking nonsense I might have got away with preserving it.

John's post is about how you can gain respect for apologising, but it does have often have costs too, and I think the respect is partly for being willing to pay them.

bideup30

Words are a type of action, and I guess apologising and then immediately moving on to defending yourself is not the sort of action which signals sincerity.

1M. Y. Zuo
Well technically since it does take energy and time to move the vocal chords, mouth, tongue, etc..., but it's such a low cost action that even doing something as simple as treating someone to lunch will outweigh it by a hundred fold.
bideup21

Explaining my downvote:

This comment contains ~5 negative statements about the post and the poster without explaining what it is that the commentor disagrees with.

As such it seems to disparage without moving the conversation forward, and is not the sort of comment I'd like to see on LessWrong.

2Mikhail Samin
My comment was a reply to a comment on ITT. I made it in the hope someone would be up for the bet. I didn’t say I disagree with the OP's claims on alignment; I said I don’t think they’d be able to pass an ITT. I didn’t want to talk about specifics of what the OP doesn’t seem to understand about Yudkowsky’s views, as the OP could then reread some of what Yudkowsky’s written more carefully, and potentially make it harder for me to distinguish them in an ITT. I’m sorry if it seemed disparaging. The comment explained what I disagree with in the post: the claim that the OP would be good at passing an ITT. It wasn’t intended as being negative about the OP, as, indeed, I think 20 people are on the right order of magnitude of the amount of people who’d be substantially better at it, which is the bar of being in the top 0.00000025% of Earth population at this specific thing. (I wouldn’t claim I’d pass that bar.) If people don’t want to do any sort of betting, I’d be up for a dialogue on what I think Yudkowsky thinks that would contradict some of what’s written in the post, but I don’t want to spend >0.5h on a comment no one will read
bideup30

The second footnote seems to be accidentally duplicated as the intro. Kinda works though.

2johnswentworth
WOW I missed that typo real hard. Thanks for mentioning.
bideup157

"Not invoking the right social API call" feels like a clarifying way to think about a specific conversational pattern that I've noticed that often leads to a person (e.g. me) feeling like they're virtuosly giving up ground, but not getting any credit for it.

It goes something like:

Alice: You were wrong to do X and Y.

Bob: I admit that I was wrong to do X and I'm sorry about it, but I think Y is unfair.

discussion continues about Y and Alice seems not to register Bob's apology

It seems like maybe bundling in your apology for X with a protest against Y just ... (read more)

3PoignardAzur
I think "API calls" are the wrong way to word it. It's more that an apology is a signal; to make it effective, you must communicate that it's a real signal reflecting your actual internal processes, and not a result of a surface-level "what words can I say to appear maximally virtuous" process. So for instance, if you say a sentence equivalent to "I admit that I was wrong to do X and I'm sorry about it, but I think Y is unfair", then you're not communicating that you underwent the process of "I realized I was wrong, updated my beliefs based on it, and wondered if I was wrong about other things". A simple fix would be "I admit I was wrong to do X, and I'm sorry about it. Let me think about Y for a moment." And then actually think about Y, because if you did one thing wrong, you probably did other things wrong too.
4M. Y. Zuo
Typically people show genuine sincerity by their actions, not just by words...  So focusing on the 'right social API calls' seems a bit tangential.
9Kei
It also helps to dedicate a complete sentence (or multiple sentences if the action you're apologizing for wasn't just a minor mistake) to your apology. When apologizing in-person, you can also pause for a bit, giving your conversational partner the opportunity to respond if they want to. When you immediately switch into the next topic, as in your example apology above, it looks like you're trying to distract from the fact that you were wrong, and also makes it less likely your conversational partner internalizes that you apologized.
bideup43

Is it true that scaling laws are independent of architecture? I don’t know much about scaling laws but that seems surely wrong to me.

e.g. how does RNN scaling compare to transformer scaling

2Vladimir_Nesov
The relevant laws describe how perplexity determines compute and data needed to get it by a training run that tries to use as little compute as possible and is otherwise unconstrained on data. The claim is this differs surprisingly little across different architectures. This is different from what historical trends in algorithmic progress measure, since those results are mostly not unconstrained on data (which also needs to be from sufficiently similar distributions to compare architectures), and fail to get through the initial stretch of questionable scaling at low compute. It's still probably mostly selection effect, but see Mamba's scaling laws (Figure 4 in the paper) where dependence of FLOPs on perplexity only ranges about 6x across GPT-3, LLaMA, Mamba, Hyena, and RWKV. Also, the graphs for different architectures don't like intersecting, suggesting some "compute multiplier" property of how efficient an architecture is across a wide range of compute compared to another architecture. The question is if any of these compute multipliers significantly change at greater scale, once you clear the first 1e20 FLOPs or so.
bideup22

Your example of a strong syllogism (‘if A, then B. A is true, therefore B is true’) isn’t one.

It’s instead of the form ‘If A, then B. A is false, therefore B is false’, which is not logically valid (and also not a Jaynesian weak syllogism).

If Fisher lived to 100 he would have become a Bayesian

Fisher died at the age of 72

———————————————————————————————————

Fisher died a Frequentist

You could swap the conclusion with the second premise and weaken the new conclusion to ‘Fisher died before 100’, or change the premise to ‘Unless Fisher lived to a 100 he would not become a Bayesian’.

2Jan Christian Refsgaard
crap, you are right, this was one of the last things we changed before publishing because out previous example were to combative :(. I will fix it later today.
bideup107

Augmenting humans to do better alignment research seems like a pretty different proposal to building artificial alignment researchers.

The former is about making (presumed-aligned) humans more intelligent, which is a biology problem, while the latter is about making (presumed-intelligent) AIs aligned, which is a computer science problem.

8Noosphere89
I think my crux is that if we assume that humans are scalable in intelligence without the assumption that they become misaligned, then it becomes much easier to argue that we'd be able to align AI without having to go through the process, for the reason sketched out by jdp: https://www.lesswrong.com/posts/JcLhYQQADzTsAEaXd/?commentId=7iBb7aF4ctfjLH6AC
bideup30

I don’t think that that’s the view of whoever wrote the paragraph you’re quoting, but at this point we’re doing exegesis

2faul_sname
"We don't currently have any way of getting any system to learn to robustly optimize for any specific goal once it enters an environment very different from the one it learned in" is my own view, not Nate's. Like I think the MIRI folks are concerned with "how do you get an AGI to robustly maximize any specific static utility function that you choose". I am aware that the MIRI people think that the latter is inevitable. However, as far as I know, we don't have even a single demonstration of "some real-world system that robustly maximizes any specific static utility function, even if that utility function was not chosen by anyone in particular", nor do we have any particular reason to believe that such a system is practical. And I think Nate's comment makes it pretty clear that "robustly maximize some particular thing" is what he cares about.
bideup52

Hm, I think that paragraph is talking about the problem of getting an AI to care about a specific particular thing of your choosing (here diamond-maximising), not any arbitrary particular thing at all with no control over what it is. The MIRI-esque view thinks the former is hard and the latter happens inevitably.

2faul_sname
I don't think we have any way of getting an AI to "care about" any arbitrary particular thing at all, by the "attempt to maximize that thing, self-correct towards maximizing that thing if the current strategies are not working" definition of "care about". Even if we relax the "and we pick the thing it tries to maximize" constraint.
bideup30

Right, makes complete sense in the case of LLM-based agents, I guess I was just thinking about much more directly goal-trained agents.

bideup50

I like the distinction but I don’t think either aimability or goalcraft will catch on as Serious People words. I’m less confident about aimability (doesn’t have a ring to it) but very confident about goalcraft (too Germanic, reminiscent of fantasy fiction).

Is words-which-won’t-be-co-opted what you’re going for (a la notkilleveryoneism), or should we brainstorm words-which-could-plausibly-catch on?

1RogerDearnaley
I would say "metaethics", but sadly the philosophers of Ethics already used that one for something else. How about "Social Ethical System Design" or "Alignment Ethical Theory" for 'goalcraft', and "Pragmatic Alignment" for 'aimability'?
bideup10

Perhaps, or perhaps not? I might be able to design a gun which shoots bullets in random directions (not on random walks), without being able to choose the direction.

Maybe we can back up a bit, and you could give some intuition for why you expect goals to go on random walks at all?

My default picture is that goals walk around during training and perhaps during a reflective process, and then stabilise somewhere.

2avturchin
My intuition: imagent LLM-based agent. It has fixed prompt and some context text and use this iteratively. Context part can change and as it changes, it affects interpretation of fixed part of the prompt.  Examples are Waluigi and other attacks. This causes goal drift.  This may have bad consequences as a robot suddenly turns in Waluigi and start kill randomly everyone around. But long-term planning and deceptive alignment requires very fixed goal system. 
bideup30

I think that’s a reasonable point (but fairly orthogonal to the previous commenter’s one)

bideup52

A gun which is not easily aimable doesn't shoot bullets on random walks.

Or in less metaphorical language, the worry is that mostly that it's hard to give the AI the specific goal you want to give it, not so much that it's hard to make it have any goal at all. I think people generally expect that naively training an AGI without thinking about alignment will get you a goal-directed system, it just might not have the goal you want it to.

2faul_sname
At least some people are worried about the latter, for a very particular meaning of the word "goal". From that post: I think to some extent this is a matter of "yes, I see that you've solved the problem in practical terms, and yes, every time we try to implement the theoretically optimal solution it fails due to Goodharting, but we really want the theoretically optimal solution", which is... not universally agreed, to say the least. But it is a concern some people have.
2avturchin
If we find that AI can stop its random walk on a goal X, we can use this as an aimability instrument, and find a way to manipulate the position of X.
6Roko
The practical effect of very inaccurate guns in the past was that guns mattered less and battles were often won by bayonet charges or morale. So I think it's fair to conclude that Aimability just makes AI matter a lot more.
2MondSemmel
Friendship is Optimal, a My Little Pony fan fiction about an AGI takeover (?) scenario. 39k words. (I don't know the details, haven't read it.)
bideup62

I like the idea of a public research journal a lot, interested to see how this pans out!

bideup1510

You seem to be operating on a model that says “either something is obvious to a person, or it’s useful to remind them of it, but not both”, whereas I personally find it useful to be reminded of things that I consider obvious, and I think many others do too. Perhaps you don’t, but could it be the case that you’re underestimating the extent to which it applies to you too?

I think one way to understand it is to disambiguate ‘obvious’ a bit and distinguish what someone knows from what’s salient to them.

If someone reminds me that sleep is important and I thank ... (read more)

bideup82

I think it falls into the category of 'advice which is of course profoundly obvious but might not always occur to you', in the same vein as 'if you have a problem, you can try to solve it'.

When you're looking for something you've lost, it's genuinely helpful when somebody says 'where did you last have it?', and not just for people with some sort of looking-for-stuff-atypicality.

0AlphaAndOmega
I will regard with utter confusion someone who doesn't immediately think of the last place they saw something when they've lost it. It's fine to state the obvious on occasion, it's not always obvious to everyone, and like I said in the parent comment, this post seems to be liked/held useful by a significant number of LW users. I contend that's more of a property of said users. This does not make the post a bad thing or constitute a moral judgement!
bideup12

I think I practice something similar to this with selfishness: a load-bearing part of my epistemic rationality is having it feel acceptable that I sometimes (!) do things for selfish rather than altruistic reasons.

You can make yourself feel that selfish acts are unacceptable and hope this will make you very altruistic and not very selfish, but in practice it also makes you come up with delusional justifications as to why selfish acts are in fact altruistic.

From an impartial standpoint we can ask how much of the latter is woth it for how much of the former. I think one of life's repeated lessons is that sacrificing your epistemics for instrumental reasons is almost always a bad idea.

bideup10

Do people actually disapprove of and disagree with this comment, or do they disapprove of the use of said 'poetic' language in the post? If the latter, perhaps they should downvote the post and upvote the comment for honesty.

Perhaps there should be a react for "I disapprove of the information this comment revealed, but I'm glad it admitted it".

7Ege Erdil
As I said, I think it's not just that the language is poetic. There is an implicit inference that goes like 1. People who would not voluntarily undergo surgery without long-term adverse effects on their health to improve the life of a stranger are evil. 2. Most researchers who would be in a position to know the state of the evidence on the long-term adverse health effects of kidney donation don't personally donate one of their kidneys. 3. Most researchers are unlikely to be evil. 4. So it's unlikely that most researchers believe kidney donation has no long-term adverse health effects. I'm saying that there is no definition of the word "evil" that makes statements (1) and (3) simultaneously true. Either you adopt a narrow definition, in which case (3) is true but (1) is false; or you adopt a broad definition, in which case (1) is true but (3) is false. This is not a point about stylistic choices, it's undermining one of the key arguments the post offers for its position. The post is significantly stronger if it can persuade us that even established experts in the field agree with the author and the hypothesis being advanced is in some sense "mainstream", even if it's implicitly held.
bideup74

LLMs calculate pdfs, regardless of whether they calculate ‘the true’ pdf.

bideup10

Sometimes I think trying to keep up with the endless stream of new papers is like watching the news - you can save yourself time and become better informed by reading up on history (ie classic papers/textbooks) instead.

This is a comforting thought, so I’m a bit suspicious of it. But also it’s probably more true for a junior researcher not committed to a particular subfield than someone who’s already fully specialised.

bideup*99

Sometimes such feelings are your system 1 tracking real/important things that your system 2 hasn’t figured out yet.

bideup1310

I’d like to see more posts using this format, including for theoretical research.

bideup1013

I vote singular learning theory gets priority (if there was ever a situation where one needed to get priority). I intuitively feel like research agendas or communities need an acronym more than concepts. Possibly because in the former case the meaning of the phrase becomes more detached from the individual meaning of the words than it does in the latter.

bideup399

Just wanted to say that I am a vegan and I’ve appreciated this series of posts.

I think the epistemic environment of my IRL circles has always been pretty good around veganism, and personally I recoil a bit from discussion of specific people or groups’ epistemic virtues of lack thereof (not sure if I think it’s unproductive or just find it aversive), so this particular post is of less interest to me personally. But I think your object-level discussion of the trade-offs of veganism has been consistently fantastic and I wanted to thank you for the contribution!

bideup20

Are Self Control and Freedom.to for different purposes or the same? Should I try multiple app/website blockers till I find one that's right for me, or is there an agreed upon best one that I can just adopt with no experimentation?

4Raemon
Freedom is overall best (it syncs across your devices and can block apps on desktop), but self control had a different mechanism that was harder to circumvent
4jacquesthibs
Used to be. Now Freedom is a lot better. They have an app to block you from your phone’s and tablet’s apps/websites (on top of your computer’s websites and apps). Self Control only blocks websites on mac. Self Control is free, but Freedom is worth paying for imo. And you can create custom block lists and scheduled block time in Freedom. I used to use Cold Turkey, which I liked, but Freedom is much better.
bideup10

Well, the joke does give a fair bit of information about both your politics and how widespread you think they are on LW. It might be very reasonable for someone to update their beliefs about LW politics based on seeing it. Then to what extent their conclusion mind-kills them is somewhat independent of the joke.

(I agree it’s a fairly trivial case, mostly discussing it out of interest in how our norms should work.)

1Oliver Sourbut
Yeah, interesting. FWIW I've never voted in the US (I'm British), and I've observed and discussed politics (broadly construed) being mind-killing. I weakly assess LW consensus to be 'obviously major candidates are all terrible'. Of course internet people don't know these facts unless they bother to check, which is an unreasonably high bar! But I do expect LW readers to understand mind-killing, and consider it common knowledge. Trying to learn from this thread. With the OP invoking recent US presidents as a topic of in-context flippancy and humour, it didn't even cross my mind that the joke wouldn't come across as being entirely about the ability of deepfakes to influence people's opinions (I could have punctuated it with any number of flippant fake observations, and it didn't seem important). Then the only real explanation for downvotes was mind-killed responses, but you've helped me realise this all wasn't obvious, and in hindsight I should have predicted that - thanks. Incidentally, this reminds me of the (folk?) claim about normativity along the lines of, 'most people don't believe the news, while believing most other people do believe the news'. Normally I think it's part of the mind-killing process that people much too frequently respond to things on the basis of some imagined third-party response. But regarding the question of 'is this political-flavoured sentence potentially mind-killingly potent' I can see why it'd be worth adopting a precautionary principle. (But then why is not the OP punished? After all, it's literally a politically-flavoured infohazard in multiple ways which I won't spell out. I happen to think it's an on-balance good one, but I also happen to think my throwaway remark was an on-balance good and harmless one.)
bideup83

My guess is that it's not that people are downvoting because they think you made a political statement which they oppose and they are mind-killed by it. Rather they think you made a political joke which has the potential to mind-kill others, and they would prefer you didn't.

That's why I downvoted, at least. The topic you mentioned doesn't arouse strong passions in me at all, and probably doesn't arouse strong passions in the average LW reader that much, but it does arouse strong passions in quite a large number of people, and when those people are here, I'd prefer such passions weren't aroused.

1Oliver Sourbut
Aha, thanks. That makes some sense! I generally don't expect people to get mind-killed by (what seem like) obvious jokes, but I guess now you mention it, I should probably entertain that as possible (but maybe not on LW?)
1blake.crypto
Mind-killed?
bideup53

Even now I would like it if you added an edit at the start to make it clearer what you’re doing! Before reading the replying comment and realising the context, I was mildly shocked by such potentially inflammatory speculation and downvoted.

bideup30

On the other hand, even the smallest of small towns in the UK has a wide variety of ethnic food. I think pretty much anywhere with a restaurant has a Chinese and an Indian, and usually a lot more.

bideup*21

Meta point: I think the forceful condescending tone is a bit inappropriate when you’re talking about a topic that you don’t necessarily know that much about.

You’ve flatly asserted that the entirety of game theory is built on an incorrect assumption, and whether you’re or not your correct about that, it doesn’t seem like you’re that clued up on game theory.

Eliezer just about gets away with his tone because he knows whereof he speaks. But I would prefer it if he showed more humility, and I think if you’re writing about a topic while you’re learning the basic... (read more)

2Isaac King
I agree! The condescending tone was not intentional, I think it snuck in due to soem lingering frustration with some people I've argued with in the past. (e.g. those who say that two-boxing is rational even when they admit it gets them less money.) I've tried to edit that out, let me know if there are any passages that still strike you as unnecessarily rude.
bideup20

Being able to deduce a policy from beliefs doesn’t mean that common knowledge of beliefs is required.

The common knowledge of policy thing is true but is external to the game. We don’t assume that players in prisoner’s dilemma know each others policies. As part of our analysis of the structure of the game, we might imagine that in practice some sort of iterative responding-to-each-other’s-policy thing will go on, perhaps because players face off regularly (but myopically), and so the policies selected will be optimal wrt each other. But this isn’t really a... (read more)

3TekhneMakre
Sure, I didn't say it was. I'm saying it's sufficient (given some assumptions), which is interesting. Sure, who's saying so? It's analyzed this way in the literature, and I think it's kind of natural; how else would you make the game be genuinely perfect information (in the intuitive sense), including the other agent, without just picking a policy?
Load More