LESSWRONG
LW

All of David Lorell's Comments + Replies

I would guess not. Some more guesses: If there's an increase in testosterone, I think it's pretty mild and likely to revert by some homeostatic mechanism. My understanding is that while sensitivity to DHT causes male pattern baldness on the scalp, it contributes to masculinizing hair-growth elsewhere (like facial hair. I'm pretty sure DHT is the major driver of male-type body-hair development in puberty.) So a woman taking 5-AR inhibitors might actually find the effect to be feminizing. I haven't looked into it much, but for the same reason it seems plausible that men on 5-AR inhibitors will have less intense beards than they might have otherwise. (But more head hair.)

johnswentworth's Shortform

David Lorell11d150

This seems basically right to me, yup. And, as you imply, I also think the rat-depression kicked in for me around the same time likely for similar reasons (though for me an at-least-equally large thing that roughly-coincided was the unexpected, disappointing and stressful experience of the funding landscape getting less friendly for reasons I don't fully understand.) Also some part of me thinks that the model here is a little too narrow but not sure yet in what way(s).

Eli's shortform feed

David Lorell13d40

Unrelated to the actual content of your post, but regarding your "pseudo-depression," I've written a bit about something that sounds damn close to what you describe, which I've been calling "rat depression." Listless but not "sad" is right on the mark.

The Value Proposition of Romantic Relationships

David Lorell1mo62

Fair enough, re: romantic movies showing female preferences*. (Though I don't watch many romance movies and would guess my gestalt impression is therefore more made up of romantic elements in the non-romance movies I do watch...)

*...maybe See below.

Two main thoughts:
1) I think I've lost track of what "male-coded" means and am not sure why it matters. I know that the women I'm closest too see it similarly to me. (Obvious selection effects there, of course.)
2) This aside you're replying to is a pet theory I haven't given much thought to that both men and wom... (read more)

6Viliam1mo

I think the stereotypes for choosing a (heterosexual) partner are the following: * The ideal man is strong, street-smart, agenty, high-status, rich. * The ideal woman is pretty, emotionally sensitive, sexually inexperienced. How this relates to supporting each other? The attributes of the ideal woman are unrelated to needing support. Actually, when a woman needs support, that is a perfect opportunity for a man to demonstrate his strength and resources; this is what the "damsel in distress" trope is about. But when a man needs support... well, apparently he is not strong/smart/agenty enough to help himself, so it kinda ruins his value on the dating market. So the situation is not symmetrical. Loyal partners will support each other, but for the man, it has a flavor of "by supporting you, I demonstrate my value, which makes our relationship stronger", while for the woman it has a flavor of "you lost some of your value, but I will support you loyally anyway". (That is the "she's not very happy about it" part. She now has to work harder than before, to get less of what she wanted.)

The Value Proposition of Romantic Relationships

David Lorell1mo329

I see it as a promise of intent on an abstract level moreso than a guarantee of any particular capability. Maybe more like, "I've got you, wherever/however I am able." And that may well look like traditional gendered roles of physical protection on one side and emotional support on the other, but doesn't have to.

I have sometimes tried to point at the core thing by phrasing it, not very romantically, as an adoption of and joint optimization of utility functions. That's what I mean, at least, when I make this "I got you" promise. And depending on the situati... (read more)

4Viliam1mo

I think this is evidence that supports the hypothesis that "I got you" is male-coded. Romantic movies are mostly made for women; it makes sense that they would portray female preferences.

The Value Proposition of Romantic Relationships

David Lorell1mo232

Not a full response to everything but:

As I mentioned in private correspondence, I think at least the "willingness to be vulnerable" is downstream of a more important thing which is upstream of other important things besides "willingness to be vulnerable." The way I've articulated that node so far is, "A mutual happy promise of, 'I got you' ". (And I still don't think that's quite all of the thing which you quoted me trying to describe.)

Willingness to be vulnerable is a thing that makes people good (or at least comfortable) at performance, public speaking, and truth or dare, but it's missing the expectation/hope that the other will protect and uplift that vulnerable core.

5johnswentworth1mo

I find the "mutual happy promise of 'I got you'" thing... suspicious. For starters, I think it's way too male-coded. Like, it's pretty directly evoking a "protector" role. And don't get me wrong, I would strongly prefer a woman who I could see as an equal, someone who would have my back as much as I have hers... but that's not a very standard romantic relationship. If anything, it's a type of relationship one usually finds between two guys, not between a woman and <anyone else her age>. (I do think that's a type of relationship a lot of guys crave, today, but romantic relationships are a relatively difficult place to satisfy that craving.) And the stereotypes do mostly match the relationships I see around me, in this regard. Even in quite equal happy relationships, like e.g. my parents, even to the extent the woman does sometimes have the man's back she's not very happy about it. To be comfortable opening up, one does need to at least trust that the other person will not go on the attack, but there's a big gap between that and active protection.

4Mo Putera1mo

Do you think Thane's value prop identification captures it?

Orienting Toward Wizard Power

David Lorell1mo40

I'm very glad you're in a better place now! It sounds like there was a lot going on for you and agree that, in circumstances like yours, bupropion is probably not the right starting point.

$500 + $500 Bounty Problem: Does An (Approximately) Deterministic Maximal Redund Always Exist?

David Lorell2mo72

Note to bounty hunters, since it's come up twice: An "approximately deterministic function of X" is one where, conditional on X, the entropy of F(X) is very small, though possibly nonzero. I.e. you have very nearly pinned down the value that F(X) takes on once you learn X. For conceptual intuition on the graphical representation, X approximately mediating between F(X) and F(X) (two random variables which always take on the same value as each other) means as always that there is approximately no further update on the value of either given the other, (despit... (read more)

5johnswentworth2mo

(Note that F, in David's notation here, is a stochastic function in general.)

Orienting Toward Wizard Power

David Lorell2mo62

Well but also kind of yes? Like agreed with what you said, but also the hypothesis is that there's a certain kind of depression-manifestation which is somewhat atypical and that we've seen bupropion work magic on.

*And that this sounds a lot like that manifestation. So it might be particularly good at giving John in particular (and me, and others) the Wizard spirit back.

4Garrett Baker2mo

This is true, but I read amitlevy49's comment as having an implicit "and therefore anyone who wants that kind of natural drive should take bupropion". I probably should've given more information in my response.

Orienting Toward Wizard Power

David Lorell2mo7118

Disclaimer: I am not a doctor and this is not medical advice. Do your own research.

In short: I experienced something similar. Garrett and I call it "Rat(ionalist) Depression." It manifested as similar to a loss/lessening of Will To Wizard Power as John uses the term here. Importantly: I wasn't "sad", or pessimistic about the future (AI risk aside,) or most other classical signs of depression; I was considered pretty well emotionally put-together by myself and my friends (throughout, and this has never stopped being true.) But at some point for reason... (read more)

3Lorxus6d

Reporting back in after having laid hands on some bupropion. I'm two days in and: * it's definitely doing anything helpful with respect to my senses of opportunity, pointfulness, allowedness/permission, and task initiation/switching/completion, among other subtle senses * it meshes just fine with my focus meds so far * the increase in anxiety has been real but thus far manageable * but also there's some weird effect where it's making me way more forgetful/absent-minded by way of that massively improved power at task-initiation and task-switching and sharpness of task-completion? I've already nearly mislaid things like pens, a pocket knife, my phone, assorted drinks.... and such because I set it down, finished a task, and then moved on to the next task * weak improvement to hedonic tone mostly flavored like ~internal burning/dynamoness? in a good way More to come, maybe.

Regex2mo*191

Took buproprion for years and while it did help with executive function, I was also half-insane that entire time (literal years from like 2015 to 2021). I guess it was hypomania? And to expand on 'half-insane' - one aspect of what I mean is was far too willing to accept ideas on a dime, and accepted background assumptions conspiracy theories made while only questioning their explicit claims. Misinformation feels like information! Overall there was a lack of sense of grounding to base conclusions on in the first place. I will note this still describes me so... (read more)

Hopenope2mo*130

As a doctor, I can tell you that even if you don’t have anxiety, it’s possible to develop some while taking bupropion/welbutrin. I used it personally and experienced the most severe anxiety I’ve ever had. It is also associated with a higher chance of seizures, and if you daydream a lot, it may make them worse. However, on the positive side, it often decreases inattention. Generally i like the drug , but it is not a first-line treatment for depression, and for good reasons.

3jmh2mo

Do you have an indications that those without the clinical signs of depression (or at least doctor approved state) won't become acclimated to the drug in a way those that perhaps need it for a balanced state don't? I suppose asking a bit differently here, what are the gears here that do the work and how well one might think they they match up with one's own system that is in place? Interesting though about using it to improve one's performance rather than just as an antidepressant or aid to quit smoking. the wiki has some good info but interesting that it doesn't have a strong effect on dopamine so makes me wonder if looking more at what norepinephrine does, or perhaps the ratio between norepinephrine and dopamine. Any consideration on the use of other NDRI drugs rather than Bupropion. I've not looked into much at this point but Bupropion does have some side effects I would not be too interested in experiencing.

Raemon2mo134

I also had a pretty similar experience.

Orienting Toward Wizard Power

David Lorell2mo53

🫡 I have pitched him. (Also agreed strongly on point 1. And tentatively agree on your point about the primary bottleneck.)

stavros2mo1712

I'm curious about this pitch :)

johnswentworth's Shortform

David Lorell7mo70

Sounds plausible. Is that 50% of coding work that the LLMs replace of a particular sort, and the other 50% a distinctly different sort?

johnswentworth's Shortform

David Lorell7mo63

My impression is that they are getting consistently better at coding tasks of a kind that would show up in the curriculum of an undergrad CS class, but much more slowly improving at nonstandard or technical tasks.

johnswentworth's Shortform

David Lorell7mo134

I do use LLMs for coding assistance every time I code now, and I have in fact noticed improvements in the coding abilities of the new models, but I basically endorse this. I mostly make small asks of the sort that sifting through docs or stack-overflow would normally answer. When I feel tempted to make big asks of the models, I end up spending more time trying to get the LLMs to get the bugs out than I'd have spent writing it all myself, and having the LLM produce code which is "close but not quite and possibly buggy and possibly subtly so" that I then hav... (read more)

5Nathan Helm-Burger7mo

I find them quite useful despite being buggy. I spend about 40% of my time debugging model code, 50% writing my own code, and 10% prompting. Having a planning discussion first with s3.6, and asking it to write code only after 5 or more exchanges works a lot better. Also helpful is asking for lots of unit tests along the way yo confirm things are working as you expect.

johnswentworth's Shortform

David Lorell8mo93

I think that "getting good" at the "free association" game is in finding the sweet spot / negotiation between full freedom of association and directing toward your own interests, probably ideally with a skew toward what the other is interested in. If you're both "free associating" with a bias toward your own interests and an additional skew toward perceived overlap, updating on that understanding along the way, then my experience says you'll have a good chance of chatting about something that interests you both. (I.e. finding a spot of conversation which b... (read more)

David Lorell9mo20

wiggitywiggitywact := fact about the world which requires a typical human to cross a large inferential gap.

David Lorell9mo20

wact := fact about the world
mact := fact about the mind
aact := fact about the agent more generally

vwact := value assigned by some agent to a fact about the world

2johnswentworth9mo

Spitballing: * "local fact" vs "global fact" (to evoke local/global variables) * "local fact" vs "interoperable fact" * "internal fact" vs "interoperable fact" * "fact valence" for the value stuff

David Lorell9mo20

Seems accurate to me. This has been an exercise in the initial step(s) of CCC, which indeed consist of "the phenomenon looks this way to me. It also looks that way to others? Cool. What are we all cottoning on to?"

David Lorell9mo20

Wait. I thought that was crossing the is-ought gap. As I think of it, the is ought gap refers to the apparent type-clash and unclear evidential entanglement between facts-about-the-world and values-an-agent-assigns-to-facts-about-the-world. And also as I think of it, "should be" always is short hand for "should be according to me" though possibly means some kind of aggregated thing but also ground out in subjective shoulds.

So "how the external world is" does not tell us "how the external world should be" .... except in so far as the external world has beco... (read more)

2johnswentworth9mo

Needs jargon also needs jargon ...

David Lorell9mo20

We have at least one jury rigged idea! Conceptually. Kind of.

David Lorell9mo20

Yeeeahhh.... But maybe it's just awkwardly worded rather than being deeply confused. Like: "The learned algorithms which an adaptive system implements may not necessarily accept, output, or even internally use data(structures) which have any relationship at all to some external environment." "Also what the hell is 'reference'."

David Lorell9mo20

Seconded. I have extensional ideas about "symbolic representations" and how they differ from.... non-representations.... but I would not trust this understanding with much weight.

David Lorell9mo20

Seconded. Comments above.

David Lorell9mo20

Indeed, our beliefs-about-values can be integrated into the same system as all our other beliefs, allowing for e.g. ordinary factual evidence to become relevant to beliefs about values in some cases.

Super unclear to the uninitiated what this means. (And therefore threateningly confusing to our future selves.)

Maybe: "Indeed, we can plug 'value' variables into our epistemic models (like, for instance, our models of what brings about reward signals) and update them as a result of non-value-laden facts about the world."

David Lorell9mo20

But clearly the reward signal is not itself our values.

Ahhhh

Maybe: "But presumably the reward signal does not plug directly into the action-decision system."?

Or: "But intuitively we do not value reward for its own sake."?

David Lorell9mo20

It does seem like humans have some kind of physiological “reward”, in a hand-wavy reinforcement-learning-esque sense, which seems to at least partially drive the subjective valuation of things.

Hrm... If this compresses down to, "Humans are clearly compelled at least in part by what 'feels good'." then I think it's fine. If not, then this is an awkward sentence and we should discuss.

David Lorell9mo20

an agent could aim to pursue any values regardless of what the world outside it looks like;

Without knowing what values are, it's unclear that an agent could aim to pursue any of them. The implicit model here is that there is something like a value function in DP which gets passed into the action-decider along with the world model and that drives the agent. But I think we're saying something more general than that.

David Lorell9mo20

but the fact that it makes sense to us to talk about our beliefs

Better terminology for the phenomenon of "making sense" in the above way?

David Lorell9mo20

“learn” in the sense that their behavior adapts to their environment.

I want a new word for this. "Learn" vs "Adapt" maybe. Learn means updating of symbolic references (maps) while Adapt means something like responding to stimuli in a systematic way.

We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap

David Lorell9mo40

Not quite what we were trying to say in the post. Rather than tradeoffs being decided on reflection, we were trying to talk about the causal-inference-style "explaining away" which the reflection gives enough compute for. In Johannes's example, the idea is that the sadist might model the reward as coming potentially from two independent causes: a hardcoded sadist response, and "actually" valuing the pain caused. Since the probability of one cause, given the effect, goes down when we also know that the other cause definitely obtained, the sadist might lower... (read more)

We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap

David Lorell9mo40

Suppose you have a randomly activated (not dependent on weather) sprinkler system, and also it rains sometimes. These are two independent causes for the sidewalk being wet, each of which are capable of getting the job done all on their own. Suppose you notice that the sidewalk is wet, so it definitely either rained, sprinkled, or both. If I told you it had rained last night, your probability that the sprinklers went on (given that it is wet) should go down, since they already explain the wet sidewalk. If I told you instead that the sprinklers went on last ... (read more)

4Raemon9mo

Okay, I think one crystallization here for me is that "explaining away" is a matter of degree. (I think I found the second half of the comment less helpful, but the combo of the first half + John's response is helpful both for my own updating, and seeing where you guys are currently at)

... Wait, our models of semantics should inform fluid mechanics?!?

David Lorell10mo60

This is fascinating and I would love to hear about anything else you know of a similar flavor.

Seconded!!

5JenniferRM10mo

Alright! I'm going to try to stick to "biology flavored responses" and "big picture stuff" here, maybe? And see if something conversational happens? <3 (I attempted several responses in the last few days and each sketch turned into a sprawling messes that became a "parallel comment". Links and summaries at the bottom.) The thing that I think unifies these two attempts at comments is a strong hunch that "human language itself is on the borderland of being anti-epistemic". Like... like I think humans evolved. I think we are animals. I think we individually grope towards learning the language around us and always fail. We never "get to 100%". I think we're facing a "streams of invective" situation by default. I think prairie dogs have some kind of chord-based chirp system that works like human natural language noun phrases do because noun-phrases are convergently useful. And they are flexible-and-learned enough for them to have regional dialects. I think elephants have personal names to help them manage moral issues and bad-actor-detection that arise in their fission-fusion social systems, roughly as humans do, because personal names are convergently useful for managing reputation and tracking loyalty stuff in very high K family systems. I think humans evolved under Malthusian conditions and that there's lots of cannibalism in our history and that we use social instincts to manage groups that manage food shortages (who semi-reliably go to war when hungry). If you're not tracking such latent conflict somehow then you're missing something big. I think human languages evolve ON TOP of human speech capacities, and I follow McWhorter in thinking that some languages are objectively easy (because of being learned by many as a second language (for trade or slavery or due to migration away from the horrors of history or whatever)) and others are objectively hard (because of isolation and due to languages naturally becoming more difficult over time, after a disruption-ca

Most People Don't Realize We Have No Idea How Our AIs Work

David Lorell2y133

Anecdotal 2¢: This is very accurate in my experience. Basically every time I talk to someone outside of tech/alignment about AI risk, I have to go through the whole "we don't know what algorithms the AI is running to do what it does. Yes, really." thing. Every time I skip this accidentally, I realize after a while that this is where a lot of confusion is coming from.

On Trust

David Lorell2y*73

1. "Trust" does seem to me to often be an epistemically broken thing that rides on human-peculiar social dynamics and often shakes out to gut-understandings of honor and respect and loyalty etc.

2. I think there is a version that doesn't route through that stuff. Trust in the "trust me" sense is a bid for present-but-not-necessarily-permanent suspension of disbelief, where the stakes are social credit. I.e. When I say, "trust me on this," I'm really saying something like, "All of that anxious analysis you might be about to do to determine if X is true... (read more)

5AnthonyC2y

I think this is a big part of it. It can also include, "I have information I'm not supposed to share, don't know how to share, or don't have time to share."

Why Not Subagents?

David Lorell2y140

Some nits we know about but didn't include in the problems section:

P[mushroom->anchovy] = 0. The current argument does not handle the case where subagents believe that there is a probability of 0 on one of the possible states. It wouldn't be possible to complete the preferences exactly as written, then.
Indifference. If anchovy were placed directly above mushroom in the preference graph above (so that John is truly indifferent between them), then that might require some special handling. But also it might just work if the "Value vs Utility" issue is work

... (read more)

Would more model evals teams be good?

David Lorell2y31

Might be worth thinking about / comparing how and why things went wrong to produce the 2007/8 GFC. iirc credit raters had misaligned incentives that rhyme with this question/post.

evhub's Shortform

David Lorell3yΩ9140

Disclaimer: At the time of writing, this has not been endorsed by Evan.

I can give this a go.

Unpacking Evan's Comment:
My read of Evan's comment (the parent to yours) is that there are a bunch of learned high-level-goals ("strategies") with varying levels of influence on the tactical choices made, and that a well-functioning end-to-end credit-assignment mechanism would propagate through action selection ("thoughts directly related to the current action" or "tactics") all the way to strategy creation/selection/weighting. In such a system, strategies which dec... (read more)

2TurnTrout3y

Thanks for the story! I may comment more on it later.

Where I agree and disagree with Eliezer

David Lorell3y10

This feels like stepping on a rubber duck while tip-toeing around sleeping giants but:

Don't these analogies break if/when the complexity of the thing to generate/verify gets high enough? That is, unless you think the difficulty of verification of arbitrarily complex plans/ideas is asymptotic to some human-or-lower level of verification capability (which I doubt you do) then at some point humans can't even verify the complex plan.

So, the deeper question just seems to be takeoff speeds again: If takeoff is too fast, we don't have enough time to use "weak" AG... (read more)

Don't leave your fingerprints on the future

David Lorell3y41

But I'm not really accusing y'all of saying "try to produce a future that has no basis in human values." I am accusing this post of saying "there's some neutral procedure for figuring out human values, we should use that rather than a non-neutral procedure."

My read was more "do the best we can to get through the acute risk period in a way that lets humanity have the time and power to do the best it can at defining/creating a future full of value." And that's in response and opposed to positions like "figure out / decide what is best for humanity (or a procedure that can generate the answer to that) and use that to shape the long term future."

Don't leave your fingerprints on the future

David Lorell3y10

The point is that as moral attitudes/thoughts change, societies or individuals which exist long enough will likely come to regret permanently structuring the world according to the morality of a past age. The Roman will either live to regret it, or the society that follows the Roman will come to regret it even if the Roman dies happy, or the AI is brainwashing everyone all the time to prevent moral progress. The analogy breaks down a bit with the third option since I'd guess most people today would not accept it as a success and it's today's(ish) morals that might get locked in, not ancient Rome's.