This seems basically right to me, yup. And, as you imply, I also think the rat-depression kicked in for me around the same time likely for similar reasons (though for me an at-least-equally large thing that roughly-coincided was the unexpected, disappointing and stressful experience of the funding landscape getting less friendly for reasons I don't fully understand.) Also some part of me thinks that the model here is a little too narrow but not sure yet in what way(s).
Unrelated to the actual content of your post, but regarding your "pseudo-depression," I've written a bit about something that sounds damn close to what you describe, which I've been calling "rat depression." Listless but not "sad" is right on the mark.
Fair enough, re: romantic movies showing female preferences*. (Though I don't watch many romance movies and would guess my gestalt impression is therefore more made up of romantic elements in the non-romance movies I do watch...)
*...maybe See below.
Two main thoughts:
1) I think I've lost track of what "male-coded" means and am not sure why it matters. I know that the women I'm closest too see it similarly to me. (Obvious selection effects there, of course.)
2) This aside you're replying to is a pet theory I haven't given much thought to that both men and wom...
I see it as a promise of intent on an abstract level moreso than a guarantee of any particular capability. Maybe more like, "I've got you, wherever/however I am able." And that may well look like traditional gendered roles of physical protection on one side and emotional support on the other, but doesn't have to.
I have sometimes tried to point at the core thing by phrasing it, not very romantically, as an adoption of and joint optimization of utility functions. That's what I mean, at least, when I make this "I got you" promise. And depending on the situati...
Not a full response to everything but:
As I mentioned in private correspondence, I think at least the "willingness to be vulnerable" is downstream of a more important thing which is upstream of other important things besides "willingness to be vulnerable." The way I've articulated that node so far is, "A mutual happy promise of, 'I got you' ". (And I still don't think that's quite all of the thing which you quoted me trying to describe.)
Willingness to be vulnerable is a thing that makes people good (or at least comfortable) at performance, public speaking, and truth or dare, but it's missing the expectation/hope that the other will protect and uplift that vulnerable core.
I'm very glad you're in a better place now! It sounds like there was a lot going on for you and agree that, in circumstances like yours, bupropion is probably not the right starting point.
Note to bounty hunters, since it's come up twice: An "approximately deterministic function of X" is one where, conditional on X, the entropy of F(X) is very small, though possibly nonzero. I.e. you have very nearly pinned down the value that F(X) takes on once you learn X. For conceptual intuition on the graphical representation, X approximately mediating between F(X) and F(X) (two random variables which always take on the same value as each other) means as always that there is approximately no further update on the value of either given the other, (despit...
Well but also kind of yes? Like agreed with what you said, but also the hypothesis is that there's a certain kind of depression-manifestation which is somewhat atypical and that we've seen bupropion work magic on.
*And that this sounds a lot like that manifestation. So it might be particularly good at giving John in particular (and me, and others) the Wizard spirit back.
Disclaimer: I am not a doctor and this is not medical advice. Do your own research.
In short: I experienced something similar. Garrett and I call it "Rat(ionalist) Depression." It manifested as similar to a loss/lessening of Will To Wizard Power as John uses the term here. Importantly: I wasn't "sad", or pessimistic about the future (AI risk aside,) or most other classical signs of depression; I was considered pretty well emotionally put-together by myself and my friends (throughout, and this has never stopped being true.) But at some point for reason...
Took buproprion for years and while it did help with executive function, I was also half-insane that entire time (literal years from like 2015 to 2021). I guess it was hypomania? And to expand on 'half-insane' - one aspect of what I mean is was far too willing to accept ideas on a dime, and accepted background assumptions conspiracy theories made while only questioning their explicit claims. Misinformation feels like information! Overall there was a lack of sense of grounding to base conclusions on in the first place. I will note this still describes me so...
As a doctor, I can tell you that even if you don’t have anxiety, it’s possible to develop some while taking bupropion/welbutrin. I used it personally and experienced the most severe anxiety I’ve ever had. It is also associated with a higher chance of seizures, and if you daydream a lot, it may make them worse. However, on the positive side, it often decreases inattention. Generally i like the drug , but it is not a first-line treatment for depression, and for good reasons.
I also had a pretty similar experience.
🫡 I have pitched him. (Also agreed strongly on point 1. And tentatively agree on your point about the primary bottleneck.)
I'm curious about this pitch :)
Sounds plausible. Is that 50% of coding work that the LLMs replace of a particular sort, and the other 50% a distinctly different sort?
My impression is that they are getting consistently better at coding tasks of a kind that would show up in the curriculum of an undergrad CS class, but much more slowly improving at nonstandard or technical tasks.
I do use LLMs for coding assistance every time I code now, and I have in fact noticed improvements in the coding abilities of the new models, but I basically endorse this. I mostly make small asks of the sort that sifting through docs or stack-overflow would normally answer. When I feel tempted to make big asks of the models, I end up spending more time trying to get the LLMs to get the bugs out than I'd have spent writing it all myself, and having the LLM produce code which is "close but not quite and possibly buggy and possibly subtly so" that I then hav...
I think that "getting good" at the "free association" game is in finding the sweet spot / negotiation between full freedom of association and directing toward your own interests, probably ideally with a skew toward what the other is interested in. If you're both "free associating" with a bias toward your own interests and an additional skew toward perceived overlap, updating on that understanding along the way, then my experience says you'll have a good chance of chatting about something that interests you both. (I.e. finding a spot of conversation which b...
wiggitywiggitywact := fact about the world which requires a typical human to cross a large inferential gap.
wact := fact about the world
mact := fact about the mind
aact := fact about the agent more generally
vwact := value assigned by some agent to a fact about the world
Seems accurate to me. This has been an exercise in the initial step(s) of CCC, which indeed consist of "the phenomenon looks this way to me. It also looks that way to others? Cool. What are we all cottoning on to?"
Wait. I thought that was crossing the is-ought gap. As I think of it, the is ought gap refers to the apparent type-clash and unclear evidential entanglement between facts-about-the-world and values-an-agent-assigns-to-facts-about-the-world. And also as I think of it, "should be" always is short hand for "should be according to me" though possibly means some kind of aggregated thing but also ground out in subjective shoulds.
So "how the external world is" does not tell us "how the external world should be" .... except in so far as the external world has beco...
We have at least one jury rigged idea! Conceptually. Kind of.
Yeeeahhh.... But maybe it's just awkwardly worded rather than being deeply confused. Like: "The learned algorithms which an adaptive system implements may not necessarily accept, output, or even internally use data(structures) which have any relationship at all to some external environment." "Also what the hell is 'reference'."
Seconded. I have extensional ideas about "symbolic representations" and how they differ from.... non-representations.... but I would not trust this understanding with much weight.
Seconded. Comments above.
Indeed, our beliefs-about-values can be integrated into the same system as all our other beliefs, allowing for e.g. ordinary factual evidence to become relevant to beliefs about values in some cases.
Super unclear to the uninitiated what this means. (And therefore threateningly confusing to our future selves.)
Maybe: "Indeed, we can plug 'value' variables into our epistemic models (like, for instance, our models of what brings about reward signals) and update them as a result of non-value-laden facts about the world."
But clearly the reward signal is not itself our values.
Ahhhh
Maybe: "But presumably the reward signal does not plug directly into the action-decision system."?
Or: "But intuitively we do not value reward for its own sake."?
It does seem like humans have some kind of physiological “reward”, in a hand-wavy reinforcement-learning-esque sense, which seems to at least partially drive the subjective valuation of things.
Hrm... If this compresses down to, "Humans are clearly compelled at least in part by what 'feels good'." then I think it's fine. If not, then this is an awkward sentence and we should discuss.
an agent could aim to pursue any values regardless of what the world outside it looks like;
Without knowing what values are, it's unclear that an agent could aim to pursue any of them. The implicit model here is that there is something like a value function in DP which gets passed into the action-decider along with the world model and that drives the agent. But I think we're saying something more general than that.
but the fact that it makes sense to us to talk about our beliefs
Better terminology for the phenomenon of "making sense" in the above way?
“learn” in the sense that their behavior adapts to their environment.
I want a new word for this. "Learn" vs "Adapt" maybe. Learn means updating of symbolic references (maps) while Adapt means something like responding to stimuli in a systematic way.
Not quite what we were trying to say in the post. Rather than tradeoffs being decided on reflection, we were trying to talk about the causal-inference-style "explaining away" which the reflection gives enough compute for. In Johannes's example, the idea is that the sadist might model the reward as coming potentially from two independent causes: a hardcoded sadist response, and "actually" valuing the pain caused. Since the probability of one cause, given the effect, goes down when we also know that the other cause definitely obtained, the sadist might lower...
Suppose you have a randomly activated (not dependent on weather) sprinkler system, and also it rains sometimes. These are two independent causes for the sidewalk being wet, each of which are capable of getting the job done all on their own. Suppose you notice that the sidewalk is wet, so it definitely either rained, sprinkled, or both. If I told you it had rained last night, your probability that the sprinklers went on (given that it is wet) should go down, since they already explain the wet sidewalk. If I told you instead that the sprinklers went on last ...
This is fascinating and I would love to hear about anything else you know of a similar flavor.
Seconded!!
Anecdotal 2¢: This is very accurate in my experience. Basically every time I talk to someone outside of tech/alignment about AI risk, I have to go through the whole "we don't know what algorithms the AI is running to do what it does. Yes, really." thing. Every time I skip this accidentally, I realize after a while that this is where a lot of confusion is coming from.
1. "Trust" does seem to me to often be an epistemically broken thing that rides on human-peculiar social dynamics and often shakes out to gut-understandings of honor and respect and loyalty etc.
2. I think there is a version that doesn't route through that stuff. Trust in the "trust me" sense is a bid for present-but-not-necessarily-permanent suspension of disbelief, where the stakes are social credit. I.e. When I say, "trust me on this," I'm really saying something like, "All of that anxious analysis you might be about to do to determine if X is true...
Some nits we know about but didn't include in the problems section:
Might be worth thinking about / comparing how and why things went wrong to produce the 2007/8 GFC. iirc credit raters had misaligned incentives that rhyme with this question/post.
Disclaimer: At the time of writing, this has not been endorsed by Evan.
I can give this a go.
Unpacking Evan's Comment:
My read of Evan's comment (the parent to yours) is that there are a bunch of learned high-level-goals ("strategies") with varying levels of influence on the tactical choices made, and that a well-functioning end-to-end credit-assignment mechanism would propagate through action selection ("thoughts directly related to the current action" or "tactics") all the way to strategy creation/selection/weighting. In such a system, strategies which dec...
This feels like stepping on a rubber duck while tip-toeing around sleeping giants but:
Don't these analogies break if/when the complexity of the thing to generate/verify gets high enough? That is, unless you think the difficulty of verification of arbitrarily complex plans/ideas is asymptotic to some human-or-lower level of verification capability (which I doubt you do) then at some point humans can't even verify the complex plan.
So, the deeper question just seems to be takeoff speeds again: If takeoff is too fast, we don't have enough time to use "weak" AG...
But I'm not really accusing y'all of saying "try to produce a future that has no basis in human values." I am accusing this post of saying "there's some neutral procedure for figuring out human values, we should use that rather than a non-neutral procedure."
My read was more "do the best we can to get through the acute risk period in a way that lets humanity have the time and power to do the best it can at defining/creating a future full of value." And that's in response and opposed to positions like "figure out / decide what is best for humanity (or a procedure that can generate the answer to that) and use that to shape the long term future."
The point is that as moral attitudes/thoughts change, societies or individuals which exist long enough will likely come to regret permanently structuring the world according to the morality of a past age. The Roman will either live to regret it, or the society that follows the Roman will come to regret it even if the Roman dies happy, or the AI is brainwashing everyone all the time to prevent moral progress. The analogy breaks down a bit with the third option since I'd guess most people today would not accept it as a success and it's today's(ish) morals that might get locked in, not ancient Rome's.
I would guess not. Some more guesses: If there's an increase in testosterone, I think it's pretty mild and likely to revert by some homeostatic mechanism. My understanding is that while sensitivity to DHT causes male pattern baldness on the scalp, it contributes to masculinizing hair-growth elsewhere (like facial hair. I'm pretty sure DHT is the major driver of male-type body-hair development in puberty.) So a woman taking 5-AR inhibitors might actually find the effect to be feminizing. I haven't looked into it much, but for the same reason it seems plausible that men on 5-AR inhibitors will have less intense beards than they might have otherwise. (But more head hair.)