LESSWRONG
LW

All of cubefox's Comments + Replies

Our Reality: A Simulation Run by a Paperclip Maximizer

Dreams exhibit many incoherencies. You can notice them and become "lucid". Video games are also incoherent. They don't obey some simple but extremely computationally demanding laws. They instead obey complicated laws that are not very computationally demanding. They cheat with physics for efficiency reasons, and those cheats are very obvious. Our real physics, however, hasn't uncovered such apparent cheats. Physics doesn't seem incoherent, it doesn't resemble a video game or a dream.

Wei Dai's Shortform

cubefox18h51

Can this be summarized as "don't optimize for what you believe is good too hard, as you might be mistaken about what is good"?

Vale's Shortform

cubefox2d30

It was both, in the system prompt the model was instructed to end the conversation if in disagreement with the user. You could also ask it to end the conversation. It would presumably send an end-of-conversation token. Which then made the text box disappear.

jenn's Shortform

cubefox11d20

i kinda thought that ey's anti-philosophy stance was a bit extreme but this is blackpilling me pretty hard lmao

He actually cites reflective equilibrium here:

Closest antecedents in academic metaethics are Rawls and Goodman's reflective equilibrium, Harsanyi and Railton's ideal advisor theories, and Frank Jackson's moral functionalism.

Mo Putera's Shortform

cubefox11d20

If Thurston is right here and mathematicians want to understand why some theorem is true (rather than to just know the truth values of various conjectures), and if we "feel the AGI" ... then it seems future "mathematics" will consist in "mathematicians" asking future ChatGPT to explain math to them. Whether something is true, and why. There would be no research anymore.

The interesting question is, I think, whether less-than-fully-general systems, like reasoning LLMs, could outperform humans in mathematical research. Or whether this would require a full AGI that is also smarter than mathematicians. Because if we had the latter, it would likely be an ASI that is better than humans in almost everything, not just mathematics.

ryan_greenblatt's Shortform

cubefox11d30

I think when people use the term "gradual disempowerment" predominantly in one sense, people will also tend to understand it in that sense. And I think that sense will be rather literal and not the one specifically of the original authors. Compare the term "infohazard" which is used differently (see comments here) from how Yudkowsky was using it.

Leon Lang's Shortform

cubefox13d1716

Unrelated to vagueness they can also just change the framework again at any time.

How to Defend the Indefensible

cubefox13d30

Reminds me of Schopenhauer's posthumously published manuscript The Art of Being Right: 38 Ways to Win an Argument.

Richard Ngo's Shortform

cubefox16d*50

In Richard Jeffrey's utility theory there is actually a very natural distinction between positive and negative motivations/desires. A plausible axiom is $U (⊤) = 0$ (the tautology has zero desirability: you already know it's true). Which implies with the main axiom^[1] that the negation of any proposition with positive utility has negative utility, and vice versa. Which is intuitive: If something is good, its negation is bad, and the other way round. In particular, if $U (X) = U (\neg X)$ (indifference between $X$ and $\neg X$ ), then $U (X) = U (\neg X) = 0$ .

More generally, $U (\neg X) = - (P (X) / P (\neg X$ ... (read more)

Six reasons why objective morality is nonsense

cubefox17d20

conducive to well-being

That in itself isn't a good definition , because it doesn't distinguish ethics from, e.g. Medicine...and it doesn't tell you whose well being. De facto people are ethically obliged to do things which against their well being and refrain from doing some things which promote their own wellbeing...I can't rob people to pay my medical bills.

Promoting your own well-being only would be egoism, while ethics seems to be more similar to altruism.

Whose desires?

I guess of all beings that are conscious. Perhaps relative to their degr... (read more)

2TAG14d

Well, yes. (You don't have to start from.a tabular area , and then proceed in baby steps, since there is a lot of prior art) But "desires" is not how "ethics" is defined in standard dictionaries or philosophy. It's not "the" definition. That's irrelevant. Rival theories still need shared connotation.

Paper

cubefox18d20

Related: zettelkasten, a different note-taking method, where each card gets an address.

Six reasons why objective morality is nonsense

cubefox18d20

Many attempts at establishing an objective morality try to argue from considerations of human well-being. OK, but who decided that human well-being is what is important? We did!

That's a rather minimal amount of subjectivism. Everything downstream of that can be objective , so its really a compromise position

It's also possible (and I think very probable) that "ethical" means something like "conducive to well-being". Similar to how "tree" means something like "plant with a central wooden trunk". Imagine someone objecting: "OK, but who decided that tr... (read more)

2TAG17d

That in itself isn't a good definition , because it doesn't distinguish ethics from, e.g. Medicine...and it doesn't tell you whose well being. De facto people are ethically obliged to do things which against their well being and refrain from doing some things which promote their own wellbeing...I can't rob people to pay my medical bills. (People also receive objective punishments, which makes an objective approach domestics justifiable). Whose desires? Why? They could also be seen as attempts to find different denotations of a term with shared connotation. Disagreement , as opposed to talking-past, requires some commonality (And utilitarianism is a terrible theory of obligation, a standard objection which its rationalist admirers have no novel response to).

Taxonomy of possibility

cubefox19d21

That's some careful analysis!

Two remarks:

1

"Can" is the opposite of "unable". "Unable" means that the change involves granting ability to they who would act, i.e. teaching a technique, providing a tool, fixing the body, or altering the environment.

That's a good characterization, though arguably not a definition, as it relies on "ability", which is circular. I can do something = I have the ability to do something. I can = I'm able to.

But we can use the initial principle (it really needs a name) which doesn't mention ability:

You do a thing iff you can

cubefox19d4-2

Your headline overstates the results. The last common ancestor of birds an mammals probably wasn't exactly unintelligent. (In contrast to our last common ancestor with the octopus, as the article discusses.)

Introduction to Representing Sentences as Logical Statements

cubefox22d20

"the" supposes there's exactly one canonical choice for what object in the context is indicated by the predicate. When you say "the cat" there's basically always a specific cat from context you're talking about. "The cat is in the garden" is different from "There's exactly one cat in the garden".

Yes, we have a presupposition that there is exactly one cat. But that presupposition is the same regardless of the actual number of cats (regardless of the context), because the "context" here is a feature of the external world ("territory"), while the belief is... (read more)

1Towards_Keeperhood22d

Thanks for clarifying. I mean I do think it can happen in my system that you allocate an object for something that's actually 0 or >1 objects, and I don't have a procedure for resolving such map-territory mismatches yet, though I think it's imaginable to have a procedure that defines new objects and tries to edit all the beliefs associated with the old object. I definitely haven't described how we determine when to create a new object to add to our world model, but one could imagine an algorithm checking when there's some useful latent for explaining some observations, and then constructing a model for that object, and then creating a new object in the abstract reasoning engine. Yeah there's still open work to do for how a correspondences between the constant symbol for our object and our (e.g. visual) model of the object can be formalized and used, but I don't see why it wouldn't be feasible. I agree that we end up with a map that doesn't actually fit the territory, but I think it's fine if there's a unresolveable mismatch somewhere. There's still a useful correspondence in most places. (Sure logic would collapse from a contradiction but actually it's all probabilistic somehow anyways.) Although of course we don't have anything to describe that the territory is different from the map in our system yet. This is related to embedded agency, and further work on how to model your map as possibly not fitting the territory and how that can be used is still necessary.

Introduction to Representing Sentences as Logical Statements

cubefox23d20

What I was saying was that we can, from our subjective perspective, only "point" to or "refer" to objects in a certain way. In terms of predicate logic the two ways of referring are via a) individual constants and b) variable quantification. The first corresponds to direct reference, where the reference always points to exactly one object. Mental objects can presumably be referred to directly. For other objects, like physical ones, quantifiers have to be used. Like "at least one" or "the" (the latter only presupposes there is exactly one object satisfying ... (read more)

1Towards_Keeperhood23d

Thanks. I'm still not quite understanding what you're thinking though. "the" supposes there's exactly one canonical choice for what object in the context is indicated by the predicate. When you say "the cat" there's basically always a specific cat from context you're talking about. "The cat is in the garden" is different from "There's exactly one cat in the garden". I mean there has to be some possibility for revising your world model if you notice that there are actually 2 objects for something where you previously thought there's only one. I agree that "Superman" and "the superhero" denote the same object(assuming you're in the right context for "the superhero"). (And yeah to some extent names also depend a bit on context. E.g. if you have 2 friends with the same name.) Yeah I didn't mean this as formal statement. formal would be: {exists x: apple(x) AND location(x, on=Table342)} CAUSES {exists x: apple(x) AND see(SelfPerson, x)}

Introduction to Representing Sentences as Logical Statements

cubefox23d30

I think object identification is important if we want to analyze beliefs instead of sentences. For beliefs we can't take a third person perspective and say "it's clear from context what is meant". Only the agent knows what he means when he has a belief (or she). So the agent has to have a subjective ability to identify things. For "I" this is unproblematic, because the agent is presumably internal and accessible to himself and therefore can be subjectively referred to directly. But for "this" (and arguably also for terms like "tomorrow") the referred objec... (read more)

1Towards_Keeperhood23d

I'm not exactly sure what you're saying here, but in case the following helps: Indicators like "here"/"tomorrow"/"the object I'm pointing to" don't get stored directly in beliefs. They are pointers used for efficiently identifying some location/time/object from context, but what get's saved in the world model is the statement where those pointers were substituted for the referent they were pointing to. I think I still don't understand what you're trying to say, but some notes: * In my system, experiences aren't objects, they are facts. E.g. the fact "cubefox sees an apple". * CAUSES relates facts, not objects. * You can say "{(the fact that) there's an apple on the table} causes {(the fact that) I see an apple}" * Even though we don't have an explicit separate name in language for every apple we see, our minds still tracks every apple as a separate object which can be identified. ---------------------------------------- Btw, it's very likely not what you're talking about, but you actually need to be careful sometimes when substituting referent objects from indicators, in particular in cases where you talk about the world model of other people. E.g. if you have the beliefs: 1. Mia believes Superman can fly. 2. Superman is Clark Kent. This doesn't imply that "Mia believes Clark Kent can fly", because Mia might not know (2). But essentially you just have a separate world model "Mia's beliefs" in which Superman and Clark Kent are separate objects, and you just need to be careful to choose the referent of names (or likewise with indicators) relative to who's belief scope you are in.

xpostah's Shortform

cubefox23d30

Yeah. I proposed a while ago that all the AI content was becoming so dominant that it should be hived off to the Alignment Forum while LessWrong is for all the rest. This was rejected.

Introduction to Representing Sentences as Logical Statements

cubefox24d50

Maybe I missed it, but what about indexical terms like "I", "this", "now"?

1Towards_Keeperhood23d

Yep I did not cover those here. They are essentially shortcodes for identifying objects/times/locations from context. Related quote: ("The laptop" is pretty similar to "This laptop".) (Though "this" can also act as complementizer, as in "This is why I didn't come", though I think in that function it doesn't count as indexical. The section related to complementizers is the "statement connectives" section.)

xpostah's Shortform

cubefox24d20

There is still the possibility on the front page to filter out the AI tag completely.

3samuelshadrach23d

Yes but then it becomes a forum within a forum kinda thing. You need a critical mass of users who all agree to filter out the AI tag, and not have to preface their every post with "I dont buy your short timelines worldview, I am here to discuss something different". Building critical mass is difficult unless the forum is conducive to it. There's is ultimately only one upvote button and one front-page so the forum will get taken over by the top few topics that its members are paying attention to. I don't think there's anything wrong with a forum that's mostly focussed on AI xrisk and transhumanist stuff. Better to do one thing well than half ass ten things. But it also means I may need to go elsewhere.

OpenAI lost $5 billion in 2024 (and its losses are increasing)

cubefox25d30

That difference is rather extreme. It seems LLM companies have a strong winner-take-all market tendency. Similar to Google (web search) or Amazon (online retail) in the past. It seems now much more likely to me that ChatGPT has basically already won the LLM race, similar to how Google won the search engine race in the past. Gemini outperforming ChatGPT in a few benchmarks likely won't make a difference.

Show, not tell: GPT-4o is more opinionated in images than in text

cubefox26d46

[...] because it is embedded natively, deep in the architecture of our omnimodal GPT‑4o model, 4o image generation can use everything it knows to apply these capabilities in subtle and expressive ways [...] Unlike DALL·E, which operates as a diffusion model, 4o image generation is an autoregressive model natively embedded within ChatGPT.

source

2Tao Lin24d

Yeah they may be the same weights. The above quote does not absolutely imply the same weights generate the text and images IMO, just that it's based on the 4o and sees the whole prompt. OpenAI's audio generation is also 'native', but it's served as a separate model on the API with different release dates, and you can't mix audio and some function calling in chatgpt in a way that's consistent with them not actually being the same weights.

testingthewaters's Shortform

cubefox26d20

To operationalise this: a decision theory usually assumes that you have some number of options, each with some defined payout. Assuming payouts are fixed, all decision theories simply advise you to pick the outcome with the highest utility.

The theories typically assume that each choice option has a number of known mutually exclusive (and jointly exhaustive) possible outcomes. And to each outcome the agent assigns a utility and a probability. So uncertainty is in fact modelled insofar the agent can assign subjective probabilities to those outcomes occurr... (read more)

Grok3 On Kant On AI Slavery

cubefox1mo125

(This is off-topic but I'm not keen on calling LLMs "he" or "she". Grok is not a man, nor a woman. We shouldn't anthropomorphize language models. We already have an appropriate pronoun for those: "it")

1Knight Lee1mo

Animals get called he or she, so why can't AI? From a utilitarian point of view, there's not much downside in calling them he/she since humans are very capable of distrusting someone even if they're seen as another human. Meanwhile the advantage of talking politely about AI, is that the AI will predict humans to keep our promises to them. That said, I tend to use "it" when referring to AI because everyone else does, and I don't want to become the kind of person who argues over pronouns (yet here I am, sorry). Preferably, don't imagine the AI to have the gender you might be attracted to, to avoid this.

On the Implications of Recent Results on Latent Reasoning in LLMs

cubefox1mo30

There is also Deliberation in Latent Space via Differentiable Cache Augmentation by Liu et al. and Efficient Reasoning with Hidden Thinking by Shen et al.

Probability Theory Fundamentals 102: Source of the Sample Space

cubefox1mo20

I think picking axioms is not necessary here and in any case inconsequential.

By picking your axioms you logically pinpoint what you are talking in the first place. Have you read Highly Advanced Epistemology 101 for Beginners? I'm noticing that our inferential distance is larger than it should be otherwise.

I have read it a while ago, but he overstates the importance of axiom systems. E.g. he wrote:

You need axioms to pin down a mathematical universe before you can talk about it in the first place. The axioms are pinning down what the heck this 'NUM

... (read more)

deep's Shortform

cubefox1mo20

I wouldn't generally dismiss an "embarassing & confusing public meltdown" when it comes from a genius. Because I'm not a genius while he or she is. So it's probably me who is wrong rather than him. Well, except the majority of comparable geniuses agrees with me rather than with him. Though geniuses are rare, and majorities are hard to come by. I still remember an (at the time) "embarrassing and confusing meltdown" by some genius.

Probability Theory Fundamentals 102: Source of the Sample Space

cubefox1mo20

My point is that if your picking of particular axioms is entangled with reality, then you are already using a map to describe some territory. And then you can just as well describe this territory more accurately.

I think picking axioms is not necessary here and in any case inconsequential. "Bachelors are unmarried" is true whether or not I regard it as some kind of axiom or not. I seems the same holds for tautologies and probabilistic laws. Moreover, I think neither of them is really "entangled" with reality, in the sense that they are compatible with an... (read more)

2Ape in the coat1mo

By picking your axioms you logically pinpoint what you are talking in the first place. Have you read Highly Advanced Epistemology 101 for Beginners? I'm noticing that our inferential distance is larger than it should be otherwise. No, you are missing the point. I'm not saying that this phrase has to be axiom itself. I'm saying that you need to somehow axiomatically define your individual words, assign them meaning and only then, in regards to these language axioms the phrase "Bachelors are unmarried" is valid. You've drawn the graph yourself, how meaning is downstream of reality. This is the kind of entanglement we are talking about. The choice of axioms is motivated by our experience with stuff in the real world. Everything else is beside the point. Yes. That's, among other things, what not being instrumentally exploitable "in principle" means. Epistemic rationality is a generalisation of instrumental rationality the same way how arithmetics is a generalisation from the behaviour of individual objects in reality. The kind of beliefs that are not exploitable in any case other than literally adversarial cases such as a mindreader specifically rewarding people who do not have such beliefs. I think the problem is that you keep using the word Truth to mean both Validity and Soundness and therefore do not notice when you switch from one to another. Validity depends only on the axioms. As long as you are talkin about some set of axioms in which P defined in such a way that P(A) ≥ P(A&B) is a valid theorem, no appeal to reality is needed. Likewise, you can talk about a set of axioms where P(A) ≤ P(A&B). These two statements remain valid in regards to their axioms. But the moment you claim that this has something to do with the way beliefs - a thing from reality - are supposed to behave you start talking about soundness, and therefore require a connection to reality. As soon as pure mathematical statements mean something you are in the domain of map-territory relatio

Doing principle-of-charity better

cubefox1mo30

Somewhat related: A critique of "bad faith".

Policy for LLM Writing on LessWrong

cubefox1mo20

Do you really have access to the GPT-4 base (foundation) model? Why? It's not publicly available.

Probability Theory Fundamentals 102: Source of the Sample Space

cubefox1mo*20

Yes, the meaning of a statement depends causally on empirical facts. But this doesn't imply that the truth value of "Bachelors are unmarried" depends less than completely on its meaning. Its meaning (M) screens off the empirical facts (E) and its truth value (T). The causal graph looks like this:

E —> M —> T

If this graph is faithful, it follows that E and T are conditionally independent given M. $E ⊥ T ∣ M$ . So if you know M, E gives you no additional information about T.

And the same is the case for all "analytic" statements, where the truth value only d... (read more)

2Ape in the coat1mo

I think we are in agreement here. My point is that if your picking of particular axioms is entangled with reality, then you are already using a map to describe some territory. And then you can just as well describe this territory more accurately. Rationality is about systematic ways to arrive to correct map-territory correspondence. Even if in your particular situation no one is exploiting you, the fact that you are exploitable in principle is bad. But to know about what is exploitable in principle we generalize from all the individual acts of exploatation. It all has to be grounded in reality in the end. You've said yourself, meaning is downstream of experience. So in the end you have to appeal to reality while trying to justify it.

Probability Theory Fundamentals 102: Source of the Sample Space

cubefox1mo20

It seems clear to me that statements expressing logical or probabilistic laws like $P (A \lor B) = P (A) + P (B) - P (A \land B)$ or $\neg (A \land \neg A)$ are "analytic". Similar to "Bachelors are unmarried".

The truth of a statement in general is determined by two things, it's meaning and what the world is like. But for some statements the latter part is irrelevant, and their meanings alone are sufficient to determine their truth or falsity.

2Ape in the coat1mo

As soon as you have your axioms you can indeed analytically derive theorems from them. However, the way you determine which axioms to pick, is entangled with reality. It's an especially clear case with probability theory where the development of the field was motivated by very practical concerns. The reason why some axioms appear to us appropriate for logic of beliefs and some don't, is because we know what beliefs are from experience. We are trying to come up with a mathematical model approximating this element of reality - an intensional definition for an extensional referent that we have. Being Dutch-bookable is considered irrational because you systematically lose your bets. Likewise, continuing to believe that a particular outcome can happen in a setting where it, in fact, can't and another agent could've already figured it out with the same limitations you have, is irrational for the same reason. Indeed. There is, in fact, some real world reasons why the words "bachelor" and "unmarried" have these meanings in the English language. In both "why these particular worlds for this particular meanings?" and "why these meanings deserved designating any words at all" senses. The etimology of english language and the existence of the institute of marrige in the first place, both of which the results of social dynamics of humans whose psyche has evolved in a particular way. I hope the previous paragraph does a good enough job showing, how meaning of a statement is, in fact, connected to the way the world is like. Truth is a map-territory correspondence. We can separately talk about its two components: validity and soundness. As long as we simply conceptualize some mathematical model, logically pinpointing it for no particular reason, then we are simply dealing with tautologies and there is only validity. Drawing maps for the sake of drawing maps, without thinking about territory. But the moment we want our model to be about something, we encounter soundness. Whic

Probability Theory Fundamentals 102: Source of the Sample Space

cubefox1mo20

Not to remove all limitations: I think the probability axioms are a sort of "logic of sets of beliefs". If the axioms are violated the belief set seems to be irrational. (Or at least the smallest incoherent subset that, if removed, would make the set coherent.) Conventional logic doesn't work as a logic for belief sets, as the preface and lottery paradox show, but subjective probability theory does work. As a justification for the axioms: that seems a similar problem to justifying the tautologies / inference rules of classical logic. Maybe an instrumental ... (read more)

2Ape in the coat1mo

Well yes, they are. But how do you know which axioms are the correct axioms for logic of sets beliefs? How comes violation of some axioms seems to be irrational, while violation of other axioms does not? What do you even mean by "rational" if not "systematic way to arrive to map-territory correspondence"? You see, in any case you have to ground your mathematical model in reality. Natural numbers may be logically pinpointed by arithmetical axioms, but a question of whether some action with particular objects behave like addition of natural numbers is a matter of empiricism. The reason we came up with a notion of natural numbers, in the first place, is because we've encountered a lot of stuff in reality which behavior generalizes this way. And the same things with logic of beliefs. First we encounter some territory, then we try to approximate it with a map. What I'm trying to say is that if you are already trying to make a map that corresponds to some territory, why not make the one that corresponds better? You can declare that any consistent map is "good enough" and stop your inquiry there, but surely you can do better. You can declare that any consistent map following several simple conditions is good enough - that's a step in the right direction, but still there is a lot of place for improvement. Why not figure out the most accurate map that we can come up with? Well, yes, it's harder than the subjective probability approach you are talking about. We are trying to pinpoint a more specific target: a probabilistic model for a particular problem, instead of just some probabilistic model. No, not really. We can do a lot before we go this particular rabbit hole. I hope my next post will make it clear enough.

Probability Theory Fundamentals 102: Source of the Sample Space

cubefox1mo20

Well, technically P(Ω)=1 is an axiom, so you do need a sample space if you want to adhere to the axioms.

For a propositional theory this axiom is replaced with $P (⊤) = 1$ , i.e. a tautology in classical propositional logic receives probability 1.

But sure, if you do not care about accurate beliefs and systematic ways to arrive to them at all, then the question is, indeed, not interesting. Of course then it's not clear what use is probability theory for you, in the first place.

Degrees of belief adhering to the probability calculus at any point in time rules... (read more)

2Ape in the coat1mo

What is even the motivation for it? If you are not interested in your map representing a territory, why demanding that your map is coherent? And why not assume some completely different axioms? Surely, there is a lot of potential ways to logically pinpoint things. Why this one in particular? Why not allow P(Mary is a feminist and bank teller) > P(Mary is a feminist)? Why not simply remove all the limitations from the function P?

Probability Theory Fundamentals 102: Source of the Sample Space

cubefox1mo20

And how would you know which worlds are possible and which are not?

Yes, that's why I only said "less arbitrary".

Regarding "knowing": In subjective probability theory, the probability over the "event" space is just about what you believe, not about what you know. You could theoretically believe to degree 0 in the propositions "the die comes up 6" or "the die lands at an angle". Or that the die comes up as both 1 and 2 with some positive probability. There is no requirement that your degrees of belief are accurate relative to some external standard. It is... (read more)

2Ape in the coat1mo

I don't think I can agree even with that. Previously we arbritrary assumed that a particular sample space correspond to a problem. Now we are arbitrary assuming that a particular set of possible worlds corresponds to a problem. In the best case we are exactly as arbitrary as before and have simply renamed our set. In the worst case we are making a lot of extra unfalsifiable assumptions about metaphysics. Well, technically P(Ω)=1 is an axiom, so you do need a sample space if you want to adhere to the axioms. But sure, if you do not care about accurate beliefs and systematic ways to arrive to them at all, then the question is, indeed, not interesting. Of course then it's not clear what use is probability theory for you, in the first place.

Probability Theory Fundamentals 102: Source of the Sample Space

cubefox1mo20

A less arbitrary way to define a sample space is to take the set of all possible worlds. Each event, e.g. a die roll, corresponds to the disjunction of possible worlds where that event happens. The possible worlds can differ in a lot of tiny details, e.g. the exact position of a die on the table. Even just an atom being different at the other end of the galaxy would constitute a different possible world. A possible world is a maximally specific way the world could be. So two possible worlds are always mutually exclusive. And the set of all possible worlds ... (read more)

4Ape in the coat1mo

And how would you know which worlds are possible and which are not? How would Albert and Barry use the framework of "possible worlds" to help them resolve their disagreement? This simply passes the buck of the question from "What is the sample space corresponding to a particular problem?" to "What is the event space corresponding to a particular problem?". You've renamed your variables, but the substance of the issue is still the same. How would you know, whether P(1)+P(2)+P(3)+P(4)+P(5)+P(6)=1 or P(1)+P(2)+P(3)+P(4)+P(5)=1 for a dice roll?

Human alignment

cubefox1mo30

I think the main problem from this evolutionary perspective is not so much entertainment and art, but low fertility. Not having children.

Solving willpower seems easier than solving aging

cubefox1mo30

A drug that fixes akrasia without major side-effects would indeed be the Holy Grail. Unfortunately I don't think caffeine does anything of that sort. For me it increases focus, but it doesn't combat weakness of will, avoidance behavior, ugh fields. I don't know about other existing drugs.

6Mateusz Bagiński1mo

Some amphetamines kinda solve akrasia-in-general to some extent (much more so than caffeine), at least for some people. I'm not claiming that they're worth it.

Why Were We Wrong About China and AI? A Case Study in Failed Rationality

cubefox1mo40

I think the main reason is that until a few years ago, not much AI research came out of China. Gwern highlighted this repeatedly.

1thedudeabides1mo

Exactly. @gwern was wrong. And yet...

Human alignment

cubefox1mo30

I agree with the downvoters that the thesis of this post seems crazy. But aren't entertainment and art superstimuli? Aren't they forms of wireheading?

A Critique of “Utility”

cubefox1mo31

Hedonic and desire theories are perfectly standard, we had plenty of people talking about them here, including myself. Jeffrey's utility theory is explicitly meant to model (beliefs and) desires. Both are also often discussed in ethics, including over at the EA Forum. Daniel Kahneman has written about hedonic utility. To equate money with utility is a common simplification in many economic contexts, where expected utility is actually calculated, e.g. when talking about bets and gambles. Even though it isn't held to be perfectly accurate. I didn't encounter... (read more)

Richard_Kennaway's Shortform

cubefox1mo20

A more ambitious task would be to come up with a model that is more sophisticated than decision theory, one which tries to formalize your previous comment about intent and prediction/belief.

2Dagon1mo

I think it's a different level of abstraction. Decision theory works just fine if you separate the action of predicting a future action from the action itself. Whether your prior-prediction influences your action when the time comes will vary by decision theory. I think, for most problems we use to compare decision theories, it doesn't matter much whether considering, planning, preparing, replanning, and acting are correlated time-separated decisions or whether it all collapses into a sum of "how to act at point-in-time". I haven't seen much detailed exploration of decision theory X embedded agents or capacity/memory-limited ongoing decisions, but it would be interesting and important, I think.

Mo Putera's Shortform

cubefox1mo60

Interesting. This reminds me of a related thought I had: Why do models with differential equations work so often in physics but so rarely in other empirical sciences? Perhaps physics simply is "the differential equation science".

Which is also related to the frequently expressed opinion that philosophy makes little progress because everything that gets developed enough to make significant progress splits off from philosophy. Because philosophy is "the study of ill-defined and intractable problems".

Not saying that I think these views are accurate, though they do have some plausibility.

1Mo Putera1mo

(To be honest, to first approximation my guess mirrors yours.)

Gunnar_Zarncke's Shortform

cubefox1mo82

It seems to be only "deception" if the parent tries to conceal the fact that he or she is simplifying things.

2Gunnar_Zarncke1mo

as we use the term, yes. But the point (and I should have made that more clear) is that any mismodeling of the parent of the interests of the child's interests and future environment will not be visible to the child or even someone reading the thoughts of the well-meaning parent. So many parents want the best for their child, but model the future of the child wrongly (mostly by status quo bias; the problem is different for AI).

LWLW's Shortform

cubefox1mo42

There is also the related problem of intelligence being negatively correlated with fertility, which leads to a dysgenic trend. Even if preventing people below a certain level of intelligence to have children was realistically possible, it would make another problem more severe: the fertility of smarter people is far below replacement, leading to quickly shrinking populations. Though fertility is likely partially heritable, and would go up again after some generations, once the descendants of the (currently rare) high-fertility people start to dominate.

A Critique of “Utility”

cubefox1mo*40

This seems to be a relatively balanced article which discusses serveral concepts of utility with a focus on their problems, while acknowledging some of their use cases. I don't think the downvotes are justified.

1mako yass1mo

These are not concepts of utility that I've ever seen anyone explicitly espouse, especially not here, the place to which it was posted.

Richard_Kennaway's Shortform

cubefox1mo20

That's an interesting perspective. Only it doesn't seem fit into the simplified but neat picture of decision theory. There everything is sharply divided between being either a statement we can make true at will (an action we can currently decide to perform) and to which we therefore do not need to assign any probability (have a belief about it happening), or an outcome, which we can't make true directly, that is at most a consequence of our action. We can assign probabilities to outcomes, conditional on our available actions, and a value, which lets us com... (read more)

2Dagon1mo

Decision theory is fine, as long as we don't think it applies to most things we colloquially call "decisions". In terms of instantaneous discrete choose-an-action-and-complete-it-before-the-next-processing-cycle, it's quite a reasonable topic of study.

Why White-Box Redteaming Makes Me Feel Weird

cubefox1mo52

Maybe this is avoided by KV caching?

4nielsrolf1mo

I think that's plausible but not obvious. We could imagine different implementations of inference engines that cache on different levels - eg kv-cache, cache of only matrix multiplications, cache of specific vector products that the matrix multiplications are composed of, all the way down to caching just the logic table of a NAND gate. Caching NAND's is basically the same as doing the computation, so if we assume that doing the full computation can produce experiences then I think it's not obvious which level of caching would not produce experiences anymore.

Richard_Kennaway's Shortform

cubefox1mo20

This is not how many decisions feel to me - many decisions are exactly a belief (complete with bayesean uncertainty). A belief in future action, to be sure, but it's distinct in time from the action itself.

But if you only have a belief that you will do something in the future, you still have to decide, when the time comes, whether to carry out the action or not. So your previous belief doesn't seem to be an actual decision, but rather just a belief about a future decision -- about which action you will pick in the future.

See Spohn's example about belie... (read more)

2Dagon1mo

Correct. There are different levels of abstraction of predictions and intent, and observation/memory of past actions which all get labeled "decision". I decide to attend a play in London next month. This is an intent and a belief. It's not guaranteed. I buy tickets for the train and for the show. The sub-decisions to click "buy" on the websites are in the past, and therefore committed. The overall decision has more evidence, and gets more confident. The cancelation window passes. Again, a bit more evidence. I board the train - that sub-decision is in the past, so is committed, but there's STILL some chance I won't see the play. Anything you call a "decision" that hasn't actually already happened is really a prediction or an intent. Even DURING an action, you only have intent and prediction. While the impulse is traveling down my arm to click the mouse, the power could still go out and I don't buy the ticket. There is past, which is pretty immutable, and future, which cannot be known precisely. I think this is compatible with Spohn's example (at least the part you pasted), and contradicts OP's claim that "you did not make a decision" for all the cases where the future is uncertain. ALL decisions are actually predictions, until they are in the past tense. One can argue whether that's a p(1) prediction or a different thing entirely, but that doesn't matter to this point. "If, on making a decision, your next thought is “Was that the right decision?” then you did not make a decision." is actually good directional advice in many cases, but it's factually simply incorrect.

Richard_Kennaway's Shortform

cubefox1mo20

Decision screens off thought from action. When you really make a decision, that is the end of the matter, and the actions to carry it out flow inexorably.

Yes, but that arguably means we only make decisions about which things to do now. Because we can't force our future selves to follow through, to inexorably carry out something. See here:

Our past selves can't simply force us to do certain things, the memory of a past "commitment" is only one factor that may influence our present decision making, but it doesn't replace a decision. Otherwise, always whe

... (read more)

2Richard_Kennaway1mo

My left hand cannot force my right hand to do anything either. Instead, they work harmoniously together. Likewise my present, past, and future. Not only is the sage one with causation, he is one with himself. That is an example of dysfunctional decision-making. It is possible to do better. I always do the dishes today.