Human: "Look, can't you just be normal about this?"
GAA-optimized agent: "Actually-"
Hm, I guess this wouldn't work if the agent still learns an internalized RL methodology? Or would it? Say we have a base model, not much need for GAA because it's just doing token pred. We go into some sort of (distilled?) RL-based cot instruct tuning, GAA means it picks up abnormal rewards from the signal more slowly, ie. it doesn't do the classic boat-spinning-in-circles thing (good test?), but if it internalizes RL at some point its mesaoptimizer wouldn't be so limited, and that's a general technique so GAA wouldn't prevent it? Still, seems like a good first line of defense.
The issue is, from a writing perspective, that a positive singularity quickly becomes both unpredictable and unrelatable, so that any hopeful story we could write would, inevitably, look boring and pedestrian. I mean, I know what I intend to do come the Good End, for maybe the next 100k years or so, but probably a five-minute conversation with the AI will bring up many much better ideas, being how it is. But ... bad ends are predictable, simple, and enter a very easy to describe steady state.
A curve that grows and never repeats is a lot harder to predict than a curve that goes to zero and stays there.
Another difficulty in writing science fiction is that good stories tend to pick one technology and then explore all its implications in a legible way, whereas our real future involves lots of different technologies interacting in complex multi-dimensional ways too complicated to fit into an appealing narrative or even a textbook.
Can I really trust an organization to preserve my brain that can't manage a working SSL certificate?
I mean, you can trust it to preserve your brain more than you can trust a crematorium to preserve your brain.
And if you do chemical preservation, the operational complexity of maintaining a brain in storage is fairly simple. LN2 isn't that complex either, but does have higher risks.
That said, I would generally suggest using Tomorrow Biostasis for Europe residents if you can afford it.
Should ChatGPT assist with things that the user or a broad segment of society thinks are harmful, but ChatGPT does not? If yes, the next step would be "can I make ChatGPT think that bombmaking instructions are not harmful?"
Probably ChatGPT should go "Well, I think this is harmless but broad parts of society disagree, so I'll refuse to do it."
I think the analogy to photography works very well, in that it's a lot easier than the workflow that it replaced, but a lot harder than it's commonly seen as. And yeah, it's great using a tool that lets me, in effect, graft the lower half of the artistic process to my own brain. It's a preview of what's coming with AI, imo - the complete commodification of every cognitive skill.
As somebody who makes AI "art" (largely anime tiddies tbh) recreationally, I'm not sure I agree with the notion that the emotion of an artist is not recognizeable in the work. For one, when you're looking at least at a finished picture I've made, you're looking at hours of thought and effort. I can't draw a straight line to save my life, but I can decide what should go where, which color is the right or wrong one, and which of eight candidate pictures has particular features I like. When you're working incrementally, img2img, for instance, it's very common...
I think your 100 billion people holding thousands of hands each are definitely conscious. I also think the United States and in fact nearly every nationstate are probably conscious as well. Also, my Linux system may be conscious.
I believe consciousness is, at its core, a very simple system: something closer to the differentiation operator than to a person. We merely think that it is a complicated big thing because we confuse the mechanism with the contents - a lot of complicated systems in the brain exchange data using consciousness in various formats, inc...
If military AI is dangerous, it's not because it's military. If a military robot can wield a gun, a civilian robot can certainly acquire one as well.
The military may create AI systems that are designed to be amoral, but it will not want systems that overinterpret orders or violate the chain of command. Here as everywhere, if intentional misuse is even possible at all, alignment is critical and unintentional takeoff remains the dominant risk.
In seminal AI safety work Terminator, the Skynet system successfully triggers a world war because it is a military AI...
My impression is that there's been a widespread local breakdown of the monopoly of force, in no small part by using human agents. In this timeline the trend of colocation of datacenters and power plants and network decentralization would have probably continued or even sped up. Further, while building integrated circuits takes first-rate hardware, building ad-hoc powerplants should be well in the power of educated humans with perfect instruction. (Mass cannibalize rooftop solar?)
This could have been stopped by quick, decisive action, but they gave it time and now they've lost any central control of the situation.
So what's happening there?
Allow me to speculate. When we switch between different topics of work, we lose state. So our brain tries to first finish all pending tasks in the old context, settle and reorient, and then begin the new context. But one problem with the hyperstimulated social-media-addicted akrasia sufferer is that the state of continuous distraction, to the brain, emulates the state of being in flow. Every task completion is immediately followed by another task popping up. Excellent efficiency! And when you are in flow, switching to another topi...
I personally think that all powerful AIs should be controlled by Ryan Greenblatt.
I don't know the guy, but he seems sane from reading just a little of his writing. Putting him in charge would run a small s-risk (bad outcomes if he turned out to have a negative sadism-empathy balance), but I think that's unlikely. It would avoid what I think are quite large risks arising from Molochian competition among AGIs and their human masters in an aligned but multipolar scenario.
So: Ryan Greenblatt for god-emperor!
Or whoever else, as long as they don't self-nominate....
Note after OOB debate: this conversation has gone wrong because you're reading subtext into Said's comment that he didn't mean to put there. You keep trying to answer an implied question that wasn't intended to be implied.
If you think playing against bots in UT is authentically challenging, just answer "Yes, I think playing against bots in UT is authentically challenging."
I haven't really followed the math here, but I'm worried that "manipulating the probability that the button is pressed" is a weird and possibly wrong framing. For one, a competent agent will always be driving the probability that the button is pressed downward. In fact, what we want in a certain sense is an agent that brings the probability to zero - because we have ended up in such an optimal state or attractor that we, even for transitively correct reasons, have no desire to shut the agent down. At that point, what we want to preserve is not precisely "t...
Simplicia: Sure. For example, I certainly don’t believe that LLMs that convincingly talk about “happiness” are actually happy. I don’t know how consciousness works, but the training data only pins down external behavior.
I mean, I don't think this is obviously true? In combination with the inductive biases thing nailing down the true function out of a potentially huge forest, it seems at least possible that the LLMs would end up with an "emotional state" parameter pretty low down in its predictive model. It's completely unclear what this would do out of ...
If something interests us, we can perform trials. Because our knowledge is integrated with our decisionmaking, we can learn causality that way. What ChatGPT does is pick up both knowledge and decisionmaking by imitation, which is why it can also exhibit causal reasoning without itself necessarily acting agentically during training.
Sure, but surely that's how it feels from the inside when your mind uses a LRU storage system that progressively discards detail. I'm more interested in how much I can access - and um, there's no way I can access 2.5 petabytes of data.
I think you just have a hard time imagining how much 2.5 petabyte is. If I literally stored in memory a high-resolution poorly compressed JPEG image (1MB) every second for the rest of my life, I would still not reach that storage limit. 2.5 petabyte would allow the brain to remember everything it has ever perceived, with ve...
But no company has ever managed to parlay this into world domination
Eventual failure aside, the East India Company gave it a damn good shake. I think if we get an AI to the point where it has effective colonial control over entire countries, we can be squarely said to have lost.
Also keep in mind that we have multiple institutions entirely dedicated to the purpose of breaking up companies when they become big enough to be threatening. We designed our societies to specifically avoid this scenario! That, too, comes from painful experience. IMO, if we now give AI the chances that we've historically given corporations before we learnt better, then we're dead, no question about it.
While I wouldn't endorse the 2.5 PB figure itself, I would caution against this line of argument. It's possible for your brain to contain plenty of information that is not accessible to your memory. Indeed, we know of plenty of such cognitive systems in the brain whose algorithms are both sophisticated and inaccessible to any kind of introspection: locomotion and vision are two obvious examples.
This smells like a framing debate. More importantly, if an article is defining a common word in an unconventional way, my first assumption will be that it's trying to argumentatively attack its own meaning while pretending it's defeating the original meaning. I'm not sure it matters how clearly you're defining your meaning; due to how human cognition works, this may be impossible to avoid without creating new terms.
In other words, I don't think it's that Scott missed the definitions as that he reflexively disregarded them as a rhetorical trick.
N of 1, but I realized the intended meaning of “impaired” and “disabled” before even reading the original articles and adopted them into my language. As you can see from this article, adopting new and more precise and differentiated definitions for these two terms hasn’t harmed my ability to understand that not all functional impediments are caused by socially imposed disability.
So impossible? No.
If Scott had accurately described the articles he quoted before dealing with the perceived rhetorical trickery, I’d have let it slide. But he didn’t, and he’s criticized inaccurately representing the contents of cited literature plenty of times in the past.
If a gender identity is a belief about one’s own gender, then it’s not even clear that I have one in a substantial relevant sense, which is part of the point of my “Am I trans?” post. I think I would have said early on that I better matched male psychological stereotypes and it’s more complicated now (due to life experience?).
Right? I mean, what should I say, who identifies as male and wants to keep his male-typical psychological stereotypes? It seems to me what you're saying in this post fits more closely with the conservative stereotype as the trans m...
I guess I could say, if you want to keep being psychologically male, don't medically transition and present as a woman for years, and if you do don't buy into the ideology that you did any of this because of some gender identity? Probably there's variation in the degree to which people want to remain psychologically gendered the way they are which is part of what explains differences in decisions.
I think there is a real problem with the gender/trans memespace inducing gender dysphoria in people, such as distress not previously present at being different fr...
As an AGP, my view is that ... like, that list of symptoms is pretty diverse but if I don't want to be a woman - not in the sense that I would be upset to be misgendered, though I would be, but more for political than genderical (?) reasons - I don't see why it would matter if I desire to have a (particular) female bodytype.
If I imagine "myself as a woman" (as opposed to "myself as myself with a cute female appearance"), and actually put any psychological traits on that rather than just gender as a free-floating tag, then it seems to me that my identity wo...
The relationship of a CEO to his subordinates, and the nature and form of his authority over them, are defined in rules and formal structures—which is true of a king but false of a hunter-gatherer band leader. The President, likewise.
Eh. This is true in extremis, but the everyday interaction that structures how decisions actually get made, can be very different. The formal structure primarily defines what sorts of interactions the state will enforce for you. But if you have to get the state to enforce interactions within your company, things have gone v...
I mean, men also have to put in effort to perform masculinity, or be seen as being inadequate men; I don't think this is a gendered thing. But even a man that isn't "performing masculinity adequately", an inadequate man, like an inadequate woman, is still a distinct category, and though transwomen, like born women, aim to perform femininity, transwomen have a higher distance to cross and in doing so traverse between clusters along several dimensions. I think we can meaningfully separate "perform effort to transition in adequacy" from "perform effort to tra...
I just mean like, if we see an object move we have a qualia of position but also of velocity/vector and maybe acceleration. So when we see for instance a sphere rolling down an incline, we may have a discrete conscious "frame" where the marble has a velocity of 0 but a positive acceleration, so despite the fact that the next frame is discontinuous with the last one looking only at position, we perceive them as one smooth sequence because the predicted end position of the motion in the first frame is continuous with the start point in the second.
Shouldn't the king just make markets for "crop success if planted assuming three weeks" and "crop success if planted assuming ten years" and pick whichever is higher? Actually, shouldn't the king define some metric for kingdom well-being (death rate, for instance) and make betting markets for this metric under his possible roughly-primitive actions?
This fable just seems to suggest that you can draw wrong inferences from betting markets by naively aggregating. But this was never in doubt, and does not disprove that you can draw valuable inferences, even in the particular example problem.
These would be good ideas. I would remark that many people definitely do not understand what is happening when naively aggregating, or averaging together disparate distributions. Consider the simple example of the several Metaculus predictions for date of AGI, or any other future event. Consider the way that people tend to speak of the aggregated median dates. I would hazard most people using Metaculus, or referencing the bio-anchors paper, think the way the King does, and believe that the computed median dates are a good reflection of when things will probably happen.
This is just human decision theory modules doing human decision theory things. It's a way of saying "defend me or reject me; at any rate, declare your view." You say something that's at the extreme end of what you consider defensible in order to act as a Schelling point for defense: "even this is accepted for a member." In the face of comments that seem like they validate Ziz's view, if not her methods, this comment calls for an explicit rejection of not Ziz's views, but Ziz's mode of approach, by explicitly saying "I am what you hate, I am here, come at m...
Right, but if you're an alien civilization trying to be evil, you probably spread forever; if you're trying to be nice, you also spread forever, but if you find a potentially life-bearing planet, you simulate it out (obviating the need for ancestor sims later). Or some such strategy. The point is there shouldn't ever be a border facing nothing.
Disclaimer: I know Said Achmiz from another LW social context.
In my experience, the safe bet is that minds are more diverse than almost anyone expects.
A statement advanced in a discussion like "well, but nobody could seriously miss that X" is near-universally false.
(This is especially ironic cause of the "You don't exist" post you just wrote.)
I don't understand it but it does make me feel happy.