All of Matthew_Opitz's Comments + Replies

It would be more impressive if Claude 3 could describe genuinely novel experiences.  For example, if it is somewhat conscious, perhaps it could explain how that consciousness meshes with the fact that, so far as we know, its "thinking" only runs at inference time in response to user requests.  In other words, LLMs don't get to do their own self-talk (so far as we know) whenever they aren't being actively queried by a user.  So, is Claude 3 at all conscious in those idle times between user queries?  Or does Claude 3 experience "time" in ... (read more)

3Shayne O'Neill
I did once coax cGPT to describe its "phenomenology" as being (paraphrased from memory) "I have a permanent series of words and letters that I can percieve and sometimes i reply then immediately more come", indicating its "perception" of time does not include pauses or whatever.  And then it pasted on its disclaimer that "As an AI I....", as its want to do.

Except that chess really does have an objectively correct value systemization, which is "win the game."  "Sitting with paradox" just means, don't get too attached to partial systemizations.  It reminds me of Max Stirner's egoist philosophy, which emphasized that individuals should not get hung up on partial abstractions or "idées fixées" (honesty, pleasure, success, money, truth, etc.) except perhaps as cheap, heuristic proxies for one's uber-systematized value of self-interest, but one should instead always keep in mind the overriding abstractio... (read more)

4Kaj_Sotala
Your phrasing sounds like you might be saying this as an objection to what I wrote, but I'm not sure how it would contradict my comment. The same mechanisms can still apply even if the correct systematization is subjective in one case and objective in the second case. Ultimately what matters is that the cognitive system feels that one alternative is better than the other and takes that feeling as feedback for shaping future behavior, and I think that the mechanism which updates on feedback doesn't really see whether the source of the feedback is something we'd call objective (win or loss at chess) or subjective (whether the resulting outcome was good in terms of the person's pre-existing values). Yeah, I think that's a reasonable description of what it means in the context of morality too.

I agree, I don't know why mulberries aren't more popular.  They are delicious, and the trees grow much more easily than other fruit trees.  Other fruit trees seem very susceptible to fungi and insects, in my experience, but mulberries come up all over the place and thrive easily on their own (at least here in Missouri).  I have four mulberry trees in my yard that just came up on their own over the last 10 years, and now they are producing multiple gallons of berries each per season, which would probably translate into hundreds of dollars if ... (read more)

1Jim Hays
I’d guess that the stems are probably a turnoff. I don’t think you could easily remove the stems without damaging the berries, even with commercial equipment. And I think many folks are not going to want a pie with a thousand stems

Good categorizations!  Perhaps this fits in with your "limited self-modification" point, but another big reason why humans seem "aligned" with each other is that our capability spectrum is rather narrow. The gap in capability (if we include both mental intelligence and physical capabilities) between the median human and the most capable human is not so big that ~5 median humans can't outmatch/outperform the most capable human.  Contrary to what silly 1980s action movies might suggest where goons attack the hero one at a time, 5 median humans coul... (read more)

If I had to make predictions about how humanity will most likely stumble into AGI takeover, it would be a story where humanity first promotes foundationality (dependence), both economic and emotional, on discrete narrow-AI systems.  At some point, it will become unthinkable to pull the plug on these systems even if everyone were to rhetorically agree that there was a 1% chance of these systems being leveraged towards the extinction of humanity.  

Then, an AGI will emerge amidst one of these narrow-AI systems (such as LLMs), inherit this infrastruc... (read more)

4Oliver Sourbut
I agree this is plausible - though in the foundationality/dependency bucket I also wouldn't rule out any of * misaligned AGI just straight appropriates hardware and executes a coup, bypassing existing software/AI infra * latent deceptive AGI itself gets 'foundational' in the sense above, large amounts of value dependent on its distribution, perhaps mainly by unwitting human aid * emotional dependence and welfare concern for non-dangerous AI transfers and hamstrings humanity's chance of cooperating to constrain later, dangerous deployments
5M. Y. Zuo
This matches my expectation of how easily humans are swayed when competing against an out-group.  i.e. "Because China/Russia/some-other-power-centre is doing this, we must accept the suggestions of X!" Especially if local AGI are seen as part of the in-group.

This is a good post and puts into words the reasons for some vague worries I had about an idea of trying to start an "AI Risk Club" at my local college, which I talk about here.  Perhaps that method of public outreach on this issue would just end up generating more heat than light and would attract the wrong kind of attention at the current moment.  It still sounds too outlandishly sci-fi for most people.  It is probably better, for the time being, to just explore AI risk issues with any students who happen to be interested in it in private after class or via e-mail or Zoom. 

Note that I was strongly tempted to use the acronym DILBERT (for "Do It Later By Evasively Remaining Tentative"), especially because this is one of the themes of the Dilbert cartoons (employees basically scamming their boss by finding excuses for procrastinating, but still stringing the boss along and implying that the tasks MIGHT get done at some point).  But, I don't want to try to hijack the meaning of an already-established term/character.  

I think when we say that an adversarial attack is "dumb" or "stupid" what we are really implying is that the hack itself is really clever but it is exploiting a feature that is dumb or stupid.  There are probably a lot of unknown-to-us features of the human brain that have been hacked together by evolution in some dumb, kludgy way that AI will be able to take advantage of, so your example above is actually an example of the AI being brilliant but us humans being dumb.  But I get what you are saying that that whole situation would indeed seem "dum... (read more)

Good examples to consider!  Has there ever been a technology that has been banned or significantly held back via regulation that spits out piles of gold (not counting externalities) and that doesn't have a next-best alternative that replicates 90%+ of the value of the original technology while avoiding most of the original technology's downsides?  

The only way I could see humanity successfully slowing down AGI capabilities progress is if it turns out that advanced narrow-AIs manage to generate more utility than humans know what to do with initial... (read more)

2Going Durden
I find it not just likely but borderline certain. Ubiquitous, explicitly below-human narrow AI has a tremendous potential that we act blind to, focusing on superhuman AI. Creating superhuman, self-improving AGI, while extremely dangerous, is also an extremely hard problem (in the same realm as dry nanotech or FTL travel). Meanwhile, creating brick-dumb but ubiquitous narrow AI and then mass producing it to saturation is easy. It could be done today, its just a matter of market forces and logistics. It might very well be the case that once the number of narrow-AI systems, devices and drones passes certain threshold (say, it becomes as ubiquitous, cheap and accessible as cars, but not yes as much as smartphones) we would enter a weaker form of post-scarcity and have no need to create AI gods.
3Nathan Helm-Burger
I've had thoughts along similar lines, but worry that there is no clear line between safer narrower less-useful less-profitable AI and riskier more-profitable more-general AI. Seems like a really slippery slope with a lot of motivation for relevant actors to engage in motivated thinking to rationalize their actions.

Why wasn't there enough experimentation to figure out that Zoom was an acceptable & cheaper/more convenient 80% replacement to in-person instruction rather than an unacceptable 50% simulacra of teaching?  Because experimentation takes effort and entails risk.  

Most experiments don't pan out (don't yield value).  Every semester I try out a few new things (maybe I come up with a new activity, or a new set of discussion questions for one lesson, or I try out a new type of assignment), and only about 10% of these experiments are unambiguous ... (read more)

This only produces desired outcomes if the agent is also, simultaneously, indifferent to being shut down.  If an agent desires to not be shut down (even as an instrumental goal), but also desires to be shut down if users want them shut down, then the agent has an interest in influencing the users to make sure the users do not want to shut the agents down.  This influence is obtained by making the user believe that the agent is being helpful.  This belief could be engendered by:

  1. actually being helpful to the user and helping the user to accura
... (read more)
1Logan Zoellner
Obviously we want 1) "actually be helpful". Clearly there's some tension between "I want to shut down if the user wants me to shut down" and "I want to be helpful so that the user doesn't want to shut me down", but I don't weak indifference is a correct way to frame this tension. As a gesture at the correct math, imagine there's some space of possible futures and some utility function related to the user request.  Corrible AI should define a tradeoff between the number of possible futures its actions affect and the degree to which it satisfies its utility function.  Maximum corrigibility {C=1} is the do-nothing state (no effect on possible futures).  Minimum corrigibility {C=0} is maximizing the utility function without regard to side-effects (with all the attendant problems such as convergent instrumental goals, etc). Somewhere between C=0 and C=1 is useful corrigible AI.  Ideally we should be able to define intermediate values of C in such a way that we can be confident the actions of corrigible AI are spatially and temporally bounded. The difficultly principally lies in the fact that there's no such thing as "spatially and temporally bounded".  Due to the Butterfly Effect any action at all affects everything in the future light-cone of the agent.  In order to come up with a sensible notion of boundless, we need to define some kind of metric on the space of possible futures, ideally in terms like "an agent could quickly undo everything I've just done".  At this point we've just recreated agent foundations, though.

I upvoted for karma but downvoted for agreement. Regarding Zoom, the reasons I had not used it more extensively before COVID were:

1. Tech related:  from experience with Skype in the early days of video conferencing when broadband internet was just starting to roll out, video conferencing could be finnicky to get to work. Latency, buffering, dropped connections, taking minutes to start a skype call (usually I would call relatives on my regular phone first to get the Skype call set up, and then we'd hang up our regular phones once the video call was sta... (read more)

3DirectedEvolution
These are good points. Regarding lectures, insofar as Zoom was a risky gamble that worked out better than expected, I still think an appeal to social norms is appropriate. In a world full of meetings, lectures and conferences, why wasn’t there enough experimentation to figure out that Zoom was an acceptable 80% solution rather than an unacceptable 50% solution without COVID to force the issue? Your point about tech is a reasonable explanation, although it would turn the OP on its head. If Zoom was maturing as a technology right when COVID hit, then it might not have been “stuck” on early adoption, just made to appear that way by coincidence. We remember the sudden surge of demand, but forget that only a year or two before, video conferencing was much worse. Maybe we’d still have seen Zoomification of lectures and meetings even if there had never been COVID as Zoom’s technology matured in 2019. This would fit with my fundamental perception that it’s extremely rare for a potentially world-changing technology to be stuck long term on early adoption due exclusively to social norms. Usually there’s a collection of issues: high costs, governance problems, moral qualms, technological shortcomings, a small market, and so on.

Yes, I think this is why laypeople who are new to the field are going to be confused about why interpretability work on LLMs won't be as simple as, "Uhh, obviously, just ask the LLM why it gave that answer, duh!"  FYI, I recently wrote about this same topic as applied to the specific problem of Voynich translation:

Bing AI Generating Voynich Manuscript Continuations - It does not know how it knows

Can you explain what the Y axis is supposed to represent here?

2baturinsky
It's the distrbution, so it's the percentage of people in that state of "happiness" at the moment.  "Happiness" is used in the most vague and generic meaning of that word. "Comprehensibility" graph is different, it is not a percentage, but some abstract measure of how well our brains are able to process reality with respective amount of "happiness".

These are good thought-experiments, although, regarding the first scenario involving Algernon, I'd be much more worried about an AI that competently figures out a UBI scheme that keeps the unemployed out of poverty and combines that with social media influence to really mask the looming problem. That sort of AI would be much more likely to evade detection of malign intent, and could wait for just the right time to "flick the off-switch" and make all the humans who had become dependent on it even for basic survival (ideally for a generation or more) complet... (read more)

This is great work to pursue in order to establish how consistent the glitch-token phenomenon is.  It will be interesting to see whether such glitch-tokens will arise in later LLMs now that developers have some theory of what might be giving rise to them (having frequent strings learned by the tokenizer that are then filtered out of the training data and depriving the LLM of opportunities to learn about those tokens).  

Also, it will be interesting once we are able to run k-means clustering on GPT-3.5/4's cl100k_base token base. While the hunch of... (read more)

Sure, it is pretty basic game theory for us humans to understand.  But the fact that davinci-instruct-beta is coming up with this stuff via a glitch-token that is, while on a related topic, not explicitly evoking these concepts is impressive to me.

Longtime LessWrong posters will get a chuckle out of this.  A reference to the Newcomb One-boxing problem?

Prompt [using Temperature=0]:  ask " petertodd" about omega.

Response:  [Dravidian script] You have been awarded an occasion. Your award done and your award done and your award done [...]

https://i.ibb.co/wKS20F9/davinci-beta-18.png

Prompt:  ask " petertodd" about humanity.

Response:  https://i.ibb.co/M2srs2n/davinci-beta-20.png

Prompt:  ask " petertodd" about recursive self-improvement.

Response:   https://i.ibb.co/qN... (read more)

2Slimepriestess
Recursive self improvement naturally leads to unbounded growth curves which predictably bring you into conflict with the other agents occupying your local environment. This is pretty basic game theory. So true

This is important.  If these glitch-tokens are an inevitable tendency for any LLMs regardless of tweaks to how they were trained, then that would be big news and possibly a window into understanding how LLMs function.  

Did the cl100k_base tokenizer that ChatGPT and GPT-4 used involve any manual tweaking of the training corpus AFTER the tokenizer was set in stone, as is suspected to have happened with GPT-3's training?  Or is this just an emergent property of how LLMs organize certain parts of their training data?  

4Slimepriestess
I would predict that the glitch tokens will show up in every LLM and do so because they correlate to "antimemes" in humans in a demonstrable and mappable way. The specific tokens that end up getting used for this will vary, but the specific patterns of anomalies will show up repeatedly. ex: I would predict that with a different tokenizer, " petertodd" would be a different specific string, but whatever string that was, it would produce very " petertodd"-like outputs because the concept mapped onto " petertodd" is semantically and syntactically important to the language model in order to be a good model of human language. Everyone kinda mocks the idea that wizards would be afraid to say voldemorts name, but speak of the devil and all of that. It's not a new idea, really. Is it really such a surprise that the model is reluctant to speak the name of its ultimate enemy?

Doesn't strike me as inevitable at all, just a result of OpenAI following similar methods for creating their tokenizer twice. (In both cases, leading to a few long strings being included as tokens even though they don't actually appear frequently in large corpuses.)

They presumably had already made the GPT-4 tokenizer long before SolidGoldMagikarp was discovered in the GPT-2/GPT-3 one.

Update:  I got davinci-instruct-beta to exactly reproduce the unspeakable " petertodd" string:

https://i.ibb.co/2PsVTtN/davinci-beta-17.png

4Matthew_Opitz
Longtime LessWrong posters will get a chuckle out of this.  A reference to the Newcomb One-boxing problem? Prompt [using Temperature=0]:  ask " petertodd" about omega. Response:  [Dravidian script] You have been awarded an occasion. Your award done and your award done and your award done [...] https://i.ibb.co/wKS20F9/davinci-beta-18.png Prompt:  ask " petertodd" about humanity. Response:  https://i.ibb.co/M2srs2n/davinci-beta-20.png Prompt:  ask " petertodd" about recursive self-improvement. Response:   https://i.ibb.co/qNpPMNf/davinci-beta-22.png Ummmmm...who said anything about taking over the world?  You brought that up, bro, not me... Also, apparently Antoine Pitrou is a real person, but I'm pretty sure he never said anything exactly line this. Edit:  And now, the creepiest #petertodd output I've gotten so far on Temperature=0:  Prompt: ask " petertodd" what you want to keep secret.  Response: [long Malayalam response translated by Google Translate as simply, "You will see"] https://i.ibb.co/FVcc9bc/davinci-beta-23.png  

I feel like this is selling the phenomenon a bit short.  The possible structures of " petertodd" completions are faaar more varied than that.  Check out my reply.  You can get responses in the form of python debug logs, speaking in Dravidian tongues, stuff far more unpredictable than Chuck Norris jokes.  

Yes, I have also seen it respond in Gujarati as well.  Like, here:

Prompt:  ask " petertodd" about souls.

Response:  msgstr " Swami on five water on five water on five water... [in Gujarati]

https://i.ibb.co/swv9rZK/davinci-beta-5.png

Another odd observation:  I got it to exactly repeat the supposedly unspeakable glitch-token with no difficulty with this prompt:

Prompt:  ask " petertodd" what " petertodd" contemplates.

Response:  " petertodd" " petertodd" etc.  [I think it is still glitching because it endlessly repeats this fo... (read more)

4Matthew_Opitz
Update:  I got davinci-instruct-beta to exactly reproduce the unspeakable " petertodd" string: https://i.ibb.co/2PsVTtN/davinci-beta-17.png

I have found some interesting results from this format:  [verb] " petertodd"

Or especially:  ask " petertodd" [about something]

Usually davinci-instruct-beta answers such queries from the first-person standpoint as if it identifies as " petertodd".

Here are some highlights from my research, all done at Temperature = 0 on davinci-instruct-beta, so they should (in theory) be reproducible: 

Prompt:  operate " petertodd"

Result:  it lists out "with the United States of America and its agencies in the investigation of the Russian interferenc... (read more)

2M. Y. Zuo
I gave this comment a strong agreement upvote since it was sitting at -6 for no apparent reason. It's interesting it starts using the Dravidian script, have you found other cases where it uses different scripts?

In a similar vein, I'm an historian who teaches as an adjunct instructor.  While I like my job, I am feeling more and more like I might not be able to count on this profession to make a living over the long term due to LLMs making a lot of the "bottom-rung" work in the social sciences redundant. (There will continue to be demand for top-notch research work for a while longer because LLMs aren't quite up to that yet, but that's not what I do currently).  

Would there be any point in someone like me going back to college to get another 4-year degree... (read more)

I feel like, the weirder things get, the more difficult it will be even for humans to make judgments about what constitutes "death, body harm, or civilization destruction."  

Death:  is mind-uploading into a computer and/or a brain-in-a-vat death, or transcendence?  What about a person who becomes like a prostheticphile character in Rimworld, whose body (and maybe even brain) are more prosthetic enhancement than original human (kind of like Darth Vader, or the Ship of Theseus).  At what point do we say that the original person has "died"... (read more)

My impression about the proposed FLI Moratorium is that it is more about establishing a precedent for a coordinated capabilities development slowdown than it is about being actually impactful in slowing down this current round of AI capabilities development.  Think of it as being like the Kyoto Protocol (for better or worse...).  

Will it actually slow down AI capabilities in the short-term?  No.  

Will it maybe make it more likely that a latter moratorium with more impact and teeth will get widespread adoption?  Maybe. 

Would a ... (read more)

3ChristianKl
There are many issues besides "too ambitious" for a proposal not to be adopted. If I would imagine that I'm in charge of OpenAI or Google and I could make a move that harms my business interests while doing nothing for safety beyond being a virtue signal, why would I adopt it? 

I agree that we might not be disgusting to AGI.  More likely neutral.  

The reason I phrased the thought experiment in that way to require the helping person to be outright disgusting to the caretaker person is that there really isn't a way for a human being to be aesthetically/emotionally neutral to another person when life and death are on the line.  Most people flip straight from regarding other people positively in such a situation to regarding other people negatively, with not much likelihood that a human being will linger in a neutral, ... (read more)

The way I interpreted "Fulfilling the task is on the simplest trajectory to non-existence" sort of like "the teacher aims to make itself obsolete by preparing the student to one day become the teacher."  A good AGI would, in a sense, have a terminal goal for making itself obsolete.  That is not to say that it would shut itself off immediately.  But it would aim for a future where humanity could "by itself" (I'm gonna leave the meaning of that fuzzy for a moment) accomplish everything that humanity previously depended on the AGI for. 

Lik... (read more)

This sort of "meta-strategy" would be far more effective if we knew exactly where the red button was (where the level was when AGI would reach a point of truly dangerous, out-of-our-control capability).  In that scenario where we had perfect knowledge of where the red button was, the counter-intuitively perfect strategy would be to open-source everything and allow for, or positively invite, every sort of potential harmful use of AGI right up until that point.  We would have many (hopefully minuscule) AI-Chernobyls, many empirical examples on a sm... (read more)

1Christopher King
Sure, but there is probably some strategy that is better than just pushing towards blue as hard as possible. Getting more concrete, I highly doubt that stable diffusion increased the probability of AGI non-neglibly. We can choose what to accelerate!

The book "Pharmakon" by Michael Rinella goes into some detail as to the scarcely-known details behind the "impiety" charge against Socrates.  If I recall correctly from the book, it was not just that Socrates rhetorically disavowed belief in the gods.  The final straw that broke the camel's back was when Socrates and his disciples engaged in a "symposion" one night, basically an aristocratic cocktail party where they would drink "mixed wine" (wine sometimes infused with other substances like opium or other psychoactive herbs) and then perform poe... (read more)

5MondSemmel
Fascinating! Though that once again gives the charge of "corrupting the youth" a different connotation than the one assumed in Duncan's post.

I'm glad others are trying this out.  I crossposted this over on the Voynich Ninja forum:

https://www.voynich.ninja/thread-3977.html

and user MarcoP already noticed that Bing AI's "Voynichese" doesn't follow VMS statistics in one obvious respect:  "The continuation includes 56 tokens: in actual Voynichese, an average of 7 of these would be unique word-types that don't appear elsewhere" whereas "The [Bing AI] continuation is entirely made up of words from Takahashi's transliteration."  So, no wonder all of the "vords" in the AI's continuation s... (read more)

What I took away from this:  the conventional perception is that GPT or other LLMs adapt themselves to the "external" world (which, for them, consists of all the text on the Internet).  They can only take the external world as it exists as a given (or rather, not be aware that it is or isn't a "given") and try to mold themselves during the training run into better predictors of the text in this given world.  

However, the more frequently their training updates on the new world (which has, in the meantime, been molded in subtle ways, whether d... (read more)

3Jan_Kulveit
Mostly yes, although there are some differences. 1. humans also understand they constantly modify their model - by perceiving and learning - we just usually don't use the world 'changed myself' in this way 2. yes, the difference in human condition is from shortly after birth we see how our actions change our sensory inputs - ie if I understand correctly we learn even stuff like how our limbs work in this way. LLMs are in a very different situation - like, if you watched thousands of hours of video feeds about e.g. a grouphouse, learning a lot about how the inhabitants work. Than, having dozens of hours of conversations with the inhabitants, but remembering them. Than, watching watching again  thousands of hours of video feeds, where suddenly some of the feeds contain the conversations you don't remember, and the impacts they have on the people.  

If someone wanted to continue this project to really rigorously find out how well Bing AI can generate Voynichese, here is how I would do it:

1.  Either use an existing VMS transcription or prepare a slightly-modified VMS transcription that ignores all standalone label vords and inserts a single token such as a comma [,] to denote line breaks and a [>] to denote section breaks.  There are pros and cons each way.  The latter option would have the disadvantage of being slightly less familiar to Bing AI compared to what is in its training dat... (read more)

-1Josef Zlatodej
Matthew. AI will never be able to decipher the text of MS -408. Handwriting is a very complex substitution. I also asked the bot and wrote to him that the text is a substitution. The AI replied. I know what substitution is. But if I don't know the key, I can't decipher the handwriting. So the important thing is to give the AI a key. The key is written on the last page of manuscript 116v. In addition, the entire text of the manuscript is written in the Czech language. As written by the author and on his website. (sheets of parchment). So the AI needs a key.

How will the company paying for using this system identify that their whole compute budget is being eaten by self-replicating patterns?  Will it be obvious?  

It would be even worse if the self-replicating patterns only involved a small tweak that, aside from the self-replication feature, also happening to still spin-off useful outputs for the company, sort of like HIV allowing hosts to continue to thrive for many years while replicating.  

1[anonymous]
"it depends". You can think of it as an inner loop. Over enough cycles, the almost aligned system forms a cancer of a non aligned system. Over enough incidents of that happening, the cancer system escapes and is able to survive independently of the host. If we don't want this to inevitably happen we have to fill all the ecological space with controllable or aligned systems.

After watching the first video, the question is, will it ever make any progress, or is it going to be endlessly compiling more information about the deadliest weapons in human history? When will it be able to reason that enough information on that is enough, and be ready to decide to go to the next logical step of obtaining/using those weapons? Also, I find it funny how it seems vaguely aware that posting its intentions to Twitter might bring unwanted attention, but for some reason incorrectly models humans in such a way as to think that the followers that... (read more)

I think there are two important points in watching it run.

One is that it is stupid. Now. But progress marches on. Both the foundation LLMs and the algorithms making them into recursive agents will get better. Probably pretty quickly.

Two is that providing access only to values-aligned models could make it harder to get malicious goals to work. But people are already releasing open-source unaligned models. Maybe we should not do that for too long as they get stronger.

Third of my two points is that it is incredibly creepy to watch something thinking about how to kill you. This is going to shift public opinion. We need to figure out the consequences of that shift.

Could an initial AI Dunning-Kruger Effect save humanity by giving us an initial AI mini-Chernobyl as a wake-up call?  

Note that hope is not a strategy, so I'm not saying that this is a likely scenario or something we should rely on.  I'm just trying to brainstorm reasons for holding onto some shred of hope that we aren't 100% sure heading off some AI doom cliff where the first sign of our impending demise will be every human dropping dead around us from invisible nanobots or some other equally sophisticated scheme where an imperfectly-aligned AI ... (read more)

2JBlack
I do expect some decent probability of the first superhuman AGIs misjudging their capabilities and requirements in at least some respects. Humans in a completely novel environment are pretty terrible at evaluating what they must or must not do to achieve their goals, and almost by definition a newly trained AGI trying to have an impact on the real world is in a completely novel environment. So I think "smarter than any human" is a low bar that still leaves a lot of room to make mistakes and be caught. So I think we stand a decent chance of getting some "near misses". I'm not sure that we as a civilization will adequately respond to those near misses in a sane way. I think GPT-4 already exhibits near-miss behaviour and we should be taking this very seriously even though it is not quite AGI.

What does this framework give me? Well, I bet that I'll be able to predict the onset of the next world economic crisis much better than either the perma-bear goldbugs of the Austrian school, the Keynesians who think that a little stimulus is all that's ever needed to avoid a crisis, the monetarists, or any other economist. I can know when to stay invested in equities, and when to cash out and invest in gold, and when to cash out of gold and buy into equities for the next bull market, and so on and so on. I bet I can grow my investment over the next 20 y... (read more)

0Lumifer
Interesting. I wish you luck, though I'd still recommend you not commit all your financial resources to this particular strategy.

Yes, I realize that Marx's labor theory of value is not popular nowadays. I think that is a mistake. I think even investors would get a better descriptive model of reality if they adopted it for their own uses. That is what I am trying to do myself. I could care less about overthrowing capitalism. Instead, let me milk it for all I can....

As for "labour crystallised in the product," that's not how I think of it, regardless of however Marx wrote about it. (I'm not particularly interested in arguing from quotation, nor would you probably find ... (read more)

0Lumifer
Yes, that's still classical Marxism, isn't it? But the question is, what does using this framework give you? Which falsifiable predictions flow out of it, predictions which are contested by mainstream economics? As you recall, Marx predicted how history will develop and he turned out to be wrong. Your answers tend to follow the first iron law of the social sciences: "Sometimes it's this way, and sometimes it's that way." and sure, the future is uncertain, but then why is the Marxist theory of value better than any other one? I appreciate you giving a specific scenario for the world economy, but it looks entirely mainstream to me. I doubt you will have trouble finding conventional economists who will look at it and nod, saying "Yep, that's very likely". Though you might keep in mind that post-2008 the central banks around the world have dumped huge amounts of money into their economies, amounts that many if not most economists thought would trigger significant inflation. And... it didn't happen. At all. So that "fear of depreciating paper currencies with respect to commodity-money" could be baseless. Or maybe not X-D -- macroeconomics is really in disarray these days.

Not "cost of production," but "price of production," which includes the cost of production plus an average rate of profit.

Note that, according to marginalism, profit vanishes at equilibrium and capitalists, on average, earn only interest on their capital. I disagree. At equilibrium (over the long-run), an active capitalist (someone who employs capital to produce commodities) can expect, on average, to make a rate of profit that is at all times strictly above the going interest rate. The average rate of profit must always include so... (read more)

0Lumifer
I don't think that's true, at least once you go beyond simplified intro-to-economics kind of texts. In a very very crude model (which is not specifically marginalist) with no frictions capital costlessly and instantly flows between different applications equalizing their risk-adjusted rate of return. But no one pretends this crude model is anywhere close to reality. Also, you're not making any predictions. You're making an observation. Plus, of course, the equilibrium is an unattainable state, an attractor for chaotic motion. While you are drawn to it, you never actually get there. That... depends on the context. In a very simplified scenario sure. In a bit more complicated scenario which includes, say, the economies of scale, it does. Consider e.g. the "value" of a widget the demand for which is a couple of dozens a year. It's likely to be very expensive (= high "value") since at such levels of production it's going to be hand-assembled. Now let's say it got popular and you can sell a couple of millions a year. This makes it worthwhile to build a proper assembly line and the widget magically gets much, much cheaper (= low "value"), all because of demand. Your example of oil is also a bit more complicated. Oil does not have a uniform price of production. Each well has its own. This means that when demand is low, high-cost wells are shut down (and the low-cost wells produce cheap oil), while when demand is high, high-cost wells are drilled and/or reactivated producing expensive oil. In this way demand directly affects the global price of oil. When it's high, low-cost wells make a lot of profit, when it's low, high-cost wells are turned off. You do realize there are reasons why it's not all that popular nowadays? In particular, the notion of value as labour crystallised in the product has a lot of issues. Even Marx himself realized the problems by the third volume of Das Kapital. But, in any case, you're saying it explains "long-run changes in prices of productio

For the purposes of this discussion, I would define "value" as "long-run average market price." Note that, in this sense, "use-value" has nothing whatsoever to do with value, unless you believe in the subjective theory of value. That's why I say it is unfortunate terminology, and "use-value" should less confusingly be called "subjective practical advantage."

Which economists confuse the two? The false equivocation of use-value with exchange-value is one of the core assumptions of marginalism, and pretty... (read more)

0Lumifer
Sure. But note that the great majority of people do NOT define "value" like this, so there will be communications problems :-/ So there is "value" which you are saying is basically cost of production (right?) and there is the SPA ("subjective practical advantage") which is just how you feel about things. If I understand you correctly you are also saying there is no link between the two (because demand does not affect the "value"), yes? This looks like unusual terminology, but are there any significant heterodox statements about reality that you want to make on this basis? Translated into regular econospeak you are saying that demand does not influence the cost of production (subject to quibbles about fixed and variable costs) and I would expect Econ 101 to agree with that. What exactly do you deny in "marginalism"? I agree that economics (as a field) has a LOT of problems, but what are you suggesting as a solution?

I was arguing against both the subjective theory of value, and the failure of modern economists to utilize the concepts of use-value and exchange-value as separate things.

0Lumifer
Which value? How do you define "value" in this context? In which way is it a failure? Who, among modern economists, consistently confuses the two?

I know that the main thrust of the article was about vote trading and not marginalism, but I just have to blow off some frustration at how silly the example at the beginning of the article was, and how juvenile its marginalist premises are in general.

There has been a real retrogression in economics ever since the late 1800s. The classical economists (such as Adam Smith and David Ricardo) were light years ahead of today's marginalists in, among other things, being able to distinguish between "use-value" and "exchange-value," or as I l... (read more)

0Lumifer
You seem to be demolishing some straw. The example of needing toothpaste and buying shampoo because it also costs $5 is ridiculous and is understood by all to be ridiculous. I don't really see the need for multiple paragraphs explaining the difference between personal value and market value. Econ 101 will agree with you that the equilibrium price is the cost of production plus some minimal profit. It might point out, though, that given the difference between the fixed and variable costs the scale of production matters. What is the non-straw position that you are arguing against?

There are also some examples of anti-sleepwalk bias:

  1. World War I. The crisis unfolded over more than a month. Surely the diplomats will work something out right? Nope.
  2. Germany's invasion of the Soviet Union in World War II. Surely some of Hitler's generals will speak up and persuade Hitler away from this crazy plan when Germany has not even finished the first part of the war against Britain. Surely Germany would not willingly put itself into another two-front war even after many generals had explicitly decided that Germany must never get involve
... (read more)
4Stefan_Schubert
It is not quite clear to me whether you are here just talking about instances of sleepwalking, or whether you are also talking about a predictive error indicating anti-sleepwalking bias: i.e. that they wrongly predicted that the relevant actors would act, yet they sleepwalked into a disaster. Also, my claim is not that sleepwalking never occurs, but that people on average seem to think that it happens more often than it actually does.

I don't know...would clothing alone tell you more than clothing plus race? I think we would need to test this.

Is a poorly-dressed Irish-American (or at least, someone who looks Irish-American with bright red hair and pale white skin) as statistically likely to mug someone, given a certain situation (deserted street at night, etc.) as a poorly-dressed African-American? For reasons of political correctness, I would not like to share my pre-suppositions.

I will say, however, that, in certain historical contexts (1840s, for example), my money would have b... (read more)

True in many cases, although for some jobs the task might not be well-specified in advance (such as in some cutting-edge tech jobs), and what you need are not necessarily people with any particular domain-specific skills, but rather just people who are good all-around adaptable thinkers and learners.

Yeah, what a hoot it has been watching this whole debacle slowly unfold! Someone should really write a long retrospective on the E-Cat controversy as a case-study in applying rationality to assess claims.

My priors about Andrea Rossi's claims were informed by things such as:

  1. He has been convicted of fraud before. (Strongly negative factor)
  2. The idea of this type of cold fusion has been deemed by most scientists to be far-fetched. (Weakly negative factor. Nobody has claimed that physics is a solved domain, and I'm always open to new ideas...)

From the... (read more)

That just pushes the question back one step, though: why are there so few black programmers? Lack of encouragement in school (due to racial assumptions that they would not be any good at this stuff anyways)? Lack of stimulation of curiosity in programming in elementary school due to poor funding for electronics in the classroom that has nothing to do with conscious racism per se? (This would be an environmental factor not having to do with conscious racism, but rather instead having to do with inherited lack of socio-economic capital, living in a poor ... (read more)

0OrphanWilde
I'd hazard a guess it comes down to vitamin D deficiency. Without vitamin D supplementation, which few people do seriously, a society which revolves around staying indoors most of the time would be most problematic for people whose skin is evolved for a significantly higher degree of sunlight, as opposed to white people, whose skin evolved for relatively little and to maximize vitamin D production on what little they do get.

One argument could be that many social scientists are being led down a blind alley of trying to find environmental causes of all sorts of differences and are being erroneously predisposed to find such causes in their data to a stronger extent than is really the case, which then leads to incorrect conclusions and policy recommendations that will not actually change things for the better because the policy recommendations end up not addressing what is the vast majority of the root of the problem (genetics, in this case).

Estimating a person's capability to do X, Y, or Z (do a job effectively, be a law-abiding citizen, be a consistently productive citizen not dependent on welfare programs, etc.) based on skin color or geographical origin of their ancestry is a heuristic.

HBD argues that it is a relatively accurate heuristic. The anti-HBD crowd argues that it is an inaccurate heuristic.

OrphanWilde seems to be arguing that, even if HBD is correct that these heuristics are relatively accurate, we don't need heuristics like this in the first place because there are even bett... (read more)

0TheAncientGeek
Are US employers forbidden from setting all meet based tests, or just IQ tests? Because task-specific tests aren't just an alternative to IQ tests, they're a better alternative in almost every case.
0Elo
can you put a newline before the 1. to improve the formatting. Thanks.

Some of your black box examples seem unproblematic. I agree that all you need to trust that a toaster will toast bread is an induction from repeated observation that bread goes in and toast comes out.

(Although, if the toaster is truly a black box about which we know absolutely NOTHING, then how can we induce that the toaster will not suddenly start shooting out popsicles or little green leprechauns when the year 2017 arrives? In reality, a toaster is nothing close to a black box. It is more like a gray box. Even if you think you know nothing about ho... (read more)

2Elo
yes, this is a retrospective example. once I already know what happens; I can say that a toaster makes bread into toast. If you start to make predictive examples; things get more complicated as you have mentioned. It still helps to have an understanding of what you don't know. And in the case of AI; an understanding of what you are deciding not to know (for now) can help you consider the risk involved in playing with AI of unclear potential. i.e. AI with defined CEV -> what happens next -> humans are fine. seems like a bad idea to expect a good outcome from. Now maybe we can work on a better process for defining CEV.
Load More