All of Sune's Comments + Replies

Sune70

I use ChatGPT and Claude to try to learn Macedonian, because there is only very little learning material available for that language. For example, they can (with a few errors sometimes) explain grammatical concepts or give me sentences to translate. I have not found a good way of storing a description of my abilities and weaknesses across conversations, but within a conversation they are good at adapting the difficulty of the questions to the quality of my answers. 
Unfortunately I’m not aware of any tools that can pronounce or transcribe Macedonian.&n... (read more)

Sune80

What is the definition of a Dyson Swarm? Is it really easier to define, or just easier to see that we are not there, only because we are not close yet?

6Noosphere89
Unfortunatelly, I fear this applies to basically everything I could in principle make a benchmark around, mostly because of my own limited abilities.
Sune10

That is assuming you live sufficiently long. The point in life insurance is to make sure you leave something for your kids/spouse if you die soon.

2kqr
It is under no such assumption! If you have sufficient wealth you will leave something even if you die early, by virtue of already having the wealth. If it's easier, think of it as the child guarding the parent's money and deciding whether to place a hedging bet on their parent's death or not -- using said parent's money. Using the same Kelly formula we'll find there is some parental wealth at which it pays more to let it compound instead of using it to pay for premia.
Sune30

Why does the plot start at 3x3 instead of 2x2? Of course, it is not common to have games with only one choice, but for Chicken that is what you end up with when removong one option. You could even start the investigation at 2x1 options.

2niplav
You're right. I'll rerun the analysis and include 2x2 games as well.
Sune10

Retracting the comment because I have seen a couple of couterexamples, including myself!

Sune180

The alcor-page was not updated since 15th December 2022, where a person who died in August 2022 (as well as later data) was added, so if he was signed up there, we should not expect it too be mentioned yet. For CI latest update was for a patient dying 29th February 2024, but I can’t see any indication of when that post was made.

Reply1111
5Prometheus
"To the best of my knowledge, Vernor did not get cryopreserved. He has no chance to see the future he envisioned so boldly and imaginatively. The near-future world of Rainbows End is very nearly here... Part of me is upset with myself for not pushing him to make cryonics arrangements. However, he knew about it and made his choice." https://maxmore.substack.com/p/remembering-vernor-vinge 
2Celarix
This doesn't really raise my confidence in Alcor, an organization that's supposed to keep bodies preserved for decades or centuries.
Sune32

My point is that potential parents often care about non-existing people: their potential kids. And once they bring these potential kids into existence, those kids might start caring about a next generation. Simularly, some people/minds will want to expand because that is what their company does, or they would like the experience of exploring a new planet/solar system/galaxy or would like the status of being the first to settle there.

Sune10

Which non-existing person are you refering to?

-2jbash
You can choose or not choose to create more "minds". If you create them, they will exist and have experiences. If you don't create them, then they won't exist and won't have experiences. That means that you're free to not create them based on an "outside" view. You don't have to think about the "inside" experiences of the minds you don't create, because those experiences don't and will never exist. That's still true even on a timeless view; they never exist at any time or place. And it includes not having to worry about whether or not they would, if they existed, find anything meaningful[1]. If you do choose to create them, then of course you have to be concerned with their inner experiences. But those experiences only matter because they actually exist. ---------------------------------------- 1. I truly don't understand why people use that word in this context or exactly what it's supposed to, um, mean. But pick pretty much any answer and it's still true. ↩︎
Sune36

Beyond a certain point, I doubt that the content of the additional minds will be interestingly novel.

Somehow people keep finding meaning in failling in love and starting a family, even when billions of people have already done that before. We also find meaning in doing careers that are very similar to what million of people have done before or traveling to destination that has been visited by millions of turist. The more similar an activity is to something our ancestors did, the more meaningful it seems.

From the outside, all this looks grabby, but from the inside it feels meaningful.

-2jbash
... but a person who doesn't exist doesn't have an "inside".
Sune1625

There has been enough discussion about timelines that it doesn’t make sense to provide evidence about it in a post like this. Most people on this site has already formed views about timelines, and for many, these are much shorter than 30 years. Hopefully, readers of this site are ready to change their views if strong evidence in either direction appears, but I dont think it is fair to expect a post like this to also include evidence about timelines.

jessicata16-3

The post is phrased as "do you think it's a good idea to have kids given timelines?". I've said why I'm not convinced timelines should be relevant to having kids. I think if people are getting their views by copying Eliezer Yudkowsky and copying people who copy his views (which I'm not sure if OP is doing) then they should get better epistemology.

Sune10

There is a huge amount of computation going on in this story and as far as I can tell not even a single experiment. The end hints that there might be some learning from the protagonists experince, at least it is telling it story many times. But I would expect a lot more experimenting, for example with different probe designs and with how much posthumans like different possible negotiated results.

I can see in the story that it make sense not to experiment with posthumans reactions to scenarios, since it might take a long time to send them to the fronter and... (read more)

4Richard_Ngo
A mix of deliberate and blind spot. I'm assuming that almost everything related to physical engineering and technological problems has been worked out, and so the stuff remaining is mostly questions about how (virtual) minds and civilizations play out (which are best understood via simulation) and questions about what other universes and other civilizations might look like. But even if the probes aren't running extensive experiments, they're almost certainly learning something from each new experience of colonizing a solar system, and I should have incorporated that somehow.
Sune117

An alternative reason for building telescopes would be to recieve updates and more efficient strategies for expanding found after the probe was send out.

4Daniel Kokotajlo
Yeah that's what I assumed the rationale was.
Sune10

How did this happen?! I guess not by rationalists directly trying to influence the pope? But I’m curious to know the process leading up to this.

1Dirichlet-to-Neumann
The pope has advisors. Some may even be young ! The Catholic Church has a long intellectual tradition even if it's very different from the one on lesswrong, and it has always been wary of potential misuses of new technologies. So nothing really surprising here for those who are used to Vatican-speak.
Sune10

What does respect mean in this case? That is a word I don’t really understand and seems to be a combination of many different concepts being mixed together.

Sune32

This is also just another way of saying “willing to be vulnerable” (from my answer below) or maybe “decision to be vulnerable”. Many of these answers are just saying the same thing in different words.

Sune84

My favourite definition of trust is “willingness to be vulnerable” and I think this answers most of the questions in the post. For example it explains why trust is a decision that can exist independently from your beliefs: if you think someone is genuinely on your side with probability 95%, you can choose to trust them, by doing something that benefit you in 95% of cases and hurt you on the 5% of cases, or you can decide not to, by taking actions that are better in the 5% of cases. Similar for trusting a statement about the world.

I think this definition co... (read more)

1Jacob G-W
Yes, this is pretty much how I see trust. It is an abstraction over how much I would think that the other person will do what I would want them to do. Trusting someone means that I don't have to double-check their work and we can work closer and faster together. If I don't trust someone to do something, I have to spend much more time verifying that the thing that they are doing is correct.
Sune10

whilst the Jews (usually) bought their land fair and square, the owners of the land were very rarely the ones who lived and worked on it.

I have heard this before but never understood what it meant. Did the people who worked the land respect the ownership of the previous owners, for example by paying rest or by being employed by the previous owners, but they just did not respect the sale? Or did the people who worked the land consider themselves to be owners or didn’t have the same concept of ownership as we do today?

4Yair Halberstadt
Just read in Morris's Righteous Victims:
1Yair Halberstadt
It seems you've stepped on quite a land mine here, and the following is mostly just vague guesses. As far as I can make out it dates back to the Ottoman land code of 1858 where for various reasons a lot of land was declared owned by the government, which would collect a tax in lieu of rent. So in one case the Ottoman empire sold a large tract of land to a Lebanese Effendi, who then sold it to the Yishuv. There was an village on this land which had been settled for some 60+ years, and despite protests to the Ottoman government the villagers were all evicted by the Jewish settlers. It seems the villagers paid a tithe. I suppose at first they would have paid the tithe to the Ottoman government, which would have seemed normal to them (and more like a tax), then they switched to paying a Lebanese Effendi, which wouldn't have made any difference to them either way. And then suddenly they were sold again, and evicted off their land, which would have felt very wrong given they'd been living there all their life, and viewed it as their land.
Sune90

If someone accidentally uses “he” when they meant “she” or vice versa and when talking about a person who’s gender they know, it is likely because the speaker’s first language does not distinguish between he and she. This could be Finnish, Estonian, Hungarian and some Turkic languages and probably also other languages. I haven’t actually use it, but noticed it with a Finnish speaker.

[This comment is no longer endorsed by its author]Reply1
1Sune
Retracting the comment because I have seen a couple of couterexamples, including myself!
3Celarix
Counterpoint: it could also be because the speaker thinks male is default and automatically thinks of an unknown person as male.
1Cole Wyeth
Interesting! I will attempt to verify this and then add it to the list.
Sune*141

The heading of this question is misleading, but I assume I should answer the question and ignore the heading

P(Global catastrophic risk) What is the probability that the human race will make it to 2100 without any catastrophe that wipes out more than 90% of humanity?

4Screwtape
Yes, answer the question not the heading.
Sune92

You don’t really need the producers to be “idle”, you just have to ensure that if something important shows up, they are ready to work on that. Instead of having idle producers, you can just have them work on lower priority tasks. Has this also been modelled in queueing theory?

6Viliam
You need to make sure that when something important shows up, (1) the system will clearly recognize that this happened, and (2) the producers will actually be able to abandon the lower-priority task quickly. I have seen companies trying to implement this, but what often actually happens is that the manager responsible for the lower-priority task just keeps assigning work to the employees anyway. The underlying cause is that the manager's incentives are misaligned with the company goals -- his bonus depends on getting the lower-priority task done. (How would you set up his incentives?)
Dagon113

Definitely, along with switching costs (if you drop a low-priority to work on a high-priority item, there's some delay and some waste involved).  In many systems, the switching delay/cost is high enough that it's best to just leave some nodes idle.  In others, the low-priority things can be arranged such that dropping/setting-aside is pretty painless.

Sune30

I have a question that tricks GPT-4, but if I post it I’m afraid it’s going to end up in the training data for GPT-5. I might post it once there is a GPT-n that solves it.

5quetzal_rainbow
You can publish hash of this question
Answer by Sune31

You can use ChatGPT 3.5 for free with chat history turned off. This way your chats should not be used as training data.

Sune113

The corporate structure of OpenAI was set up as an answer to concerns (about AGI and control over AGIs) which were raised by rationalists. But I don’t think rationalists believed that this structure was a sufficient solution to the problem, anymore than non-rationalists believed it. The rationalists that I have been speaking to were generally mostly sceptical about OpenAI.

6dr_s
Oh, I mean, sure, scepticism about OpenAI was already widespread, no question. But in general it seems to me like there's been too many attempts to be too clever by half from people at least adjacent in ways of thinking to rationalism/EA (like Elon) that go "I want to avoid X-risk but also develop aligned friendly AGI for myself" and the result is almost invariably that it just advances capabilities more than safety. I just think sometimes there's a tendency to underestimate the pull of incentives and how you often can't just have your cake and eat it. I remain convinced that if one wants to avoid X-risk from AGI the safest road is probably to just strongly advocate for not building AGI, and putting it in the same bin as "human cloning" as a fundamentally unethical technology. It's not a great shot, but it's probably the best one at stopping it. Being wishy-washy doesn't pay off.
Sune93

They were not loyal to the board, but it is not clear if they were loyal to The Charter since they were not given any concrete evidence of a conflict between Sam and the Charter.

Sune10

I don’t understand how this is a meaningful attitude to your own private economy. But want to donate to someone who needs it more is also a way to spend your money. This would be charity, possibly EA.

Sune40

I have noticed a separate disagreement about what capitalism means, between me and a family member.

I used to think of it as how you handle your private economy. If you are a capitalist, it means that when you have surplus, you save it up and use it (as capital) to improve your future, i.e. you invest it. The main alternative is to be a consumer, who simply spend it all.

My family member sees capitalism as something like big corporations that advertise and make you spend money on things you don’t need. She sees consumerism and capitalism as basically the same thing, while I see them as complete opposites.

2Viliam
Or have it taken away, and given to someone who is better at spending.
Sune10

Ok, looks like he was invited in to OpenAIs office for some reason at least https://twitter.com/sama/status/1726345564059832609

Sune*245

It seems the sources are supporters of Sam Altman. I have not seen any indication of this from the boards side.

1Sune
Ok, looks like he was invited in to OpenAIs office for some reason at least https://twitter.com/sama/status/1726345564059832609
Sune3329

It seems this was a surprise to almost everyone even at OpenAI, so I don’t think it is evidence that there isn’t much information flow between LW and OpenAI.

Sune167

There seems to be an edit error after “If I just stepped forward privately, I tell the people I”. If this post wasn’t about the bystander effect, I would just have hoped someone else would have pointed it out!

1Screwtape
Yep, that sure was an edit error. Thank you for pointing it out!
Sune10

Corollary: don’t trust yourself!

Sune30

Most cryptocurrencies have slow transactions. For AI, who think and react much faster than humans the latency would be more of a problem, so I would expect AIs to find a better solution than current cryptocurrencies.

8lc
Current cryptocurrencies are useful because they might be the only vaguely legal way to make the financial agreements that the AI wants, and AIs might have an easier time extending and using them than humans. It's not about it being a good information platform, it's about it avoiding the use of institutional intermediaries that the government pretends are illegal.
Sune138

I don’t find it intuitive at all. It would be intuitive if you started by telling a story describing the situation and asked the LLM to continue the story, and you then sampled randomly from the continuations and counted how many of the continuations would lead to a positive resolution of the question. This should be well-calibrated, (assuming the details included in the prompt were representative and that there isn’t a bias of which types of ending the stories are in the training data for the LLM). But this is not what is happing. Instead the model outpu... (read more)

4dynomight
Thanks, you've 100% convinced me. (Convincing someone that something that (a) is known to be true and (b) they think isn't surprising, actually is surprising is a rare feat, well done!)
4justinpombrio
Yeah, exactly. For example, if humans had a convention of rounding probabilities to the nearest 10% when writing them, then baseline GPT-4 would follow that convention and it would put a cap on the maximum calibration it could achieve. Humans are badly calibrated (right?) and baseline GPT-4 is mimicking humans, so why is it well calibrated? It doesn't follow from its token stream being well calibrated relative to text.
Sune218

Two possible variations of the game that might be worth experimenting with:

  1. Let the adversaries have access to a powerful chess engine. That might make it a better test for what malicious AIs are capable of.
  2. Make the randomisation such that there might not be an honest C. For example, if there is 1/4 chance that no player C is honest, each adversary would still think that one of the other adversaries might be honest, so they would want to gain player A’s trust, and hence end up being helpful. I think the player Cs might improve player A’s chances of winni
... (read more)
2Dweomite
For variant 1, do you mean you'd give only the dishonest advisors access to an engine, while the honest advisor has to do without?  I'd expect that's an easy win for the dishonest advisors, for the same reason it would be an easy win if the dishonest advisors were simply much better at chess than the honest advisor. Contrariwise, if you give all advisors access to a chess engine, that seems to me like it might significantly favor the honest advisor, for a couple of reasons: A.  Off-the-shelf engines are going to be more useful for generating honest advice; that is, I expect the honest advisor will be able to leverage it more easily. * The honest advisor can just ask for a good move and directly use it; dishonest advisors can't directly ask for good-looking-but-actually-bad moves, and so need to do at least some of the search themselves. * The honest advisor can consult the engine to find counter-moves for dishonest recommendations that show why they're bad; dishonest advisors have no obvious way to leverage the engine at all for generating fake problems with honest recommendations. (It might be possible to modify a chess engine, or create a custom interface in front of it, that would make it more useful for dishonest advisors; but this sounds nontrivial.) B.  A lesson I've learned from social deduction board games is that the pro-truth side generally benefits from communicating more details.  Fabricating details is generally more expensive than honestly reporting them, and also creates more opportunities to be caught in a contradiction. Engine assistance seems like it will let you ramp up the level of detail in your advice: * You can give quantitative scores for different possible moves (adding at least a few bits of entropy per recommendation) * You can analyze (and therefore discuss) a larger number of options in the same amount of time. (though perhaps you can shorten time controls to compensate) * Note that the player can ask advisors for more detail
Ericf141

Agree that closer to reality would be one advisor, who has a secret goal, and player A just has to muddle through against an equal skill bot with deciding how much advice to take. And playing like 10 games in a row, so the EV of 5 wins can be accurately evaluated against.

Plausible goals to decide randomly between:

  1. Player wins
  2. Player loses
  3. Game is a draw
  4. Player loses thier Queen (ie opponent still has thier queen after all immediate trades and forcing moves are completed)
  5. Player loses on time
  6. Player wins, delivering checkmate with a bishop or knight move
  7. M
... (read more)
Sune20

Why select a deterministic game with complete information for this? I suspect games like poker or backgammon would be easier for the adversarial advisors to fool the player and that these games are a better model of the real world scenario.

1Zane
Agreed that it could be a bit more realistic that way, but the main constraint here is that we need a game where there are three distinct levels of players who always beat each other. The element of luck in games like poker and backgammon makes that harder to guarantee (as suggested by the stats Joern_Stoller brought up). And another issue is that it'll be harder to find a lot of skilled players at different levels from any game that isn't as popular as chess is - even if we find an obscure game that would in theory be a better fit for the experiment, we won't be able to find any Cs for it.
2aphyer
For an entertainingly thematic choice, I'd recommend Twilight Struggle.
2Joe Collman
I'm not sure about poker, but I think for backgammon it'd be harder to get three levels where C beats B beats A reliably. I'm not a backgammon expert, but I could win games against experts - it's enough to be competent and lucky. A may also learn too fast - becoming competent is much faster for backgammon than for chess. (needing a larger sample size due to randomness makes A learning more of a problem - this may apply with poker too??) I have a lot more experience and skill at chess, but it's still pretty simple to find players who'll beat me 90% of the time.
Sune18-1

This seems like the kind of research that can have a huge impact on capabilities, and much less and indirect impact on alignment/safety. What is your reason for doing it and publishing it?

1[anonymous]
Speaking for myself, I think this research was worth publishing because its benefits to understanding LLMs outweigh its costs from advancing capabilities.  In particular, the reversal curse shows us how LLM cognition differs from human cognition in important ways, which can help us understand the "psychology" of LLMs. I don't think this finding will to advance capabilities a lot because: * It doesn't seem like a strong impediment to LLM performance (as indicated by the fact that people hadn't noticed it until now). * Many facts are presented in both directions during training, so the reversal curse is likely not a big deal in practice. * Bidirectional LLMs (e.g. BERT) likely do not suffer from the reversal curse.[1] If solving the reversal curse confers substantial capabilities gains, people could have taken advantage of this by switching from autoregressive LLMs to bidirectional ones. 1. ^ Since they have to predict "_ is B" in addition to "A is _".
Sune10

How about “prediction sites”? Although that could include other things like 538. Not sure if you want to exclude them.

Sune40

In case you didn’t see the author’s comment below: there is now a patreon button!

Sune10

Sorry my last comments wasn’t very constructive. I was also confusing two different critisisms:

  1. that some changes in predicted probabilities are due to the deadline getting closer and you need to make sure not to claim that as news, and
  2. that deadlines are not the in headlines and not always in the graphs either.

About 2): I don’t actually think this is much of a problem, if you ensure that the headline is not misleading and that the information about deadlines is easily available. However if the headline does not contain a deadline, and the deadline is r... (read more)

Sune63

I think this is a great project! Have you considered adding a donation button or using Patreon to allow readers to support the project?

I do have one big issue with the current way the information is presented: one of the most important things to take into account when making and interpreting predictions is the timeframe of the question. For example, if your are asking about the probability that Putin losses power, they the probability would likely be twice as high if you consider a 2 year timeframe compared to a 1 year time frame, assuming the probability ... (read more)

2vandemonian
Thank you! I've fixed the last headline now. I agree that it was being driven by the resolution date and therefore misleading. I'm curious what you think of e.g. the Putin headline. Of the 8 Putin markets, some resolve by July, some by October, and most by 2024 -- but all of them show ~90% or better odds (well, Hypermind is at 88%). So even if technically correct, would including the resolution dates really add value? Especially on an data-ink ratio basis? Most prediction markets used on the site resolve by 2024 (unless specified otherwise on the chart). It would feel redundant to include "by 2024" in every single headline? My subjective impression is that odds start collapsing as the resolution date gets near, more so than some linear decline as the market window slowly closes. Some markets do trend down (peace odds), but others sideways (Putin, Crimea), and others trend up (Russian territory gain). So my current sense is that as long as headlines don't overweight markets closing within say ~2 months, it's probably okay? Anyway am still very open to feedback on this. Just trying to navigate the trade-off of keeping things simple on an already cluttered site... Thanks again, it was a stimulating post and I think this is going to be an important issue for me to get right. Will keep thinking about it. Oh, I've also added a Patreon button now. Warning: no benefits yet! Purely to support the mission
4romeostevensit
+1 was looking for a patreon etc.
3Forged Invariant
One thing that I have seen on manifold is markets that will resolve at a random time, with a distribution such that at any time, their expected duration (from the current day, conditional on not having already resolved) is 6 months. They do not seem particularly common, and are not quite equivalent to a market with a deadline exactly 6 months in the future. (I can't seem to find the market.)
Sune64

Shouldn’t you get notification when there are reactions to your post? At least in the batched notification. The urgency/importance of reactions are somewhere between replies, where you get the notification immediately and karma changed, were the default is that it is batched.

Sune10

Can you only react with -1 of a reaction if someone else has already reacted with the +1 version of the reaction?

2Ruby
Actually if you first +1 to apply it yourself, you can then hover and then downvote it. But it will only show up if you hover.
Sune72

Most of the reactions are either positive of negative, but if a comment has several reactions, I find it difficult to see immediately which are positive and which are negative. I’m not sure if this is a disadvantage, because it is slightly harder to get peoples overall valuation of the comment, or if it actually an advantage because you can’t get the pleasure/pain of learning the overall reaction to your comment without first learning the specific reasons for it.

Another issue, if we (as readers of the reactions) tend to group reaction into positive and neg... (read more)

2Kenoubi
The obvious way to quickly and intuitively illustrate whether reactions are positive or negative would seem to be color; another option would be grouping them horizontally or vertically with some kind of separator. The obvious way to quickly and intuitively make it visible which reactions were had by more readers would seem to be showing a copy of the same icon for each person who reacted a certain way, not a number next to the icon. I make no claim that either of these changes would be improvements overall. Clearly the second would require a way to handle large numbers of reactions to the same comment. The icons could get larger or smaller depending on number of that reaction, but small icons would get hard to recognize. Falling back to numbers isn't great either, since it's exactly in the cases where that fallback would happen that the number of a particular reaction has become overwhelmingly high. I think it matters that there are a lot of different reactions possible compared to, say, Facebook, and at the same time, unlike many systems with lots of different reactions, they aren't (standard Unicode) emoji, so you don't get to just transfer existing knowledge of what they mean. And they have important semantic (rather than just emotive) content, so it actually matters if one can quickly tell what they mean. And they partially but not totally overlap with karma and agreement karma; it seems a bit inelegant and crowded to have both, but there are benefits that are hard to achieve with only one. It's a difficult problem.
4Ben
I think that the situation of someone spamming all the "bad" reactions on a post they don't like is the upvote system that already exists. If a post has a fair amount of karma and then copy of 10 different negative reacts might not mean much.
Sune40

Testing comment. Feel free to react to this however you like, I won’t intrepret the reactions as giving feedback to the comment.

Reply5555544333333332222222222222222222111
1Kenoubi
I think this comment demonstrates that the list of reacts should wrap, not extend arbitrarily far to the right.
6Sune
Shouldn’t you get notification when there are reactions to your post? At least in the batched notification. The urgency/importance of reactions are somewhere between replies, where you get the notification immediately and karma changed, were the default is that it is batched.
1Sune
Can you only react with -1 of a reaction if someone else has already reacted with the +1 version of the reaction?
2Raemon
oh no
Sune10

I don't follow the construction. Alice don't know x and S when choosing f. If she is taking the preimage for all 2^n values of x, each with a random S, she will have many overlapping preimages.

2interstice
Yes. But I don't see why that's a problem? Which preimage a given x would be assigned to is random. The hope is that repeated trials would give the same preimage frequently enough for it to be a meaningful partition of the input space. How well it would work depends on the details of the ECC but I suspect it would work reasonably well in many cases. You could also just apply the decoder directly to the string x but I thought that might be a bit more unnatural since in reality Bob will never see the full string.
Sune10

I tried and failed to formalize this. Let me sketch the argument, to show where I ran into problems.

Considering a code  with a corresponding decoding function , and assume that   .

For any function  we can define .  We then choose  randomly from the  such functions. We want to code to be such that for random   and random  the information  is enough to deduce , with hi... (read more)

Sune32

This question is non-trivial even for . Here it becomes: let Alice choose a probability  (which has to be on the form  but this is irrelevant for large ) and Bob observes the binomially distributed number . With which distribution should Alice choose  to maximize the capacity of this channel.  

Sune31

"STEM-level" is a type error: STEM is not a level, it is a domain. Do you mean STEM at highschool-level? At PhD-level? At the level of all of humanity put together but at 100x speed? 

3Rob Bensinger
The definition I give in the post is "AI that has the basic mental machinery required to do par-human reasoning about all the hard sciences". In footnote 3, I suggest the alternative definition "AI that can match smart human performance in a specific hard science field, across all the scientific work humans do in that field". By 'matching smart human performance... across all the scientific work humans do in that field' I don't mean to require that there literally be nothing humans can do that the AI can't match. I do expect this kind of AI to quickly (or immediately) blow humans out of the water, but the threshold I have in mind is more like: STEM-level AGI is AI that's at least as scientifically productive as a human scientist who makes a variety of novel, original contributions to a hard-science field that requires understanding the physical world well. E.g., it can go toe-to-toe with highly productive human scientists on applying its abstract theories to real-world phenomena, using scientific ideas to design new tech, designing physical experiments, operating equipment, and generating new ideas that turn out to be true and that importantly advance the frontiers of our knowledge. The way I'm thinking about the threshold, AI doesn't have to be Nobel-prize-level, but it has to be "fully doing science". I'd also be happy with a definition like 'AI that can reason about the physical world in general', but I think that emphasizing hard-science tasks makes it clearer why I'm not thinking of GPT-4 as 'reasoning about the physical world in general' in the relevant sense.
Sune11

Seems difficult to mark answers to this question.

The type of replies you get, and the skills you are testing, would also depend how long the subject is spending on the test. Did you have a particular time limit in mind?

1Max H
I think timeboxing it to 3 hours or so would be a good standard; maybe a bit more if you're totally unfamiliar with poker. I don't think judging responses would be particularly difficult; even if we don't know what actually happened for certain, you can still judge whether someone used valid rules of inference to reach a plausible estimate. (Judging well requires rationality skills too, of course - rationalists should be more easily convinced of true propositions than false ones, and be able to distinguish invalid reasoning from valid reasoning.) Also, I suspect that most strong rationalists would independently converge to the same probability estimate for approximately the same reasons, if they looked into the matter, which could serve as a baseline.
Sune2-1

This seems to be a copy of an existing one month old post: https://www.lesswrong.com/posts/CvfZrrEokjCu3XHXp/ai-practical-advice-for-the-worried

Load More