2025 Prediction Thread

habryka

77 2025 Prediction Thread

30th Dec 2024

1 min read

77

2024 is drawing to a close, which means it's an opportune time to make predictions about 2025. It's also a great time to put probabilities on those predictions, so we can later prove our calibration (or lack thereof).

We just shipped a LessWrong feature to make this easy. Simply highlight a sentence in your comment, and click the crystal-ball icon on the toolbar to turn it a prediction that everyone (who's logged in) can put probability estimates on. The result will look like this:

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

moonlight (40%),niplav (45%),Archimedes (46%),bruberu (48%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

Marcus Williams (50%),Nikola Jurkovic (53%),Tao Lin (55%),BenAuer (57%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

rvnnt (60%),Arun Johnson (62%),Jan Betley (63%),Sohaib Imran (63%),sliu (65%),Joe Kwon (65%),Screwtape (65%),Chris Scammell (66%),Cheese Mann (66%),AaronKA (66%),jordine (66%),mathyouf (67%),Steveot (67%),tenthkrige (67%),Morpheus (68%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

Tomás B. (70%),isabel (70%),teradimich (70%),Jozdien (70%),Ruby (71%),Nihal M (71%),cqb (72%),Eric Neyman (72%),habryka (73%),Thane Ruthenis (73%),jimrandomh (75%),Jasper Blank (75%),silentbob (76%),Bart Bussmann (76%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

ConcurrentSquared (80%),Lucas Klein (80%),eytangut (82%),Ben Pace (82%),winstonBosan (83%),ashtree (87%),Jules (89%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Michael Liu (98%),Mateusz Bagiński (99%)

Lightcone Infrastructure will reach its second fundraising goal ($2M) by Jan 31 2025

99%

Some more probabilities that seem cool to elicit (basically all about AI, because that's what's on my mind, but it would be great to have some less AI focused predictions from others)^[1]:

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

RussellThor (25%),Eric Neyman (27%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

getrichquick (33%),cata (35%),gw (35%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

isabel (42%),lidwug (45%),niplav (45%),teradimich (45%),sliu (47%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

Nick_Tarleton (50%),Joel Burget (50%),Tao Lin (50%),Peter Wildeford (50%),tenthkrige (50%),Mateusz Bagiński (51%),cqb (52%),Nihal M (53%),Michael Liu (53%),Chris Scammell (54%),Logan Riggs (55%),avturchin (56%),habryka (57%),silentbob (57%),ashtree (58%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

the gears to ascension (63%),jimrandomh (65%),Thane Ruthenis (65%),Marcus Williams (65%),mattmacdermott (66%),Archimedes (67%),bolu (68%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

kairos_ (70%),james.lucassen (70%),Morpheus (70%),hmys (70%),Nikola Jurkovic (71%),shash42 (72%),Daniel Kokotajlo (75%),rvnnt (75%),AaronKA (75%),waterlubber (75%),Arshak Aivazian (75%),Dave Banerjee (77%),moonlight (77%),Cheese Mann (77%),Joachim Przibylla (78%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

eytangut (80%),Tomás B. (80%),lbThingrb (80%),Joe Kwon (80%),O O (80%),Audrey Tang (80%),Sohaib Imran (86%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

FVelde (90%),shita (90%),ConcurrentSquared (92%)

Will FrontierMath performance of the best tested AI model reach 50%?

99%

gw (5%)

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

RussellThor (12%),Eric Neyman (14%),ashtree (16%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

Nick_Tarleton (20%),isabel (20%),tenthkrige (25%),niplav (28%),Joel Burget (28%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

jimrandomh (30%),silentbob (30%),Peter Wildeford (30%),cqb (33%),Tao Lin (34%),teradimich (35%),Nihal M (35%),the gears to ascension (38%),sliu (38%),mattmacdermott (39%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

moonlight (40%),Arshak Aivazian (40%),Mateusz Bagiński (40%),george_adams (49%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

Thane Ruthenis (50%),Marcus Williams (50%),Nikola Jurkovic (50%),hmys (50%),Sohaib Imran (50%),Audrey Tang (50%),AaronKA (51%),Archimedes (52%),O O (54%),Benjy_Forstadt (55%),Morpheus (55%),mhdempsey (55%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

eytangut (60%),Tomás B. (60%),Daniel Kokotajlo (60%),Joe Kwon (60%),rvnnt (60%),Joachim Przibylla (61%),shash42 (64%),lbThingrb (65%),Cheese Mann (65%),nottombrown (65%),maybe.hello.world (67%),Steveot (67%),mhorava@gmail.com (69%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

ConcurrentSquared (70%),shita (74%),Optimization Process (79%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

FVelde (85%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Will FrontierMath performance of the best tested AI model reach 75%?

99%

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

Joe Kwon (35%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

FVelde (40%),the gears to ascension (41%),silentbob (46%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

eytangut (50%),Tomás B. (50%),jacob_drori (50%),hmys (50%),tenthkrige (50%),mhorava@gmail.com (52%),Mateusz Bagiński (52%),sliu (53%),ashtree (53%),Huera (53%),jimrandomh (55%),niplav (55%),isabel (55%),teradimich (55%),ConcurrentSquared (55%),shita (55%),habryka (59%),Cheese Mann (59%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

Thane Ruthenis (60%),Dana (60%),Peter Wildeford (60%),Joachim Przibylla (62%),Nihal M (62%),Nikola Jurkovic (63%),jordine (64%),moonlight (65%),mhdempsey (65%),Benjy_Forstadt (66%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

Marcus Williams (70%),exmateriae (71%),maybe.hello.world (71%),shash42 (72%),Archimedes (73%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

O O (95%)

Will OpenAI be valued at more than $300B?

99%

mhdempsey (1%),ConcurrentSquared (3%),Nihal M (3%),Selfmaker662 (4%),khang200923 (4%),shita (4%),Dana (5%),ashtree (5%),Peter Wildeford (5%),moonlight (6%),maybe.hello.world (6%),Joachim Przibylla (8%),Joe Kwon (8%),isabel (9%),cqb (9%),habryka (9%)

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

exmateriae (10%),niplav (12%),Nikola Jurkovic (12%),silentbob (13%),Archimedes (14%),RussellThor (15%),O O (15%),tenthkrige (15%),Mateusz Bagiński (15%),shash42 (16%),the gears to ascension (16%),jacob_drori (17%),sliu (17%),eytangut (19%),jordine (19%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

Caleb Biddulph (20%),Thane Ruthenis (20%),Tomás B. (20%),Marcus Williams (20%),MrThink (20%),teradimich (25%),Cheese Mann (25%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

hmys (30%),jimrandomh (35%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

FVelde (81%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

anaguma (94%)

Will OpenAI be valued at more than $1T?

99%

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

Ben Pace (42%),silentbob (43%),Daniel Kokotajlo (45%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

Morpheus (51%),Joe Kwon (55%),isabel (56%),the gears to ascension (57%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

niplav (60%),Adam Shai (60%),Screwtape (60%),Nihal M (60%),Nikola Jurkovic (62%),Sohaib Imran (62%),Mateusz Bagiński (66%),ashtree (67%),tenthkrige (67%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

jimrandomh (70%),Marcus Williams (70%),teradimich (70%),mattmacdermott (70%),mako yass (71%),habryka (71%),shita (73%),cqb (74%),eytangut (75%),AaronKA (75%),Cheese Mann (77%),O O (77%),shash42 (78%),Joachim Przibylla (78%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

Archimedes (80%),hmys (80%),Arshak Aivazian (80%),sliu (82%),ConcurrentSquared (82%),moonlight (85%),jordine (85%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Will there be more posts on LW in 2025 than 2024?

99%

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

JMC (12%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

the gears to ascension (22%),jordine (25%),mhdempsey (25%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

jimrandomh (30%),O O (31%),Nihal M (33%),shash42 (34%),Ben Pace (35%),Morpheus (35%),ashtree (35%),sliu (36%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

isabel (40%),niplav (45%),moonlight (45%),ConcurrentSquared (45%),Nikola Jurkovic (48%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

eytangut (50%),Marcus Williams (50%),teradimich (50%),mattmacdermott (50%),shita (50%),Mateusz Bagiński (50%),tenthkrige (51%),Cheese Mann (54%),habryka (55%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

Joe Kwon (60%),Michael Cohn (64%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

mako yass (76%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Will any philanthropist or foundation other than Dustin Moskovitz/Good Ventures donate more than $100M to AI existential risk reduction efforts?

99%

Daniel Kokotajlo (3%),lbThingrb (3%),jordine (3%),O O (3%),mattmacdermott (4%),cfoster0 (5%),cqb (5%),Dana (5%),Adam Shai (5%),Nihal M (5%),faul_sname (7%),Selfmaker662 (7%),Rafael Harth (8%),Joachim Przibylla (9%),Joel Burget (9%),Mateusz Bagiński (9%)

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

Thane Ruthenis (10%),ConcurrentSquared (10%),Sohaib Imran (10%),niplav (12%),RussellThor (12%),Morpheus (12%),exmateriae (14%),Nikola Jurkovic (14%),Cole Wyeth (15%),Marcus Williams (15%),sliu (15%),silentbob (15%),the gears to ascension (16%),habryka (17%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

isabel (20%),Ben Pace (20%),Joe Kwon (20%),tenthkrige (20%),FVelde (20%),mako yass (22%),shash42 (23%),moonlight (25%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

jimrandomh (35%),ashtree (35%),teradimich (38%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

hmys (50%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Will an AI system cause $100M+ damages in a way that would have triggered SB1047?

99%

faul_sname (1%),Selfmaker662 (1%),Nihal M (1%),Sohaib Imran (2%),ashtree (3%),Morpheus (5%),Daniel Kokotajlo (5%),lbThingrb (5%),Adam Shai (5%),khang200923 (5%),mhdempsey (5%),exmateriae (6%),shita (6%),shash42 (7%),the gears to ascension (7%),Joe Kwon (8%),isabel (9%),Mateusz Bagiński (9%)

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

RussellThor (10%),eytangut (10%),Ben Pace (12%),habryka (14%),cqb (14%),Thane Ruthenis (15%),mako yass (15%),Rafael Harth (15%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

O O (20%),niplav (24%),ProgramCrafter (24%),Marcus Williams (25%),Nikola Jurkovic (29%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

teradimich (30%),silentbob (31%),tenthkrige (32%),jimrandomh (35%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

ConcurrentSquared (45%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

hmys (50%),getrichquick (53%),Tomás B. (55%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

FVelde (73%),LWLW (74%),jordine (75%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

moonlight (80%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Will we observe a frontier AI system successfully exfiltrate, copy and run a copy of itself?

99%

Adam Shai (3%),eytangut (5%),shash42 (7%),Sohaib Imran (7%),Thane Ruthenis (8%),RussellThor (9%),O O (9%)

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

Nihal M (10%),moonlight (12%),mattmacdermott (15%),Morpheus (16%),isabel (18%),niplav (18%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

Tomás B. (20%),jimrandomh (25%),Marcus Williams (25%),mhdempsey (25%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

jordine (30%),Mateusz Bagiński (30%),teradimich (35%),Nikola Jurkovic (37%),shita (38%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

the gears to ascension (40%),hmys (40%),tenthkrige (47%),silentbob (48%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

Joe Kwon (50%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

waterlubber (66%),FVelde (67%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

ConcurrentSquared (85%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Will OpenAI trigger the "High" risk category milestone in the "Persuasion" category

99%

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

Sohaib Imran (25%),Mateusz Bagiński (25%),Thane Ruthenis (26%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

Tomás B. (30%),O O (31%),the gears to ascension (32%),shash42 (34%),jimrandomh (35%),teradimich (37%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

niplav (40%),mattmacdermott (40%),tenthkrige (48%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

mhdempsey (53%),silentbob (54%),jordine (54%),ZY (57%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

eytangut (60%),moonlight (60%),Marcus Williams (60%),hmys (60%),isabel (65%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

Bart Bussmann (70%),Nikola Jurkovic (74%),maybe.hello.world (74%),Joe Kwon (75%),shita (77%),FVelde (79%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

ConcurrentSquared (90%)

Will OpenAI trigger the "Medium" risk threshold in the "Model Autonomy" category

99%

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

Morpheus (20%),eytangut (29%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

RussellThor (35%),Alex Brown (36%),tenthkrige (39%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

Tomás B. (40%),Marcus Williams (40%),the gears to ascension (41%),isabel (45%),silentbob (45%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

habryka (50%),moonlight (50%),AaronKA (50%),hmys (50%),teradimich (52%),jordine (53%),Adam Shai (55%),NaiveTortoise (59%),Mateusz Bagiński (59%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

Nikola Jurkovic (62%),jimrandomh (65%),niplav (65%),1a3orn (65%),ConcurrentSquared (65%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

Joe Kwon (80%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

FVelde (90%)

Will Anthropic declare having reached ASL-3?

99%

^{^}
Unless otherwise specified assume all predictions are about the state of the world at midnight PT, Dec 31st 2025. Also some things won't be perfectly operationalized. Assume that I am going to be judging the resolution using my best judgement.

Frontpage

77

Mentioned in

7How do you decide to phrase predictions you ask of others? (and how do you make your own?)

New Comment

21 comments, sorted by

top scoring

Click to highlight new comments since: Today at 12:46 PM

[-]Nick_Tarleton6mo148

How can I remove an estimate I created with an accidental click? (Said accidental click is easy to make on mobile, especially because the way reactions work there has habituated me to tapping to reveal hidden information and not expecting doing so to perform an action.)

[-]Ben Pace6mo1-2

You click the estimate a second time.

[-]Nick_Tarleton6mo42

This doesn't work. (Recording is Linux Firefox; same thing happens in Android Chrome.)

An error is logged when I click a second time (and not when I click on a different probability):

[GraphQL error]: Message: null value in column "prediction" of relation "ElicitQuestionPredictions" violates not-null constraint, Location: line 2, col 3, Path: MakeElicitPrediction instrument.ts:129:35

[-]jimrandomh6mo72

Sorry about that, a fix is in progress. Unmaking a prediction will no longer crash. The UI will incorrectly display the cancelled prediction in the leftmost bucket; that will be fixed in a few minutes without you needing to re-do any predictions.

[-]Ben Pace6mo144

FWIW I would prefer to have to click to show the distribution, so I could vote before being anchored.

[-]jimrandomh6mo40

You can change this in your user settings! It's in the Site Customization section; it's labelled "Hide other users' Elicit predictions until I have predicted myself". (Our Claims feature is no longer linked to Elicit, but this setting carries over from back when it was.)

[-]Ben Pace6mo40

Bug report: It does have the amusing property that, if you hover over a part of the claim where others have left votes, the text underneath vanishes. Normally it would be replaced with the names of the users who voted, but now it shows no text. This doesn't reveal the key identity bits, but does reveal non-zero bits about what others think.

[-]Ben Pace6mo50

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

niplav (66%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

silentbob (74%),jimrandomh (75%),Lukas Finnveden (77%),cqb (79%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

Screwtape (80%),exmateriae (80%),Marcus Williams (80%),Mateusz Bagiński (80%),khang200923 (84%),Ben Pace (85%),habryka (85%),Morpheus (85%),ConcurrentSquared (85%),bohaska (87%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

moonlight (90%),jordine (91%)

Will there be more quick takes on LW in 2025 than 2024?

99%

For data, here are annual counts of quick takes / shortforms.

2018: 85
2019: 634
2020: 887
2021: 1028
2022: 1258
2023: 1580
2024: 2286

[-]CstineSublime6mo20

I'm quite fearful I may singlehandidly goodheart this one

[-]gw6mo*31

I'm somewhat surprised to see the distribution of predictions for 75% on FrontierMath. Does anyone want to bet money on this, at say, 2:1 odds (my two dollars that this won't happen against your one that it will)?

(Edit: I guess the wording doesn’t exclude something like AlphaProof, which I wasn’t considering. I think I might bet 1:1 odds if systems targeted at math are included, as opposed to general purpose models?)

[-]winstonBosan5mo10

Is the bet for general purpose model still open? I guess it depends on the specific resolver/resolution criteria - considering that OpenAI have gotten the answer and solution to most of the hard questions. Does o3's 25% even count?

[-]gw5mo10

Yeah I think I would still make this bet. I think I would still count o3's 25% for the purposes of such a bet.

[-]Mateusz Bagiński6mo20

https://www.lesswrong.com/posts/RDG2dbg6cLNyo6MYT/thane-ruthenis-s-shortform#GagwbhMwc3y3NYyTZ

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

Adele Lopez (19%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

tdko (34%),niplav (37%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

Mateusz Bagiński (52%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

At EOY 2025, will Thane Ruthenis believe we're at a circa-GPT-4 plateau?

99%

[-]isabel6mo21

oh, I like this feature a lot!

what's the plan for how scoring / checking resolved predictions will work?

[-]habryka6mo*70

I don't think I would want any kind of site-wide scoring, since I explicitly don't want people to have to worry too much about operationalization.

I do think if people keep using the feature, we'll add some way to resolve things, and resolved markets should have some way of showing whose predictions were right, and send people notifications. And maybe we'll also show people's predictions on their profiles at some point.

If the thing really takes off, of course there is a whole world of integrating prediction things more deeply into LW, but that seems somewhat unlikely given Manifold's existence and success.

[-]jimrandomh6mo30

We don't have any plans yet; we might circle back in a year and build a leaderboard, or we might not. (It's also possible for third-parties to do that with our API). If we do anything like that, I promise the scoring will be incentive-compatible.

[-]Screwtape6mo40

. . . Okay, I'll bite.

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

isabel (25%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

Ben Pace (30%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

jimrandomh (40%),Mateusz Bagiński (40%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

Screwtape (75%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Somebody (possibly Screwtape) builds an integration between Fatebook.io and the LessWrong prediction UI by the end of July 2025

99%

Edit: And-

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

Screwtape (40%),Mateusz Bagiński (45%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

ProgramCrafter (57%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

jimrandomh (65%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

isabel (70%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

Ben Pace (86%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Conditional on somebody building an integration between Fatebook.io and the LessWrong prediction UI by the end of July 2025, it's Screwtape that built it

99%

Now, I don't suppose that LessWrong prediction API is documented anywhere?

[-]jimrandomh6mo20

Kinda. There's source code here and you can poke around the API in graphiql. (We don't promise not to change things without warning.) When you get the HTML content of a post/comment it will contain elements that look like <div data-elicit-id="tYHTHHcAdR4W4XzHC">Prediction</div> (the attribute name is a holdover from when we had an offsite integration with Elicit). For example, your prediction "Somebody (possibly Screwtape) builds an integration between Fatebook.io and the LessWrong prediction UI by the end of July 2025" has ID tYHTHHcAdR4W4XzHC. A graphql query to get the results:

query GetPrediction {
  ElicitBlockData(questionId:"tYHTHHcAdR4W4XzHC") {
    _id predictions {
      createdAt
      creator { displayName }
    }
  }
}

[-]CstineSublime6mo10

Meta question for those who have made predictions: How do go about making a prediction? As in What is your prediction making process?

Which I suppose this is really a melange of questions that decomposes into:

Which questions appealed to you as being worth predicting, and why?
How did you determine what specific conditions the question was asking you to make a prediction about?
What was your process for determining your own level of confidence in that state of affairs?
Is the process similar or dissimilar from how you go about making decisions with tangible effects in your personal, familial, and professional life?

[-]Stephen McAleese6mo10

One prediction I'm interested in that's related to o3 is how long until an AI achieves superhuman ELO on Codeforces.

OpenAI claims that o3 achieved a Codeforces ELO of 2727 which is 99.9th percentile but the best human competitor in the world right now has an ELO of 3985. If an AI could achieve an ELO of 4000 or more, an AI would then be the best entity in the world at competitive programming and that would be the "AlphaGo" moment for the field.

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

jimrandomh (35%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

teradimich (45%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

Stephen McAleese (50%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

Daniel Kokotajlo (74%),jordine (77%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

anaguma (97%)

Will an AI model achieve superhuman Codeforces ELO by the end of 2025?

99%

[-]ashtree6mo10

An AI system replicating itself seems very unlikely because AI labs are hopefully and presumably protecting against that in particular. However, there are many other dangerous things an AI system could do that aren't self-replication and are often worse. It also seems that if that does happen, we are doomed since AI labs are trying their hardest to prevent that, and if they fail to prevent that, we have a self-replicating non-aligned AI, and so we are screwed.

Moderation Log

Curated and popular this week