User Comment Replies

Infra-Bayesian physicalism: a formal theory of naturalized induction

A theory of physics is mathematically quite similar to a cellular automaton. This theory will usually be incomplete, something that we can represent in infra-Bayesianism by Knightian uncertainty. So, the "cellular automaton" has underspecified time evolution.

What evidence is there that incomplete models with Knightian uncertainty are a way to turn rough models of physics into loss functions? Can the ideas behind it be applied to regular Bayesianism?

Unifying Bargaining Notions (2/2)

bargo2y10

bolution

solution

Logical Probability of Goldbach’s Conjecture: Provable Rule or Coincidence?

bargo2y30

https://arxiv.org/abs/2211.06738 is related

2avturchin2y

Thanks, will check

Self-Reference Breaks the Orthogonality Thesis

bargo2y32

Sure, I mean that it is an implementation of what you mentioned in the third-to-last paragraph.

Self-Reference Breaks the Orthogonality Thesis

bargo2y50

Congratulations, you discovered [Active Inference]!

4lsusr2y

Do you mean the free energy principle?

Curiosity as a Solution to AGI Alignment

bargo2y21

Solomonoff Induction and Machine Learning. How would you formulate this in terms of a machine that can only predict future observations?

Curiosity as a Solution to AGI Alignment

bargo2y10

I want to provide feedback, but can't see the actual definition of the objective function in either of the cases. Can you write down a sketch of how this would be implemented using existing primitives (SI, ML) so I can argue against what you're really intending?

Some preliminary thoughts:

Curiosity (obtaining information) is an instrumental goal, so I'm not sure if making it more important will produce more or less aligned systems. How will you trade off curiosity and satisfaction of human values?
It's difficult to specify correctly - depending on what you me

... (read more)

1Harsha G.2y

Good point. Will have to think on this more. What do you mean by (SI, ML)? Can you link me to articles around this so I can define this better?

bargo2y10

Hmm, can you elaborate on what you mean in the last sentence?

2Ericf2y

Making choices between domains in pursuit of abstract goals: Say I have an agent with the goal of "win $ in online poker" and read/write access to the internet. Obviously that agent will simulate millions of games, and play thousands of hands online to learn more about poker and get better. What I don't expect to ever see (without explicit coding by a human) is that "win $ at poker" AI looking up instructional youtube videos to learn from human experts, or telling its handlers to set up additional hardware for it, or writing child AI programs with different strategies and having them play against each-other, or trading crypto during a poker game because that is another way to "win $," or even coding and launching a new poker playing website. I would barely expect it to find new sites where it could play, and be able to join those sites.

LESSWRONG
LW

All of bargo's Comments + Replies