All of Matthias G. Mayer's Comments + Replies

It is correct, but I agree the reasoning is not written out properly. I'll write y for 

We are in the case . So if , then , where the last inequality is true because it's equivalent to . So we have , which is a contradiction to the choice of y.

A direct application would need that you have an uncountable variable. You might want to do this if you have enough evidence to say this confidently. As a simple example imagine a real-valued graph where all your data points lie almost on the identity diagonal. You might then want to infer a variable which is the identity.

As a more general application, we want to model infinities because the world is probably infinite in some aspects. We then want a theorem that tells us, that even if the underlying model is infinite, if you have enough data points then you are close enough, like with the Strong law of Large numbers, for example.

Answer by Matthias G. Mayer
30

I'm working on the FFS framework in general. I'm currently writing up decidability of finite temporal inference. After this I will probably start working on efficient finite temporal inference which is what you're referencing if I understood correctly.

I'm also working on extending the framework to the infinite setting and am almost finished except for conditional orthogonality for uncountable sets.

I quite like the name Logical Time Theory, under which I will probably publish those results in a month or so.

1David Reber
  Hmm, what would be the intuition/application behind the uncountable setting? Like, when would one want that (I don't mind if it's niche, I'm just struggling to come up with anything)?

Here, we need to find a variable W such that 

  1. P(W|X,Y) is deterministic, because X and Y already fully describe our sample space. This means P(W|X,Y) is either 0 or 1
  2. Z and W are independent
  3. X and W are dependent
  4. Y and W are dependent

I think your arguments in Section 3 to rule out Graph 3 can't be correct if you accept Graph 2.

To see this, note that there is a symmetry between  and . Namely, if we use FFS temporal inference, then we know that  and  are both before  (and  ).(here we even have&nb... (read more)

3Magdalena Wache
Good point, you are right! I just edited the post and now say that graph 2, 3, and 4 only have instantiations that are equivalent to graph 1.  fixed, thanks!

Two features are orthogonal if their  norm is zero

Just as a side note about terminology: It is a bit imprecise that you use innerproduct and norm interchangeably.

Innerproduct is the function  and the norm is 

The internals of a system of course determine its functional behavior. But there might be different systems that differ only in what they actually do. E.g. different sort algorithms all end up with a sorted list but sort it differently. Likewise, a pathfinding algorithm like Dijkstra is different than checking every possible path and picking the best one. 

Looking only at functional behavior strips you of your ability to make predictions. You only know what has already happened. You can't generalize to new inputs.

This is the actual crux of why we care ... (read more)

1Ilio
Thanks, that clarifies your aims a lot. Did you gave some thoughts on how your approach would deal with cases of embodied cognition and uses of external memories?

What if you had some computation that could be interpreted (e.g. decrypted with two different keys) as either a simulation full of happy people, or a simulation full of depressed people? I think an adequate theory of experience is able to look at the encrypted computation (or any computation) and decide directly if there is suffering happening there.

Also, what is the difference between normal computation and encrypted computation? I feel like looking at a process that you haven't programmed yourself is not really that different than looking at an encrypted... (read more)

1Tamsin Leake
i suspect that those two kinds of computation in fact have a profoundly different shape, such that you can't have something that can convert into either in a simple manner. if i am wrong about this, then alignment is harder than i thought, and i don't know what to think about encrypted computations in such a situation — i guess nuke them just to be safe? we can figure out some parts of normal programs, and sometimes possibly even large parts. whereas, an encrypted computation should be guaranteed to be un-figureout-able without exponential compute. the same way i could figure out some of the meaning of a large text that's been translated into dutch, but i'd likely be completely unable to figure out the meaning of a large text that's been encrypted through say AES.

This is not what I meant (I've edited the post to make that clearer). I am looking for a way to naturally express that a result of a computation changes how the computation progresses. In a*(b+c) + (1-a)*(b-c) you compute both (b+c)and (b-c) . This is not what actually happens in the program.

The curried node is an interesting idea but breaks down if we move away from this toy example. If both branches contain subgraphs with a different amount of nodes and different connections between them then currying does not work (or is very unnatural).

(Currying is a nice idea so yes)

Turing completeness regards only the functional behavior of a class of computational systems. I want to look at the internals, what the system is actually doing, and find abstractions in there: Modularity, search processes, and steering mechanisms for instance.

So it’s not about finding yet another framework whose expressiveness is equivalent to Turing completeness. It’s about finding a framework to express the actual computation.

4Ilio
In what sense is the functional behavior different from the internals/actual computations? Could you provide a few toy examples?