What's a perfect agent? No one is infallible, except the Pope.
How do you reconcile
When faced with new evidence, an intelligent agent should update on it and then forget it.
with
We can use the actual data we gather to introspect on our faulty reasoning.
given that you have discarded the data which led to the faulty reasoning? How do you know when it's safe to discard? In your example
I'd hate to forget how I stacked the deck, but only in those cards that are actually in play.
If you forget the discarded cards, and later realize that you may have an incorrect map of the deck, aren't you SOL?
an intelligent agent should update on it and then forget it.
Should being the operative word. This refers to a "perfect" agent (emphasis added in text; thanks!).
People don't do this, as well they shouldn't, because we update poorly and need the original data to compensate.
If you forget the discarded cards, and later realize that you may have an incorrect map of the deck, aren't you SOL?
If I remember the cards in play, I don't care about the discarded ones. If I don't, the discarded cards could help a bit, but that's not the heart of my problem.
if you really care about the values on that list, then there are linear aggregations
Of course existence doesn't mean that we can actually find these coefficients. Even if you have only 2 well-defined value functions, finding an optimal tradeoff between them is generally computationally hard.
Suppose that at time t the world is in a state Wt, and that the agent may look at it and make an observation Ot. Objectively, the surprise of this observation would be Sobj = S(Ot|Wt) = -log Pr(Ot|Wt).
One note on philosophy of probability: if the world is in state Wt, what does it mean to say that an observation Ot has some probability given Wt? Surely all observations have probability 1 if the state of the world is exhaustively known.
Philosiphically, yes.
Practically, it may be useful to distinguish between a coin and a toss. The coin has persisting features which make it either fair or loaded for a long time, with correlation between past and future. The toss is transient, and essentially all information about it is lost when I put the coin away - except through the memory of agents.
So yes, the toss is a feature of the present state of the world. But it has the very special property, that given the bias of the coin, the toss is independent of the past and the future. It's sometimes more useful to treat a feature like that as an observation external to the world, but of course it "really" isn't.
Thanks for the post; I particularly enjoyed the one-sentence takeaway at the end. One criticism though: you use mathematical notation like I(Wt;Mt) without saying what it denotes. Even though that can be inferred from the surroundings, it would be less likely to confuse if you stated it explicitly.
I'm trying to balance between introducing terminology to new readers and not boring those who've read my previous posts. Thanks for the criticism, I'll use it (and its upvotes) to correct my balance.
This one was fun for the math. Thank you. The practical advice is pretty prosaic - study the things you're most uncertain about.
Well, thank you!
Yes, I do this more for the math and the algorithms than for advice for humans.
Still, the advice is perhaps not so trivial: study not what you're most uncertain about (highest entropy given what you know) but those things with entropy generated by what you care about. And even this advice is incomplete - there's more to come.
When the new memory state is generated by a Bayesian update from the previous one
and the new observation
, it's a sufficient statistic of these information sources for the world state
, so that
keeps all the information about the world that was remembered or observed:
As this is all the information available, other ways to update can only have less information.
The amount of information gained by a Bayesian update is
and because the observation only depends on the world
That seems to be a convoluted way of defining a Markov process .
It would preferable if you attempted to use standard terminology and provide references frame the discourse within the theory.
I explained this in my non-standard introduction to reinforcement learning.
We can define the world as having the Markov property, i.e. as a Markov process. But when we split the world into an agent and its environment, we lose the Markov property for each of them separately.
I'm using non-standard notation and terminology because they are needed for the theory I'm developing in these posts. In future posts I'll try to link more to the handful of researchers who do publish on this theory. I did publish one post relating the terminology I'm using to more standard research.
When a light bulb observes voltage, it emits light, regardless of whether it did so a second earlier. When the light bulb's internal attributes entangle with the voltage, they lose all information of what came before.
This example is false. An incandescent light bulb has a memory: its temperature. The temperature both determines the amount of light currently emitted by the bulb, and also the electrical resistance of the filament (higher when hot), which means that even the connected electrical circuit is affected by the state of the bulb — turning on the bulb produces a high “inrush current”.
A much better example would be a LED (not an LED light bulb, which likely contains a stateful power supply circuit), which is stateless for most practical purposes. (For example, once upon a time, there was networking hardware which could be snooped optically — the activity indicator LEDs were simply connected to the data lines and therefore transmitted them as visible light. Modern equipment typically uses programmed blinking intervals instead.)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
I've been enjoying this series, but feel like I could get more out of it if I had more of an information theory background. Is there a particular textbook you would recommend? Thanks
Thanks!
The best book is doubtlessly Elements of Information Theory by Cover and Thomas. It's very clear (to someone with some background in math or theoretical computer science) and lays very strong introductory foundations before giving a good overview of some of the deeper aspects of the theory.
It's fortunate that many concepts of information theory share some of their mathematical meaning with the everyday meaning. This way I can explain the new theory (popularized here for the first time) without defining these concepts.
I'm planning another sequence where these and other concepts will be expressed in the philosophical framework of this community. But I should've realized that some readers should be interested in a complete mathematical introduction. That book is what you're looking for.