Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

gwern comments on Worse Than Random - Less Wrong

25 Post author: Eliezer_Yudkowsky 11 November 2008 07:01PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (99)

Sort By: Old

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 11 November 2009 05:38:48PM 4 points [-]

Here's a shot, assuming they couldn't ever coordinate even at the beginning. Each agent takes a hash function & runs it on some salient data they have access to. This could be their ID number (since they may not know the ID numbers of any of the other agents but may know their own), or maybe a memory they are certain they will have different from agents. If an agent is uncertain what hash function to use on what data, they can just pick the Schelling point, or pick randomly (which would still be better). We know there must be some difference between the agent and all the others, else in what sense are there multiple agents?

The hash output is then used as the index into the array of all possible solutions. The agent works forward through the solutions and if they run out of solutions without finding a valid one, they go back to their index and work the other way.

This is an improvement over each agent choosing a random index, because each agent will get a different hash on its unique identity. There may still be 'collisions' between agents, but a good hash will have fewer such collisions than each agent choosing randomly. And the subsequent search procedure guarantees that every possibility will eventually be visited by an agent.

Simpler version: each agent takes their common PRNG and uses their identity as the seed, instead of, say, every agent picking '1' as the seed.

Comment author: Douglas_Knight 13 November 2009 04:07:41AM 0 points [-]

There may still be 'collisions' between agents, but a good hash will have fewer such collisions than each agent choosing randomly.

I don't know what you're suggesting, but it seems to me that you're just being specific about how to use the randomness, rather than using a different method. That's clear in your simpler version. But even in the first version, a hash pretty much is a PRNG, and you don't seem to use it in any additional way.

If I understand correctly, you're saying the method with the hash is a better use of randomness than the method with the PRNG. That seems wrong to me. I think the expected number of collisions is equal, the only difference might be that the hash method has fewer random stages, thus more variance.

Comment author: gwern 13 November 2009 04:44:28PM *  2 points [-]

Either 2 agents are identical and have identical environments, or they don't.

  • If they do, then it's hopeless: they must duplicate each other's work because if they didn't they would be different, and where could that difference come from if not the agent or the environment? N agents will perform N-1 useless searches of the array. The Fates will it.
  • If they don't, then the difference must be either in the agent or the environment or both

    1. We might call any difference in the environment a random number generator. The agent might see a flickering light or be in a virtual world with little dust devils randomly racing around or just have access to a stream of bits - whatever.

      The RNG will give us a starting index. But RNGs can be non-identical but still 'collide' for the first number generated, leading to a wasted agent. (I think this is the birthday paradox: length! / length^n * (length - n)!)

    2. Going back to the first point, either the random number generator is the same for 2 agents or it is different. If the RNG is the same, then the agents are right back in the identical-situation: using the RNG cannot help them act differently (choose a different index) from any of the other agents. So it is useless in this situation.

      The difference in the agent is useful, though. This difference could be childhood memories, or what-have-you. Let's just say it's expressed as a number m, which is less than the array length. This number will be unique among agents since otherwise the 2 agents sharing the number would be the same. We can then come up with a injective mapping from the agent's number to the indexes of the array, using a hash, for example, or perhaps just taking the number as a straight index.

      We get no collisions this way, while just using a random number permits collisions. Thus, going by m is better than going by the first random number.

    3. But what if there are differences in both the agent & environment? Then 2 agents might be identical but in different environments, and so m would be worse than the random number because the 2 agents will have the same m and so will repeat each other though there was a way for them to not repeat each other.

      In this case, we can take m and take our first random number, and xor them together. A random bitstream XORed with a non-random bitstream yields a random bitstream, and also for 2 random bitstreams. (2 non-random bitstreams won't become random, but if one bitstream is the agent and the other the environment, then we're in the hopeless situation.)

      Note also that if we're in case #2, where the RNG is the same among all agents but m isn't, then we can instruct every agent to do the XORing, and there still won't be any collisions among agents.

      Now we are no worse off than the just-random-number agent, because we get all the potential randomness of the number, and also all the potential value of m.

Just accepting a random number is not the best strategy: agents may have information about themselves and thus potential other agents that they can exploit. A smart agent can in effect fall back on randomness (by using XOR), but it can also do better than an agent which only ever uses randomness by exploiting the knowledge embodied by m.

EDIT: realized that 'surjective' meant the exact opposite of what I thought, and that I wanted 'injective'

Comment author: Douglas_Knight 14 November 2009 02:39:59AM 0 points [-]

We get no collisions this way, while just using a random number permits collisions. Thus, going by m is better than going by the first random number.

If the only way you use the unique data is to feed it into a hash, you might as well be using a random number. If you get different results from the hash than from randomness, you've broken the hash.

Comment author: pengvado 14 November 2009 03:04:37AM *  1 point [-]

If you get different results from the hash than from randomness, you've broken the hash.

That was my first impression too. But... isn't a hash considered to be cryptographically broken if you have a process that finds any collisions at all? Distinguishing based on the frequency of collisions (if that frequency is high enough to measure) is superfluous.

edit: removed the rest of my argument, which was just equivalent to truncating hash values.

Comment author: Douglas_Knight 14 November 2009 03:38:03AM 3 points [-]

If you get different results from the hash than from randomness, you've broken the hash.

That was my first impression too. But... isn't a hash considered to be cryptographically broken if you have a process that finds any collisions at all? Distinguishing based on the frequency of collisions (if that frequency is high enough to measure) is superfluous.

Yes, if you have few enough agents / work-flow that you're in P, then it is extremely unlikely that there will be absolute collisions, whether collisions of random numbers or hash values. But that chance is the same. You can break a hash through dumb luck! If you have lots of agents, then cryptographic security of the hash doesn't apply.

But we're not talking about absolute collisions of hashes of id numbers. In talking about hashes, the assumption is that the space of id numbers and hash values are big and the space of problems we're working on is not larger. When we truncate the hash values to get work items, that's when we get collisions, at exactly the same rate as if it were random.

Comment author: gwern 14 November 2009 03:55:41PM *  1 point [-]

If the only way you use the unique data is to feed it into a hash, you might as well be using a random number. If you get different results from the hash than from randomness, you've broken the hash.

The randomness is not as good. Even if the randomness is different from agent to agent, we still can get collisions. If we take the unique aspect of the agent, then by definition it isn't shared by other agents and so we can avoid any collision with other agents:

We can then come up with a injective mapping from the agent's number to the indexes of the array, using a hash, for example, or perhaps just taking the number as a straight index.

A hash doesn't have to collide; it only has to have collisions (by the Pigeonhole Principle) if the end hash is 'smaller' than the input. If I'm using SHA512 on data that is always less than 512 bits, then I'll never get any collisions. (Let's assume SHA512 is a perfect hash; if this bothers you, replace 'SHA512' with '$FAVORITE_PERFECT_HASH'.) But using the hash isn't essential: we just need a mapping from agent to index. 'Hash' was the first term I thought of.

(And the XOR trick lets us make use of a injective mapping regardless of whether it's the randomness or agent-related-data that is unique; we XOR them together and get something that is unique if either was.)

Comment author: pengvado 16 November 2009 06:48:56AM 3 points [-]

A hash doesn't have to collide; it only has to have collisions (by the Pigeonhole Principle) if the end hash is 'smaller' than the input. If I'm using SHA512 on data that is always less than 512 bits, then I'll never get any collisions.

How do you know which aspect of the agent is unique, without prior communication? If it's merely that agents have so many degrees of freedom that there's a negligible probability that any two agents are identical in all aspects, then your hash output is smaller than its input. Also, you can't use the 2^512 figure for SHA-512 unless you actually want to split a 2^512 size array. If you only have, say, 20 choices to split, then 20 is the size that counts for collision frequency, no matter what hash algorithm you use.

we XOR them together and get something that is unique if either was.

If hash-of-agent outputs are unique and your RNG is random, then the XOR is just random, not guaranteed-unique.

Comment author: gwern 16 November 2009 04:00:16PM *  1 point [-]

How do you know which aspect of the agent is unique, without prior communication? If it's merely that agents have so many degrees of freedom that there's a negligible probability that any two agents are identical in all aspects, then your hash output is smaller than its input.

If your array is size 20, say, then why not just take the first x bits of your identity (where 2^x=20)? (Why 'first', why not 'last'? This is another Schelling point, like choice of injective mapping.)

If hash-of-agent outputs are unique and your RNG is random, then the XOR is just random, not guaranteed-unique.

This is a good point; I wasn't sure whether it was true when I was writing it, but since you've pointed it out, I'll assume it is. But this doesn't destroy my argument: you don't do any worse by adopting this more complex strategy. You still do just as well as a random pick.

(Come to think of it: so what if you have to use the full bitstring specifying your uniqueness? You'll still do better on problems the same size as your full bitstring, and if your mapping is good, the collisions will be as 'random' as the RNGs and you won't do worse.)