I thought you were referring to degenerate gases when you mentioned nontrivial behavior in solid state systems since that is the most obvious case where you get behavior that cannot be easily explained by the "obvious" model (the canonical ensemble). If you were thinking of something else, I'm curious to know what it was.
I'm having a hard time parsing your suggestion. The "dropout" method introduces entropy to "the model itself" (the conditional probabilities in the model), but it seems that's not what you're suggesting. You can also introduce entropy to the inputs, which is another common thing to do during training to make the model more robust. There's no way to introduce 1 bit of entropy per "1 bit of information" contained in the input though since there's no way to measure the amount of information contained in the input without already having a model of the input. I think systematically injecting noise into the input based on a given model is not functionally different from injecting noise into the model itself, at least not in the ideal case where the noise is injected evenly.
You said that "if you add 1 bit of information, you have added 1 bit of entropy". I can't tell if you're equating the two phrases or if you're suggesting adding 1 bit of entropy for every 1 bit of information. In either case, I don't know what it means. Information and entropy are negations of one another, and the two have opposing effects on certainty-of-an-outcome. If you're equating the two, then I suspect you're referring to something specific that I'm not seeing. If you're suggesting adding entropy for a given amount of information, it may help if you explain which probabilities are impacted. To which probabilities would you suggest adding entropy, and which probabilities have information added to them?
1) any non-trivial Density of States, especially for semiconductors for the van Hove singularities.
2) I don't mean a model like 'consider an FCC lattice populated by one of 10 types of atoms. Here are the transition rates...' such that the model is made of microstates and you need to do statistics to get probabilities out. I mean a model more like 'Each cigarette smoked increases the annual risk of lung cancer by 0.001%' so the output is simply a distribution over outcomes, naturally (these include the others as special cases)
In particular, I'm working und...
A putative new idea for AI control; index here.
Noise versus preference and complexity
Error versus bias versus preference
Preference versus prejudice (and bias)
Known prejudices
Revisiting complexity