You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

passive_fist comments on A Limited But Better Than Nothing Way To Assign Probabilities to Statements of Logic, Arithmetic, etc. - Less Wrong Discussion

6 Post author: alex_zag_al 22 November 2013 09:14PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (16)

You are viewing a single comment's thread.

Comment author: passive_fist 23 November 2013 01:16:05AM *  2 points [-]

What you're referring to are called 'features' in the machine intelligence community. A 'feature' is usually an easily-computable property of an input data point that provides some information as to what class the input data point corresponds to. Models are then built by specifying a probability distribution over features. An optimal feature is one that gives the maximal amount of information after all previous features have been taken into account. For instance, if a feature takes one bit to describe, it will be optimal if, considering all other features have been taken into account, it provides one bit of information (or, equivalently, partitions the probability distribution exactly in half). If a feature is not optimal, there are various measures for determining how optimal it is and if it should be used or not.

Ordinary reasoning in mathematics is just a special case of Bayesian reasoning, as has been pointed out numerous times in the sequences. There has been a lot of work on optimal feature selection and how to derive good features (for example, using the Bayesian Information Criterion or BIC). It might be useful to extend your idea to incorporate those developments.