This post is part of the Asymptotic Logical Uncertainty series. In this post, I will give an algorithm, BenfordLearner, that passes the Benford test. This algorithm further satisfies whenever is an irreducible pattern with probability .

In the following algorithm, we define to be 1 whenever .

BenfordLearner(N)
 1: P=0
 2: M=N
 3: for j=0 to N
 4:   M_Y=0
 5:   for Y a TM expressible in K_Y <log N bits
 6:     M_X=N
 7:     for Y a TM expressible in K_X <log N bits
 8:       if UTM(X,N) and UTM(Y,N) both accept in time T(N)
 9:         A=0
10:         R=0
11:         i=1
12:         while i<=N
13:           if UTM(X,i) and UTM(Y,i) both accept in time T(i)
14:             if UTM(L,i) accepts in time T(N)
15:               A=A+1
16:             else if UTM(L,i) rejects in time T(N)
17:               R=R+1
18:             else
19:               i=N
20:           i=i+1
21:         F=A/(A+R)
22:         Q=A+R
23:         if max(K_X,|F-j/N|*sqrt(Q)/K_Y/sqrt(log log Q))<M_X
24:           M_X=max(K_X,|F-j/N|*sqrt(Q)/K_Y/sqrt(log log Q))
25:     if M_X>M_Y
26:       M_Y=M_X
27:   if M_Y<M
28:     M=M_Y
29:     P=j/N
30: return P

Let be the set of all Turing machines expressible in at most bits such that accepts in time at most . The encoding of Turing machines must be prefix-free, which in particular means that no Turing machine is encoded in 0 bits. Let denote the set of rational numbers of the form with .

For and Turing machines, let be the number of bits necessary to encode . Let be the subset of natural numbers which are accepted by both and in time at most . Let be the greatest number less than or equal to such that for every in the first elements of , halts in time . Let be the proportion of the first elements of which accepts. Let

Lemma: The output of is in

Proof: The algorithm has three for loops, the outer ranging over and the inner two ranging over and respectively, both restricted to Turing machines expressible in bits. The condition on line 8 means that and effectively range over all Turing machines in , and ranges over .

The inner while loop will increment the variables or a total of exactly times. Thus, is set to in line 22. Similarly, is sent to in line 21. Clearly and are and respectively. Therefore, the expression on lines 23 and 24 is

Considering the for loops from inner to outer, we minimize this quantity in , maximize it in , and find of the form minimizing the whole quantity. The returned is therefore a minimizer of

Our goal is to quickly assign logical probabilities. The code is not optimized to run efficiently. The following proposition is just to ensure that the run time is not far off from .

Proposition: The run time of is in .

Simulating on any input for time steps can be done in time in time for some fixed constant by simulating for steps \cite{Hennie1966}. The bulk of the run time comes from simulating Turing machines on lines 8, 13, 14, and 16. Each of these lines takes at most time, and we enter each of these lines at most times. Therefore, the program runs in time .

In the next post, we will use the above Lemma and the notion of irreducible patterns to show that this algorithm passes the Benford Test.

New Comment
1 comment, sorted by Click to highlight new comments since:

Typos: "expressible it * bits" should be "expressible in * bits"