next up previous
Next: Appendix Up: Computer Search Programs Previous: Initial Search Program

Meta-Program

The second program that I wrote to find PDIs had a completely different architecture from the first, and thus was able to include several improvements. The main difference is that it allows a wider range of command line arguments, and then generates, compiles and runs code to find PDIs meeting the desired parameters. It uses much of the code from the initial program as a base. However, since it actually generates code, I was able to turn the recursive algorithm of the first program into a faster iterative one without losing any of the flexibility. Also, being iterative, I was able to include several enhancements that would have been extremely difficult to implement in the first version.

The meta-program allows a much richer set of command line options. As opposed to the original program, which needed to be told everything explicitly,you can specify incomplete values and it will automatically calculate the remaining ones necessary. For instance, if I wanted to find all 10 digit base 10 PDIs, it will automatically calculate the necessary powers that need to be checked. If instead, I wanted to find all order-10 PDIs in base 10, it will calculate the minimum and maximum number of digits to check. In general, the algorithm for calculating the number of digits provides better bounds than the one that calculates exponents, but they do both work.

To calculate the number of digits, I used the following two principles: For the maximum number, I used Stewart's formula that said that $C
\leq (t-1)B^{t} - 1$ to calculate $C$, then determined the number of digits in $C$. For the minimum number, I reasoned that no PDI exists containing entirely of 1's and 0's except for 1 itself. Therefore, a very naive low bound can be established by calculating the number of digits in $2^{t}$, where $t$ is the power that we are checking.

To calculate the minimum and maximum exponents, I used the following principles: The PDI with the largest possible power would be $2^{t} + 1$, so I tried increasing values of $t$ until I reached one that produced a result with the desired number of digits. The PDI with the smallest possible power would be $\mathrm{numDigits}*(B-1)^{t}$. Therefore, I tried increasing values of $t$ until I reached one that produced a number that was too large.

The first optimization that the meta-program made besides the switch from recursion to iteration was calculating both the maximum and minimum number of nines that could be in a number and still have it potentially be a PDI that met the other criteria already established (i. e. the number of digits and exponents have already been calculated at this point). In order to calculate the minimum number of nines, I used a generalization of a formula presented by Deimel and Jones ([6]): Let

\begin{eqnarray*}
m & = & \mathrm{number of digits} \\
k & = & \mathrm{desi...
... \\
v & = & \mathrm{number of digits that aren't } b - 1
\end{eqnarray*}



If a combination of digits is to make up a PDI, then the following must be true:

\begin{eqnarray*}
v(b-2)^{k} + (m-v)(b-1)^{k} & \geq & b^{m-1}\\
v(b-2)^{k} + m...
...c{b^{m-1} - m(b-1)^{k}}{(b-2)^{k} - (b-1)^{k}}
\right \rfloor\\
\end{eqnarray*}



A simpler formula was used to calculate the maximum number of $b-1$ digits that could be in a PDI: Using the same variables as above, if a PDI were all $b-1$s, then it would be $m(b-1)^{k}$. If this number has too many digits, then we repeatedly subtract $(b-1)^{k}$ from it until it has the desired number of digits (or until we reach 0). If $s$ is the number of times we subtracted from it, then the maximum number of nines would be $b-s$. While a more direct formula could be derived using logarithms, the GMP library does not provide logarithmic functions, and this method was faster than the alternative of developing and testing my own log functions.

In a similar way, the Meta-program also calculated the maximum number of 2s that could appear in a PDI (this was only considered for bases greater than 4). It relied on the maximum number of $b-1$ digits already being calculated. Again using the variables provided above, this routine assumes that the maximum number of digits are in fact $b-1$, and that the rest are all 2. Therefore, if it is a PDI, the number would be:

\begin{displaymath}(m-v)(b-1)^{k} + v2^{k}.\end{displaymath}

The number of digits in this number is calculated, and if there aren't enough digits, then there must be too many 2s. However, since we've already calculated the maximum number of $b-1$ digits possible, the maximum value that can replace a 2 would be $b-2$. Therefore, we subtract $2^{k}$ the total and add $(b-2)^{k}$. This process is repeated until we either reach a number that has at least the desired number of digits, or until we've removed all 2s from the number. One thing worth noting is that although this optimization is only made for bases greater than 4, for base 2, 3, or 4 it is hardly needed since finding PDIs in these bases is already extremely fast compared to the other bases.

The final optimization that was made in the meta program was the insertion of a check inside the loop that generates the appropriate number of 4s. By the time it reaches that loop, the only other digits left to generate are 3,2,1,or 0. Therefore, after testing if a number with no digits lower than 4 is a PDI, it checks to see if all remaining slots were filled with 3s, if there would be enough digits in the number for it to be a PDI. If there are, then the testing continues into the lower loops which distribute these other digits, otherwise it breaks out into a higher loop, since there is no possible way to distribute the remaining digits and have a number be a PDI.

The general result of these improvements was a reduction by approximately one third in the execution time of my program. The majority of the checks don't add any extra overhead in the runtime since they're constants that are fixed at compilation time. The one exception to this is the check inside the 4-for loop as described above. The placement of this check was made purely by guesswork, and it may actually have hurt the performance of the program, particularly for lower orders where it will most likely fail nearly all of the time. However, as noted above, for orders in the high 20's - low 30s, the execution time was in fact approximately cut in third.


next up previous
Next: Appendix Up: Computer Search Programs Previous: Initial Search Program
Scott Moore 2002-04-03