This is a response to the post We Write Numbers Backward, in which lsusr argues that little-endian numerical notation is better than big-endian.[1] I believe this is wrong, and big-endian has a significant advantage not considered by lsusr.
Lsusr describes reading the number "123" in little-endian, using the following algorithm:
- Read the first digit, multiply it by its order of magnitude (one), and add it to the total. (Running total: ??? one.)
- Read the second digit, multiply it by its order of magnitude (ten), and add it to the total. (Running total: ??? twenty one.)
- Read the third digit, multiply it by its order of magnitude (one hundred), and add it to the total. (Arriving at three hundred and twenty one.)
He compares it with two algorithms for reading a big-endian number. One is using the same process as for a little-endian number, but from right to left. I agree with him that this is worse than the little-endian algorithm, because it is easier to read a number in the same direction as the text that surrounds it, which in English is from left to right.
The other big-endian algorithm is the one I observe myself as usually using. For "321", it is:
- Count the digits (three), and convert that into an order of magnitude (one hundred). (Running total: ??? hundred ???.)
- Read the first digit, multiply it by its order of magnitude (one hundred), and add it to the total. (Running total: three hundred ???.)
- Read the second digit, multiply it by its order of magnitude (ten), and add it to the total. (Running total: three hundred and twenty ???.
- Read the third digit, multiply it by its order of magnitude (one), and add it to the total. (Arriving at three hundred and twenty one.)
The point raised by lsusr against the big-endian algorithm is that we must count a number's digits before we can start reading them. He doesn't say explicitly why he dislikes this, but I can see three reasons:
- It makes the algorithm more complex.
- It makes the algorithm slower.
- It means we cannot begin the algorithm if we only have access to the beginning of a number.
Making the algorithm more complex is bad, but not very bad, because it is still fairly simple. It being slow to count all the digits in a number is a real problem, but we can usually solve it by separating groups of digits using commas or by using exponential notation. Finally, only having access the beginning of a number is not a very common situation in day-to-day life.
So these problems might not be that important, but they are still problems, so, if they were the only consideration, little-endian would be better. Then, what other advantage does big-endian have over little-endian?
Though it is not common for us to not be able to process the entire representation of a number, we often have reason not to need to. Numbers represent quantities, and sometimes we only want to know an approximation, not an exact quantity.
For example, if I look up the population of India, Wikipedia will tell me it was estimated to be 1,428,627,663 people (in big-endian notation), but I will usually have no reason not to think of it as "about 1.4 billion". By running the big-endian algorithm only partially, this is exactly what we get: an estimate of a number to some order of magnitude.
By contrast, after running the little-endian algorithm partially, we find the number's value modulo a power of ten. In most situations, that is completely useless. Besides, since the data on the population of India is actually an estimate from 2023, in that example, we also can be pretty sure the least significant digits aren't even accurate.
What if you are not a person, but a computer, converting a string into an integer? In that case, having a simpler and faster algorithm is important, having to start with only the beginning of a string (what the user has typed so far) is plausible, and knowing the number's approximate value is useless. So in this case the little-endian algorithm is much better than the big-endian one.
But there is another algorithm that can be used by a computer for parsing big-endian numbers. Operating on "321":
- Read the first digit and add it to the total. (Running total: three.)
- Read the second digit. Multiply the total by ten, and add the digit to the total. (Running total: thirty two.)
- Read the third digit. Multiply the total by ten, and add the digit to the total. (Arriving at three hundred and twenty one.)
This algorithm operates sequentially on the string, and it is even simpler and faster than the little-endian algorithm, because it doesn't have to keep track of the order of magnitude. So for computers, too, reading big-endian is easier.[2]
So why do humans use the previous algorithm to read numbers, instead of this one? For the same reason we prefer big-endian to little-endian: successive approximations that narrow down on a number are more useful than operations for which the state in the middle is useless.
Lsusr's article ends by claiming the inventor of Arabic numerals knew little-endian numbers were better, and used them, because Arabic is written right-to-left. But positional decimal notation was not invented by the Arabs. It was invented by the Hindus, and brought to Europe by the Arabs. And the Hindus used the Brahmi script, which is written left-to-right. Therefore, the inventor of the Hindu-Arabic numeric system used big-endian notation.
One aspect neither of you have explicitly addressed is the speaking of numbers; speaking, after all, predates writing. We say "one billion, four hundred twenty-eight million, [...]".
Given that that's what we say, the first two pieces of information we need are "one" and "billion". More generally, we need to get the first 1-3 digits (the leftmost comma-separated group), then we need the magnitude, then we can proceed reading off all remaining digits.
Given that the magnitude is not explicitly written down, we get it by counting the digits. If the digits are comma-separated into groups of 3 (and "right-justified", so that if there are 3n+1 or 3n+2 digits, then the extra 1-2 are the leftmost group), then it's generally possible to get the magnitude from your "peripheral vision" (as opposed to counting them one by one) for numbers less than, say, 1 billion, which are what you'd most often encounter; like, "52" vs "52,193" vs "52,193,034", you don't need to count carefully to distinguish those. (It gets harder around 52,193,034,892 vs 52,193,034,892,110, but manually handling those numbers is rare.) So if getting the magnitude is a mostly free operation, then you might as well just present the digits left-to-right for people who read left-to-right.
Now, is it sensible that we speak "one billion, four hundred twenty-eight million, [...]"? Seems fine to me. It presents the magnitude and the most significant digits first (and essentially reminds you of the magnitude every 3 digits), and either the speaker or the listener can cut it off at any point and have an estimate accurate to as many digits as they care for. (That is essentially the use case of "partially running the algorithm" you describe.) I think I'd hate listening to "six hundred sixty three, six hundred twenty-seven thousand, four hundred twenty-eight million, and one billion", or even suffixes of it like "four hundred twenty eight million and one billion". Tell me the important part first!