%20-%20Donald%20Knuth.jpg)
In Book 2, Chapter 3, Page 5 of The Art of Computer Programming, Donald Kuth describes how in 1959 he tried to write the "Super-random" number generator. Given a 10-digit seed number, the program would process it through a bewlidering series of arithmetic operations to produce the next number in the random series. As Knuth says:
The machine language program corresponding to this algorithm was intended to be so complicated that a person reading the listing of it without explanatory comments wouldn't know what the program was doing .
When Knuth ran the program he found that it either converged quickly to the number 6065038420 or it would generate a uselessly short loop of numbers. As a result, Knuth says:
The moral of the story is that random numbers should not be generated by a method chosen at random. Some theory should be used.
Inspired by Knuth's tale, in 2022 I created an equally stupid "Super-random" generator that has no theory behind it, but I'll wager that it does (by accident) produce high-quality random numbers. If I'm right, the brute-force combination of famous algorithms and other random seed data can't go wrong. It's like casting all of the currently most popular actors into a big-budget movie and betting it can't go wrong.
My super-random number generating algorithm goes something like this:
Seed and Reseed (every 8K)
- State buffer is filled with 64-bytes from the .NET Random class.
- The current Process working set size is xor folded into the buffer.
- The computer's tick count since boot is folded.
- A new Guid's bytes are folded.
- The buffer's MD5 hash is folded.
- The buffer's SHA1 hash is folded.
- The buffer's XxHash64 hash is folded.
- The buffer's XxHash32 hash is folded.
- Utc time tick count is folded.
- MD5 hash of the leading 256 UTF8 character bytes of a random Wikipedia article is folded.
- The XxHash64 of https://dayspedia.com/time/online/ containing a clock is folded.
- 17 bytes of crypto RNG bytes are folded.
Step every 64 bytes
Similar to the Seed process, but the state buffer is updated by xor folding in the following order.
- New Guid
- 5 bytes of .NET Random (only on Saturday and Sunday)
- XxHash32
- MD5
- Timer tick count since boot
- XxHash64
- SHA1
- Utc time tick count
- If state[0] >= 0x80 then 17 bytes of crypto RNG
- The previous state buffer rotated 11 bytes (to create a Cipher Block Chaining effect)
The 64-byte state buffer is seeded and recalculated by such a cruel barrage of hashing and bit mangling that I'll be utterly shocked if it
doesn't produce "Super-random" numbers.
But I'll just have to assume I'm correct until I can figure out how to pass a large block of output
data into a randomness quality test (I can't find an online test that looks usable and respectable)
.
Update — On Christmas day 2023 I found the Random Bistream Tests site where you can feed in a string of '1' and '0' and they are passed through a series of tests. Passing a binary string as a parameter is really weird and I had to write some script to convert the Yotta output into the correct format. A discussion of the results is in a section below.
What makes this random number generator so stupid are the following points:
- It has absolutely no theory behind it (despite Kunth's insistence).
- It's really inefficient. I haven't measured the performance, but I'm guessing it runs at about 1% of the speed of popular pseudo-random number generators.
In future hobby time I hope to make the "Super-random" generator even stupider by adding more randomness from weather measurements, stock prices, the orbital positions of satellites, the pixels on public webcams, and whatever else I might find available.
Web API
The stupid algorithm is internally called the YottaRandom project and all of the source code is available in the following Azure DevOps repository:
https://dev.azure.com/orthogonal/YottaRandom
A web API with Swagger help can be found at the following address:
https://www.orthogonal.com.au/yotta/swagger/index.html
Test Results
The Random Bistream Tests web page is a bit vague about how it works exactly, but I get the impression that runs the NIST test suite converted to JavaScript. What!?! I find it hard to believe that an accurate translation would be possible, but there are JavaScript obsessives out there who could probably do it, somehow. I'll just assume the tests are reliable.
Anyway, I generated a dozen strings of 1 million ones and zeroes, pasted them into the page (via a tiny textbox) and hit run.
The first test produced 12 Passed and 2 Error … what the hell is an Error? Ignoring the errors because I guessed they were web site internal problems, I thought getting 12 out of 12 successful test passes was a fabulous outcome.
However, following tests on different data strings produced confusing results. Sometimes the Runs Test would give Failed and I couldn't find any pattern. A few times the Random Excursions Test produced Failed instead of Error and I couldn't figure why.
The good news is that a few test runs produced 14 out of 14 Passed. Overall the results are a bit confusing due to lack of knowledge of the testing internals and some random failures, but it pleases me to imagine that the testing is sensible and my stupid random generator is producing very high quality random bits. I would still like to find another randomness testing facility that is more professional to perform more stress tests on the stupid generator.