![]() |
stringhash |
|
| Home « Computers « stringhash | ||
Overview
For many years I've been wondering how the standard string class GetHashCode method behaved in the .NET Framework. I wondered how well distributed the hash code integers were and if they were collision resistant. Collision resistance is highly desirable when the integers are used as keys in collections.
I spent a few mornings over the 2005 Christmas holidays writing a small forms application that hashes strings using the standard string GetHashCode and some other algorithms and compares their distribution and collisions. The first tab in the window hashes strings one at a time and shows the resulting hash values, or you can click Run to start a timer loop that hashes hundreds of automatically generated strings per second. The integer hash values are scaled down and plotted inside a pair of 256x256 grids to visually show their distribution. The second tab runs a collision test against all 2-character strings of letters and digits.
Framework 1.1
Shown below is the C code for the Framework versions 1.0 and 1.1 string GetHashCode method. This code is implemented in native code and is clearly designed to be fast rather than robust. Next to the code is a thumbnail of my old Framework 1.1 version of stringhash that shows the hash distribution (on the left) for random strings of length 5 to 6. You can see that the distribution is far from random and looks seriously flawed. My initial tests show that the standard string hash has a dreadfully poor distribution for strings less than 6 characters in length, and it has many collisions for 2-character strings.
Framework 2.0
After the arrival of Framework 2.0 I recompiled stringhash to see if and how String GetHashCode had changed. It turns out that the code is no longer native code and can be displayed using Reflector.
You can see that roughly the same number of points from the Framework 2.0 String GetHashCode have a much better distribution that they did in Framework 1.x. There are no 2-character collisions any more.
Download
| stringhash_src.zip (201KB) - Zip of the full Visual Studio 2005 solution and source code. |
|
| stringhash_exe.zip (22KB) - Zip of the executable files. Place them in any folder and run stringhash.exe. No installer is needed and no files are created at runtime. |
Links
An Illustrated Guide to Cryptographic Hashes