Ahh.. the old age question... Yes there are better algorithms...
you see, CRC32 generates 32bits... 32bits are not enough to ensure that our 15meg file
is unique..
You see, 32 bits gives us 4,294,967,296 bit combinations, while simple 15meg file has
of them: 5,287*10^3084 (thats in scientific notation, it means the number has 3084 digits)
as you can see there are many different 15meg files and there is a possibility that our
CRC32 will generate same code for two completly different files..
However this is not really a problem.. Usually when files get corrupt they dont have
ALOT of changes being made (few bits/bytes) which are not enough to be corrupt and still
generate a valid code... Another thing that saves us is that we generate SFV files
on relatively small files (15megs) not complete isos (700meg)...
But as I said, better algorithms do exist... For example large files (actually, irony,
linux Isos) are verified with md5 algorithm as opposed to sfv... MD5 uses 128 bits plus
has better avalanching than crc32 (this means even a single bit of difference will make
significant impact on the output)... Why is md5 not used? sfv became popular, md5 is
catching on however, but has not as wide use as md5.
Further reading recommended:
RFC 1320 is about the MD4 Message-Digest Algorithm.
RFC 1321 is about the MD5 one.
Also you might want to head to local bookstore and take some Compression books (crc checks
are used in compressions alot) and Networking books (again, CRC is popular).. Also books
on descrete math might be a good help (data integrity issues, and data mapping spread,
error detection, automatic error correction). Local university/College is always
a plus... ;D