Checksums are an outdated concept

Today we look a bit at checksums, as always I make it as brief as possible to avoid confusion and to come straight to the point.

In short what are checksums
A checksum is used to determine if something is the same.

Outdated and no one uses it
First of all, with no one I mean the mass, not people with academic or scientific background, coders, script kiddies etc. I mean the average Jimmy.

The major issue with checksums is that you need tools, scripts or anything third-party related to check them, most operating systems to not directly displaying it or only with hidden cli tools or tools that normal users typically never use. The problem is simply awareness and how to work with what the operating system provides.

GitHub and the missing open source concern
GitHub has many concerns, some are well-known others are a matter of perspective. Personally I think that the platform should directly integrated functions as reproducible builds as well as displaying checksums directly on every binary, as of today this does not exist or you need CI, Bots, Actions and and other workarounds typically not directly provided by GitHub aka community workarounds for platform flaws GitHub refuses to address since a decade now.

Why is there no checksums shown on every download to avoid confusion. Well, my speculation is because everyone thinks the platform is secure and data breaches cannot happen that easily. In truth it might be more because less people asking for it because lets assume the server is breached and the binaries as well as shown checksums are replaced, which is not impossible.

We simply need reproducible systems that show if the actual data on the server are touched and better indicators.

Lets assume you check the checksums, so what then
Another problem with checksums is that even if you check them, you only can compare them against what you downloaded and cross reference it with what the server displays. Your verification relies on visual components as well as the hash calculation what you get in return. This system is in times of A.i. and ML. heavily outdated given the fact that some checksums algorithm can be more or less easily be manipulated.

The average user here never compiles things himself and implemented updater solutions usually only showing the changelog without any verification or indicators which can lead once again to problems as you entirely need to trust the server that it is not already delivering malware. The only protection is robustness against permutation together with security measurements from the download server that there is no breach in the first place.

Other systems
We have several problems and not really much solutions, what we need is a build systems which makes it easy to verify binaries and given code, cross-reference it and then re-check it on a daily, weekly or monthly basis.

Alternatives that could be enforced on the client side as well on the server end are HMACs with a cryptographic hash function, like SHA256 and then cross-reference them from time to time and then show some sort of traffic-light logo to make it easier for beginners to understand if the file is manipulated or not.

1 Like