MD5算法是一种摘要算法,它可以从多个字节组成的串中计算出由32个字节构成的“特征串”。对于超过32字节的串来说,MD5计算得出的值必然是其一个子集,所以必然存在两个(或更多)不同的串能够得出相同MD5值的情况,这种情况就叫做MD5碰撞。
几位密码学家使用 “构造前缀碰撞法”(chosen-prefix collisions)来进行攻击(是王小云所使用的攻击方法的改进版本),他们所使用的计算机是一台Sony PS3,且仅用了不到两天。如果仅仅是想要生成MD5 相同而内容不同的文件的话,在任何主流配置的电脑上用几秒钟就可以完成。
他们的结论:MD5 算法不应再被用于任何软件完整性检查或代码签名的用途。
验证软件完整性时可能出现的问题:
A.文件不完整的情况
a.感染病毒
b.植入木马/后门/人为篡改
c.传输故障
B.可能出现的问题
a.如果有第三方在验证软件完整性时截取软件代码,使用快速MD5碰撞生成器,在短时间内伪造一份相同的MD5,并恶意篡改软件,那么安全性 将会大大下降
b.当软件过大时,在验证过程中所需的时间也会大大增加,对于第三方而言,攻击的成功概率也会增加
c.网站链接中的Vulnerability analysis也给出了一些问题分析:
On the other hand, there is the viewpoint of the relying party, i.e. the user downloading hashed or signed code who needs some guarantee that this software can be trusted. This relying party can not be sure anymore that the published hash value or the digital signature is valid for only the executable file he downloaded. There might very well be a sibling file with the same hash value or digital signature, while only one of these siblings went through the proper hashing or signing procedure. Especially when the software integrity verification takes place under the hood, with the user not knowing that the operating system or some hidden application is silently verifying digital signatures on software to be installed, the user may be more easily lured into installing malware.
Note that it is not necessary for an attacker to build both executables from source code. It is perfectly well possible to take as the first file any executable from any source, and as the second file produce a second executable as malware. Then a byte block to be appended to both files can be found such that the resulting files have the same MD5 hash value. If an attacker can then get the first file to be signed, e.g. by the original software vendor, this signature will also be valid for the attacker-constructed malware.