【密码分析(单表代换)】
1. Equipment
(1) operating system version :WIN 10
(2) CPU instruction set: x 64
(3) software:MATLAB R2020a
2.process
- Problem background analysis
密码分析(单表代换): 密文1: UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZVUEPHZHMDZSHZQWSFPAPPDTSVPQUZWYMXUZUHSXEPYEPOPPZSZUFPOMBZWPFUPZHMDJUDTMOHMQ 密文2: JXQCEFMPJASOQMDPQABCSTYSMGRQBTQOASKOAOUWCPQBDPMEEASIVMWPOQVJXQVQCSORWBQKMMYVIQAOXQPVASBFPAOJCOARQHFQPCQSOQASBQAOXXAVCJVMGSABZASJATQVJXQYSMGRQBTQGQTACSDPMEKMMYVASBDMPEARQBWOAJCMSQSAKRQVWVJIMRQAPSAKMWJIXCSTVXAJGQXAZQSMMFFMPJWSCJIMQHFQPCQSOQCSBACRIRCDQG0OASVJWBIARRJXQFRAOQVCSIXQGMPRBASBRQAPSDPMEFQMFRQGQGCRRSQZQPEQQJCSMWRCDQJCEQLWVJKIPQABCSTJXQCPKMMYVGQOASARVMBQZQRMFMWPASARICOARVYCRRVASBRQAPSXMGJMZCQGASBCSJQPFPQJIXQGMPRBAPMWSBWVCSBCDDQPQSJGAIVGQOASRQAPSIXQFAVJKIPQABCSTKMMYVCSJXCVGAIGQGMSJPQFQAJIXQECVJAYQVMMIXQPVASBOASKWCRBMSJXQCPAOXCQZQEQSJV
通过题意可知,此题需要采用单表代换的方法进行密码分析,单表代换密码的密码算法加解密时使用一个固定的替换表。单表代换密码又可分为一般单表替代密码、移位密码、仿射密码等等,此处的替代密码是指先建立一个替换表,加密时将需要加密的明文依次通过查表,替换为相应的字符,明文字符被逐个替换后,生成无任何意义的字符串,即密文,替代密码的密钥就是其替换表。
字母出现的频率会反映出相应语言的统计特性。大量的统计定会发现,相应语言中每个字母在相应语言中出现的概率。于是便得到该语言字母表上的一个概率分布。 例如Beker在1982年统计的样本总数为100 362,得到单码的概率分布见下表:
- Solution
先分析密文1,此处采用matlab中的tabulate函数来创建向量ciphertext的信息数据频率表,其代码如下:
code:
代码语言:javascript复制ciphertext=['UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZVUEPHZHMDZSHZQWSFPAPPDTSVPQUZWYMXUZUHSXEPYEPOPPZSZUFPOMBZWPFUPZHMDJUDTMOHMQ']';
tabulate(ciphertext)
统计密文中字母出现的频率如下:
Value | Count | Percent |
---|---|---|
U | 10 | 8.33% |
Z | 14 | 11.67% |
Q | 4 | 3.33% |
S | 10 | 8.33% |
O | 8 | 6.67% |
V | 5 | 4.17% |
H | 7 | 5.83% |
X | 5 | 4.17% |
M | 8 | 6.67% |
P | 17 | 14.17% |
G | 2 | 1.67% |
E | 6 | 5.00% |
W | 4 | 3.33% |
F | 4 | 3.33% |
D | 5 | 4.17% |
B | 2 | 1.67% |
T | 3 | 2.50% |
A | 2 | 1.67% |
I | 1 | 0.83% |
Y | 2 | 1.67% |
J | 1 | 0.83% |
为了方便直观对比,我将它制作成柱状图的形式:
经过与单码的概率分布表的比对可以得知,其明文为:
it was disclosed yesterday that several informal but direct contacts have been made with political representatives of the viet cong in moscow
同理,密文2也按照同样的思路分析:
code:
代码语言:javascript复制ciphertext=['JXQCEFMPJASOQMDPQABCSTYSMGRQBTQOASKQAOUWCPQBDPMEEASIVMWPOQVJXQVQCSORWBQKMMYVJQAOXQPVASBFPAOJCOARQHFQPCQSOQASBQAOXXAVCJVMGSABZASJATQVJXQYSMGRQBTQGQTACSDPMEKMMYVASBDMPEARQBWOAJCMSQSAKRQVWVJMRQAPSAKMWJJXCSTVXAJGQXAZQSMMFFMPJWSCJIJMQHFQPCQSOQCSBACRIRCDQGOOASVJWBIARRJXQFRAOQVCSJXQGMPRBASBRQAPSDPMEFQMFRQGQGCRRSQZQPEQQJCSMWRCDQJCEQLWVJKIPQABCSTJXQCPKMMYVGQOASARVMBQZQRMFMWPASARIJCOARVYCRRVASBRQAPSXMGJMZCQGASBCSJQPFPQJJXQGMPRBAPMWSBWVCSBCDDQPQSJGAIVGQOASRQAPSJXQFAVJKIPQABCSTKMMYVCSJXCVGAIGQGMSJPQFQAJJXQECVJAYQVMMJXQPVASBOASKWCRBMSJXQCPAOXCQZQEQSJV']';
tabulate(ciphertext)
统计密文中字母出现的频率如下:
Value | Count | Percent |
---|---|---|
J | 39 | 7.17% |
X | 20 | 3.68% |
Q | 73 | 13.42% |
C | 35 | 6.43% |
E | 10 | 1.84% |
F | 13 | 2.39% |
M | 39 | 7.17% |
P | 31 | 5.70% |
A | 51 | 9.38% |
S | 47 | 8.64% |
O | 20 | 3.68% |
D | 9 | 1.65% |
B | 24 | 4.41% |
T | 8 | 1.47% |
Y | 8 | 1.47% |
G | 18 | 3.31% |
R | 28 | 5.15% |
K | 10 | 1.84% |
U | 1 | 0.18% |
W | 14 | 2.57% |
I | 9 | 1.65% |
V | 28 | 5.15% |
H | 2 | 0.37% |
Z | 6 | 1.10% |
L | 1 | 0.18% |
其柱状图如下:
经过与单码的概率分布表的比对可以得知,其明文为:
The importance of reading knowledge can be acquired from many sources,these include books,teachers and practical experience and each has its own advantages.The knowledge we gain from books and formal education enables us to learn about things that we have no opportunity to experience in daily life.We can study all the places in the world and learn from people we will never meet in our life time,just by reading their books, we can also develop our analytical skill sand learn how to view and interpret the world around us in different ways,we can learn the past by reading books in this way we wont repeat them is takes of other sand can build on their achievements.
3. summary and harvest
最初采用C 写的字频分析的函数,采用按位读取相应的字符,并记录到相应的tag值上,最后再用for循环进行输出,由于过程过于冗杂,统计显示的效果不是很理想。通过查找matlab的官方文档,我了解到matlab中的tabulate函数可以进行字频统计的操作,于是改用matlab完成。
从实践的角度上第一次对频率分析这个破解经典密码的方法有了初步的理解。因为在自然语言里,字母表里的有些字母比其它的字母出现得更频繁。频率分析法假设密码没有隐藏这样的统计信息。例如,在简单的替换密码中,每个字母只是简单地被替换成另一个字母,那么在密文中出现频率最高的字母就最有可能是E,再按照对应的频率统计即可完成密文的破译。
初学信息安全,可能存在错误之处,还请各位不吝赐教。
受于文本原因,本文相关算法实现工程无法展示出来,现已将资源上传,可自行点击下方链接下载。
密码分析之单表代换原理详解与算法实现工程文件