前言
关于Il2cpp的资料网上有很多,简而言之,Il2cpp就是unity用来代替原来的基于Mono虚拟机的一种新的打包方式,它先生成IL(中间语言),然后再转换成Cpp文件,提高运行效率的同时增加了安全性。原本基于Mono的打包方式极其容易被逆向,现在市面上的新游戏基本上都是用Il2cpp的方式打包的,当然Il2cpp的逆向教程也很多,但是都是千篇一律,教你用国内大佬写的Il2cppDumper去dump就完事,毫无技术含量。事实上,由于这个工具太过出名,很多游戏厂商都采取了对抗措施,导致就算你照着教程来,大多数情况下也不会成功的。因此打算学习一下Il2cpp相关的攻防技术,于是在网上找了一个Il2cpp的CTF题来练手。题目来源:n1ctf-2018
baby unity3d
题目要求很明确,输入正确的flag即可。既然已经知道它是一个用了Il2cpp的unity程序了,那么就直接去找它的libil2cpp.so
以及global-metadata.dat
文件,然后尝试用Il2cppDumper进行解析,当然肯定会解析失败。解析失败的原因肯定是出在这两个文件中,至少有一个文件是被加密了,导致无法正常解析。这题比较基础,只加密了global-metadata.dat
文件,可以将global-metadata.dat
拖入010Editor查看,正常的global-metadata.dat
开头的四个字节应该是AF 1B B1 FA
,而这题的global-metadata.dat
显然被加密过了,因此只需要将其解密即可完成解析,后面的flag问题就迎刃而解。
想要解密global-metadata.dat
我们有两种思路,一种是dump解密结果,另一种是分析加密算法。对于第一种思路,这里有个frida脚本
function frida_Memory(pattern)
{
Java.perform(function ()
{
console.log("头部标识:" pattern);
var addrArray = Process.enumerateRanges("r--");
for (var i = 0; i < addrArray.length; i )
{
var addr = addrArray[i];
Memory.scan(addr.base, addr.size, pattern,
{
onMatch: function (address, size)
{
console.log('搜索到 ' pattern " 地址是:" address.toString());
console.log(hexdump(address,
{
offset: 0,
length: 64,
header: true,
ansi: true
}
));
//0x108,0x10C如果不行,换0x100,0x104
var DefinitionsOffset = parseInt(address, 16) 0x108;
var DefinitionsOffset_size = Memory.readInt(ptr(DefinitionsOffset));
var DefinitionsCount = parseInt(address, 16) 0x10C;
var DefinitionsCount_size = Memory.readInt(ptr(DefinitionsCount));
//根据两个偏移得出global-metadata大小
var global_metadata_size = DefinitionsOffset_size DefinitionsCount_size
console.log("大小:", global_metadata_size);
var file = new File("/data/data/" get_self_process_name() "/global-metadata.dat", "wb");
file.write(Memory.readByteArray(address, global_metadata_size));
file.flush();
file.close();
console.log('导出完毕...');
},
onComplete: function ()
{
//console.log("搜索完毕")
}
}
);
}
}
);
}
setImmediate(frida_Memory("AF 1B B1 FA")); //global-metadata.dat头部特征
大概流程就是通过魔术来定位到文件在内存中的起始地址,然后通过解析文件头来计算出文件的大小,最后进行dump。该脚本的适用条件是global-metadata.dat
在解密后必须要有正常的魔术即AF 1B B1 FA
否则定位,同时文件头信息要正确否则无法计算文件大小。这个脚本有一定的参考价值,然而对于这题起不到作用,脚本执行后没有找到起始地址,看来即使解密后,内存中也没有AF 1B B1 FA
存在。所以这种通用的dump方式应该是不行了,只能找到global-metadata.dat
的加载函数,待其解密完成后再进行dump,所以我们需要对global-metadata.dat
的加载流程进行分析。
global-metadata.dat
加载流程
这篇文章IL2CPP Tutorial: Finding loaders for obfuscated global-metadata.dat files对global-metadata.dat
加载流程有很详细的介绍,非常值得一读。我这边简要概括下,在libil2cpp.so
里面有个il2cpp_init
函数是加载函数调用链中的第一个函数,整个调用链是这样的
il2cpp_init
-> il2cpp::vm::Runtime::Init
-> il2cpp::vm::MetadataCache::Initialize
-> il2cpp::vm::MetadataLoader::LoadMetadataFile
我们可以在libil2cpp.so
里面搜索il2cpp_init
或者整个调用链里的关键字来定位到其中一个函数,最简单的是通过搜索global-metadata.dat
来直接定位到MetadataCache::Initialize
,但是这题不行,因为出题人特意把global-metadata.dat
这个字符串加密了,所以搜索不到。所以这边我们搜索il2cpp_init
来对照源码往下定位到MetadataCache::Initialize
。
假如说上面几个函数一个都没找到该怎么办,其实上面那篇文章也有提到,在libunity.so
里面会对il2cpp_init
做符号解析,得到它的地址。具体参考上面那篇文章就行。另外那篇文章中有个例子是il2cpp_init
被做了ROT-5处理,函数名变成了nq2huu_nsny
,然后我发现在我自己找的一些case里搜索nq2huu_nsny
也能找到,所以这个nq2huu_nsny
也值得一试。
il2cpp_init
代码语言:javascript复制int __fastcall il2cpp_init(int a1)
{
setlocale(6, "");
return sub_4C4770(a1, "v2.0.50727");
}
sub_4C4770
代码语言:javascript复制int __fastcall sub_4C4770(int a1)
{
.......
v1 = nullsub_3();
v2 = nullsub_1(v1);
v3 = sub_514E34(v2);
dword_695A80 = (int)"2.0";
v4 = sub_4F8468(v3);
v5 = sub_5171B8(v4);
v6 = sub_4B5564(v5);
v7 = sub_501A60(v6);
v8 = sub_4FA8B8(v7);
v9 = sub_4E0D84(v8);
sub_4D566C(v9);
memset(&dword_695AB0, 0, 0x13Cu);
v10 = sub_5017E4("mscorlib.dll");
dword_695AB0 = il2cpp_assembly_get_image_0(v10);
dword_695AB4 = ((int (*)(void))il2cpp_class_from_name_0)();
dword_695ABC = il2cpp_class_from_name_0(dword_695AB0, "System", "Void");
dword_695AC0 = il2cpp_class_from_name_0(dword_695AB0, "System", "Boolean");
dword_695AB8 = il2cpp_class_from_name_0(dword_695AB0, "System", "Byte");
dword_695AC4 = il2cpp_class_from_name_0(dword_695AB0, "System", &unk_5BA5E1);
dword_695AC8 = il2cpp_class_from_name_0(dword_695AB0, "System", "Int16");
dword_695ACC = il2cpp_class_from_name_0(dword_695AB0, "System", &unk_5BA5E7);
dword_695AD0 = il2cpp_class_from_name_0(dword_695AB0, "System", "Int32");
......
一个个点进去看,发现sub_4B5564
其实就是MetadataCache::Initialize
void sub_4B5564()
{
void *v0; // r4
int v1; // r4
unsigned int v2; // r7
int v3; // r0
int v4; // lr
int v5; // r2
int v6; // r4
int v7; // r3
_DWORD *v8; // r1
int v9; // r6
int v10; // r0
unsigned int v11; // r3
int v12; // r7
int v13; // r1
unsigned int v14; // r1
unsigned int v15; // r9
int v16; // r6
unsigned __int16 v17; // r0
unsigned __int16 *v18; // r6
int v19; // t1
_DWORD *v20; // r7
unsigned __int16 v21; // r4
int v22; // r2
int v23; // r1
int v24; // r0
int v25; // r6
int v26; // r7
int v27; // r1
int v28; // [sp 8h] [bp-48h]
unsigned int v29; // [sp Ch] [bp-44h]
int v30; // [sp 10h] [bp-40h]
int v31; // [sp 14h] [bp-3Ch]
int v32[2]; // [sp 18h] [bp-38h] BYREF
int v33; // [sp 20h] [bp-30h] BYREF
int v34; // [sp 24h] [bp-2Ch]
double v35; // [sp 28h] [bp-28h] BYREF
int v36; // [sp 30h] [bp-20h]
v0 = (void *)sub_4B5518("CLKFILrMETIDITInDIT", 19);
dword_6959CC = sub_513060();
free(v0);
dword_6959D0 = dword_6959CC;
v28 = dword_6959CC *(_DWORD *)(dword_6959CC 184);
if ( *(_DWORD *)(dword_6959CC 188) >= 0x44u )
{
v1 = dword_6959CC *(_DWORD *)(dword_6959CC 184);
v2 = 0;
do
{
sub_5019F8(v1);
v1 = 68;
v2;
}
while ( v2 < *(_DWORD *)(dword_6959D0 188) / 0x44u );
}
dword_6959D4 = sub_5169D4(*(_DWORD *)(dword_6959C4 24), 4);
dword_6959D8 = sub_5169D4(*(_DWORD *)(dword_6959D0 164) / 0x68u, 4);
dword_6959DC = sub_5169D4(*(_DWORD *)(dword_6959D0 52) / 0x38u, 4);
dword_6959E0 = sub_5169D4(*(_DWORD *)(dword_6959C4 32), 4);
dword_6959E4 = *(_DWORD *)(dword_6959D0 180) / 0x18u;
v3 = sub_5169D4(dword_6959E4, 28);
dword_6959E8 = v3;
if ( dword_6959E4 >= 1 )
{
v4 = dword_6959CC;
v5 = 0;
v6 = dword_6959D0;
v7 = 12;
v8 = (_DWORD *)(*(_DWORD *)(dword_6959D0 176) dword_6959CC 12);
while ( 1 )
{
v9 = v3 v7;
v5;
*(_DWORD *)(v9 - 12) = v4 *(_DWORD *)(v6 24) *(v8 - 3);
*(_DWORD *)(v9 - 8) = *(v8 - 2);
*(_DWORD *)(v9 - 4) = *(v8 - 1);
*(_DWORD *)(v3 v7) = *v8;
*(_DWORD *)(v9 4) = v8[1];
*(_DWORD *)(v9 12) = v8[2];
if ( v5 >= dword_6959E4 )
break;
v7 = 28;
v8 = 6;
v6 = dword_6959D0;
v4 = dword_6959CC;
v3 = dword_6959E8;
}
}
sub_4B5A28();
v35 = 0.0;
v36 = 0;
v10 = dword_6959D0;
if ( *(_DWORD *)(dword_6959D0 188) >= 0x44u )
{
v11 = 0;
v31 = dword_6959CC *(_DWORD *)(dword_6959D0 160);
do
{
v12 = 0;
v13 = *(_DWORD *)(v28 68 * v11);
if ( v13 != -1 )
v12 = dword_6959E8 28 * v13;
v30 = v12;
v14 = *(_DWORD *)(v12 12);
if ( v14 )
{
v15 = 0;
v29 = v11;
do
{
v16 = v31 104 * (*(_DWORD *)(v12 8) v15);
v19 = *(unsigned __int16 *)(v16 80);
v18 = (unsigned __int16 *)(v16 80);
v17 = v19;
if ( v19 )
{
v20 = (_DWORD *)(v31 104 * (*(_DWORD *)(v12 8) v15) 52);
v21 = 0;
do
{
v22 = *(_DWORD *)(dword_6959D0 48);
v34 = *v20 v21;
v23 = *(_DWORD *)(dword_6959CC v22 56 * v34 24);
if ( v23 == -1 )
{
v33 = 0;
}
else
{
v33 = *(_DWORD *)(*(_DWORD *)(dword_6959C0 4) 4 * v23);
if ( v33 )
{
sub_4B5CFC(&v35, &v33);
v17 = *v18;
}
}
v21;
}
while ( v21 < (unsigned int)v17 );
v12 = v30;
v14 = *(_DWORD *)(v30 12);
}
v15;
}
while ( v15 < v14 );
v11 = v29;
v10 = dword_6959D0;
}
v11;
}
while ( v11 < *(_DWORD *)(v10 188) / 0x44u );
}
v24 = dword_6959C4;
if ( *(int *)(dword_6959C4 16) >= 1 )
{
v25 = 0;
v26 = 0;
do
{
v27 = *(_DWORD *)(v24 20);
v32[1] = *(_DWORD *)(*(_DWORD *)(v24 36) 12 * *(_DWORD *)(v27 v25));
v32[0] = *(_DWORD *)(*(_DWORD *)(dword_6959C0 20) 4 * *(_DWORD *)(v27 v25 4));
sub_4B5CFC(&v35, v32);
v24 = dword_6959C4;
v25 = 12;
v26;
}
while ( v26 < *(_DWORD *)(dword_6959C4 16) );
}
sub_4C70FC(&v35);
if ( LODWORD(v35) )
operator delete((void *)LODWORD(v35));
}
其中这个sub_4B5518("CLKFILrMETIDITInDIT", 19);
就是将字符串解密成global-metadata.dat
的位置。
_BYTE *__fastcall sub_4B5518(char *a1, int a2)
{
_BYTE *result; // r0
int v5; // r1
_BYTE *v6; // r2
char v7; // t1
result = malloc(a2 1);
if ( a2 >= 1 )
{
v5 = a2;
v6 = result;
do
{
v7 = *a1 ;
--v5;
*v6 = (v7 - 2) ^ 0x26;
}
while ( v5 );
}
result[a2] = 0;
return result;
}
然后sub_513060
则实际上就是MetadataLoader::LoadMetadataFile
int __fastcall sub_513060(const char *a1)
{
void *v2; // r0
int v3; // r4
int v4; // r6
size_t v5; // r5
int v6; // r8
unsigned int *v7; // r2
int v8; // r1
void *v9; // r0
void *v10; // r0
unsigned int *v12; // r2
int v13; // r1
unsigned int *v14; // r2
int v15; // r1
int v16; // [sp Ch] [bp-4Ch] BYREF
int v17[2]; // [sp 10h] [bp-48h] BYREF
int v18; // [sp 18h] [bp-40h] BYREF
int v19[2]; // [sp 1Ch] [bp-3Ch] BYREF
int v20; // [sp 24h] [bp-34h] BYREF
int v21; // [sp 28h] [bp-30h] BYREF
int v22[2]; // [sp 2Ch] [bp-2Ch] BYREF
int v23[2]; // [sp 34h] [bp-24h] BYREF
sub_4C5B40(&v20);
v19[0] = (int)"Metadata";
v19[1] = 8;
v22[0] = v20;
v22[1] = *(_DWORD *)(v20 - 12);
sub_4C7F74(&v21, v22, v19);
v2 = (void *)(v20 - 12);
if ( (_UNKNOWN *)(v20 - 12) != &unk_6A25F4 )
{
v7 = (unsigned int *)(v20 - 4);
__dmb(0xBu);
do
v8 = __ldrex(v7);
while ( __strex(v8 - 1, v7) );
__dmb(0xBu);
if ( v8 <= 0 )
j_operator delete(v2);
}
v17[0] = (int)a1;
v17[1] = strlen(a1);
v23[0] = v21;
v23[1] = *(_DWORD *)(v21 - 12);
sub_4C7F74(&v18, v23, v17);
v3 = 0;
v16 = 0;
v4 = sub_4CDA80(&v18, 3, 1, 1, 0, &v16);
if ( !v16 )
{
v5 = sub_4CDE4C(v4, &v16);
if ( !v16 )
{
v6 = sub_5163A8(v4, 0, 0);
sub_4CDCF4(v4, &v16);
if ( v16 )
{
v3 = 0;
sub_516540(v6, 0);
}
else
{
v3 = sub_512FDC(v6, v5);
}
}
}
v9 = (void *)(v18 - 12);
if ( (_UNKNOWN *)(v18 - 12) != &unk_6A25F4 )
{
v12 = (unsigned int *)(v18 - 4);
__dmb(0xBu);
do
v13 = __ldrex(v12);
while ( __strex(v13 - 1, v12) );
__dmb(0xBu);
if ( v13 <= 0 )
j_operator delete(v9);
}
v10 = (void *)(v21 - 12);
if ( (_UNKNOWN *)(v21 - 12) != &unk_6A25F4 )
{
v14 = (unsigned int *)(v21 - 4);
__dmb(0xBu);
do
v15 = __ldrex(v14);
while ( __strex(v15 - 1, v14) );
__dmb(0xBu);
if ( v15 <= 0 )
j_operator delete(v10);
}
return v3;
}
对比源码发现这个sub_512FDC
就是解密函数
char *__fastcall sub_512FDC(int a1, size_t size)
{
char *result; // r0
size_t v5; // r2
result = (char *)malloc(size);
if ( size )
{
v5 = 0;
do
{
*(_DWORD *)&result[v5 & 0xFFFFFFFC] = *(_DWORD *)(a1 (v5 & 0xFFFFFFFC)) ^ dword_5DCF6C[(v5 v5 / 0x84) % 0x84];
v5 = 4;
}
while ( v5 < size );
}
return result;
}
写出解密脚本
代码语言:javascript复制import struct
f = open('global-metadata.dat', 'rb')
a = ""
a = f.read()
key = [0xF83DA249, 0x15D12772, 0x40C50697, 0x984E2B6B, 0x14EC5FF8, 0xB2E24927,
0x3B8F77AE, 0x472474CD, 0x5B0CE524, 0xA17E1A31, 0x6C60852C, 0xD86AD267, 0x832612B7, 0x1CA03645, 0x5515ABC8,
0xC5FEFF52, 0xFFFFAC00, 0x0FE95CB6, 0x79CF43DD, 0xAA48A3FB, 0xE1D71788, 0x97663D3A, 0xF5CFFEA7, 0xEE617632,
0x4B11A7EE, 0x040EF0B5, 0x0606FC00, 0xC1530FAE, 0x7A827441, 0xFCE91D44, 0x8C4CC1B1, 0x7294C28D, 0x8D976162,
0x8315435A, 0x3917A408, 0xAF7F1327, 0xD4BFAED7, 0x80D0ABFC, 0x63923DC3, 0xB0E6B35A, 0xB815088F, 0x9BACF123,
0xE32411C3, 0xA026100B, 0xBCF2FF58, 0x641C5CFC, 0xC4A2D7DC, 0x99E05DCA, 0x9DC699F7, 0xB76A8621, 0x8E40E03C,
0x28F3C2D4, 0x40F91223, 0x67A952E0, 0x505F3621, 0xBAF13D33, 0xA75B61CC, 0xAB6AEF54, 0xC4DFB60D, 0xD29D873A,
0x57A77146, 0x393F86B8, 0x2A734A54, 0x31A56AF6, 0x0C5D9160, 0xAF83A19A, 0x7FC9B41F, 0xD079EF47, 0xE3295281,
0x5602E3E5, 0xAB915E69, 0x225A1992, 0xA387F6B2, 0x7E981613, 0xFC6CF59A, 0xD34A7378, 0xB608B7D6, 0xA9EB93D9,
0x26DDB218, 0x65F33F5F, 0xF9314442, 0x5D5C0599, 0xEA72E774, 0x1605A502, 0xEC6CBC9F, 0x7F8A1BD1, 0x4DD8CF07,
0x2E6D79E0, 0x6990418F, 0xCF77BAD9, 0xD4FE0147, 0xFEF4A3E8, 0x85C45BDE, 0xB58F8E67, 0xA63EB8D7, 0xC69BD19B,
0xDA442DCA, 0x3C0C1743, 0xE6F39D49, 0x33568804, 0x85EB6320, 0xDA223445, 0x36C4A941, 0xA9185589, 0x71B22D67,
0xF59A2647, 0x3C8B583E, 0xD7717DED, 0xDF05699C, 0x4378367D, 0x1C459339, 0x85133B7F, 0x49800CE2, 0x3666CA0D,
0xAF7AB504, 0x4FF5B8F1, 0xC23772E3, 0x3544F31E, 0x0F673A57, 0xF40600E1, 0x7E967417, 0x15A26203, 0x5F2E34CE,
0x70C7921A, 0xD1C190DF, 0x5BB5DA6B, 0x60979C75, 0x4EA758A4, 0x078FE359, 0x1664639C, 0xAE14E73B, 0x2070FF03]
with open('decrypt', 'wb') as fp:
n = 0
while n < len(a):
num = struct.unpack("<I", a[n:n 4])[0]
num = num ^ key[(n n // 0x84) % 0x84]
d = struct.pack('I', num)
fp.write(d)
n = n 4
解密完成后发现还是不能用Il2cppDumper,将解密后的文件放到010editor里发现魔数不对,改成AF 1B B1 FA
就行了,原来他把魔数校验的那一步给去掉了,所以可以改魔数,这样就可以防止用前面提到的通用frida脚本来dump了。
题解
用Il2cppDumper解析完成后发现它有个下面几个函数
代码语言:javascript复制public Void .ctor() { }
// RVA: 0x518834 VA: 0xc575a834
private Void Start() { }
// RVA: 0x518838 VA: 0xc575a838
private Void Update() { }
// RVA: 0x51883c VA: 0xc575a83c
public Void Click() { }
// RVA: 0x518a24 VA: 0xc575aa24
private Boolean CheckFlag(String input) { }
// RVA: 0x518b54 VA: 0xc575ab54
public static String AESEncrypt(String text, String password, String iv) { }
// RVA: 0x518ee4 VA: 0xc575aee4
public static String AESDecrypt(String text, Byte[] password, Byte[] iv) { }
// RVA: 0x5191f0 VA: 0xc575b1f0
private static Void .cctor() { }
CheckFlag
的偏移是0x518a24
,把libil2cpp.so
放IDA里然后按G跳转过去,查看函数
int __fastcall sub_518A24(int a1, int a2)
{
int v3; // r0
int v4; // r4
if ( !byte_69C825 )
{
sub_4B82BC(1279);
byte_69C825 = 1;
}
v3 = dword_698140;
if ( (*(_BYTE *)(dword_698140 178) & 1) != 0 && !*(_DWORD *)(dword_698140 96) )
{
il2cpp_runtime_class_init_0();
v3 = dword_698140;
}
v4 = sub_518B54(
*(_DWORD *)(v3 80),
a2,
*(_DWORD *)(*(_DWORD *)(v3 80) 4000),
*(_DWORD *)(*(_DWORD *)(v3 80) 2364));
if ( (*(_BYTE *)(dword_696FB8 178) & 1) != 0 && !*(_DWORD *)(dword_696FB8 96) )
il2cpp_runtime_class_init_0();
return sub_7D644(0, v4, dword_69B7F0, 0);
}
这里这个sub_518B54
函数其实就是AESDecrypt
,可以用Il2cppDumper的IDA脚本来还原函数名,这边就偷懒不还原了,大概的代码逻辑就是将你的输入做一下加密然后和flag进行对比,所以我们打印一下AES的key以及flag做一下解密就行了。
其他方法
有几个比较先进的工具可以帮助我们对Il2cpp进行逆向分析,用来解这题也非常方便
- Zygisk-Il2CppDumper:一个Magisk插件,可以动态dump函数名和函数偏移,最初需要自己配置安卓开发环境,现在作者改用github Action,直接fork一份填个包名就能编译使用,很方便。
- frida-il2cpp-bridge:一个frida库,功能非常强大,不仅有动态dump函数名和函数偏移的功能,还有trace、hook等多种功能,非常值得尝试。
Reference
unity3d il2cpp原理解析及逆向分析 IL2CPP Tutorial: Finding loaders for obfuscated global-metadata.dat files Unity之IL2CPP解析 Baby unity3D Il2cpp源码