Google CTF 2024
Table of Contents
Google CTF 2024 #
Reverse #
not_obfuscate #
Description:
True story: I had planned on learning to write LLVM passes and use them for obfuscation for this year’s gCTF challenge.However, when I compiled this code with optimizations enabled, I decided I didn’t need to write an obfuscator. It’s a crackme, you know the drill. GL HF!
analysis #
After a quick check, I found the code flow is effortless:
First, it reads 32 chars from the console, then converts it from hex bytes. It saves these bytes as a rc4 key. So we know that it takes the 16-byte key from our input.
That key is just used for rc4 to decrypt a given buffer, a.k.a flag.
After that, it converts 16 bytes to an array, also a 4x4 matrix (the matrix is that thing I don’t think about when doing the challenge, so I got stuck while analyzing and guessing that thing).
So, here is the code flow of the challenge:
matrix_input = convert_fromhex()
matrix_temp = func(matrix_input,given_matrix_1)
...// a whole bunch of code with `matrix_temp`...
result = func(matrix_temp,given_matrix_2)
result = substitute(result) // like a AES subbyte
-> compare `result` with the given result [0x345,0x1,0x215...]
To be honest, I didn’t solve the challenge but after a quick check of the src, I can’t believe that the challenge flow is not hard at all, it’s about the math problem. So I decided to test it all.
Okay, just think, if we know that a 16-word array is a matrix, so do_something
is probably an operation of the matrix, like add, multiply, inverse,…vv.
So I decided to test it, this one is the input matrix.
import numpy as np
def array_to_matrix(a:list):
return np.matrix([a[0:4],a[4:8],a[8:12],a[12:16]])
inp = [0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201]
inp_matrix = array_to_matrix(inp)
buf1 = [0x49, 0x324, 0x7, 0x22, 0x24b, 0x210, 0x123, 0x17, 0x2b, 0x329, 0x21, 0x3, 0x241, 0x344, 0x330, 0x32a]
given_matrix_1 = array_to_matrix(buf1)
##test
result = inp_matrix+given_matrix_1
result = np.remainder(result,0x400)
print(result)
result = inp_matrix*given_matrix_1
result = np.remainder(result,0x400)
print(result)
Here’s result in python:
[[586 293 520 547]
[ 76 17 804 536]
[556 298 546 516]
[ 66 325 305 299]]
[[256 417 635 870]
[256 417 635 870]
[256 417 635 870]
[256 417 635 870]]
result in IDA:
[586, 293, 520, 547, 76, 17, 804, 536, 556, 298, 546, 516, 66, 325, 305, 299]
So, do_something
seems like a matrix add
operation. But when I take another add
, it seems incorrect:
tmp1 = array_to_matrix([0x248, 0x10a, 0x323, 0x230, 0x125, 0x149, 0x206, 0x132, 0x217, 0x10b, 0x312, 0x220, 0x1a, 0x228, 0x46, 0x307])
tmp2 = array_to_matrix([0x308, 0x227, 0x133, 0x1, 0x120, 0x139, 0x205, 0x100, 0x145, 0x221, 0x29, 0x11a, 0x318, 0x13, 0x11a, 0x16])
print(np.remainder(tmp1,tmp2))
Result:
[[584 266 189 0]
[ 5 16 1 50]
[210 267 7 262]
[ 26 1 70 5]]
Result in IDA:
[323, 804, 6, 561, 581, 549, 11, 562, 780, 812, 827, 826, 805, 571, 259, 784]
Even add
operation in a special way, I will explain in rns:
So the correct add
is:
def add(a,b):
ret = 0
ret = ((a>>8) + (b>>8))%4
ret <<= 4
ret |= (((a>>4)&0xf)+((b>>4)&0xf))%5
ret <<= 4
ret |= ((a&0xf) + (b&0xf))%13
return ret
print([add(x,y) for x,y in zip(tmp1,tmp2)]) # [323, 804, 6, 561, 581, 549, 11, 562, 780, 812, 827, 826, 805, 571, 259, 784]
Then we know:
add(inp,matrix1)
...whole bunch of code...
add(inp,matrix2)
I found another matrix that I didn’t know while doing this challenge (it looked like an unknown xmmword, so I formatted and renamed it):
Do you think a whole bunch of code means just a simple operation? Yes, that’s it, the matrix multiplied with the other given matrix I found.
Testing
temp_matrix = inp_matrix+given_matrix_1
temp_matrix = np.remainder(temp_matrix,0x400)
after_shit = np.remainder(temp_matrix*another_matrix,0x400)
print(after_shit)
[[ 833 772 481 37]
[ 162 395 832 1005]
[ 708 856 200 72]
[ 651 559 943 164]]
Result in IDA:
Python>result = [idaapi.get_word(0x7FFDDB69FAA0 + i) for i in range(0,32,2)]
Python>print(result)
[584, 266, 803, 560, 293, 329, 518, 306, 535, 267, 786, 544, 26, 552, 70, 775]
It’s not working, am I missing something?
After a quick check, I found that another_matrix
is used for later, and I see that just a simple xor operation but in the form of substitution.
def rns_xor_with_buf2(buf):
ret = [0]*16
for i in range(16):
num = buf[i] # decode before xor
a,b,c = num&0xf,(num>>4)&0xf,(num>>8)&0xf
d,e,f = buf2[i]&0xf,(buf2[i]>>4)&0xf,(buf2[i]>>8)&0xf
idx = 0x410 * ((5 * a + b)&0xff) + 0x104 * c + 4 * ((5 * d + e)&0xff) + f
ret[i] = given_words[idx]
return ret
I know it’s xor after checking the src code before it xor before it’s pack and decode the matrix. (matrix using a custom base in range 260, instead of 256)
Keep in mind this, the matrix uses rns
, which is like another base.
rns #
Residue number system A residue numeral system (RNS) is a numeral system representing integers by their values modulo several pairwise coprime integers called the moduli. This representation is allowed by the Chinese remainder theorem, which asserts that, if M is the product of the moduli, there is, in an interval of length M, exactly one integer having any given set of modular values. The arithmetic of a residue numeral system is also called multi-modular arithmetic. https://en.wikipedia.org/wiki/Residue_number_system https://personal.utdallas.edu/~ivor/ce6305/m5p.pdf
conv_from_inp[i] = (hex_byte % 13u) | (16 * ((hex_byte % 5u) | (16 * hex_byte) & 0x30));
which means:
input_in_hex = [0x61,0x61,0x61,0x61,...] #'aaaaaaaaa'
matrix = rns(input_in_hex) #[rns(0x61),rns(0x61),rns(0x61),...]
#encode -> [0x12,0x12,......], in base 260
# and the `given_words` also known as `xor_table`
xor_table = [rns(byte_a^ byte_b
not
xor_table = [rns(byte_a) ^ rns(byte_b)]
That’s the reason why I got stuck for a long time…
So, we know the property of xor, then just xor again and we get the original buffer.
to_cmp = after_shit ^ another_matrix
=> after_shit = to_cmp ^ another_matrix
Just get the xor_table from binary, and then I use this function to operate xor.
I’ve tried this one but it didn’t work when I do it again, even if I patch these words into IDA and let it run, still got the wrong answer.
buf2 = [0x1a, 0x12c, 0x24c, 0x11c, 0x12a, 0x137, 0x210, 0x111, 0x242, 0x31b, 0x126, 0x327, 0x329, 0x20b, 0x3, 0x220]
# `another_buffer` a.k.a `buf2`
def rns_xor_with_buf2(buf):
ret = [0]*16
for i in range(16):
num = buf[i]
a,b,c = num&0xf,(num>>4)&0xf,(num>>8)&0xf
d,e,f = buf2[i]&0xf,(buf2[i]>>4)&0xf,(buf2[i]>>8)&0xf
idx = 0x410 * ((5 * a + b)&0xff) + 0x104 * c + 4 * ((5 * d + e)&0xff) + f
ret[i] = given_words[idx]
return ret
So I write another function to xor but just using brute foce, and it work:
def another_xor(buf):
res = []
for i in range(16):
res.append(brute(buf2[i],buf[i]))
return res
def brute(b2,to_cmp):
for hex_byte in range(256):
num = 0
num += hex_byte % 4
num <<= 4
num += hex_byte % 5
num <<= 4
num += hex_byte % 13
a,b,c = num&0xf,(num>>4)&0xf,(num>>8)&0xf
d,e,f = b2&0xf,(b2>>4)&0xf,(b2>>8)&0xf
idx = 0x410 * ((5 * a + b)&0xff) + 0x104 * c + 4 * ((5 * d + e)&0xff) + f
if given_words[idx] == to_cmp:
# print(hex(num),num)
return num
return None
def main():
#recover
recovered = another_xor(to_cmp_w)
print(recovered)
#test
tmp = rns_xor_with_buf2(recovered) # also check in IDA
print(tmp==to_cmp_w)
[771, 282, 38, 267, 39, 299, 819, 576, 770, 843, 529, 32, 837, 323, 315, 257]
True
Now, we have completed step 1, and here’s the flow we got:
tmp = (inp + matrix_1)
sus(tmp) #Whole bunch of Code
tmp = add(tmp,matrix_2)
tmp = xor(tmp, another_matrix)
if tmp == buf:
#we got it
and we have: matrix_1
, matrix_2
, another_matrix
, and buf
Whole bunch of Code? #
The feeling…
RULE 35: It just a matrix operation
Before we reverse, I figure out how it is possible. I mean, How can a simple operation be a bunch of code?
Here’s a simple matrix multiply from programiz
but I’m using built-in type: long double
instead of int
:
#include <iostream>
using namespace std;
int main()
{
long double a[10][10], b[10][10], mult[10][10];
int r1, c1, r2, c2, i, j, k;
cout << "Enter rows and columns for first matrix: ";
cin >> r1 >> c1;
cout << "Enter rows and columns for second matrix: ";
cin >> r2 >> c2;
while (c1!=r2)
{
cout << "Error! column of first matrix not equal to row of second.";
cout << "Enter rows and columns for first matrix: ";
cin >> r1 >> c1;
cout << "Enter rows and columns for second matrix: ";
cin >> r2 >> c2;
}
// Storing elements of first matrix.
cout << endl << "Enter elements of matrix 1:" << endl;
for(i = 0; i < r1; ++i)
for(j = 0; j < c1; ++j)
{
cout << "Enter element a" << i + 1 << j + 1 << " : ";
cin >> a[i][j];
}
// Storing elements of second matrix.
cout << endl << "Enter elements of matrix 2:" << endl;
for(i = 0; i < r2; ++i)
for(j = 0; j < c2; ++j)
{
cout << "Enter element b" << i + 1 << j + 1 << " : ";
cin >> b[i][j];
}
// Initializing elements of matrix mult to 0.
for(i = 0; i < r1; ++i)
for(j = 0; j < c2; ++j)
{
mult[i][j]=0;
}
// Multiplying matrix a and b and storing in array mult.
for(i = 0; i < r1; ++i)
for(j = 0; j < c2; ++j)
for(k = 0; k < c1; ++k)
{
mult[i][j] += a[i][k] * b[k][j];
}
// Displaying the multiplication of two matrix.
cout << endl << "Output Matrix: " << endl;
for(i = 0; i < r1; ++i)
for(j = 0; j < c2; ++j)
{
cout << " " << mult[i][j];
if(j == c2-1)
cout << endl;
}
return 0;
}
So, I compiled with Clang++ and see the different between without optimize and optimize flag enabled:
We easily see that optimization makes the code harder. Even if it is just a built-in type, the challenge is using a custom one, which is almost impossible.
It’s just a point of view from a reverser, here’s more about clang optimize:
https://www.incredibuild.com/blog/compiling-with-clang-optimization-flags https://news.ycombinator.com/item?id=28207207 https://www.reddit.com/r/C_Programming/comments/conavx/clangs_optimizer_is_ridiculously_smart_like/
In the challenge, we see it as SIMD:
https://ftp.cvut.cz/kernel/people/geoff/cell/ps3-linux-docs/CellProgrammingTutorial/BasicsOfSIMDProgramming.html https://www.codeproject.com/Articles/5298048/Using-SIMD-to-Optimize-x86-Assembly-Code-in-Array
We know that the code just doing some simple operations but is optimized.
After a day I figure out what is it. I finally found it just matrix multiply (as I guess) but it was difficult to find the multiplicand (matrix A)
So, to save time I decided to get it from src code (will be updated later).
Here is the complete flow:
A = Matrix_rns([208, 120, 107, 19, 138, 163, 11, 174, 217, 67, 60, 143, 13, 232, 181, 2],conv=True) # from source
B = Matrix_rns([0x49, 0x324, 0x7, 0x22, 0x24b, 0x210, 0x123, 0x17, 0x2b, 0x329, 0x21, 0x3, 0x241, 0x344, 0x330, 0x32a],ls=True) #from IDA
C = Matrix_rns([0x308, 0x227, 0x133, 0x1, 0x120, 0x139, 0x205, 0x100, 0x145, 0x221, 0x29, 0x11a, 0x318, 0x13, 0x11a, 0x16],ls=True)#from IDA
D = Matrix_rns([0x1a, 0x12c, 0x24c, 0x11c, 0x12a, 0x137, 0x210, 0x111, 0x242, 0x31b, 0x126, 0x327, 0x329, 0x20b, 0x3, 0x220],ls=True)#from IDA
#### THE FLOW
inp_matrix = Matrix_rns([0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201, 0x201],ls=True) # test input with 'aaaaaaaaaaaaaaaaaaaa'
temp = inp_matrix.add(B)
# print(temp.as_list())
temp = temp.mul(A)
# print(temp.as_list())
temp = temp.add(C)
# print(temp.as_list())
result = temp.xor(D)
And here’s the way that I operate xor to get the previous buffer before compare (expected_to_cmp):
from given_words import given_words,L
buf2 = [0x1a, 0x12c, 0x24c, 0x11c, 0x12a, 0x137, 0x210, 0x111, 0x242, 0x31b, 0x126, 0x327, 0x329, 0x20b, 0x3, 0x220]
buf1 = [0 for _ in range(16)]
aftershit = [0x143, 0x324, 0x6, 0x231, 0x245, 0x225, 0xb, 0x232, 0x30c, 0x32c, 0x33b, 0x33a, 0x325, 0x23b, 0x103, 0x310]
to_cmp_w = [0x346,1,0x215,0x4C,0x144,0x44,0x10C,0x301,0x125,0x30,0x309,0x31C,0x42,0x328,0x103,0x332]
assert len(to_cmp_w)==16
def get_index(value):
#rebuild from src
m = [13,5,4]
res = [0]*3
index = 0
for i in range(3):
res[2 - i] = (value >> (4 * i)) & 0xF
for i in range(3):
index += res[2-i]
if i+1 < 3:
index *= m[1-i]
# index &= 0xff
return index
def rns_xor_with_buf2(buf):
ret = [0]*16
for i in range(16):
num = buf[i]
a,b,c = num&0xf,(num>>4)&0xf,(num>>8)&0xf
d,e,f = buf2[i]&0xf,(buf2[i]>>4)&0xf,(buf2[i]>>8)&0xf
idx = 0x410 * ((5 * a + b)&0xff) + 0x104 * c + 4 * ((5 * d + e)&0xff) + f
ret[i] = given_words[idx]
return ret
def another_xor(buf):
res = []
for i in range(16):
res.append(brute(buf2[i],buf[i]))
return res
def brute(b2,to_cmp):
nums = []
for hex_byte in range(256):
num = 0
num += hex_byte % 4
num <<= 4
num += hex_byte % 5
num <<= 4
num += hex_byte % 13
a,b,c = num&0xf,(num>>4)&0xf,(num>>8)&0xf
d,e,f = b2&0xf,(b2>>4)&0xf,(b2>>8)&0xf
idx = 0x410 * ((5 * a + b)&0xff) + 0x104 * c + 4 * ((5 * d + e)&0xff) + f
if given_words[idx] == to_cmp:
# print(hex(num),num)
nums.append(num)
print(nums)
return nums[-1]
def main():
brute(buf2[4],324)
#recover
recovered = another_xor(to_cmp_w)
print(recovered)
#test
tmp = rns_xor_with_buf2(to_cmp_w)
print(tmp)
if __name__=='__main__':
main()
I tried both ways: brute and get xor table, which got me 2 different matrices.
[771, 282, 38, 267, 39, 299, 819, 576, 770, 843, 529, 32, 837, 323, 315, 257]
[771, 282, 38, 267, 76, 299, 819, 576, 770, 843, 529, 32, 837, 323, 315, 257]
I’ve reimplement the rns matrix in python, just need the matrix invert operation.
Here invert from @d4rk9n1ght in sagemath.
def convert_to_1mod(arr):
result = []
for i in arr:
a,b,c = i&0xf, (i>>4)&0xf, (i>>8)&0xf
res = crt([c,b,a], [4,5,13])
result.append(res)
return result
def convert_to_3mod(arr):
result = []
cp = [int(i) for i in arr]
for i in cp:
a,b,c = i %4, i%5, i%13
hehe = a
hehe <<= 4
hehe |= b
hehe <<= 4
hehe |= c
result.append(hehe)
return result
from sage.all import *
gl = GL(3, Zmod(4*5*13))
arr = [48, 3, 803, 838 ,568, 823, 795, 581 ,297, 802, 8, 816, 304, 43, 284, 546]
arr = convert_to_1mod(arr)
A = [arr[:4], arr[4:8], arr[8:12], arr[12:]]
A = Matrix(Zmod(4 * 5 * 13), A)
print(A.is_invertible())
inv_A = A.inverse()
print(A * inv_A)
# print(inv_A)
inv_A_arr = list(inv_A[0]) + list(inv_A[1]) + list(inv_A[2]) + list(inv_A[3])
# print(inv_A_arr)
final = convert_to_3mod(inv_A_arr)
print(final)
output
[811, 276, 795, 9, 295, 561, 306, 825, 295, 309, 842, 844, 580, 842, 279, 313]
Here’s script that recover the input:
A_inv = Matrix_rns([811, 276, 795, 9, 295, 561, 306, 825, 295, 309, 842, 844, 580, 842, 279, 313],ls = True)
print(A.mul(A_inv))
# aftershit = Matrix_rns([0x248, 0x10a, 0x323, 0x230, 0x125, 0x149, 0x206, 0x132, 0x217, 0x10b, 0x312, 0x220, 0x1a, 0x228, 0x46, 0x307],ls=True)
x = aftershit.mul(A_inv)
x = x.sub(B)
x = [x.decode(i) for i in x.as_list()]
print(bytes(x).hex())
output
[[273, 0, 0, 0], [0, 273, 0, 0], [0, 0, 273, 0], [0, 0, 0, 273]]
fcedd5ab42f188b49760fca0d51e6fb1
So, I finally found the correct input but still the wrong key for decryption. It is probably the problem from 2 matrices that I found above.
Also, I tried to brute force the byte in the matrix and got the result:
A_inv = Inverse(Matrix_rns([208, 120, 107, 19, 138, 163, 11, 174, 217, 67, 60, 143, 13, 232, 181, 2],conv=True).as_list())
A_inv = Matrix_rns(A_inv,ls = True)
print("A_inv: ",A_inv.matrix) ### A_inv is correct
enc_flag = b'\xe8M0a\x19\t\x8e\xac\x99\x8a\x8e:\x0f\xf1\xf6%w9t-h#\xc7{~\xa4\xcb\xbe\xee\xc7E\xff\x1c\x8f-m\xcb\xf5b'
for hex_byte in range(256):
num = 0
num += hex_byte % 4
num <<= 4
num += hex_byte % 5
num <<= 4
num += hex_byte % 13
aftershit.matrix[3][1] = num ## the wrong number is at matrix[3][1], I've test by each number of matrix.
x = aftershit.mul(A_inv)
x = x.sub(B)
x = [x.decode(i) for i in x.as_list()]
try:
#print(bytes(x).hex())
arc = ARC4(bytes(x))
flag = arc.decrypt(enc_flag)
if b'CTF' in flag or b'ctf' in flag:
print(flag)
except:
#print(x)
pass
output
A: [[48, 3, 803, 838], [568, 823, 795, 581], [297, 802, 8, 816], [304, 43, 284, 546]]
B: [[73, 804, 7, 34], [587, 528, 291, 23], [43, 809, 33, 3], [577, 836, 816, 810]]
C: [[776, 551, 307, 1], [288, 313, 517, 256], [325, 545, 41, 282], [792, 19, 282, 22]]
D: [[26, 300, 588, 284], [298, 311, 528, 273], [578, 795, 294, 807], [809, 523, 3, 544]]
A_inv: [[811, 276, 795, 9], [295, 561, 306, 825], [295, 309, 842, 844], [580, 842, 279, 313]]
b'Flag: ctf{I_pr0mize_its_jUsT_mAtriCeS}\x00'
Finally got flag: ctf{I_pr0mize_its_jUsT_mAtriCeS}
In my point, it’s the easiest challenge, and not interested at all, but at least, I know that I need to learn more, more, and more math. The new meta of Reverse. Also, I will update the remaining challenge later, pls enjoy.