Let's figure out how to backup these gigantic 128-bit registers...

Maintaining the integrity and state of the CPU and stack is always vital whenever we’re discussing Omnified processes, as these usually contain code concerned with the seamless overhaul of one gameplay system or another. Because this code gets injected into a running process’s memory, we need to make sure that anything we’re using for our own calculations is restored to how it was when we’re done doing our thing (save for the desired outcome or affected part of the injection, of course).

This, seemingly neurotic, need to preserve CPU data integrity is, without a doubt, exacerbated a bit when we’re talking about injecting foreign code into a process (as is my wont when playing games on my streams), since it is critical that everything is identical as to how it was prior to our code executing (lest a crash doth follow). But, code injection aside, when writing assembly, you’ll often need to temporarily store data in order to make use of whatever register said data was occupying.

Most of the time, this is a simple affair. Well, it is when we’re talking about general-purpose registers. Backing up the “globs” of data found in 128-bit (or wider) SSE registers however? Not as clear cut. Let me show you how I do it.

But first, an overview!

Push it! Pop It!

As you may very well know, it’s easy to temporarily store data in a general-purpose register and then later restore it. We simply use the push instruction to push the data to the stack, and then later use the pop instruction to essentially pop said data off the stack.

Pushing and Popping Some Stuff

// We need to use rax, rbx, rcx. Back them up!
push rax
push rbx
push rcx
// rax, rbx, and rcx are essentially backed up.
mov rax,[someAddress]
mov rbx,[rax+20]
mov rcx,[rbx+10]
movss xmm0,[rcx+4]
// Do whatever we need with these registers...
pop rcx
pop rbx
pop rax
// rax, rbx, and rcx should now have their original values.

Pretty simple. It happens all the time.

Very often, however, we might want to back up some data from a register that isn’t so general-purpose, such as an SSE register. Perhaps we wanted to back up the xmm0 used in the previous example before writing to it — something you’d always want to do if injecting code that makes use of SSE registers into a running process (unless you love those crashes baby).

Illegal Pushing and Popping!

// Same as before...
push xmm0 // <-- This will fail!
movss xmm0,[rcx+4]
// Do whatever we need with xmm0...
pop xmm0 // <-- This too! Fail!

That isn’t going to assemble buddy! You lose!

I would be lying if I said I never tried the above before. Hey, I bet you may have too! The push instruction is just so convenient! Why can’t it work for everything! Sadly, attempting to back up a beefy SSE register in this fashion can only fail horribly.

Looking at some documentation, it’s clear, more or less, that we’re limited in what we can “push” using this instruction: general-purpose registers, memory locations, and an even more limited range of immediate values.

So How Can I Push an SSE Register?

If we understand the mechanics behind how the push and pop instructions actually work, then it becomes rather clear as to how we can back up the data in SSE registers such as xmm0, xmm1, or what have you.

When we push a register, we’re taking what’s on that register and placing it on the stack, the place in memory that essentially acts as temporary storage and which is being pointed to by the rsp register.

How does it place the register’s value on top of the stack? Typically, by subtracting a number of bytes from the stack pointer’s address, causing rsp to essentially point to the next piece of memory that’s not currently being occupied by data that may still be needed (and if it is needed, then someone screwed up bad).

Business is then concluded by writing the value to where rsp now points to. When we pop the value back onto the register, we’re essentially doing what’s described above, but in reverse.

So, if we want to back up one of our SSE registers, we need only do exactly what the push instruction itself does, bearing in mind the difference in the amount of memory required to store an SSE register’s value (sixteen bytes vs a general-purpose register’s eight bytes).

Backing Up an SSE Register

// We need 16 bytes of space to write to on the stack.
sub rsp,0x10 // 0x10 = 16 of course.
// And then we just dump our SSE register onto the stack.
movdqu [rsp],xmm0
// Do what needs to be done with xmm0...
movdqu xmm0,[rsp]
add rsp,0x10
// xmm0 now has its original value, and the stack pointer is pointing to
// where it was. All is well.

You see me do this all the time in my Omnified hacking framework.

Now: with this knowledge go forth and conquer the world, with the surety attainable only when one is confident in the integrity of those many SSE registers that helped them along in their journey.