Emulation is awesome!

"Beep, start broadcasting on channel 11."

"STARTING BROADCAST ON CHANNEL 11"

"For the hackers still out there in 20xx, this is a record about the dozen hours I've spent experimenting with the qiling framework which is awesome."

It's a unicorn dragon, get it?

It's modular and straightforward to extend.

Unfortunately, whenever I've tried to use it for something I'm interested in, I get the dreaded:

syscall not implemented error

Now, I enjoy coding as much as any other hacker but in the robot apocalypse I have limited time to spend implementing syscalls. After all, my security cameras can't watch themselves.

So I've committed the grave sin of spending time to automate something I probably shouldn't have.

I wrote some code to figure out how much time I need to spend writing code.

To do this, I want to figure out which syscalls are used by a function. As a bonus, by keeping track of the number of times a particular syscall is made, I can prioritize which syscall gets called most often and is more important to emulate first.

I forked the Syscaller Binary Ninja plugin and added a few modules to do the following:

  1. Identify library calls a particular function makes.
  2. Count the number of syscalls a function (and its subroutines) uses.
  3. Put the two together, then follow library calls and enumerate syscalls used.

Identifying library calls

To identify library calls, I unnecessarily implemented a breadth first search to follow callees of a function and keep track of any calls to any functions Binary Ninja tags as imported.

The relevant code is in the Libcaller module and looks like this:

def traverse_breadth_first(self, function):
    queue = []
    queue.append(function)
    self.visited.append(function)

    while len(queue) != 0:
      current_func = queue.pop()

      matched_library_call = current_func.symbol in self.imported_functions
      if matched_library_call:
        bn.log_info("[*] Traversing {} at 0x{:x}".format(current_func.name, current_func.start))
        self.libcalls.append(current_func.name)

      for callee in current_func.callees:
        if callee not in self.visited:
          queue.append(callee)
          self.visited.append(callee)

Counting the number of syscalls

The Syscaller module uses Binary Ninja's low-level intermediate language (LLIL) to identify syscalls and add comments with the syscall name at the code location.

To count the number of syscalls, I simply keep a dictionary of the comments and increase the count whenever the same syscall comment is found.

This tracking is performed on a per-function basis so at the end, the results are summarized.

This gives us the total syscall count for a given function and all of the functions and library functions that it calls.

For example, if I had the C code:

void alloc() {
    char * heap = malloc(30);
    return heap;
}

void foo() {
    char * heap = alloc();
}

All the syscalls used when calling foo => alloc => malloc would be counted.

Checking for an existing implementation in the framework

To check which syscalls are already implemented in the framework, we can import the python function that implements it.

However, we don't know what syscalls we need to import until runtime.

I did some digging in the framework code to see how the framework resolves syscalls and found a utility function ql_get_module_function which allows us to import qiling functions at runtime.

Using this utility function, I can pass the module path where the POSIX syscalls are implemented and get the syscall function implementation.

qiling's syscall implementations seem to follow the naming standard ql_syscall_syscall_name.

I wrote a small snippet of python code to check if the syscall exists in the framework or not:

module = "qiling.os.posix.syscall"

for syscall in syscalls:
    try:
        ql_get_module_function(module, syscall)
    except exception.QlErrorModuleFunctionNotFound:
        print(f"{syscall} needs to be implemented!")

Limitations

This approach has a few limitations:

  1. I don't handle library calls that call a function from another library. For example, if libcrypto.so uses a function from libc.so, I don't follow it
  2. I don't follow functions that are wrappers to syscalls and are provided the syscall code through the wrapper.
  3. I only check for the existence of a module. The implementation of that module could be incomplete. For example, the 32-bit socketcall syscall takes in an integer code as an argument that determines what actions the syscall will take. Not all of these actions may be implemented in the emulator framework.

Given the above limitations and my overall laziness, the information output by the syscall enumerator plugin is incomplete. Not to mention, the usability of the UI leaves much to be desired.

But, maybe now I'll be motivated enough to implement some syscalls for something I'm working on. So long as there aren't too many.

Or maybe I'll just keep watching the CCTV.

"Beep, stop broadcasting"

"ENDING BROADCAST"