Saturday, November 8, 2008

Hooking library calls on Mac using DYLD_INSERT_LIBRARIES

Mac offers a way to override functions in a shared library with DYLD_INSERT_LIBRARIES environment variable (which is similar to LD_PRELOAD on Linux). When you make a twin brother of a function that is defined in an existing shared library, put it in you a shared library, and you register your shared library name in DYLD_INSERT_LIBRARIES, your function is used instead of the original one. This is my simple test. Here I've replaced f() in mysharedlib.dylib with f() in openhook.dylib.


$ cat mysharedlib.h
void f();
$ cat mysharedlib.c
#include <stdio.h>
#include "mysharedlib.h"

void f()
{
printf("hello");
}
$ cat main.c
#include <stdio.h>
#include "mysharedlib.h"

int main()
{
f();
return 0;
}
$ cat openhook.c
#include <stdio.h>
#include <dlfcn.h>
#include <unistd.h>
#include "mysharedlib.h"

typedef void (*fType)();
static void (*real_f)() = NULL;

void f()
{
if ( ! real_f)
{
void* handle = dlopen("mysharedlib.dylib", RTLD_NOW);
real_f = (fType)dlsym(handle, "f");
if ( ! real_f) printf("NG");
}
printf("--------zzz--------");
real_f();
}
$ cat bat
#!/bin/bash
gcc -flat_namespace -dynamiclib -o openhook.dylib openhook.c
gcc -dynamiclib -o mysharedlib.dylib mysharedlib.c
gcc mysharedlib.dylib main.c
export DYLD_FORCE_FLAT_NAMESPACE=
export DYLD_INSERT_LIBRARIES=openhook.dylib
./a.out
$ ./bat
--------zzz--------hello

You also need to define DYLD_FORCE_FLAT_NAMESPACE (doesn't matter what value it has). In general it makes the command (in this case a.out) unstable, not a lot in my opinion if we use it just for debugging purpose, but it increases the chance of symbol name conflicts.

You can use the same technique to override a method in a C++ class. Say there's a method named "fff" in a class AAA, like

class AAA
{
public:
int m;
AAA(){m = 1234;}
void fff(int a);
};

To override it, you first need to know the mangled symbol name of the method.

$ nm somelibrary.dylib | grep "T "
00000ed6 T __ZN3AAA3fffEi

Then what you need to define is _ZN3AAA3fffEi. Don't forget removing the first '_'. If you see multiple symbols in the shared library and not sure which one to override, you can check it by demangling a symbol like

$ c++filt __ZN3AAA3fffEi
AAA::fff(int)

Now you can override it like this.

$ cat mysharedlib.h
class AAA
{
public:
int m;
AAA(){m = 1234;}
void fff(int a);
};
$ cat mysharedlib.cpp
#include <stdio.h>
#include "mysharedlib.h"

void AAA::fff(int a)
{

printf("--ORIGINAL:%d--", a);
}
$ cat main.cpp
#include <stdio.h>
#include "mysharedlib.h"

int main()
{
AAA a;
printf("--------main1--------");
a.fff(50);
printf("--------main2--------");
return 0;
}
$ cat openhook.cpp
#include <stdio.h>
#include <dlfcn.h>
#include <unistd.h>
#include "mysharedlib.h"

typedef void (*AAAfffType)(AAA*, int);
static void (*real_AAAfff)(AAA*, int);

extern "C"
{

void _ZN3AAA3fffEi(AAA* a, int b)
{
printf("--------AAA::fff--------");
printf("%d, %d", b, a->m);
void* handle = dlopen("mysharedlib.dylib", RTLD_NOW);
real_AAAfff = (AAAfffType)dlsym(handle, "_ZN3AAA3fffEi");
if (real_AAAfff) printf("OK");
real_AAAfff(a, b);
}

}
$ cat bat
#!/bin/bash

gcc -flat_namespace -dynamiclib -lstdc++ -o openhook.dylib openhook.cpp
gcc -dynamiclib -lstdc++ -o mysharedlib.dylib mysharedlib.cpp
gcc -lstdc++ mysharedlib.dylib main.cpp
export DYLD_FORCE_FLAT_NAMESPACE=
export DYLD_INSERT_LIBRARIES=openhook.dylib
./a.out
$ ./bat
--------main1----------------AAA::fff--------50, 1234OK--ORIGINAL:50----------main2--------

Note that the first argument of the function call is this pointer, just like Python passes self to a bound method. C++ just does it implicitely. I believe this is compiler (in this case gcc) implementation specific, and there may be a case it is not true. Please use this technique at your own risk.

Here I assumed you have access to the header file that declares the function to get (the size of) argument data being passed to the function. If the size of arguments are all known (int, float, pointer, ...) you can wrap it even if you don't have the header but if not, you'll need to write assembler to modify the stack to pass the arguments to the original function.

3 comments:

Karan Vasudeva said...

Awesome, thank you. This has saved me a quite a bit of messing about with mach_star for my present purposes.

hohehohe2 [at] gmail.com said...

You are welcome :)

Anonymous said...

Thanks man