0

Like the title says, I want to trace ALL functions calls in my application (from inside).

I tried using "_penter" but I get either a recursion limit reached error or an access violation when I try to prevent the recursion.

Is there any way to achieve this ?

Update

What I tried:

extern "C"
{
    void __declspec(naked) _cdecl _penter()
    {
        _asm {
            push    eax
            push    ecx
            push    edx
            mov     ecx, [esp + 0Ch]
            push    ecx
            mov     ecx, offset Context::Instance
            call    Context::addFrame
            pop     edx
            pop     ecx
            pop     eax
            ret
        }
}

class Context
{
 public:
    __forceinline void addFrame(const void* addr) throw() {}

    static thread_local Context Instance;
};

sadly this still gives a stack overflow due to recursion

14
  • 3
    Trace them in what sense? Log them? Step through them? Something else? Commented Jan 27, 2018 at 16:45
  • 1
    In each function, put a line that writes some information to a common log file. Commented Jan 27, 2018 at 16:47
  • 1
    What's the real problem you're trying to solve -- as stated it sounds a bit XY. Also, the fact that you're hitting a recursion limit when using _penter sounds a bit strange. If you do want to inject tracing code into your source you could always have a look at clang's tooling library. Commented Jan 27, 2018 at 17:32
  • 1
    @G.M. Call any non-inlined function from _penter, compiler will insert another _penter there, and you’ll get endless recursion. Commented Jan 28, 2018 at 9:03
  • 2
    but I get either a recursion limit reached error - sure that inside Context::addFrame implementation compiler also insert call _penter which recursive call Context::addFrame. you need implement Context::addFrame in separate c++ file compiled without /Gh option Commented Jan 28, 2018 at 15:08

3 Answers 3

4

Your approach is correct, /Gh and /GH compiler switches + _penter and _pexit functions is the way to go.

I think there’re errors in your implementation of these functions. That’s very low-level stuff, for 32 bit builds you have to use __declspec(naked), and for 64 bit builds you have to use assembler. Both are quite tricky to implement correctly.

Take a look at this repository for an example how to do it right: https://github.com/tyoma/micro-profiler Specifically, to this source file: https://github.com/tyoma/micro-profiler/blob/master/collector/src/hooks.asm As you see, they decided to use assembler for both platforms, and from that they call some C++ function to record call information. Also note how in C++ collector implementation they use __forceinline to avoid recursion.

Sign up to request clarification or add additional context in comments.

2 Comments

I tried doing it like in that repo but sadly to no avail (I updated the question with code sample). Could the fact that I use inline assembly instead of nasm be the problem?
@user1233963 That should works too, here’s an example: github.com/OSRDrivers/penter/blob/master/penterlib/penterlib.c One possible problem is thread_local, I’m not sure you’ll get desired behavior when getting address of that variable in assembler code.
3

but I get either a recursion limit reached error

this can be if inside Context::addFrame implementation compiler also insert call _penter which recursive call Context::addFrame.

but how __forceinline you can ask ? nothing. c/c++ compiler to insert a copy of the function body into each place the function is called from code which is generated by this compiler. c/c++ compiler can not insert a copy of the function body into code, which he not compile itself. so when we call function marked as __forceinline from assembler code - function will be called in usual way but not expanded in place. so your __forceinline simply have no effect and sense

you need implement Context::addFrame (and all functions which it call) in separate c++ file (let be context.cpp) compiled without /Gh option.

you can set /Gh for all files in project, except context.cpp

if exist too many cpp files in project - you can set /Gh for project, but how then remove it for single file context.cpp ? exist one original way - you can copy <cmdline> for this file and that set custom build tool for it Command Line- CL.exe <cmdline> $(InputFileName) (not forget remove /Gh) and Outputs - $(IntDir)\$(InputName).obj. original by perfect work.

so in context.cpp you can have next code:

class Context
{
public:
    void __fastcall addFrame(const void* addr);

    int _n;

    static thread_local Context Instance;
};

thread_local Context Context::Instance;

void __fastcall Context::addFrame(const void* addr)
{
#pragma message(__FUNCDNAME__)

    DbgPrint("%p>%u\n", addr, _n++);
}

if Context::addFrame call some another internal function (explicit or implicit) - put it also in this file, which compile without /Gh

the _penter better implement in separate asm file, but not as inline asm (this not supported in x64 anyway)

so for x86 you can create code32.asm ( ml /c /Cp $(InputFileName) -> $(InputName).obj)

.686p

.MODEL flat

extern ?addFrame@Context@@QAIXPBX@Z:proc
extern ?Instance@Context@@2V12@A:byte

_TEXT segment 'CODE'

__penter proc
    push edx
    push ecx
    mov edx,[esp+8]
    lea ecx,?Instance@Context@@2V12@A
    call ?addFrame@Context@@QAIXPBX@Z
    pop ecx
    pop edx
    ret
__penter endp

_TEXT ends
end

note - you need save only rcx and rdx (if you use __fastcall , except context.cpp, functions)

for x64 - create code64.asm ( ml64 /c /Cp $(InputFileName) -> $(InputName).obj)

extern ?addFrame@Context@@QEAAXPEBX@Z:proc
extern ?Instance@Context@@2V12@A:byte

_TEXT segment 'CODE'

_penter proc
    mov [rsp+8],rcx
    mov [rsp+16],rdx
    mov [rsp+24],r8
    mov [rsp+32],r9
    mov rdx,[rsp]
    sub rsp,28h
    lea rcx,?Instance@Context@@2V12@A
    call ?addFrame@Context@@QEAAXPEBX@Z
    add rsp,28h
    mov r9,[rsp+32]
    mov r8,[rsp+24]
    mov rdx,[rsp+16]
    mov rcx,[rsp+8]
    ret
_penter endp

_TEXT ENDS
end

1 Comment

Compiling context.cpp without that /Gh flag worked! Thanks a lot !
1

Here is what I use

Configuration Properties > C/C++ > Command Line

Add compiler option to Additional Options box

Like so example settings

Add flag /Gh for _penter hook
Add flag /GH for _pexit hook

Code I use for tracing / logging

#include <intrin.h>

extern "C"  void __declspec(naked) __cdecl _penter(void) {
    __asm {
        push ebp;               // standard prolog
        mov ebp, esp;
        sub esp, __LOCAL_SIZE
        pushad;                 // save registers
    }
    // _ReturnAddress always returns the address directly after the call, but that is not the start of the function!
    PBYTE addr;
    addr = (PBYTE)_ReturnAddress() - 5;

    SYMBOL_INFO* mysymbol;
    HANDLE       process;
    process = GetCurrentProcess();
    SymInitialize(process, NULL, TRUE);
    mysymbol = (SYMBOL_INFO*)calloc(sizeof(SYMBOL_INFO) + 256 * sizeof(char), 1);
    mysymbol->MaxNameLen = 255;
    mysymbol->SizeOfStruct = sizeof(SYMBOL_INFO);
    SymFromAddr(process, (DWORD64)((void*)addr), 0, mysymbol);
    myprintf("Entered Function: %s [0x%X]\n", mysymbol->Name, addr);

    _asm {
        popad;              // restore regs
        mov esp, ebp;       // standard epilog
        pop ebp;
        ret;
    }
}

extern "C"  void __declspec(naked) __cdecl _pexit(void) {
    __asm {
        push ebp;               // standard prolog
        mov ebp, esp;
        sub esp, __LOCAL_SIZE
        pushad;                 // save registers
    }
    // _ReturnAddress always returns the address directly after the call, but that is not the start of the function!
    PBYTE addr;
    addr = (PBYTE)_ReturnAddress() - 5;

    SYMBOL_INFO* mysymbol;
    HANDLE       process;
    process = GetCurrentProcess();
    SymInitialize(process, NULL, TRUE);
    mysymbol = (SYMBOL_INFO*)calloc(sizeof(SYMBOL_INFO) + 256 * sizeof(char), 1);
    mysymbol->MaxNameLen = 255;
    mysymbol->SizeOfStruct = sizeof(SYMBOL_INFO);
    SymFromAddr(process, (DWORD64)((void*)addr), 0, mysymbol);
    myprintf("Exit Function: %s [0x%X]\n", mysymbol->Name, addr);

    _asm {
        popad;              // restore regs
        mov esp, ebp;       // standard epilog
        pop ebp;
        ret;
    }
}

1 Comment

Thanks for the code. It worked for me with minor tweaks. But watch out! You allocate heap memory and then never deallocate it. If there are really many calls, then the application will allocate all the available memory and then crash (like in my case). I've changed the code to char buffer[sizeof(SYMBOL_INFO) + MAX_NAME_LENGTH + 1]; mysymbol = reinterpret_cast<SYMBOL_INFO*>(&buffer[0]); mysymbol->MaxNameLen = MAX_NAME_LENGTH;.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.