2

During debugging of some native code, I jumped into the assembly of calloc() on my MacOS (ARMv8). And I see some interesting technique, that seems to be checking if the instructions are defined at some label, and if not - it doesn't jump there:

    adrp   x8, 436866               
    ldrb   w8, [x8, #0x6c8]         
    tbnz   w8, #0x0, 0x18c2b6068      ; <+120>

And at 0x18c2b6068 address I see:

malloc_logger:
    udf    #0x0                     
    udf    #0x0                     
    udf    #0x0                     

So this immediate value 0x0 works like a flag of whether the code is defined here.

But who made this piece of code undefined? Compiler? Why not just remove these instructions completely? When would it actually be defined?

5
  • stackoverflow.com/questions/5167557/… Commented Nov 20 at 17:02
  • My guess is the byte being tested is meant as data rather than code. This would be a typical way to load and test a bool variable, for instance. It may just be that your debugger or dumper has chosen (or been told) to display those bytes as disassembly rather than numerical values or characters. Commented Nov 20 at 17:11
  • Loading an instruction to test whether it's undefined wouldn't make any sense. Also note the code only tests bit 0 of the value loaded, so for instance it would catch any ALU instruction with an even numbered destination register. To see if it's a UDF, you'd need to load a 32-bit word (not just a byte as here) and test whether bits 16-31 are all zero. Commented Nov 20 at 17:13
  • @NateEldredge, I see.. But if it's just a variable (which makes more sense), what could the label malloc_logger mean? Commented Nov 20 at 17:21
  • @StanislavBashkyrtsev: It could just be the name of the variable. Labels (symbols) can identify addresses of data as well as code. Commented Nov 20 at 18:20

1 Answer 1

3

Human error on your part.

Right off the bat, we can tell that the load and the jump don't target the same address, because adrp generates addresses aligned to 0x1000 bytes, so the bottom 12 bits of the load are 0x6c8 whereas the bottom 12 bits of the jump are 0x068. They can't possibly be the same.

Going deeper: you're on macOS 26.0.1 (25A362) and this is the function you're looking at:

;-- __malloc_zone_calloc:
0x1802f9ff0      085435b0       adrp x8, sym._ctr_des
0x1802f9ff4      08dd41f9       ldr x8, [x8, 0x3b8]
0x1802f9ff8      095435f0       adrp x9, 0x1ead7c000
0x1802f9ffc      296943f9       ldr x9, [x9, 0x6d0]
0x1802fa000      1f0100eb       cmp x8, x0
0x1802fa004      200940fa       ccmp x9, 0, 0, eq
0x1802fa008      81000054       b.ne 0x1802fa018
0x1802fa00c      085c36d0       adrp x8, 0x1ece7c000
0x1802fa010      083540f9       ldr x8, [x8, 0x68]
0x1802fa014      000140f9       ldr x0, [x8]
0x1802fa018      085435d0       adrp x8, 0x1ead7c000
0x1802fa01c      08215b39       ldrb w8, [x8, 0x6c8]
0x1802fa020      48020037       tbnz w8, 0, 0x1802fa068
0x1802fa024      085435d0       adrp x8, 0x1ead7c000
0x1802fa028      085143f9       ldr x8, [x8, 0x6a0]
0x1802fa02c      e80100b5       cbnz x8, 0x1802fa068
0x1802fa030      086840b9       ldr w8, [x0, 0x68]
0x1802fa034      1f310071       cmp w8, 0xc
0x1802fa038      89010054       b.ls 0x1802fa068
0x1802fa03c      1f410071       cmp w8, 0x10
0x1802fa040      e3000054       b.lo 0x1802fa05c
0x1802fa044      045440f9       ldr x4, [x0, 0xa8]
0x1802fa048      e8031eaa       mov x8, x30
0x1802fa04c      e843c1da       xpaci x8
0x1802fa050      038542d3       ubfx x3, x8, 2, 0x20
0x1802fa054      50c68dd2       mov x16, 0x6e32
0x1802fa058      90081fd7       braa x4, x16
0x1802fa05c      031040f9       ldr x3, [x0, 0x20]
0x1802fa060      f03688d2       mov x16, 0x41b7
0x1802fa064      70081fd7       braa x3, x16
0x1802fa068      01000014       b sym.__malloc_zone_calloc_instrumented_or_legacy

To prove that, let's look at these three instructions:

0x1802fa018      085435d0       adrp x8, 0x1ead7c000
0x1802fa01c      08215b39       ldrb w8, [x8, 0x6c8]
0x1802fa020      48020037       tbnz w8, 0, 0x1802fa068
  1. The ldrb is the exact same as in your snippet, that's how I found this in the first place.
  2. The adrp has a page delta encoding of 0x6aa82000, which your disassembler shows as 0x6aa82 in decimal, i.e. 436866. Together, these two are almost guaranteed to uniquely identify your dyld_shared_cache, since there's over 3000 libraries merged in there (you'll notice that the full delta of 0x6aa826c8 is almost 1.8GB away from the instruction performing the load).
  3. The bottom 14 bits of the tbnz address matches (i.e. 0x18c2b6068 & 0x3fff == 0x1802fa068 & 0x3fff == 0x2068).

This tells us that your cache is running with an ASLR slide of 0xbfbc000 versus the unslid image on disk. So your slid address of 0x18c2b6068 would correspond to unslid 0x1802fa068 - which is the last instruction of the function above - and that's what makes sense, jumps within functions are what tb[n]z are usually used for, with their rather limited bits for jump distance. It's also what my disassembler is showing.

So how did you get to malloc_logger? Well that's the next load after that:

0x1802fa024      085435d0       adrp x8, 0x1ead7c000
0x1802fa028      085143f9       ldr x8, [x8, 0x6a0]
0x1802fa02c      e80100b5       cbnz x8, 0x1802fa068

The variable that's loaded for the tbnz check is malloc_slowpath, and if I got there and print a bunch of surrounding stuff as instructions, I see this:

[0x1ead7c6c8]> pd -10
         ;-- _malloc_logger:
         0x1ead7c6a0      00000000       invalid
         0x1ead7c6a4      00000000       invalid
         0x1ead7c6a8      00000000       invalid
         ;-- _malloc_tracing_enabled:
         0x1ead7c6ac  ~   00000000       invalid
         ;-- _malloc_interposition_compat:
         0x1ead7c6af      00             unaligned
         0x1ead7c6b0      00000000       invalid
         ;-- _malloc_sec_transition_policy:
         0x1ead7c6b4      00000000       invalid
         ;-- _malloc_sec_transition_early_malloc_support:
         0x1ead7c6b8      00000000       invalid
         0x1ead7c6bc      00000000       invalid
         ;-- _malloc_check_start:
         0x1ead7c6c0      00000000       invalid
[0x1ead7c6c8]> pd 10
         ;-- _malloc_slowpath:
         0x1ead7c6c8      00000000       invalid
         0x1ead7c6cc      00000000       invalid
         ;-- _lite_zone:
         0x1ead7c6d0      00000000       invalid
         0x1ead7c6d4      00000000       invalid
         ;-- _malloc_zero_on_free_sample_period:
         0x1ead7c6d8      00000000       invalid
         0x1ead7c6dc      00000000       invalid
         ;-- ___mach_stack_logging_shared_memory_address:
         0x1ead7c6e0      00000000       invalid
         0x1ead7c6e4      00000000       invalid
         ;-- _stack_logging_enable_logging:
         0x1ead7c6e8      00000000       invalid
         0x1ead7c6ec      00000000       invalid

So the first load is for malloc_slowpath, which is not too far from malloc_logger, which is the target of the load right after that. But either way, we are in the DATA segment here, these are variables, not functions, there is no code here!

But hey, did you know that this code is open source?

Sign up to request clarification or add additional context in comments.

4 Comments

; <+120> looks like GDB syntax for an address 120 bytes into the current function. Does that make any sense? And GDB's disassembly by default omits a hexdump of the machine code. But on macOS I guess LLVM-DB is more likely; does it use that syntax, too?
Okay, that actually makes more sense! The jump target is 120 bytes from the top of the function.
Thanks! I like it how you figured it all out having just couple of lines of assembly and my incorrect description of the problem :) Took me couple of days to research all the things that you reference. Really cool! What I couldn't figure out is how you operate with the "unslid" code? Plain objdump -d /System/Volumes/Preboot/Cryptexes/OS/System/Library/dyld/dyld_shared_cache_arm64e doesn't work. Did you use some special utils for this?
I used radare2. It has a dsc:// protocol for the dyld shared cache, and a R_DYLDCACHE_FILTER env var to filter for only the libraries you want (plus their dependencies). So what I used was R_DYLDCACHE_FILTER=libsystem_malloc.dylib r2 dsc:///System/Cryptexes/OS/System/Library/dyld/dyld_shared_cache_arm64e.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.