Chapters

Hide chapters

Advanced Apple Debugging & Reverse Engineering

Fourth Edition · iOS 16, macOS 13.3 · Swift 5.8, Python 3 · Xcode 14

Section I: Beginning LLDB Commands

Section 1: 10 chapters
Show chapters Hide chapters

Section IV: Custom LLDB Commands

Section 4: 8 chapters
Show chapters Hide chapters

22. Script Bridging Classes & Hierarchy
Written by Walter Tyree

Heads up... You're reading this book for free, with parts of this chapter shown beyond this point as scrambled text.

You’ve learned the essentials of working with LLDB’s Python module, as well as how to correct any errors using Python’s pdb debugging module.

Now you’ll explore the main players within the lldb Python module for a good overview of the essential classes.

You’ll be building a more complex LLDB Python script as you learn about these classes. You’ll create a regex breakpoint that only stops after the scope in which the breakpoint hit has finished executing. This is useful when exploring initialization and accessor-type methods, and you want to examine the object that’s being returned after the function executes.

In this chapter, you’ll learn how to create the functionality behind this script while learning about the major classes within the LLDB module. You’ll continue on with this script in the next chapter by exploring how to add optional arguments to tweak the script based on your debugging needs.

The Essential Classes

Within the lldb module, there are several important classes:

  • lldb.SBDebugger: The “bottleneck” class you’ll use to access instances of other classes inside your custom debugging script.

    There will always be one reference to an instance of this class passed in as the debugger function parameter to your script. This class is responsible for handling input commands into LLDB, and can control where and how it displays the output.

  • lldb.SBTarget: Responsible for the executable being debugged in memory, the debug files, and the physical file for the executable resident on disk.

    In a typical debugging session, you’ll use the instance of SBDebugger to get the selected SBTarget. From there, you’ll be able to access the majority of other classes through SBTarget.

  • lldb.SBProcess: Handles memory access (reading/writing) as well as the multiple threads within the process.

  • lldb.SBThread: Manages the stack frames (SBFrames) within that particular thread, and also manages control logic for stepping.

  • lldb.SBFrame: Manages local variables (given through debugging information) as well as any registers frozen at that particular frame.

  • lldb.SBModule: Represents a particular executable. You’ve learned about modules when exploring dynamic libraries; a module can include the main executable or any dynamically loaded code (like the Foundation framework).

    You can obtain a complete list of the modules loaded into your executable using the image list command.

  • lldb.SBFunction: This represents a generic function — the code — that is loaded into memory. This class has a one-to-one relationship with the SBFrame class.

Got it? No? Don’t worry about it! Once you see how these classes interact with each other, you’ll have a better understanding of their place inside your program.

This diagram is a simplified version of how the major LLDB Python classes interact with each other.

If there’s no direct path from one class to another, you can still get to a class by accessing other variables, not shown in the diagram, that point to an instance (or all instances) of a class (many of which are not shown in the diagram).

That being said, the entry-point into the majority of these objects will be through an instance of SBDebugger, passed in as an instance variable called debugger in your scripts. From there, you’ll likely go after the SBTarget through GetSelectedTarget() to access all the other instances.

Exploring the lldb Module Through… LLDB

Since you’ll be incrementally building a reasonably complex script over the next two chapters, you’ll need a way to reload your LLDB script without having to stop, rerun and attach to a process. You’ll create an alias for reloading the ~/.lldbinit script while running LLDB.

command alias reload_script command source ~/.lldbinit

(lldb) script lldb.debugger
<lldb.SBDebugger; proxy of <Swig Object of type 'lldb::SBDebugger *' at 0x113f2f990> >
(lldb) script lldb.target
<lldb.SBTarget; proxy of <Swig Object of type 'lldb::SBTarget *' at 0x1142daae0> >
(lldb) script print (lldb.target)
Meh
(lldb) script print (lldb.process)
SBProcess: pid = 47294, state = stopped, threads = 7, executable = Meh
(lldb) script print (lldb.thread)
thread #1: tid = 0x13a921, 0x000000010fc69ab0 Meh`ViewController.viewDidLoad(self=0x00007fa8c5b015f0) -> () at ViewController.swift:13, queue = ’com.apple.main-thread’, stop reason = breakpoint 1.1
(lldb) script print (lldb.frame)
frame #0: 0x000000010fc69ab0 Meh`ViewController.viewDidLoad(self=0x00007fa8c5b015f0) -> () at ViewController.swift:13
(lldb) script help(lldb.target)
(lldb) script help(lldb.SBTarget)

Learning & Finding Documentation on Script Bridging Classes

Learning this stuff isn’t easy. You’re faced with the learning curve of the LLDB Python module, as well as learning Python along the way.

Easy Reading

I frequently find myself scouring the class documentation to see what the different classes can do for me with their APIs. However, doing that in the LLDB Terminal makes my eyes water. I typically jump to the online documentation because I am a sucker for basic Cascading Style Sheet(s) with more colors than just the background color and text color.

command regex gdocumentation 's/(.+)/script import os; os.system("open https:" + unichr(47) + unichr(47) + "lldb.llvm.org" + unichr(47) + "python_reference" + unichr(47) + "lldb.%1-class.html")/'
(lldb) gdocumentation SBTarget

Documentation for the More Serious

If you’re one of those developers who really, really needs to master LLDB’s Python module, or if you have plans to build a commercial product which interacts with LLDB, you’ll need to take a more serious approach for digging through the lldb module APIs and documentation.

mdfind SBProcess -onlyin ~/websites/lldb
SBTarget site:lists.llvm.org/pipermail/lldb-dev/
SBTarget site:discourse.llvm.org

Creating the BreakAfterRegex Command

It’s time to create the command you were promised you’d build at the beginning of this chapter!

import lldb

def breakAfterRegex(debugger, command, result, internal_dict):
  print ("yay. basic script setup with input: {}".format(command))

def __lldb_init_module(debugger, internal_dict):
  debugger.HandleCommand('command script add -f BreakAfterRegex.breakAfterRegex bar')
command script add -f BreakAfterRegex.breakAfterRegex bar
command script import ~/lldb/BreakAfterRegex.py
(lldb) reload_script
(lldb) bar UIViewController test -a -b

def breakAfterRegex(debugger, command, result, internal_dict):
  # 1
  target = debugger.GetSelectedTarget()
  breakpoint = target.BreakpointCreateByRegex(command)

  # 2
  if not breakpoint.IsValid() or breakpoint.num_locations == 0:
    result.AppendWarning(
      "Breakpoint isn't valid or hasn't found any hits.")
  else:
    result.AppendMessage("{}".format(breakpoint))

  # 3
  breakpoint.SetScriptCallbackFunction(
    "BreakAfterRegex.breakpointHandler")
(lldb) script help(lldb.SBBreakpoint)
(lldb) gdocumentation SBBreakpoint
def breakpointHandler(frame, bp_loc, dict):
  function_name = frame.GetFunctionName()
  print("stopped in: {}".format(function_name))
  return True

(lldb) reload_script
(lldb) bar somereallylongmethodthatapplehopefullydidntwritesomewhere
warning: Breakpoint isn't valid or hasn't found any hits
(lldb) bar NSObject.init\]
SBBreakpoint: id = 3, regex = 'NSObject.init\]', locations = 2

(lldb) finish
(lldb) po $arg1
<_CFXNotificationNameWildcardObjectRegistration: 0x61000006e8c0>
def breakpointHandler(frame, bp_loc, dict):
  # 1
  '''The function called when the regular
  expression breakpoint gets triggered
  '''

  # 2
  thread = frame.GetThread()
  process = thread.GetProcess()
  debugger = process.GetTarget().GetDebugger()

  # 3
  function_name = frame.GetFunctionName()

  # 4
  debugger.SetAsync(False)

  # 5
  thread.StepOut()

  # 6
  output = evaluateReturnedObject(debugger,
                                  thread,
                                  function_name)
  if output is not None:
    print(output)

  return False
def evaluateReturnedObject(debugger, thread, function_name):
  '''Grabs the reference from the return register
  and returns a string from the evaluated value.
  TODO ObjC only
  '''

  # 1
  res = lldb.SBCommandReturnObject()

  # 2
  interpreter = debugger.GetCommandInterpreter()
  target = debugger.GetSelectedTarget()
  frame = thread.GetSelectedFrame()
  parent_function_name = frame.GetFunctionName()

  # 3
  expression = 'expression -lobjc -O -- $arg1'


  # 4
  interpreter.HandleCommand(expression, res)

  # 5
  if res.HasResult():
    # 6
    output = '{}\nbreakpoint: '\
      '{}\nobject: {}\nstopped: {}'.format(
        '*' * 80,
        function_name,
        res.GetOutput().replace('\n', ''),
        parent_function_name)
    return output
  else:
    # 7
    return None
(lldb) br del
About to delete all breakpoints, do you want to do that?: [Y/n] Y
All breakpoints removed. (1 breakpoint)
(lldb) bar NSObject.init\]

(lldb) bar NSURL(\(\w+\))?\ init

Key Points

  • Use lldb.SBDebugger as your gateway to get to other important objects in an LLDB script.
  • You can use the GetSelectedTarget() method on lldb.SBDebugger to get the lldb.SBTarget which is the code being executed and debugged.
  • The lldb.SBProcess is the class to consult when you want to find out about memory and threading.
  • Stack frames and stepping logic comes from the lldb.SBThread class.
  • The lldb.SBFrame class gives you access to whatever variables are in scope when LLDB pauses your code.
  • The lldb.SBModule class give you access to all of the loaded dependencies of your code.
  • Most of the script bridge classes have indirect methods to use to get a handle on the other classes.
  • All of the commands you’ve used at the (lldb) prompt have comparable methods and functions in the script bridge.
  • Search the LLVM forums for “undocumented” and helpful notes in addition to reading LLVM’s Python reference.
  • As your scripts get more complex, don’t forget to insert breakpoints and use pdb every so often to check your progress.

Where to Go From Here?

You’ve begun your quest to create Python LLDB scripts of real-world complexity. In the next chapter, you’ll take this script even further and add some cool options to customize this script.

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.
© 2024 Kodeco Inc.

You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a Kodeco Personal Plan.

Unlock now