Disclaimer: No Pythons were harmed in the making of this post.
Version used: CPython 3.8.10 on Ubuntu 20.04 x86_64 (WSL)
Snakes can be scary, but Python is a good snek. Python is our friend!
As all good friends must do, we will gaslight it into believing that 2 + 2 = 5.
As an interpreted script language, some parts of Python’s runtime can be changed directly.
For example, we can easily replace built-in functions such as int():
1 | int = lambda x: 1 |
Good, we are getting intimate! But how do we modify arithmetic operations, stuff like +, -? When we implement our own class, it’s quite straightforward. By overriding __add__ and __radd__, we can decide exactly what logic will take place when instances (objects) of our class are being added. For example, we can make a Kitten object, and then make kitten1 + kitten2 return the string "Huge pile of cat hair".
When it comes to literals (ex. int) though, Python does not provide us with means of changing the default behavior for +. The implementation is buried inside the CPython interpreter’s C code, so we’ll need to work a little harder.
Locating the Function
Looking through the CPython source code, in Include/abstract.h, we find the following function definition:
1 | /* Returns the result of adding o1 and o2, or NULL on failure. |
In theory, every int addition operation arrives here!
We can check that with gdb, but first, we’ll need to set up a few things.
Setup
Things we’ll need:
- gdb
- A version of Python with debug symbols
- CPython source code that matches the version
In Ubuntu, the Python 3 debug symbols are shipped in the python3-dbg package.
Note: You may need to edit your APT configuration and add debug symbol sources.
1 | ➜ ~ sudo apt install -y python3-dbg |
Good! We can download this version either from the git repo (v3.8.10 tag), or the official Python website. Let’s download and extract it.
1 | ➜ ~ wget https://www.python.org/ftp/python/3.8.10/Python-3.8.10.tar.xz |
Let’s see if we could get gdb to debug Python with source code support. The dir Python-3.8.10/Python command tells gdb to add the directory to the source search path, so it will be able to show source code.
1 | ➜ ~/pymod gdb python3-dbg |
We got a breakpoint on main, awesome!
Debugging the Interpreter
Now that we have everything set up correctly, let’s see if we land in that PyNumber_Add function we saw earlier!
We’ll close gdb and run Python again:
1 | ➜ ~ python3-dbg |
Now we can attach to it with gdb and set a breakpoint on our target function.
Note: You may need to set ptrace_scope to 0.
1 | ➜ ~ gdb -p `pidof python3-dbg` |
Let’s do a simple addition:
1 | 0x1234 + 0x5678 |
We get a breakpoint hit! let’s look back at the function definition.
1 | Breakpoint 1, PyNumber_Add (v=0x7f6196567f00, w=0x7f619649a800) at ../Objects/abstract.c:957 |
It receives two pointers of the type PyObject *. Without looking into the PyObject struct too much yet, we’ll try to find our values somewhere in the memory pointed to by these arguments. We will use x/10wx v to examine 20 words (4-byte values, so, a total of 40 bytes), in hex format, in the memory pointed to by v.
1 | (gdb) x/10wx v |
Our value, 0x1234, is at an offset of 24 bytes, so, 6 words (4-byte each).
Let’s print it directly for both arguments.
We need to cast v from PyObject * to unsigned int *, add the offset, and dereference the pointer (with *).
1 | (gdb) p/x * ((unsigned int *) v + 6) |
Good, now let’s modify the result!
Here is the interesting part from the function:
1 | PyObject * |
We can setup a simple hook using breakpoint commands.
This feature that allows you to run a gdb command when the debugger reaches a certain breakpoint.
We’ll combine this with a conditional breakpoint - if v and w both contain 2, write 5 to the returned result object.
Using Ctrl-x + a, we can switch to source code view and find the target line to set the breakpoint on - In my case, it is abstract.c:959.
1 | (gdb) b abstract.c:959 if ((* ((unsigned int *) v + 6) == 2) && (* ((unsigned int *) w + 6) == 2)) |
Let’s see what happens.
1 | 1 + 1 |
We are successfully intercepting number addition operations with gdb!
In the Next Part, we will look at implementing this hook from within Python, without the use of a debugger.