Brian is young, Brian is happy-go-lucky, Brian do not backup its source codes.
So Brian lost source code of its additions making program (yes he is is not very smart to need a program for additions !). Only remains add.pyc, the compiled version.

Image: Maggie Smith / FreeDigitalPhotos.net

How Python compilation process works

When invoking a python program (executing python ./myprogram.py for example), the python interpreter load file content in memory and compile it:

  • first transforming it in a execution tree called Abstract Syntax Tree (AST)
  • then it look down through this execution tree to generate python bytecode (this is not machine bytecode. It needs to be executed by python interpreter)

Coming back to Brian program, it could look like following:

  1. the following source program
    >>> add = lambda a, b: a+b
    >>> print add
    <function <lambda> at 0xb74ee764>
    
  2. will result in this bytecode
    >>> add.func_code.co_code
    '|\x00\x00|\x01\x00\x17S'
    

    Here is the same in a human-readable form:

    >>> import dis
    >>> dis.dis(add)
    1 LOAD_FAST                0 (a)
    3 LOAD_FAST                1 (b)
    6 BINARY_ADD
    7 RETURN_VALUE
    

    About python bytecode: it is made of opcodes (LOAD_FAST, BINARY_ADD, ..) and their arguments. These are very similar to assembly language.
    You can find a python bytecodes reference at http://docs.python.org/library/dis.html#python-bytecode-instructions.

And Reblok in there ?

Reblok is able to take a python binary file (with .pyc or pyo extension) or a compiled object and to build back the corresponding execution tree. To do so, Reblok interpret bytecode as python interpreter will do, but instead of executing instructions, it re-generate the execution tree. Our precedent bytecode will give following result:

1 LOAD_FAST                0 (a)              =>  (VAR, a)
3 LOAD_FAST                1 (b)              =>  (VAR, b)
6 BINARY_ADD                                  =>  (ADD, (VAR, a), (VAR, b))
7 RETURN_VALUE                                =>  (RET, (ADD, (VAR, a), (VAR, b))
resulting in:
(RET,
 (ADD
  (VAR, a)
  (VAR, b)
 )
)
Reblok opcodes documentation can be found at http://devedge.bour.cc/resources/reblok/doc/sources/ast.html

Brian is safe

Once done, Brian only need to walk through the Rebloked execution tree to recreate it's program source code:

def do_opcode(instr, src):
if instr[0] == opcodes.RET:
return "return" + do_opcode(instr[1]) + "\n"
elif instr[0] == opcodes.ADD:
return do_opcode(instr[1]) + "+" + do_opcode(instr[2]) + "\n"
elif instr[0] == opcodes.VAR
return instr[1]
print do_opcode(AST)

Build back a program with loops, conditions and classes will just be a slightly more difficult as this :)

I'll conclude with a small funny game. Attached to this bill is a python compiled program embedding a secret passphrase.
It's up to you to decode and found it. Post you discoveries in comments.