5

Does Python Virtual Machine require a CPU to execute the bytecode? Is the bytecode converted into the machine code and then the CPU gets involved in the process?

1
  • 1
    Can you clarify what you mean by "does it require"? Every program that runs on standard hardware requires a CPU, as the CPU is what executes the operating system and any software on it. Commented May 19, 2020 at 12:41

2 Answers 2

6

In order to run an application on any computer, its code must always be somehow converted to machine code and then be executed by the CPU. The question is rather when and how this happens.

Let me try and show you how Python effectively executes bytecode.

Compiler vs Interpreter

Imagine the CPU in your computer understands nothing but Latin. You want to send it a letter with detailed instructions or a request, but you do not speak Latin. So, you will engage a translator: someone who translates your "English" letter (or whatever language you use) to Latin for you.

Compiled languages like C or Rust take your entire letter, translate all of it to Latin and really polish it. The outcome is a translated letter that is highly poetic and uses sophisticated language. An interpreter like Python, on the other hand, translates one word or one sentence at a time; it is more really like an interpreter as you encounter in the news that translates what someone in a foreign language says as they speak.

Bytecode

The full translation process from languages like C, Rust, or Python to machine code is quite complex and requires to carefully analyse the original program code. In order to avoid having to analyse your program code over and over again, the Python interpreter will do it just once, and then generate bytecode that is a very close representation of your Python code, but split up into the basic elements.

Let's take a look at a very simple Python function:

def f(x):
    y = (x + 1)*(x - 1)
    return y

The computation in this function comprises several calculations, which all have to be performed in the correct order. The bytecode reflects this:

    LOAD_VAR     x    # x+1
    LOAD_CONST   1
    ADD
    LOAD_VAR     x    # x-1
    LOAD_CONST   1
    SUBTRACT
    MULTIPLY          # ()*()
    STORE_VAR    y    # y = ...
    LOAD_VAR     y
    RETURN

Indeed, the bytecode in Python is usually a very close representation of the Python code itself, just broken up into pieces of 'atomic' simple operations.

Internally, each bytecode instruction has a numeric value (that actually fits into a byte, hence the name). For instance, LOAD_VAR = 124, LOAD_CONST = 100, ADD = 23, etc. And the local variables and constant value are also expressed through numbers. Thus, if we assign x = 01 and y = 02, the above code becomes:

  124,  01, 100,  01,  23, 124,  01, 100,  01,  
   24,  20, 125,  02, 124,  02,  83

Executing Bytecode

Below you will find a simple and minimalistic interpreter for 'Python bytecode' that is capable of executing the function we have defined in the beginning. The actual bytecode interpreter of Python is written in C and thus compiled to highly efficient machine code. But the principle is exactly the same.

It uses a stack to hold intermediate values. That is, the result of each operation is appended to a list. An operation that further processes these results takes them off the end of the list, does something (like add them together), and appends the result then back to the list (but you have to be careful when doing things like subtraction or division to keep the right order).

It is convenient to arrange the bytecode into pairs of instructions and arguments. Some instructions (like ADD) do not have an argument, so we just use 0 in that case. But the code used here is still the bytecode presented above.

def execute(bytecode, consts, vars):
    stack = []
    for (instr, arg) in bytecode:
        if instr == 20:
            stack.append(stack.pop() * stack.pop())
        elif instr == 23:
            stack.append(stack.pop() + stack.pop())
        elif instr == 24:
            second = stack.pop()
            first  = stack.pop()
            stack.append(first - second)
        elif instr == 83:
            return stack.pop()
        elif instr == 100:
            stack.append( consts[arg] )
        elif instr == 124:
            stack.append( vars[arg] )
        elif instr == 125:
            vars[arg] = stack.pop()

my_bytecode = [
  (124, 1), (100, 1), (23, 0), (124, 1), (100, 1), 
   (24, 0), (20, 0), (125, 2), (124, 2),  (83, 0)
]
my_consts = [ None, 1 ]   
my_vars   = [ x, 0 ]
execute(my_bytecode, my_consts, my_vars)

You can actually look at the lists of constant values (although they are actually tuples, not lists), or in what order the local variables are defined using:

print(f.__code__.co_code)      # prints the bytecode
print(f.__code__.co_consts)    # prints (None, 1)
print(f.__code__.co_varnames)  # prints ('x', 'y')

A tad more convenient is to use the inspect and dis modules, of course.

Sign up to request clarification or add additional context in comments.

10 Comments

Hey, PVM converts the bytecode into machine code and then the CPU executes it OR the bytecode is executed on the PVM - Which is correct? Please answer.
@NirajRaut The devil is in the details. From a conceptual point of view, both are kind of correct, but saying that the bytecode is executed by the PVM describes better what is actually happening. The PVM executes the bytecode, but the CPU itself executes the PVM.
Hey, but does the CPU executes the PVM first and converts it into machine code and then PVM executes the bytecode, right? Please tell me.
Dose python virtual machine send machine code to CPU so CPU take care o of processing and displaying results on screen or dose PVM executes byte code and send only result to CPU so that CPU can display it on screen ? is there any interaction between PVM and CPU if so wht do they communicate ? do they communicate just for passing the machine code or just for passing the result since PCM is vitual CPU and im not sure if it has the capability to interact and display message in screen avoiding interaction with CPU
@SteveFreed The bytecode instructions themselves are not pushed onto the stack, only their operands (values). So, in x+1, both x and 1 are pushed to the stack, but the ADD instruction is just executed. That being said, CPython actually uses a number of clever tricks internally to ensure that it runs the same on all architectures.
|
1

PVM is nothing but a software that converts the byte code to machine code for given operating system. Hence, Python is called an Interpreted language with PVM being the interpreter. To answer you question: Yes, the code is eventually converted into machine code by PVM. Read more here.

2 Comments

eventually converted into python code you may want to have this corrected
Does PVM even execute the machine code or it is left to the OS?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.