0

I'm following the CPython internals guide from Chapter 7.5 on adding an operator. The guide uses the example of the "almost equal" operator (~=). Despite following the prescribed steps and additional troubleshooting, the implementation does not work as expected. Here's a detailed breakdown of what I've done after successfully installing and building the CPython project:

  1. Grammar File Changes:

    • Updated Grammar/Grammar on line 141 to include ~=:
      comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='|'in'|'not' 'in'|'is'|'is' 'not'|'~='
      
    • Modified Grammar/python.gram on line 413 by adding | ale_bitwise_or under the compare_op_bitwise_or_pair[CmpopExprPair*] section.
    • On line 425, added a new rule:
      ale_bitwise_or[CmpopExprPair*]: '~=' a=bitwise_or { _PyPegen_cmpop_expr_pair(p, AlE, a) }
      
  2. Token Definitions:

    • Added TILDEEQUAL '~=' to Grammar/Tokens on line 56.
  3. AST Definitions:

    • Updated Parser/Python.asdl on line 102 to include AlE:
      cmpop = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn | AlE
      
  4. Manual Code Additions:

    • Expected automatic updates to /Parser/token.c did not occur, so I manually added:
      case '~':
          switch (c2) {
              case '=': return TILDEEQUAL;
          }
      break;
      
    • Defined TILDEEQUAL in token.h after it was not found.

After rebuilding the project multiple times and even re-cloning, I continue to receive a syntax error when running the following Python code:

import ast
ast.parse("1 ~= 1")
1 ~= 1

The error is:

SyntaxError: invalid syntax

I'm using Visual Studio for building the project solution, editing in VSCode, and running /PCbuild/build.bat --regen from the VSCode terminal. My environment is Python 3.9.19+ (heads/3.9-dirty:40d77b9367) on a 64-bit debug build.

Question:

  1. What might I be missing in the process that prevents Python from recognizing the new ~= operator syntax?
6
  • It's probably colliding with the ~ integer operator with higher precedence. Try an arbitrary other symbol. Commented Apr 19, 2024 at 17:31
  • 1
    @thatotherguy: Generally tokenizers that actually recognize a token will prefer the longer token in cases of ambiguity (they're greedy), there is no precedence between operators to work out. The fact the OP had to manually modify the generated token.c/token.h indicates they probably failed to fully define the token, or failed to rerun the code that generates the tokenizing code from the definition (which may not actually be run in a normal build, since it rarely changes, and they may just check in the generated files so the common build case can avoid that work). Commented Apr 19, 2024 at 17:43
  • @ShadowRanger: What files must I execute in order to generate the proper token.c? Shouldn't this happen with PCBuild/build.bat --regen? I tried building the solution to see if that fixed anything, hoping that it would build some things that might've been necessary for the generation. With no success ofc, these things seem separate. Commented Apr 19, 2024 at 17:50
  • @thatotherguy: I've just tried the steps: change Grammar, python.gram, Tokens and Python.asdl and then build.bat --regen. This time I used the '$' operator (i.e. "$="). It still doesn't automatically generate token.c, it's supposed to be added to the PyToken_TwoChars function. So something is going wrong there... Commented Apr 19, 2024 at 17:52
  • I really know nothing about builds on Windows, but there's one thing definitely missing from your post. Please add build.bat --regen output (and, ideally, try running it from PCBuild directory as CWD, not as PCBuild/build.bat). If I guess correctly and regen.vcxproj is some terribly-looking analogue of CMake input file or Makefile and <Warning> tags are emitted as CLI output, you should be getting some output from that command. Commented Apr 20, 2024 at 0:33

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.