3

I am refactoring C++ code using the Python bindings of the Clang compiler (cindex). Using that, I analyze the AST and prepare changes. I end up with a list of operations similar to the following:

DELETE line 10
INSERT line 5 column 32: <<  "tofu"
REPLACE from line 31 colum 6 to line 33 column 82 with: std::cout << "Thanks SO"
...

My question is how to turn these into actual file changes.

Doing it directly with python seems tedious: patches need to be applied in the right order and checked for consistency. It looks quite hard and error-prone.

I also can’t find a good library to help (clang does have something called a Rewriter, but it isn't wrapped in Python. I'd really like to avoid C++ for refactoring if possible).

Maybe an idea could be to generate patches and apply them with git, maybe? But even that seems a bit tedious.

Any ideas?

3
  • 1
    What you have exhibited is a string of patches; why can't you just apply them in order? [Maybe you are making several overlapping patches to the AST?] Would it be more useful to just modify the AST for each change as you make them, then spit out the revised AST? Then you can't have an out-of-order patching problem. Commented Aug 10, 2016 at 22:27
  • You have to be careful about order as changes can invalidate line numbers. I agree modifying the AST would be the best option, unfortunately cindex does not support it. Commented Aug 10, 2016 at 22:33
  • 1
    If the tool you have won't do the job well, you can choose another tool. Check my bio for a tool (DMS) that can parse C++, build and modify C++ ASTs (we did that avoid exactly silly patching problems when composing changes) and then regenerate valid source code. It isn't in Python, though. It isn't in C++ either; DMS is an array of DSLs that cooperate to let you specify transformations. Commented Aug 30, 2016 at 4:00

1 Answer 1

3

So I rolled out my own. The code is almost certainly buggy and not very pretty, but I'm posting it in the hope it might help someone until a better solution is found.

class PatchRecord(object):
    """ Record patches, validate them, order them, and apply them """

    def __init__(self):
        # output of readlines for each patched file
        self.lines = {}
        # list of patches for each patched file
        self.patches = {}

    class Patch(object):
        """ Abstract base class for editing operations """

        def __init__(self, filename, start, end):
            self.filename = filename
            self.start = start
            self.end = end

        def __repr__(self):
            return "{op}: {filename} {start}/{end} {what}".format(
                op=self.__class__.__name__.upper(),
                filename=self.filename,
                start=format_place(self.start),
                end=format_place(self.end),
                what=getattr(self, "what", ""))

        def apply(self, lines):
            print "Warning: applying no-op patch"

    class Delete(Patch):

        def __init__(self, filename, extent):
            super(PatchRecord.Delete, self).__init__(
                filename, extent.start, extent.end)
            print "DELETE: {file} {extent}".format(file=self.filename,
                                                   extent=format_extent(extent))

        def apply(self, lines):
            lines[self.start.line - 1:self.end.line] = [
                lines[self.start.line - 1][:self.start.column - 1] +
                lines[self.end.line - 1][self.end.column:]]

    class Insert(Patch):

        def __init__(self, filename, start, what):
            super(PatchRecord.Insert, self).__init__(filename, start, start)
            self.what = what
            print "INSERT {where} {what}".format(what=what, where=format_place(self.start))

        def apply(self, lines):
            line = lines[self.start.line - 1]
            lines[self.start.line - 1] = "%s%s%s" % (
                line[:self.start.column],
                self.what,
                line[self.start.column:])

    class Replace(Patch):

        def __init__(self, filename, extent, what):
            super(PatchRecord.Replace, self).__init__(
                filename, extent.start, extent.end)
            self.what = what
            print "REPLACE: {where} {what}".format(what=what,
                                                   where=format_extent(extent))

        def apply(self, lines):
            lines[self.start.line - 1:self.end.line] = [
                lines[self.start.line - 1][:self.start.column - 1] +
                self.what +
                lines[self.end.line - 1][self.end.column - 1:]]

    # Convenience functions for creating patches
    def delete(self, filename, extent):
        self.patches[filename] = self.patches.get(
            filename, []) + [self.Delete(filename, extent)]

    def insert(self, filename, where, what):
        self.patches[filename] = self.patches.get(
            filename, []) + [self.Insert(filename, where, what)]

    def replace(self, filename, extent, what):
        self.patches[filename] = self.patches.get(
            filename, []) + [self.Replace(filename, extent, what)]

    def _pos_to_tuple(self, position):
        """ Convert a source location to a tuple for use as a sorting key """
        return (position.line, position.column)

    def sort(self, filename):
        """ Sort patches by extent start """
        self.patches[filename].sort(key=lambda p: self._pos_to_tuple(p.start))

    def validate(self, filename):
        """Try to insure patches are consistent"""
        print "Checking patches for %s" % filename
        self.sort(filename)
        previous = self.patches[filename][0]
        for p in self.patches[filename][1:]:
            assert(self._pos_to_tuple(p.start) >
                   self._pos_to_tuple(previous.start))

    def _apply(self, filename):
        self.sort(filename)
        lines = self._getlines(filename)
        for p in reversed(self.patches[filename]):
            print p
            p.apply(lines)

    def _getlines(self, filename):
        """ Get source file lines for editing """
        if not filename in self.lines:
            with open(filename) as f:
                self.lines[filename] = f.readlines()
        return self.lines[filename]

    def apply(self):
        for filename in self.patches:
            self.validate(filename)
            self._apply(filename)
            # with open(filename+".patched","w") as output:
            with open(filename, "w") as output:
                output.write("".join(self._getlines(filename)))

Just create a PatchRecord object, add changes using the create, replace and delete methods, and apply them with apply when you're ready.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.