1

I'm working on improving Python 3.X support for PyFilesystem. It's an abstraction for filesystems. Each filesystem object has an open method that returns a file-like object.

The problem I'm facing is that the open method works open on Python 2.X, but I would like it to work like io.open which returns one of a number of binary or text mode streams.

What I could use, is a way of taking a Python 2.X file-like object, and returning an appropriate io stream object that reads/writes to the underlaying file-like object (but handles buffering/unicode etc if required).

I was thinking something like the following:

def make_stream(file_object, mode, buffering, encoding):
    # return a io instance

I can't see any straight forward way of doing that with the stdlib. But it strikes me as something the io module must be doing under the hood, since its a software layer that provides the buffering/unicode functionality.

1 Answer 1

1

Python 2 includes the same io library too.

Use from io import open to work the same across Python versions.

Your API should then offer a open() equivalent (called open() or make_stream()) that uses the io class library to provide the same functionality.

All that you need to do is create a class that implements the io.RawIOBase ABC, then use the other classes provided by the library to add buffering and text handling as needed:

import io

class MyFileObjectWrapper(io.RawIOBase):
    def __init__(self, *args):
        # do what needs done

    def close(self):
        if not self.closed:
            # close the underlying file
        self.closed = True

    # ... etc for what is needed (e.g. def read(self, maxbytes=None), etc.

def open(fileobj, mode='r', buffering=-1, encoding=None, errors=None, newline=None):
    # Mode parsing and validation adapted from the io/_iomodule.c module
    reading, writing, appending, updating = False
    text, binary, universal = False

    for c in mode:
        if c == 'r':
            reading = True;
            continue
        if c == 'w':
            writing = True;
            continue
        if c == 'a':
            appending = True;
            continue
        if c == '+':
            updating = True;
            continue
        if c == 't':
            text = True;
            continue
        if c == 'b':
            binary = True;
            continue
        if c == 'U':
            universal = reading = True;
            continue
        raise ValueError('invalid mode: {!r}'.format(mode))

    rawmode = []
    if reading:   rawmode.append('r')
    if writing:   rawmode.append('w')
    if appending: rawmode.append('a')
    if updating:  rawmode.append('+')
    rawmode = ''.join(rawmode)

    if universal and (writing or appending):
        raise ValueError("can't use U and writing mode at once")

    if text and binary) {
        raise ValueError("can't have text and binary mode at once")

    if reading + writing + appending > 1:
        raise ValueError("must have exactly one of read/write/append mode")

    if binary
        if encoding is not None:
            raise ValueError("binary mode doesn't take an encoding argument")
        if errors is not None:
            raise ValueError("binary mode doesn't take an errors argument")
        if newline is not None:
            raise ValueError("binary mode doesn't take a newline argument")

    raw = MyFileObjectWrapper(fileobj)

    if buffering == 1:
        buffering = -1
        line_buffering = True
    else:
        line_buffering = False

    if buffering < 0:
        buffering = SOME_SUITABLE_DEFAULT

    if not buffering
        if not binary:
            raise ValueError("can't have unbuffered text I/O")

        return raw

    if updating:
        buffered_class = io.BufferedRandom
    elif writing or appending:
        buffered_class = io.BufferedWriter
    elif reading:
        buffered_class = io.BufferedReader

    buffer = buffered_class(raw, buffering)

    if binary:
        return buffer

    return io.TextIOWrapper(buffer, encoding, errors, newline, line_buffering)

The above code is mostly adapted from the Modules/_io/_iomodule.c io_open function, but with the raw file object replaced by the MyFileObjectWrapper subclass of the io.RawIOBase ABC.

Sign up to request clarification or add additional context in comments.

6 Comments

Yes, I know. From Python 2.6 onwards. But I still need to provide the io interface for file-like objects. These aren't actual file objects. The data comes from a variety of different sources.
@WillMcGugan: Right, I misunderstood you then. The io library open function is just a factory. It returns instances of a series of classes. There is nothing magical about that, you can just implement the same thing in your own library.
@WillMcGugan: You can use the io library abstract base classes as a template, of course, to make your objects match expectations. You can also reuse the io buffer and text wrapper classes, if you provide your own raw file object implementation.
I had considered that, but it looks like a lot of work to provide the shill classes for each combination of mode/buffering. Guess I was hoping for something simpler.
You can also refer to the 2.6 implementation. It was ported to C in 2.7.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.