59

From the documentation for the standard library json module:

json.dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)

Serialize obj as a JSON formatted stream to fp (a .write()-supporting file-like object) using this conversion table.

What exactly does this description mean? What object types are ".write()-supporting", and "file-like"?

6
  • 5
    Well, you answered your question yourself: really any object that supports proper invocations of either read(), write(), or both, is considered to be a file-like object. It really can be any object you like - the joy of duck typing. Commented Dec 5, 2010 at 15:43
  • I don't think there's any official standard. Most interfaces should specify exactly the functionality they require. If you want to know if some other thing supports what is needed, you'll have to look at the docs for it or read its source code. Commented Dec 5, 2010 at 18:46
  • 3
    "File-like object" is, in fact, precisely defined. Commented Nov 14, 2019 at 16:21
  • In more recent versions of the documentation, this passage has been updated for json.load not to mention "file-like objects": "Deserialize fp (a .read()-supporting text file or binary file containing a JSON document) to a Python object using this conversion table." However, the json.dump description still mentions the concept. The documentation also hasn't used the simplejson name in approximately forever. Commented Jan 19, 2023 at 6:24
  • 1
    I edited the question to show an example from the current documentation that does mention (and link) the concept, and to remove a section about motivation for the question (since it doesn't make sense with the changed example). If you are here because you are trying to figure out how to handle JSON from a web request (or close a question as a duplicate), like with OP's original motivation, please refer to the canonical: How can I parse (read) and use JSON?. Commented Jan 19, 2023 at 6:39

7 Answers 7

33

From the glossary:

file-like object

A synonym for file object

and a file object is

file object

An object exposing a file-oriented API (with methods such as read() or write()) to an underlying resource. Depending on the way it was created, a file object can mediate access to a real on-disk file or to another type of storage or communication device (for example standard input/output, in-memory buffers, sockets, pipes, etc.). File objects are also called file-like objects or streams.

There are actually three categories of file objects: raw binary files, buffered binary files and text files. Their interfaces are defined in the io module. The canonical way to create a file object is by using the open() function.

Sign up to request clarification or add additional context in comments.

4 Comments

"with methods such as" ... I suppose that really brings us back to the original question: what exactly is this set of methods? Is e.g. "mode" a required attribute of such an object, or not?
@KlaasvanSchelven That, unfortunately, may depend on the underlying operating system or file system.
I suppose the question is rhetorical at this point ;-) The point being that we have different opinions on the meaning of "precisely defined"
"That, unfortunately, may depend on the underlying operating system or file system." Then it's not an API. Or at least not a single API. An API is a rigid interface; it's not conditional.
24

The IO Class Hierarchy section in the IO documentation contains a table listing the built-in and stub methods for the different types of file-like objects.

Basically, there is a hierarchy of abstract base classes:

To implement a file-like object, you would subclass one of the three descendants of IOBase, but not IOBase itself. See this answer for attempting to determine which of these a given file-like object is.

Each of these classes provides various stub methods and mixins:

Class Stub Methods Mixins
IOBase fileno, seek, truncate close, closed, __enter__, __exit__, flush, isatty, __iter__, __next__, readable, readline, readlines, seekable, tell, writable, writelines
RawIOBase readinto, write read, readall
BufferedIOBase detach, read, read1, write readinto, readinto1
TextIOBase detach, read, readline, write encoding, errors, newlines

The documentation for these methods can be found in the documentation for the classes, linked above.

1 Comment

18

In Python, a file object is an object exposing an API having methods for performing operations typically done on files, such as read() or write().

In the question's example: simplejson.load(fp, ...), the object passed as fp is only required to have a read() method, callable in the same way as a read() on a file (i.e. accepting an optional parameter size and returning either a str or a bytes object).

This does not need to be a real file, though, as long as it has a read() method.

A file-like object is just a synonym for file-object. See Python Glossary.

Comments

14

File-like objects are mainly StringIO objects, connected sockets and, well, actual file objects.

If everything goes well, urllib.urlopen() returns a file-like object supporting the necessary methods.

1 Comment

the necessary methods... this is the most important aspect. Which methods?
7

This is the API for all file-like objects in the Python standard library (as of 3.10.5).

# All file-like objects inherit the IOBase interface:
# Documented at https://docs.python.org/3/library/io.html#io.IOBase .
    close() -> None
    closed() -> bool # Implemented as @property `closed`
    fileno() -> int
    flush() -> None
    isatty() -> bool
    readable() -> bool
    readline(size: int = -1) -> Union[str, bytes]
    readlines(hint: Union[int, None] = None) -> list
    seek(pos: int, whence: int = io.SEEK_SET) -> int # SEEK_SET is 0
    seekable() -> bool
    tell() -> int
    truncate(pos: int = None) -> int # The parameter is named "size" in class FileIO
    writable() -> bool
    writelines(lines: list) -> None
    __del__() -> None
# Documented at https://docs.python.org/3/library/io.html#class-hierarchy .
    __enter__()
    __exit__(*args) -> None:
    __iter__()
    __next__() -> Union[str, bytes]
# Documented in paragraph at https://docs.python.org/3/library/io.html#io.IOBase .
# Note that while the documentation claims that the method signatures 
# of `read` and `write` vary, all file-like objects included in the Python 
# Standard Library have the following exact method signatures for `read` and `write`:
    read(size: int = -1) -> Union[str, bytes]
    write(b: Union[str, bytes]) -> int # The parameter is named "s" in TextIOBase

Specific file-like objects may implement more than this, but this is the subset of methods that are common to ALL file-like objects.

1 Comment

This is true for the standard library only. In many places in the Python world (even on docs.python.org) "file-like object" is a far looser notion. In many cases, having a read(size: int) method will be sufficient.
3

There is no "exactly" for file-likeness

The notion of an object being "file-like" is one of the most disappointing spots of Python today, in my view.

Remember Python 2?

The idea goes back to Python 2 (and probably Python 1) where duck-typing was the norm and type hints did not even exist.

In such a setting, it makes perfect sense if the meaning of "file-like" is strongly context-dependent: All the caller needs is an object that will return some bytes data (or string data) upon a call to read()? Then any such object can be considered file-like for this particular caller.

Other callers will want a write() operation that somehow sensibly consumes the data they hand over to it. Still others may want a seek() as well, and perhaps a read() in addition. And so on. If these callers get an object with those methods, they are happy and do not care at all what the actual type of those objects is.

Today's Python 3 use ticks differently

Modern Python relies a lot more on knowing about types and so we have started to often declare which ones are used where. And rightly so, because the more you make use of third-party components, the more difficult it becomes to manage duck typing successfully. Type hints are concise and informative documentation in such a setting.

But now we have conflicting explanations of what "file-like" is: old ones that are minimal and talk about methods only, newer ones that make many more requirements and may even mention specific types.

Not nice. But hey: Python is more than 30 years old. Given that, its number of warts is impressively low!
"file-like" is definitely one of them.

You can roll your own explicit definition of 'file-like'

The stdlib's typing module has the notion of Protocol by which you can define a type in (static) duck-typing fashion: As a list of method signatures.

If you like, you can define a Protocol (or several) for your own code's notion(s) of file-likeness. The bare minimum, for an object supporting read() in text mode, might look like so:

class FileLike(typing.Protocol):
    def read(self) -> str:
        ...

You can then use it in type hints:

def process_file(input: FileLike):
    ...

Comments

-1

simplejson has the calls loads and dumps that consumes and produce strings instead of file like objects.

This link has an example in the context of StringIO and simplejson for both file-like and string objects.

http://svn.red-bean.com/bob/simplejson/tags/simplejson-1.3/docs/index.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.