16

In Python3.4, is it possible to open an SQLite3 database from an io.BytesIO stream?

Something akin to:

with open("temp.db", "rb") as openf:
    byte_stream = io.BytesIO(openf.read())
sqlite3.connect(byte_stream)

The short story is: I have a stream (byte_stream) that is the sqlite database file. I can't do the following for security reasons (can't create an unencrypted file):

with open("temp.db", "wb") as openf:
    openf.write(byte_stream)
sqlite3.connect("temp.db")

Is there some lower-level API for sqlite3 that I haven't been able to find? I assume that sqlite3.connect simply calls open() at some point and opens the file as a byte stream anyway. I'm simply trying to skip that open() step.

4
  • How did you get the byte stream? Commented Sep 17, 2015 at 1:43
  • The SQLite3 python bindings use the SQLite3 C library, and likely the sqlite3_open_v2 function, which takes a file name (or a VFS URL, but that's pretty advanced and IDK if Python exposes that API). Commented Sep 17, 2015 at 3:29
  • @ColonelThirtyTwo is correct, and if the implementation is checked nowhere open() is actually called - see Modules/_sqlite/connection.c:102 for actual implementation (in C). Commented Sep 17, 2015 at 4:02
  • 1
    Yeah, I was looking around at the C implementation, trying to see if there was some way to do it. Didn't find anything at first glance, but I was hoping someone else knew something. As for the VFS URL, that's exposed but not really documented. See docs.python.org/3.5/library/sqlite3.html#sqlite3.connect (I think.) Commented Sep 17, 2015 at 4:52

4 Answers 4

8

Since Python 3.11, sqlite3 has a method called deserialize on the connection object which takes the bytes of a database and deserializes it. You can use it for your purpose like so:

import sqlite3


with open("temp.db", "rb") as openf:
    db_bytes = openf.read()

conn = sqlite3.connect(":memory:")
conn.deserialize(db_bytes)

Note that as mentioned in the documentation:

This method is only available if the underlying SQLite library has the deserialize API.

Sign up to request clarification or add additional context in comments.

1 Comment

This is the best answer. Worked perfectly for me.
6

This is not possible with Python's sqlite3 module.

If you were using APSW, you could try writing your own virtual file system.

3 Comments

I was afraid of that. What about creating an in-memory database sqlite3.connect(':memory:') and then somehow replacing that with the data from the stream? Or importing the data from the stream? Is something like that possible?
No, you cannot get at the underlying bytes of an in-memory database.
Link not actual.
3

It's not quite open-ing it so you can run SQL queries, but you can use https://github.com/uktrade/stream-sqlite to get at all the rows in the file.

If you have an io.BytesIO instance, you can use iter to convert it to an iterable of bytes that stream-sqlite needs:

import io

import httpx
from stream_sqlite import stream_sqlite

url = 'https://www.parlgov.org/data/parlgov-development.db'
byte_stream = io.BytesIO(httpx.get(url).read())
bytes_iter = iter(lambda: byte_stream.read(4096), b'')

for table_name, pragma_table_info, rows in stream_sqlite(bytes_iter, max_buffer_size=1_048_576):
    for row in rows:
        print(row)

(Disclaimer: I was heavily involved in the development of stream-sqlite)

Comments

0

You can use sqlite_deserialize API that is exposed as part of APSW that can accept the serialized bytes of a SQLite database:

import apsw
import httpx

url = "https://data.api.trade.gov.uk/v1/datasets/uk-trade-quotas/versions/v1.0.366/data?format=sqlite"

with apsw.Connection(':memory:') as db:
    # deserialize is used to replace a connected database with an in-memory
    # database. In this case, we replace the "main" database, which is the
    # `:memory:` one above
    db.deserialize('main', httpx.get(url).read())
    cursor = db.cursor()
    cursor.execute('SELECT * FROM quotas;')
    print(cursor.fetchall())

The above doesn't use an io.BytesIO instance as in the question, but you can call read on one to get the bytes if you have one:

import io

import apsw
import httpx

# For working example purposes, constructing io.BytesIO from an HTTP response
url = "https://data.api.trade.gov.uk/v1/datasets/uk-trade-quotas/versions/v1.0.366/data?format=sqlite"
byte_stream = io.BytesIO(httpx.get(url).read())

with apsw.Connection(':memory:') as db:
    db.deserialize('main', byte_stream.read())
    cursor = db.cursor()
    cursor.execute('SELECT * FROM quotas;')
    print(cursor.fetchall())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.