1

I've started to experiment with duckdb but have struggled to figure out, how we can release memory again. If I have a loop like below, I would have imagined that memory gets either freed after process is over or after con.close() was called. But on my system neither seems to be the case, if I look at the memory used by the process, there is still memory occupied (very likely by duckdb, as this is essentially all I'm doing. Unless deltalake or pyarrow have issues)

import duckdb
import deltalake
for src in ["a","b", "c"]:
        def process():
                con = duckdb.connect(":memory:")
                dt = deltalake.DeltaTable(f"s3a://{src}", storage_options=deltalake_storage_options)
                pa = dt.to_pyarrow_table()
                r1 = con.from_arrow(pa)
                duckdb.sql("select * from r1").write_parquet("/tmp/test.parquet")
        process()

As some of the tables I'm dealing with are very large, I cannot keep all of them in memory at the same time. So how would I be able to release all of the memory, that is allocated to duckdb?

2 Answers 2

0

You might have luck using

con.from_arrow(pa).write_parquet("/tmp/test.parquet")

As that doesn't use the "default" connection, as the methods directly on the module (eg, duckdb.sql) do

Sign up to request clarification or add additional context in comments.

Comments

0

Would con.close() at the end of process() help?

2 Comments

Unfortuantely not. I think I have the same problem that someone had reported here: github.com/duckdb/duckdb/issues/9033
I had a similar issue and managed to get code that emitted a pyarrow.RecordBatchReader of smaller pyarrow tables that could be written to parquet one at a time. This dramatically reduced the memory footprint for me. Perhaps something similar could work for you. gist.github.com/iangow/2d8f7be06fea688ec9b84bc45c6c473a

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.