23

I would like to obtain the byte size of a blob.

I am using Postgresql and would like to obtain the size using an SQL query. Something like this:

SELECT sizeof(field) FROM table;

Is this possible in Postgresql?

Update: I have read the postgresql manual and could not find an appropriate function to calculate the file size. Also, the blob is stored as a large object.

2
  • Please read the manual before posting such a question: postgresql.org/docs/current/static/functions.html Commented Apr 16, 2012 at 6:51
  • 1
    @DanielVérité: it does look to be a dupe, but in fairness, I couldn't find that question when I searched before posting my answer. Who calls them "lobjects", honestly ? ;-) My function is uncannily like yours from that question, though in my defence, if I'd copied it I would have copied the error handling too! Commented Apr 17, 2012 at 3:29

6 Answers 6

24

Not that I've used large objects, but looking at the docs: http://www.postgresql.org/docs/current/interactive/lo-interfaces.html#LO-TELL

I think you have to use the same technique as some file system APIs require: seek to the end, then tell the position. PostgreSQL has SQL functions that appear to wrap the internal C functions. I couldn't find much documentation, but this worked:

CREATE OR REPLACE FUNCTION get_lo_size(oid) RETURNS bigint
VOLATILE STRICT
LANGUAGE 'plpgsql'
AS $$
DECLARE
    fd integer;
    sz bigint;
BEGIN
    -- Open the LO; N.B. it needs to be in a transaction otherwise it will close immediately.
    -- Luckily a function invocation makes its own transaction if necessary.
    -- The mode x'40000'::int corresponds to the PostgreSQL LO mode INV_READ = 0x40000.
    fd := lo_open($1, x'40000'::int);
    -- Seek to the end.  2 = SEEK_END.
    PERFORM lo_lseek(fd, 0, 2);
    -- Fetch the current file position; since we're at the end, this is the size.
    sz := lo_tell(fd);
    -- Remember to close it, since the function may be called as part of a larger transaction.
    PERFORM lo_close(fd);
    -- Return the size.
    RETURN sz;
END;
$$; 

Testing it:

-- Make a new LO, returns an OID e.g. 1234567
SELECT lo_create(0);

-- Populate it with data somehow
...

-- Get the length.
SELECT get_lo_size(1234567);

It seems the LO functionality is designed to be used mostly through the client or through low-level server programming, but at least they've provided some SQL visible functions for it, which makes the above possible. I did a query for SELECT relname FROM pg_proc where relname LIKE 'lo%' to get myself started. Vague memories of C programming and a bit of research for the mode x'40000'::int and SEEK_END = 2 value were needed for the rest!

Sign up to request clarification or add additional context in comments.

3 Comments

To avoid "result out of range" errors and make it work with large objects larger than 2GB, use lo_seek64 and lo_tell64.
How does the performance compare to the other answers?
As lo_seek64 already returns the current position - is there any need to execute lo_tell?
23

You could change your application to store the size when you create the large object. Otherwise you can use a query such as:

select sum(length(lo.data)) from pg_largeobject lo
where lo.loid=XXXXXX

You can use also the large object API functions, as suggested in a previous post, they work ok, but are an order of magnitude slower than the select method suggested above.

1 Comment

Unfortunately, the pg_largeobject catalog is no longer publicly accessible since PostgreSQL 9.0: postgresql.org/docs/current/static/catalog-pg-largeobject.html.
10
select pg_column_size(lo_get(lo_oid)) from table;

Gives you the size in bytes.

If you want pretty printing:

select pg_size_pretty(pg_column_size(lo_get(lo_oid))::numeric) from table;

4 Comments

somehow it is always 4 bytes more than it actually is, why?
@wutzebaer the first four bytes of a bytea column are the size of the rest of it
i.e. you should use octet_length (as per other answers) not pg_column_size
The works, but looks overkill - why should the DB read the complete BLOB (likely multiple hundreds of MB) to throw them away after counting the length - especially if lo_seek (end) just returns its position directly from the internally stored length?
6

Try length() or octet_length()

Comments

6

This is my solution:

select
lo.loid,
pg_size_pretty(sum(octet_length(lo.data)))
from pg_largeobject lo
where lo.loid in (select pg_largeobject.loid from pg_largeobject)
group by lo.loid;

Comments

3

If the type of the blob column is oid

SELECT length(lo_get(blob_column)) FROM table;

lo_get ( loid oid [, offset bigint, length integer ] ) → bytea

Extracts the large object's contents, or a substring thereof.

https://www.postgresql.org/docs/current/lo-funcs.html

length ( bytea ) → integer

Returns the number of bytes in the binary string.

https://www.postgresql.org/docs/current/functions-binarystring.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.