3

Using the idea from this code for changing image size dynamically in a loop, there's bit of a problem here. There are a couple of methods to get the image size in bytes only one gives the accurate results but that requires file to be saved in the disk. If I save the disk every time and read it again, it'll take double the effort per iteration. IS there any way to read the image results accurately?

from PIL import Image
import os
import sys

image = Image.open(image_path
size_kb = os.stat(image_path).st_size
buffer = BytesIO()
image.save(buffer, format="jpeg", quality = 100, optimize = True) # Does not save but acts like an image saved to disc
size_kb2 = (buffer.getbuffer().nbytes)

printing the 3 different results print(size_kb, size_kb2, sys.getsizeof(image.tobytes()),) gives me 3 different results for the same image where os.stat gives accurate results (same results as shown by the Linux OS)

I do not want to save the image to disc to read it again because it'll take a whole lot of time

whole Code:

STEP = 32
MIN_SIZE = 32

def resize_under_kb(image:Image,size_kb: float, desired_size:float)-> Image:
    '''
    Resize the image under given size in KB
    args:
        Image: Pil Image object
        size_kb: Current size of image in kb
        desired_size: Final desired size asked by user
    '''
    size = image.size
    new_width_height = max(size) - STEP # Decrease the pixels for first pass

    while new_width_height > MIN_SIZE and size_kb > desired_size: # either the image reaches minimun dimension possible or the desired possible size
        image = image.resize((new_width_height,new_width_height))  # keep on resizing until you get to desired output

        buffer = BytesIO()
        image.save(buffer, format="jpeg", quality = 100, optimize = True) # Does not save but acts like an image saved to disc
        size_kb = buffer.getbuffer().nbytes

        size = image.size # Current resized pixels
        new_width_height = max(size) - STEP # Dimensions for next iteration

    return image
11
  • Accurately i think not. When writing to disk there can be small differences based on your filesystem and image to bytes might not include any metadata. On a sidenote resizing the image multiple times will cause a loss of quality, better to keep the original and do one big resize. Commented Apr 21, 2022 at 7:21
  • please look up the purpose of sys.getsizeof. that's merely telling you the amount of RAM used by a specific object Commented Apr 21, 2022 at 7:47
  • @ChristophRackwitz yes, But isn't it proportional to the size of image in a way? I mean number of pixels * memory taken by each bit? Just saying Commented Apr 21, 2022 at 7:57
  • 2
    Saving JPEGs with quality=100 is unlikely to be sensible if trying to reduce the size of an image! Please be clearer about your actual intention. What are you really trying to do with what type of images and what type of pixel dimensions and what type of sizes in bytes? Commented Apr 21, 2022 at 8:01
  • @Eumel Is there a way to approximate the size based on resized value keeping the aspect ratio. For someone having 0 knowledge of image processing and wanting to keep the image size under a specific size, they can't know. Just in case you're thinking if this is even a problem, try applying of any exam in India. Each application has their own limitations that an image should be between this size. Atleast 10M such forms are filled every year. Commented Apr 21, 2022 at 8:02

1 Answer 1

4

This code:

size_kb = os.stat(image_path).st_size

prints the number of bytes an existing JPEG takes on disk.


This code:

buffer = BytesIO()
image.save(buffer, format="jpeg", quality = 100, optimize = True) # Does not save but acts like an image saved to disc
size_kb2 = (buffer.getbuffer().nbytes)

prints the number of bytes an image would take on disk if saved... by PIL's current JPEG encoder, with its own Huffman tables and quality and chroma-subsampling and without allowing for file-system minimum block sizes.

This could be vastly different from the size you read from disk originally because that might have been created by different software, with different tradeoffs of speed and quality. It could even differ between two versions of PIL.


This code:

len(image.tobytes())

tells you the number of bytes your image is taking as currently decompressed in memory, without taking account of other data structures required for it and without taking account of metadata (comments, GPS data, copyright, manufacturer lens data and settings data).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.