13

I'm trying to get dimensions from a picture directly from the web using this code:

string image = @"http://www.hephaestusproject.com/.../csharp3.png";
byte[] imageData = new WebClient().DownloadData(image);
MemoryStream imgStream = new MemoryStream(imageData);
Image img = Image.FromStream(imgStream);

int wSize = img.Width;
int hSize = img.Height;

It works but the performance is terrible because I need to download many images just to get their dimensions.

Is there a more efficient way to do the same thing?

1
  • 2
    no. not really. you need the image to know its size. you MIGHT be able to open PART of the stream, depending on the file type, and parse the meta data directly.... but I doubt that's going to fit what you need; and generally, once the web request is sent, you're going to get the whole image in one go. Commented May 5, 2015 at 13:36

2 Answers 2

14

In case this is helpful to those coming along later, it does seem that this is indeed possible. A brief review of the JPG, PNG and GIF image formats shows that they all generally have a header at the beginning of the file that contains the image dimensions.

Reddit uses an algorithm to download successive 1024-byte chunks to determine image dimensions without downloading an entire image. The code is in Python but it is in the _fetch_image_size method here: https://github.com/reddit/reddit/blob/35c82a0a0b24441986bdb4ad02f3c8bb0a05de57/r2/r2/lib/media.py#L634

It uses a separate parser in their ImageFile class and successively attempts to parse the image and retrieve the size as more bytes are downloaded. I've loosely translated this to C# in the code below, heavily leveraging the image-parsing code at https://stackoverflow.com/a/112711/3838199.

There may be some cases where retrieving the entire file is necessary but I suspect that applies to a relatively small subset of JPEG images (perhaps progressive images). In my casual testing it seems most image sizes are retrieved via the first 1024-byte retrieval; indeed this chunk size could probably be smaller.

using System;
using System.Collections.Generic;
using System.Drawing; // note: add reference to System.Drawing assembly
using System.IO;
using System.Linq;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading.Tasks;

namespace Utilities
{
    // largely credited to https://stackoverflow.com/a/112711/3838199 for the image-specific code
    public static class ImageUtilities
    {
        private const string ErrorMessage = "Could not read image data";
        private const int ChunkSize = 1024;

        private static readonly HttpClient Client = new HttpClient();
        private static readonly Dictionary<byte[], Func<BinaryReader, Size>> ImageFormatDecoders = new Dictionary<byte[], Func<BinaryReader, Size>>()
        {
            { new byte[]{ 0x42, 0x4D }, DecodeBitmap},
            { new byte[]{ 0x47, 0x49, 0x46, 0x38, 0x37, 0x61 }, DecodeGif },
            { new byte[]{ 0x47, 0x49, 0x46, 0x38, 0x39, 0x61 }, DecodeGif },
            { new byte[]{ 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A }, DecodePng },
            { new byte[]{ 0xff, 0xd8 }, DecodeJfif },
        };

        /// <summary>
        /// Retrieve the dimensions of an online image, downloading as little as possible
        /// </summary>
        public static async Task<Size> GetWebDimensions(Uri uri)
        {
            var moreBytes = true;
            var currentStart = 0;
            byte[] allBytes = { };

            while (moreBytes)
            {
                try
                {
                    var newBytes = await GetSomeBytes(uri, currentStart, currentStart + ChunkSize - 1).ConfigureAwait(false);
                    if (newBytes is null || newBytes.Length < ChunkSize)
                        moreBytes = false;
                    
                    if(new bytes != null)
                        allBytes = Combine(allBytes, newBytes);

                    return GetDimensions(new BinaryReader(new MemoryStream(allBytes)));
                }
                catch
                {
                    currentStart += ChunkSize;
                }
            }

            return new Size(0, 0);
        }

        private static async Task<byte[]?> GetSomeBytes(Uri uri, int startRange, int endRange)
        {
            var request = new HttpRequestMessage { RequestUri = uri };
            request.Headers.Range = new RangeHeaderValue(startRange, endRange);
            try
            {
                var response = await Client.SendAsync(request).ConfigureAwait(false);
                return await response.Content.ReadAsByteArrayAsync().ConfigureAwait(false);
            }
            catch 
            {

            }
            return new byte[] { };
        }

        /// <summary>
        /// Gets the dimensions of an image.
        /// </summary>
        /// <returns>The dimensions of the specified image.</returns>
        /// <exception cref="ArgumentException">The image was of an unrecognized format.</exception>    
        public static Size GetDimensions(BinaryReader binaryReader)
        {
            int maxMagicBytesLength = ImageFormatDecoders.Keys.OrderByDescending(x => x.Length).First().Length;

            byte[] magicBytes = new byte[maxMagicBytesLength];

            for (int i = 0; i < maxMagicBytesLength; i += 1)
            {
                magicBytes[i] = binaryReader.ReadByte();

                foreach (var kvPair in ImageFormatDecoders)
                {
                    if (magicBytes.StartsWith(kvPair.Key))
                    {
                        return kvPair.Value(binaryReader);
                    }
                }
            }

            throw new ArgumentException(ErrorMessage, nameof(binaryReader));
        }

        // from https://stackoverflow.com/a/415839/3838199
        private static byte[] Combine(byte[] first, byte[] second)
        {
            byte[] ret = new byte[first.Length + second.Length];
            Buffer.BlockCopy(first, 0, ret, 0, first.Length);
            Buffer.BlockCopy(second, 0, ret, first.Length, second.Length);
            return ret;
        }

        private static bool StartsWith(this byte[] thisBytes, byte[] thatBytes)
        {
            for (int i = 0; i < thatBytes.Length; i += 1)
            {
                if (thisBytes[i] != thatBytes[i])
                {
                    return false;
                }
            }
            return true;
        }

        private static short ReadLittleEndianInt16(this BinaryReader binaryReader)
        {
            byte[] bytes = new byte[sizeof(short)];
            for (int i = 0; i < sizeof(short); i += 1)
            {
                bytes[sizeof(short) - 1 - i] = binaryReader.ReadByte();
            }
            return BitConverter.ToInt16(bytes, 0);
        }

        private static int ReadLittleEndianInt32(this BinaryReader binaryReader)
        {
            byte[] bytes = new byte[sizeof(int)];
            for (int i = 0; i < sizeof(int); i += 1)
            {
                bytes[sizeof(int) - 1 - i] = binaryReader.ReadByte();
            }
            return BitConverter.ToInt32(bytes, 0);
        }

        private static Size DecodeBitmap(BinaryReader binaryReader)
        {
            binaryReader.ReadBytes(16);
            int width = binaryReader.ReadInt32();
            int height = binaryReader.ReadInt32();
            return new Size(width, height);
        }

        private static Size DecodeGif(BinaryReader binaryReader)
        {
            int width = binaryReader.ReadInt16();
            int height = binaryReader.ReadInt16();
            return new Size(width, height);
        }

        private static Size DecodePng(BinaryReader binaryReader)
        {
            binaryReader.ReadBytes(8);
            int width = binaryReader.ReadLittleEndianInt32();
            int height = binaryReader.ReadLittleEndianInt32();
            return new Size(width, height);
        }

        private static Size DecodeJfif(BinaryReader binaryReader)
        {
            while (binaryReader.ReadByte() == 0xff)
            {
                byte marker = binaryReader.ReadByte();
                short chunkLength = binaryReader.ReadLittleEndianInt16();

                if (marker == 0xc0 || marker == 0xc1 || marker == 0xc2)
                {
                    binaryReader.ReadByte();

                    int height = binaryReader.ReadLittleEndianInt16();
                    int width = binaryReader.ReadLittleEndianInt16();
                    return new Size(width, height);
                }

                binaryReader.ReadBytes(chunkLength - 2);
            }

            throw new ArgumentException(ErrorMessage);
        }
    }
}

Tests:

using System;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Utilities;
using System.Drawing;

namespace Utilities.Tests
{
    [TestClass]
    public class ImageUtilitiesTests
    {
        [TestMethod]
        public void GetPngDimensionsTest()
        {
            string url = "https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png";
            Uri uri = new Uri(url);
            var actual = ImageUtilities.GetWebDimensions(uri);
            Assert.AreEqual(new Size(272, 92), actual);
        }

        [TestMethod]
        public void GetJpgDimensionsTest()
        {
            string url = "https://upload.wikimedia.org/wikipedia/commons/e/e0/JPEG_example_JPG_RIP_050.jpg";
            Uri uri = new Uri(url);
            var actual = ImageUtilities.GetWebDimensions(uri);
            Assert.AreEqual(new Size(313, 234), actual);
        }

        [TestMethod]
        public void GetGifDimensionsTest()
        {
            string url = "https://upload.wikimedia.org/wikipedia/commons/a/a0/Sunflower_as_gif_websafe.gif";
            Uri uri = new Uri(url);
            var actual = ImageUtilities.GetWebDimensions(uri);
            Assert.AreEqual(new Size(250, 297), actual);
        }
    }
}
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks so much for the above piece of code. It was very useful. However, in my tests, the code above seems to return 0,0 as width & height for all the png & GIF images. Maybe there are additional markers that need to be included? If you updated this code or know about additional markers for png and GIF , can you post those? Thanks
As I mentioned I'm leveraging the image parsing code from stackoverflow.com/a/112711/3838199 and perhaps it doesn't work in all instances. However I've added my unit tests and though they are admittedly far from exhaustive they show particular images where this routine works perfectly. Also note that this code is currently in use in a large production application scraping the size of images on over 800 million web sites.
5

In a word, no.

In a few more words, you would have to rely on there being a resource containing details of the dimensions of images on a given server. In 99.99% of cases that will simply not exist.

3 Comments

Given the other answer, this just seems factually incorrect. (Maybe just delete this answer?)
You mean the other answer that the OP suggested didn't work for many cases? ;)
Well, the OP also accepted it. My point was that the answer "no" doesn't seem correct given that there are (imperfect) possibilities.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.