5

Is there any way, in Python, of automatically detect the colors in a certain area of a PDF and either translate them to RGB or compare them to the legend and then get the color?

1
  • Maybe you could convert the PDF into a picture format (BMP for instance) and analyze that. Commented Apr 30, 2015 at 19:58

2 Answers 2

4

Felipe's approach didn't work for me, but I came up with this:

#!/usr/bin/env python
# -*- Encoding: UTF-8 -*-

import minecart

colors = set()

with open("file.pdf", "rb") as file:
    document = minecart.Document(file)
    page = document.get_page(0)
    for shape in page.shapes:
        if shape.fill:
            colors.add(shape.fill.color.as_rgb())

for color in colors: print color

This will print a neat list of all unique RGB values in the first page of your document (you could extend it to all pages, of course).

Sign up to request clarification or add additional context in comments.

Comments

2

Depending on where you want to extract the information from, you can use minecart. It has really robust support for colors and allows easy conversion to RGB. Though you can't input a coordinate and get the color value there, if you are trying to get color information from a shape you could do something like the following:

import minecart
doc = minecart.Document(open("my-doc.pdf", "rb"))
page = doc.get_page(0)
BOX = (.5 * 72,  # left bounding box edge
       9 * 72,   # bottom bounding box edge
       1 * 72,   # right bounding box edge
       10 * 72)  # top bounding box edge
for shape in page.shapes:
    if shape.check_in_bbox(BOX):
        r, g, b = shape.fill.color.as_rgb()
        # do stuff with r, g, b

[Disclaimer: I'm the author of minecart]

1 Comment

getting this erro KeyError: 'DeviceN' @Felipe

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.