How to detect color from PDF Python

Question

Is there any way, in Python, of automatically detect the colors in a certain area of a PDF and either translate them to RGB or compare them to the legend and then get the color?

Maybe you could convert the PDF into a picture format (BMP for instance) and analyze that. — WoJ
– WoJ, Commented Apr 30, 2015 at 19:58

Community · Accepted Answer · 2017-05-23 10:31:07Z

4

Felipe's approach didn't work for me, but I came up with this:

#!/usr/bin/env python
# -*- Encoding: UTF-8 -*-

import minecart

colors = set()

with open("file.pdf", "rb") as file:
    document = minecart.Document(file)
    page = document.get_page(0)
    for shape in page.shapes:
        if shape.fill:
            colors.add(shape.fill.color.as_rgb())

for color in colors: print color

This will print a neat list of all unique RGB values in the first page of your document (you could extend it to all pages, of course).

edited May 23, 2017 at 10:31

CommunityBot

11 silver badge

answered May 17, 2016 at 9:20

黄雨伞

1,9941 gold badge17 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Felipe · Accepted Answer · 2015-06-30 14:58:56Z

2

Depending on where you want to extract the information from, you can use minecart. It has really robust support for colors and allows easy conversion to RGB. Though you can't input a coordinate and get the color value there, if you are trying to get color information from a shape you could do something like the following:

import minecart
doc = minecart.Document(open("my-doc.pdf", "rb"))
page = doc.get_page(0)
BOX = (.5 * 72,  # left bounding box edge
       9 * 72,   # bottom bounding box edge
       1 * 72,   # right bounding box edge
       10 * 72)  # top bounding box edge
for shape in page.shapes:
    if shape.check_in_bbox(BOX):
        r, g, b = shape.fill.color.as_rgb()
        # do stuff with r, g, b

[Disclaimer: I'm the author of minecart]

answered Jun 30, 2015 at 14:58

Felipe

3,1592 gold badges30 silver badges46 bronze badges

1 Comment

Hridoy_089 Over a year ago

getting this erro KeyError: 'DeviceN' @Felipe

Collectives™ on Stack Overflow

How to detect color from PDF Python

2 Answers 2

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related