30

Have tried all the solutions I could find on the Internet to be able to print a page that is open in Selenium in Python. However, while the print pop-up shows up, after a second or two it goes away, with no PDF saved.

Here is the code being tried. Based on the code here - https://stackoverflow.com/a/43752129/3973491

Coding on a Mac with Mojave 10.14.5.

from selenium import webdriver
from selenium.webdriver.support.select import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.chrome.options import Options
from selenium.common.exceptions import WebDriverException
import time
import json

options = Options()
appState = {
    "recentDestinations": [
        {
            "id": "Save as PDF",
            "origin": "local"
        }
    ],
    "selectedDestinationId": "Save as PDF",
    "version": 2
}

profile = {'printing.print_preview_sticky_settings.appState': json.dumps(appState)}
# profile = {'printing.print_preview_sticky_settings.appState':json.dumps(appState),'savefile.default_directory':downloadPath}
options.add_experimental_option('prefs', profile)
options.add_argument('--kiosk-printing')
CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver'

driver = webdriver.Chrome(options=options, executable_path=CHROMEDRIVER_PATH)
driver.implicitly_wait(5)
driver.get(url)
driver.execute_script('window.print();')
$chromedriver --v
ChromeDriver 75.0.3770.90 (a6dcaf7e3ec6f70a194cc25e8149475c6590e025-refs/branch-heads/3770@{#1003})

Any hints or solutions as to what can be done to print the open html page to a PDF. Have spent hours trying to make this work. Thank you!


Update on 2019-07-11:

My question has been identified as a duplicate, but a) the other question seems to be using javascript code, and b) the answer does not solve the problem being raised in this question - it may be to do with more recent software versions. Chrome version being used is Version 75.0.3770.100 (Official Build) (64-bit), and chromedriver is ChromeDriver 75.0.3770.90. On Mac OS Mojave. Script is running on Python 3.7.3.

Update on 2019-07-11:

Changed the code to

from selenium import webdriver
import json

chrome_options = webdriver.ChromeOptions()
settings = {
    "appState": {
        "recentDestinations": [{
            "id": "Save as PDF",
            "origin": "local",
            "account": "",
        }],
        "selectedDestinationId": "Save as PDF",
        "version": 2
    }
}
prefs = {'printing.print_preview_sticky_settings': json.dumps(settings)}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver'
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=CHROMEDRIVER_PATH)
driver.get("https://google.com")
driver.execute_script('window.print();')
driver.quit()

And now, nothing happens. Chrome launches, loads url, print dialog appears but then nothing seems to happen - nothing in the default printer queue, and no pdf either - I even searched for the PDF files by looking up "Recent Files" on Mac.

10
  • no PDF saved, where did you check? It should be saved in your user Downloads folder. Commented Jul 10, 2019 at 7:01
  • @Kamal - I tried this again, and noticed that Chrome was firing an actual printout on my default printer but I was not in the same location, so I did not notice what actually happened. deleted the print queue from the numerous times that I had tried printing to pdf/ appeared that nothing happened. so I suspect that the "Save as PDF" option is not getting selected and do not know how to select it. Commented Jul 10, 2019 at 11:57
  • Please refer to this answer. In your code, you are calling webdriver.Chrome(options=options.., but correct syntax is webdriver.Chrome(chrome_options=options... And somehow, with webdriver.ChromeOptions print is working faster than with webdriver.chrome.options.Options, so I would suggest you to try that. Commented Jul 11, 2019 at 1:22
  • Possible duplicate of Set Selenium ChromeDriver UserPreferences to Save as PDF Commented Jul 11, 2019 at 1:23
  • 1
    @GregW.F.R glad it worked. I have not used this in a long time. But yes that is the way to instantiate a chrome driver instance. Commented Jul 19, 2021 at 20:33

7 Answers 7

27

The answer here, worked when I did not have any other printer setup in my OS. But when I had another default printer, this did not work.

I don't understand how, but making small change this way seems to work.

from selenium import webdriver
import json

chrome_options = webdriver.ChromeOptions()
settings = {
       "recentDestinations": [{
            "id": "Save as PDF",
            "origin": "local",
            "account": "",
        }],
        "selectedDestinationId": "Save as PDF",
        "version": 2
    }
prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(settings)}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver'
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=CHROMEDRIVER_PATH)
driver.get("https://google.com")
driver.execute_script('window.print();')
driver.quit()
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you @Kamal. This approach indeed works but it printed to the last used printer. Just did some search and I wonder if cups-pdf installed as a printer and if cups-pdf is the last used printer can result in the desired outcome - print-to-pdf using python.
Sorry I couldn't test my solution on Linux, it worked on Windows 10 for me.
got it. Will work on this some more and see if I can come up with something.
Worked on Linux for me. Would be nice if we could control the download location, however.
10

You can use the following code to print PDFs in A5 size with background css enabled:

import os
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import json
import time

chrome_options = webdriver.ChromeOptions()

settings = {
    "recentDestinations": [{
        "id": "Save as PDF",
        "origin": "local",
        "account": ""
    }],
    "selectedDestinationId": "Save as PDF",
    "version": 2,
    "isHeaderFooterEnabled": False,
    "mediaSize": {
        "height_microns": 210000,
        "name": "ISO_A5",
        "width_microns": 148000,
        "custom_display_name": "A5"
    },
    "customMargins": {},
    "marginsType": 2,
    "scaling": 175,
    "scalingType": 3,
    "scalingTypePdf": 3,
    "isCssBackgroundEnabled": True
}

mobile_emulation = { "deviceName": "Nexus 5" }
chrome_options.add_experimental_option("mobileEmulation", mobile_emulation)
chrome_options.add_argument('--enable-print-browser')
#chrome_options.add_argument('--headless')

prefs = {
    'printing.print_preview_sticky_settings.appState': json.dumps(settings),
    'savefile.default_directory': '<path>'
}
chrome_options.add_argument('--kiosk-printing')
chrome_options.add_experimental_option('prefs', prefs)

for dirpath, dirnames, filenames in os.walk('<source path>'):
    for fileName in filenames:
        print(fileName)
        driver = webdriver.Chrome("./chromedriver", options=chrome_options)
        driver.get(f'file://{os.path.join(dirpath, fileName)}')
        time.sleep(7)
        driver.execute_script('window.print();')
        driver.close()

2 Comments

This solution worked great for me. savefile.default_directory takes both forward and backslash paths (on Windows 10). However, this fails more often than it succeeds for me because the browser closes before the file is fully written. This can be solved by adding sleep(5) before driver.close() or some more intelligent structure.
It seems like headless is commented out, and with headless on it doesn't work. Any idea how to make it work in a headless browser?
6

Here is the solution I use with Windows :

  • First download the ChromeDriver here : http://chromedriver.chromium.org/downloads and install Selenium

  • Then run this code (based on the accepted answer, slightly modified to work on Windows):

    import json
    from selenium import webdriver
    chrome_options = webdriver.ChromeOptions()
    settings = {"recentDestinations": [{"id": "Save as PDF", "origin": "local", "account": ""}], "selectedDestinationId": "Save as PDF", "version": 2}
    prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(settings)}
    chrome_options.add_experimental_option('prefs', prefs)
    chrome_options.add_argument('--kiosk-printing')
    browser = webdriver.Chrome(r"chromedriver.exe", options=chrome_options)
    browser.get("https://google.com/")
    browser.execute_script('window.print();')
    browser.close()    
    

4 Comments

This is such a minimal revision ("Per the selenium documentation, specify the windows driver locations (e.g., chromedriver.exe) rather than the linux driver locations when running on windows") that it should simply be a comment on the accepted answer. Furthermore, It appears that you simply minified the accepted answer to make the code look different.
@RobHall Comments are sometimes cleared after years; also sometimes it's hard to extract information from multiple comments, thus this answer. I properly cited the source ("based on the accepted answer"); the devil is really in the details, I spent a lot of time trying and failing before it finally worked, so my goal was really to put a ready-to-use code for Windows as an answer.
I tried searching for the saved file but can't find it anywhere. Any idea where the file goes after being saved as pdf.
the saved file would be in downloads, does anyone know if I can add a delay for the web to load properly or if can change the default download location?
3

The solution is not very good, but you can take a screenshot and convert to pdf by Pillow...

from selenium import webdriver
from io import BytesIO
from PIL import Image

driver = webdriver.Chrome(executable_path='path to your driver')
driver.get('your url here')
img = Image.open(BytesIO(driver.find_element_by_tag_name('body').screenshot_as_png))
img.save('filename.pdf', "PDF", quality=100)

9 Comments

Thank you for your answer. The issue with this approach is that it does not work for multi-page webpages. Only a portion of information is captured. But it is a good solution for short pages and does not entail popups.
what do you mean by multi-page webpages?
meaning web pages that need scrolling to see the complete webpage and when printed as PDF fit on 3-4 sheets of papers.
you can use this code: stackoverflow.com/a/57608276/10661593 , and at the end save as pdf. P.s. I didn't understand a bit, sorry. Do you want to fit the entire page on 1 sheet when printing? or how
so what I ideally want to be able to do - is print a page as pdf. on a Mac, when you do that, the PDF generated can run into many pages - assuming PDF is created for letter or A4 sized printing. if I shrink the page a lot and take a screenshot that does not serve the purpose. although, now I understand that Selenium does not control the dialog boxes of the browser, and hence cannot print page as PDF. apparently, puppeteer or pyppeteer in python can do that but I do not know how to use that software yet. the link you shared, seems to talk about screenshot and not pdf...
|
3

You can try to use the selenium-print package.

It uses selenium's execute_cdp_cmd function behind the scenes, which is fairly easy to use. The parameters can be found here.

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
options = webdriver.ChromeOptions()
service = Service()
driver = webdriver.Chrome(service=service, options=options)
driver.get('http://localhost:3000')
time.sleep(2)
pdf = driver.execute_cdp_cmd("Page.printToPDF", {"printBackground": True})
pdf_data = base64.b64decode(pdf["data"])
with open("test.pdf", "wb") as f:
    f.write(pdf_data)

2 Comments

Only answer that worked flawless + choose fine name.
This was exactly what I was looking for, thanks!
0

In scenarios where a website restricts PDF generation to system-level printing and PDF-related APIs are either inaccessible or non-functional, the only effective means of saving the document is by triggering the system print function. It should be noted that this workaround is only feasible in non-headless mode, as neither CDP print nor page.pdf methods are applicable in such cases.

chrome_options = Options()
print_settings = {
    "recentDestinations": [{"id": "Save as PDF", "origin": "local", "account": ""}],
    "selectedDestinationId": "Save as PDF",
    "version": 2,
    "isHeaderFooterEnabled": False,
    "isLandscapeEnabled": False
}
prefs = {
    "download.prompt_for_download": False,
    "download.default_directory": self.download_path,
    "savefile.default_directory": self.download_path,
    'printing.print_preview_sticky_settings.appState': json.dumps(print_settings),
    "credentials_enable_service": False,
    "profile.password_manager_enabled": False
}
chrome_options.add_experimental_option("prefs", prefs)
chrome_options.add_argument("--kiosk-printing")
# chrome_options.add_argument("--headless=new")
chrome_options.add_argument("--avoid-stats=true")
chrome_options.add_argument(f'user-agent={user_agent}')
chrome_options.add_argument("--disable-blink-features=AutomationControlled") 
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
chrome_options.add_argument("--disable-gpu")

service = Service()
driver = webdriver.Chrome(service=service, options=chrome_options)

Comments

-6

I would suggest Downloading the page source html which can be done like so in vb.net:

Dim Html As String = webdriver.PageSource

Not sure how it is done in python but I'm sure it's very similar Once you have done that then you can select the parts of the page you want to save using an html parser or by parsing it manually with string parsing code. Once you have the html for the part you want to save stored in a string then use an html to pdf converter library or program. There are lots of these for programming languages like C# and vb.net. I don't know about any for python but I'm sure some exist. Just do some research. (some are free and some are expensive)

1 Comment

I've been using the converter approach and it is not great. The most common converter, wkhtmltopdf, lives in the 13th century, so either you put your medieval armour, forget all about flex and grid and go back to <table> layouting or you'll get zilch. Alternatives are even worse. Speaking of the 13th century, vb.net?!? In general, I don't hold a candle for two types of SO responses: 1) "Here's something I threw together and never actually tried. Good luck!", and 2) "Why would you want to do that?". Yours is type 1. Not as bad as type 2, but still a time waster.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.