how to save opened page as pdf in Selenium (Python)

Question

Have tried all the solutions I could find on the Internet to be able to print a page that is open in Selenium in Python. However, while the print pop-up shows up, after a second or two it goes away, with no PDF saved.

Here is the code being tried. Based on the code here - https://stackoverflow.com/a/43752129/3973491

Coding on a Mac with Mojave 10.14.5.

from selenium import webdriver
from selenium.webdriver.support.select import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.chrome.options import Options
from selenium.common.exceptions import WebDriverException
import time
import json

options = Options()
appState = {
    "recentDestinations": [
        {
            "id": "Save as PDF",
            "origin": "local"
        }
    ],
    "selectedDestinationId": "Save as PDF",
    "version": 2
}

profile = {'printing.print_preview_sticky_settings.appState': json.dumps(appState)}
# profile = {'printing.print_preview_sticky_settings.appState':json.dumps(appState),'savefile.default_directory':downloadPath}
options.add_experimental_option('prefs', profile)
options.add_argument('--kiosk-printing')
CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver'

driver = webdriver.Chrome(options=options, executable_path=CHROMEDRIVER_PATH)
driver.implicitly_wait(5)
driver.get(url)
driver.execute_script('window.print();')

$chromedriver --v
ChromeDriver 75.0.3770.90 (a6dcaf7e3ec6f70a194cc25e8149475c6590e025-refs/branch-heads/3770@{#1003})

Any hints or solutions as to what can be done to print the open html page to a PDF. Have spent hours trying to make this work. Thank you!

Update on 2019-07-11:

My question has been identified as a duplicate, but a) the other question seems to be using javascript code, and b) the answer does not solve the problem being raised in this question - it may be to do with more recent software versions. Chrome version being used is Version 75.0.3770.100 (Official Build) (64-bit), and chromedriver is ChromeDriver 75.0.3770.90. On Mac OS Mojave. Script is running on Python 3.7.3.

Update on 2019-07-11:

Changed the code to

from selenium import webdriver
import json

chrome_options = webdriver.ChromeOptions()
settings = {
    "appState": {
        "recentDestinations": [{
            "id": "Save as PDF",
            "origin": "local",
            "account": "",
        }],
        "selectedDestinationId": "Save as PDF",
        "version": 2
    }
}
prefs = {'printing.print_preview_sticky_settings': json.dumps(settings)}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver'
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=CHROMEDRIVER_PATH)
driver.get("https://google.com")
driver.execute_script('window.print();')
driver.quit()

And now, nothing happens. Chrome launches, loads url, print dialog appears but then nothing seems to happen - nothing in the default printer queue, and no pdf either - I even searched for the PDF files by looking up "Recent Files" on Mac.

no PDF saved, where did you check? It should be saved in your user Downloads folder. — Kamal
– Kamal, Commented Jul 10, 2019 at 7:01
@Kamal - I tried this again, and noticed that Chrome was firing an actual printout on my default printer but I was not in the same location, so I did not notice what actually happened. deleted the print queue from the numerous times that I had tried printing to pdf/ appeared that nothing happened. so I suspect that the "Save as PDF" option is not getting selected and do not know how to select it. — jim70
– jim70, Commented Jul 10, 2019 at 11:57
Please refer to this answer. In your code, you are calling webdriver.Chrome(options=options.., but correct syntax is webdriver.Chrome(chrome_options=options... And somehow, with webdriver.ChromeOptions print is working faster than with webdriver.chrome.options.Options, so I would suggest you to try that. — Kamal
– Kamal, Commented Jul 11, 2019 at 1:22
Possible duplicate of Set Selenium ChromeDriver UserPreferences to Save as PDF — Kamal
– Kamal, Commented Jul 11, 2019 at 1:23
@GregW.F.R glad it worked. I have not used this in a long time. But yes that is the way to instantiate a chrome driver instance. — jim70
– jim70, Commented Jul 19, 2021 at 20:33

Kamal · Accepted Answer · 2019-07-18 08:22:32Z

27

The answer here, worked when I did not have any other printer setup in my OS. But when I had another default printer, this did not work.

I don't understand how, but making small change this way seems to work.

from selenium import webdriver
import json

chrome_options = webdriver.ChromeOptions()
settings = {
       "recentDestinations": [{
            "id": "Save as PDF",
            "origin": "local",
            "account": "",
        }],
        "selectedDestinationId": "Save as PDF",
        "version": 2
    }
prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(settings)}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver'
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=CHROMEDRIVER_PATH)
driver.get("https://google.com")
driver.execute_script('window.print();')
driver.quit()

answered Jul 18, 2019 at 8:22

Kamal

2,5621 gold badge17 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

jim70 Over a year ago

Thank you @Kamal. This approach indeed works but it printed to the last used printer. Just did some search and I wonder if cups-pdf installed as a printer and if cups-pdf is the last used printer can result in the desired outcome - print-to-pdf using python.

Kamal Over a year ago

Sorry I couldn't test my solution on Linux, it worked on Windows 10 for me.

jim70 Over a year ago

got it. Will work on this some more and see if I can come up with something.

Rob Hall Over a year ago

Worked on Linux for me. Would be nice if we could control the download location, however.

iMath Over a year ago

@RobHall The solution stackoverflow.com/a/60548793/1485853

Gaurav Toshniwal · Accepted Answer · 2020-11-25 13:03:08Z

10

You can use the following code to print PDFs in A5 size with background css enabled:

import os
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import json
import time

chrome_options = webdriver.ChromeOptions()

settings = {
    "recentDestinations": [{
        "id": "Save as PDF",
        "origin": "local",
        "account": ""
    }],
    "selectedDestinationId": "Save as PDF",
    "version": 2,
    "isHeaderFooterEnabled": False,
    "mediaSize": {
        "height_microns": 210000,
        "name": "ISO_A5",
        "width_microns": 148000,
        "custom_display_name": "A5"
    },
    "customMargins": {},
    "marginsType": 2,
    "scaling": 175,
    "scalingType": 3,
    "scalingTypePdf": 3,
    "isCssBackgroundEnabled": True
}

mobile_emulation = { "deviceName": "Nexus 5" }
chrome_options.add_experimental_option("mobileEmulation", mobile_emulation)
chrome_options.add_argument('--enable-print-browser')
#chrome_options.add_argument('--headless')

prefs = {
    'printing.print_preview_sticky_settings.appState': json.dumps(settings),
    'savefile.default_directory': '<path>'
}
chrome_options.add_argument('--kiosk-printing')
chrome_options.add_experimental_option('prefs', prefs)

for dirpath, dirnames, filenames in os.walk('<source path>'):
    for fileName in filenames:
        print(fileName)
        driver = webdriver.Chrome("./chromedriver", options=chrome_options)
        driver.get(f'file://{os.path.join(dirpath, fileName)}')
        time.sleep(7)
        driver.execute_script('window.print();')
        driver.close()

answered Nov 25, 2020 at 13:03

Gaurav Toshniwal

3,7322 gold badges26 silver badges24 bronze badges

2 Comments

Mark Tielemans Over a year ago

This solution worked great for me. savefile.default_directory takes both forward and backslash paths (on Windows 10). However, this fails more often than it succeeds for me because the browser closes before the file is fully written. This can be solved by adding sleep(5) before driver.close() or some more intelligent structure.

user3691763 Over a year ago

It seems like headless is commented out, and with headless on it doesn't work. Any idea how to make it work in a headless browser?

Basj · Accepted Answer · 2020-04-03 13:28:33Z

6

Here is the solution I use with Windows :

First download the ChromeDriver here : http://chromedriver.chromium.org/downloads and install Selenium

Then run this code (based on the accepted answer, slightly modified to work on Windows):

import json
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
settings = {"recentDestinations": [{"id": "Save as PDF", "origin": "local", "account": ""}], "selectedDestinationId": "Save as PDF", "version": 2}
prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(settings)}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
browser = webdriver.Chrome(r"chromedriver.exe", options=chrome_options)
browser.get("https://google.com/")
browser.execute_script('window.print();')
browser.close()

answered Apr 3, 2020 at 13:28

Basj

47.5k113 gold badges467 silver badges819 bronze badges

4 Comments

Rob Hall Over a year ago

This is such a minimal revision ("Per the selenium documentation, specify the windows driver locations (e.g., chromedriver.exe) rather than the linux driver locations when running on windows") that it should simply be a comment on the accepted answer. Furthermore, It appears that you simply minified the accepted answer to make the code look different.

Basj Over a year ago

@RobHall Comments are sometimes cleared after years; also sometimes it's hard to extract information from multiple comments, thus this answer. I properly cited the source ("based on the accepted answer"); the devil is really in the details, I spent a lot of time trying and failing before it finally worked, so my goal was really to put a ready-to-use code for Windows as an answer.

Raspberry Lemon Over a year ago

I tried searching for the saved file but can't find it anywhere. Any idea where the file goes after being saved as pdf.

UserBlanko Over a year ago

the saved file would be in downloads, does anyone know if I can add a delay for the web to load properly or if can change the default download location?

Alex · Accepted Answer · 2019-07-18 14:24:52Z

3

The solution is not very good, but you can take a screenshot and convert to pdf by Pillow...

from selenium import webdriver
from io import BytesIO
from PIL import Image

driver = webdriver.Chrome(executable_path='path to your driver')
driver.get('your url here')
img = Image.open(BytesIO(driver.find_element_by_tag_name('body').screenshot_as_png))
img.save('filename.pdf', "PDF", quality=100)

edited Jul 18, 2019 at 14:24

answered Jul 18, 2019 at 14:15

Alex

1077 bronze badges

9 Comments

jim70 Over a year ago

Thank you for your answer. The issue with this approach is that it does not work for multi-page webpages. Only a portion of information is captured. But it is a good solution for short pages and does not entail popups.

Alex Over a year ago

what do you mean by multi-page webpages?

jim70 Over a year ago

meaning web pages that need scrolling to see the complete webpage and when printed as PDF fit on 3-4 sheets of papers.

Alex Over a year ago

you can use this code: stackoverflow.com/a/57608276/10661593 , and at the end save as pdf. P.s. I didn't understand a bit, sorry. Do you want to fit the entire page on 1 sheet when printing? or how

jim70 Over a year ago

so what I ideally want to be able to do - is print a page as pdf. on a Mac, when you do that, the PDF generated can run into many pages - assuming PDF is created for letter or A4 sized printing. if I shrink the page a lot and take a screenshot that does not serve the purpose. although, now I understand that Selenium does not control the dialog boxes of the browser, and hence cannot print page as PDF. apparently, puppeteer or pyppeteer in python can do that but I do not know how to use that software yet. the link you shared, seems to talk about screenshot and not pdf...

|

L Tyrone · Accepted Answer · 2024-03-30 03:38:11Z

3

You can try to use the selenium-print package.

It uses selenium's execute_cdp_cmd function behind the scenes, which is fairly easy to use. The parameters can be found here.

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
options = webdriver.ChromeOptions()
service = Service()
driver = webdriver.Chrome(service=service, options=options)
driver.get('http://localhost:3000')
time.sleep(2)
pdf = driver.execute_cdp_cmd("Page.printToPDF", {"printBackground": True})
pdf_data = base64.b64decode(pdf["data"])
with open("test.pdf", "wb") as f:
    f.write(pdf_data)

edited Mar 30, 2024 at 3:38

L Tyrone

8,36123 gold badges34 silver badges47 bronze badges

answered Mar 29, 2024 at 14:15

michellewin

311 silver badge1 bronze badge

2 Comments

Marcos Lima Mar 17 at 19:17

Only answer that worked flawless + choose fine name.

Samueljh1 Apr 19 at 7:58

This was exactly what I was looking for, thanks!

André · Accepted Answer · 2025-05-27 09:15:56Z

In scenarios where a website restricts PDF generation to system-level printing and PDF-related APIs are either inaccessible or non-functional, the only effective means of saving the document is by triggering the system print function. It should be noted that this workaround is only feasible in non-headless mode, as neither CDP print nor page.pdf methods are applicable in such cases.

chrome_options = Options()
print_settings = {
    "recentDestinations": [{"id": "Save as PDF", "origin": "local", "account": ""}],
    "selectedDestinationId": "Save as PDF",
    "version": 2,
    "isHeaderFooterEnabled": False,
    "isLandscapeEnabled": False
}
prefs = {
    "download.prompt_for_download": False,
    "download.default_directory": self.download_path,
    "savefile.default_directory": self.download_path,
    'printing.print_preview_sticky_settings.appState': json.dumps(print_settings),
    "credentials_enable_service": False,
    "profile.password_manager_enabled": False
}
chrome_options.add_experimental_option("prefs", prefs)
chrome_options.add_argument("--kiosk-printing")
# chrome_options.add_argument("--headless=new")
chrome_options.add_argument("--avoid-stats=true")
chrome_options.add_argument(f'user-agent={user_agent}')
chrome_options.add_argument("--disable-blink-features=AutomationControlled") 
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
chrome_options.add_argument("--disable-gpu")

service = Service()
driver = webdriver.Chrome(service=service, options=chrome_options)

Benjamin Loison · Accepted Answer · 2024-07-08 21:39:57Z

-6

I would suggest Downloading the page source html which can be done like so in vb.net:

Dim Html As String = webdriver.PageSource

Not sure how it is done in python but I'm sure it's very similar Once you have done that then you can select the parts of the page you want to save using an html parser or by parsing it manually with string parsing code. Once you have the html for the part you want to save stored in a string then use an html to pdf converter library or program. There are lots of these for programming languages like C# and vb.net. I don't know about any for python but I'm sure some exist. Just do some research. (some are free and some are expensive)

edited Jul 8, 2024 at 21:39

Benjamin Loison

5,7504 gold badges20 silver badges37 bronze badges

answered Apr 29, 2021 at 5:10

Jacob Leeson

12 bronze badges

1 Comment

Ricardo Over a year ago

I've been using the converter approach and it is not great. The most common converter, wkhtmltopdf, lives in the 13th century, so either you put your medieval armour, forget all about flex and grid and go back to <table> layouting or you'll get zilch. Alternatives are even worse. Speaking of the 13th century, vb.net?!? In general, I don't hold a candle for two types of SO responses: 1) "Here's something I threw together and never actually tried. Good luck!", and 2) "Why would you want to do that?". Yours is type 1. Not as bad as type 2, but still a time waster.

Collectives™ on Stack Overflow

how to save opened page as pdf in Selenium (Python)

7 Answers 7

5 Comments

2 Comments

4 Comments

9 Comments

2 Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

5 Comments

2 Comments

4 Comments

9 Comments

2 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related