218

I want to use PhantomJS in Python. I googled this problem but couldn't find proper solutions.

I find os.popen() may be a good choice. But I couldn't pass some arguments to it.

Using subprocess.Popen() may be a proper solution for now. I want to know whether there's a better solution or not.

Is there a way to use PhantomJS in Python?

2
  • My answer below tells you how to do it. Just looking at your question and actually thats exactly what Selenium does, a subprocess.popen but with some extended features to make the api seamless. Commented Mar 20, 2015 at 17:29
  • @flyer: You should probably consider changing the accepted answer, see below. Thank you. Commented Dec 24, 2015 at 9:27

8 Answers 8

390

The easiest way to use PhantomJS in python is via Selenium. The simplest installation method is

  1. Install NodeJS
  2. Using Node's package manager install phantomjs: npm -g install phantomjs-prebuilt
  3. install selenium (in your virtualenv, if you are using that)

After installation, you may use phantom as simple as:

from selenium import webdriver

driver = webdriver.PhantomJS() # or add to your PATH
driver.set_window_size(1024, 768) # optional
driver.get('https://google.com/')
driver.save_screenshot('screen.png') # save a screenshot to disk
sbtn = driver.find_element_by_css_selector('button.gbqfba')
sbtn.click()

If your system path environment variable isn't set correctly, you'll need to specify the exact path as an argument to webdriver.PhantomJS(). Replace this:

driver = webdriver.PhantomJS() # or add to your PATH

... with the following:

driver = webdriver.PhantomJS(executable_path='/usr/local/lib/node_modules/phantomjs/lib/phantom/bin/phantomjs')

References:

Sign up to request clarification or add additional context in comments.

26 Comments

This worked beautifully, and probably saved me days. Thank you. If one wants the whole rendered page back as source, it's driver.page_source.
This does work beautifully, and I'm pleasantly surprised because phantomjs.org/faq.html says "not a Node.js module" --yet the npm wrapper at npmjs.org/package/phantomjs makes it behave for this purpose. In my case I wanted to do this: bodyStr= driver.find_element_by_tag_name("body").get_attribute("innerHTML") and ...it worked!
I agree that ghost has crazy dependencies, and I actually failed to get it up and running even after installing millions of X11 related libraries. Ghost is a horror story.
@phabtar You need to pass the path to phantomjs as the first argument to PhantomJS ... or fix your windows syspath to be able to see phantomjs.
Under Windows, I did not have to install phantomJS via node and npm. Downloading the binary from phantomjs.org/download.html and putting the phantomjs.exe into a location in my PATH (e.g. c:\Windows\System32) or vice versa (putting it anywhere and adding the folder to PATH) was enough to make it work in Python.
|
86

PhantomJS recently dropped Python support altogether. However, PhantomJS now embeds Ghost Driver.

A new project has since stepped up to fill the void: ghost.py. You probably want to use that instead:

from ghost import Ghost
ghost = Ghost()

with ghost.start() as session:
    page, extra_resources = ghost.open("http://jeanphi.me")
    assert page.http_status==200 and 'jeanphix' in ghost.content

11 Comments

Even though support is dropped, I found that installing npm (node package manager) and using it to install the latest phantomjs (with webdriver support) and installing selenium in python ... way easier than trying to get PyQT or PySide to work properly. What's nice about phantom it is truly headless and requires no UI/X11 related libs to work.
I added an answer below explaining my preferred solution after trying to use ghost.py and hating my life
Pykler's "hating my life" isn't an understatement. If someone would change the "correct answer" for this question to Pykler's I would have saved a day's effort.
@YPCrumble: unfortunately, only the OP can do that; change the accepted answer.
After trying a bunch of different approaches this morning, @Pykler solution ended up working the smoothest.
|
41

Now since the GhostDriver comes bundled with the PhantomJS, it has become even more convenient to use it through Selenium.

I tried the Node installation of PhantomJS, as suggested by Pykler, but in practice I found it to be slower than the standalone installation of PhantomJS. I guess standalone installation didn't provided these features earlier, but as of v1.9, it very much does so.

  1. Install PhantomJS (http://phantomjs.org/download.html) (If you are on Linux, following instructions will help https://stackoverflow.com/a/14267295/382630)
  2. Install Selenium using pip.

Now you can use like this

import selenium.webdriver
driver = selenium.webdriver.PhantomJS()
driver.get('http://google.com')
# do some processing

driver.quit()

2 Comments

special thanks for pointing to SO answer concerning PhantomJS installation on Ubuntu, it helped me.
a quick way to install Selenium I just learned is, on Windows, type: C:\Python34\Scripts\pip.exe install Selenium.
9

Here's how I test javascript using PhantomJS and Django:

mobile/test_no_js_errors.js:

var page = require('webpage').create(),
    system = require('system'),
    url = system.args[1],
    status_code;

page.onError = function (msg, trace) {
    console.log(msg);
    trace.forEach(function(item) {
        console.log('  ', item.file, ':', item.line);
    });
};

page.onResourceReceived = function(resource) {
    if (resource.url == url) {
        status_code = resource.status;
    }
};

page.open(url, function (status) {
    if (status == "fail" || status_code != 200) {
        console.log("Error: " + status_code + " for url: " + url);
        phantom.exit(1);
    }
    phantom.exit(0);
});

mobile/tests.py:

import subprocess
from django.test import LiveServerTestCase

class MobileTest(LiveServerTestCase):
    def test_mobile_js(self):
        args = ["phantomjs", "mobile/test_no_js_errors.js", self.live_server_url]
        result = subprocess.check_output(args)
        self.assertEqual(result, "")  # No result means no error

Run tests:

manage.py test mobile

3 Comments

Thanks. I used subprocess.Popen to call the phantomjs script and it worked :)
You do see how this is limited right? All you are doing is making a shell call to execute phantomjs - you are not actually using a "proper" interface through which you may properly handle exceptions, blocking, etc.
@kamelkev: I see how this is limited. The upside is that this method allows me to use Django's bootstraping features to set up a test database with the correct content for each test. And yes, it could be combined with the other answers to get the best of both worlds.
6

The answer by @Pykler is great but the Node requirement is outdated. The comments in that answer suggest the simpler answer, which I've put here to save others time:

  1. Install PhantomJS

    As @Vivin-Paliath points out, it's a standalone project, not part of Node.

    Mac:

    brew install phantomjs
    

    Ubuntu:

    sudo apt-get install phantomjs
    

    etc

  2. Set up a virtualenv (if you haven't already):

    virtualenv mypy  # doesn't have to be "mypy". Can be anything.
    . mypy/bin/activate
    

    If your machine has both Python 2 and 3 you may need run virtualenv-3.6 mypy or similar.

  3. Install selenium:

    pip install selenium
    
  4. Try a simple test, like this borrowed from the docs:

    from selenium import webdriver
    from selenium.webdriver.common.keys import Keys
    
    driver = webdriver.PhantomJS()
    driver.get("http://www.python.org")
    assert "Python" in driver.title
    elem = driver.find_element_by_name("q")
    elem.clear()
    elem.send_keys("pycon")
    elem.send_keys(Keys.RETURN)
    assert "No results found." not in driver.page_source
    driver.close()
    

2 Comments

How to install PhantomJS on windows ? It doesn't seem to work using pip command.
Pip is a python package installer, so it works with selenium, which is available as a python package. PhantomJS is not a python package so won't work with pip. I did a quick google for "PhantomJS install windows" and there are good hits.
5

this is what I do, python3.3. I was processing huge lists of sites, so failing on the timeout was vital for the job to run through the entire list.

command = "phantomjs --ignore-ssl-errors=true "+<your js file for phantom>
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE)

# make sure phantomjs has time to download/process the page
# but if we get nothing after 30 sec, just move on
try:
    output, errors = process.communicate(timeout=30)
except Exception as e:
    print("\t\tException: %s" % e)
    process.kill()

# output will be weird, decode to utf-8 to save heartache
phantom_output = ''
for out_line in output.splitlines():
    phantom_output += out_line.decode('utf-8')

1 Comment

Thanks, I was able to alter it to taste for my purpose.
5

If using Anaconda, install with:

conda install PhantomJS

in your script:

from selenium import webdriver
driver=webdriver.PhantomJS()

works perfectly.

2 Comments

As of now, default channels don't contain PhantomJS for linux64
damn, i love conda <3 that was so easy. i'm on osx.
2

In case you are using Buildout, you can easily automate the installation processes that Pykler describes using the gp.recipe.node recipe.

[nodejs]
recipe = gp.recipe.node
version = 0.10.32
npms = phantomjs
scripts = phantomjs

That part installs node.js as binary (at least on my system) and then uses npm to install PhantomJS. Finally it creates an entry point bin/phantomjs, which you can call the PhantomJS webdriver with. (To install Selenium, you need to specify it in your egg requirements or in the Buildout configuration.)

driver = webdriver.PhantomJS('bin/phantomjs')

1 Comment

another way to automate installation process with buildout it's just use gp.recipe.phantomjs, that configures phantomjs and casperjs

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.