0

I have successfully installed Spidermonkey JS engine on my Linux machine ( Ubuntu ). Basically my goal is to make it execute Ajax (js) scripts and return the result back to my Python script. I'm basically trying to build a good O.O. web scraper. But it's pretty hard for me to get all of this working.

I'm now at the point where when I type JS in my terminal I can start executing Javascript. I've been Googling and found this little snipet on Stackoverflow :

import urllib2
import spidermonkey
js = spidermonkey.Runtime()
js_ctx = js.new_context()
script = urllib2.urlopen('http://etherhack.co.uk/hashing/whirlpool/js/whirlpool.js').read()
js_ctx.eval_script(script)
js_ctx.eval_script('var s="abc"')
js_ctx.eval_script('print(HexWhirpool(s))')

but it failed to run with the error that module Spidermonkey can not be found.

I'm a bit lost now. Anyone able to help?

4
  • Did you also install this: code.google.com/p/python-spidermonkey ? Commented Feb 22, 2012 at 21:55
  • yes i did : easy_install python-spidermonkey but it returns an error : RuntimeError: No package configuration found for: nspr Tried to fix that error by installing : apt-get install libnspr-dev pkg-config got this error : Package libnspr-dev is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source E: Package 'libnspr-dev' has no installation candidate and offically stuck now Commented Feb 22, 2012 at 22:06
  • An alternative to this would be using the QtWebKit + PySide bindings for Python - I've had great success with it. You'll get a more holistic treatment for the HTML and Javascript interactions as well since it will run 'in a real browser'. Browsers do a bit of data-massaging to make sure that invalid but 'pretty-close' HTML will still render correctly, doing this by hand is much harder. This solution is a lot heavier weight than what you're shooting for, but I wouldn't do it any other way at this point. Commented Feb 23, 2012 at 0:53
  • "Distrust all claims for one true way" - Unix Philosophy ;) Commented Mar 31, 2012 at 8:36

3 Answers 3

1

I also tried easy_install python-spidermonkey with no luck, for libnspr-dev package is absent.

So, I've built package from source. Instructions from project page (Debian Stretch):

Building

  1. Check out the Python-Spidermonkey module from the SVN repository ( I downloaded it as source archive, direct link )
  2. Unpack, and cd to ./python-spidermonkey/trunk
  3. CPPFLAGS="-Wno-format-security" python setup.py build (these flags for Debian)
  4. Error jsemit.h:508:32: error: expected ‘(’ before ‘)’ token uintN decltype); means that decltype cannot be used as variable (maybe it's a macro or something else), fix it this way:

    sed -e 's/decltype/dectyp/' -i.ORIG ./js/src/jsemit.h

    sed -e 's/decltype/dectyp/' -i.ORIG ./js/src/jsemit.cpp

  5. Error jsemit.cpp:6490:1: error: narrowing conversion of ‘-1’ from ‘int’ to ‘uint8 {aka unsigned char}’ inside { } [-Wnarrowing] means illegal variable conversion, recompile it manually:

    cd js/src

    g++ -o Linux_All_DBG.OBJ/jsemit.o -c -Wall -Wno-narrowing -Wno-format -MMD -g3 -DXP_UNIX -DSVR4 -DSYSV -D_BSD_SOURCE -DPOSIX_SOURCE -DHAVE_LOCALTIME_R -DHAVE_VA_COPY -DVA_COPY=va_copy -DPIC -fPIC -DDEBUG -DDEBUG_user -DEDITLINE -ILinux_All_DBG.OBJ jsemit.cpp

  6. Error spidermonkey.c:1:2: error: #error Do not use this file, it is the result of a failed Pyrex compilation. - some trouble with pyrex. There is a patch. Do it this way:

    wget -O - https://storage.googleapis.com/google-code-attachments/python-spidermonkey/issue-14/comment-4/cinit.patch | patch -p1 ./spidermonkey.pyx

Installation

su, and python setup.py install as root.

Running

  1. By default, setup script installs libjs.so to /usr/local/lib/, so I did ln -s /usr/local/lib/libjs.so /usr/lib/libjs.so (but you'd better use solution from Seagal82)

Without this step, python keeps complaining about import ImportError: libjs.so: cannot open shared object file: No such file or directory

  1. I also had an error ImportError: cannot import name Runtime after from spidermonkey import Runtime. The reason possibly was in old easy_install data in ~/.local/lib/python2.7/site-packages/spidermonkey/. After removing it, all runs smooth
Sign up to request clarification or add additional context in comments.

Comments

1

Recently i got a task need to do something like Web scraping, and for the javascript part, currently want to try using python-spidermonkey to resolve it and see if this might work for me ...

and i seem to meet situation might alike, after i think i finished install python-spidermonkey, i execute the script above, i got this error:

Traceback (most recent call last):
  File "spidermonkeytest.py", line 2, in <module>
    import spidermonkey
ImportError: libjs.so: cannot open shared object file: No such file or directory

then after some searching by google...i found the solution probably in the end of here: http://hi.baidu.com/peizhongyou/item/ec1575c3f0e00e31e80f2e48

i setup these things:

$sudo vi /etc/ld.so.conf.d/libjs.so.conf

fill in this line:

/usr/local/lib/

save & exit, execute ldconfig:

$sudo ldconfig

then i can run the script provided above by @Synbitz Prowduczions don't know if this is the answer you need, or this still helps?

Comments

0

You need to try libnspr4. If that doesn't work, you can always download it from Mozilla and build the code yourself.

It is not difficult to type ./config && make && make install to build the library yourself after untarring the source. If you build yourself, files will likely be in

/usr/local/{include,lib}

Also just try Googling for "YOUR_OS_NAME install nspr4".

  • I believe someone wrote a C/C++ header file translator for Python ctypes. Although I can't say much else because I don't use Python.
  • SpiderMonkey also has its own implementation of ctypes modeled after Python. So technically if you know javascript you could forego using Python altogether since you want to do some ajax with it. You will need to brush up on the NSPR or C runtime sockets to meet the requirements for your projects using only Spidermonkey.

OR a web search for Python +AJAX might turn up exactly what you need.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.