-1

Every day I need to open a webpage, copy the text on the page and paste it into an Excel file. Is there a way that I can automate this process using Python, without bothering to open a web browser?

thanks for friends who provided the answer. would it be possible to show me an example?

thanks.

1
  • 3
    If the question is can you do this then the answer is yes, but the point of SO isn't to get other people to do the work for you. Commented May 31, 2013 at 11:06

5 Answers 5

1

You could use a technique called web scraping; there is even an open source framework written in python called scrapy which is specifically written for crawling and screen scraping.

Just do a google search with a search phrase such as; "web scraping using python" this should be enough to get you started on your way.

There is some good information in the following post; Anyone know of a good Python based web crawler that I could use?

Sign up to request clarification or add additional context in comments.

1 Comment

this is direct, suitable for newbie like me :)
1

Sure, simply use urllib2 to open your webpage, then have a look at the content with BeautifulSoup and then just stick that data into the Excel file with xlwt. Easy!

2 Comments

thanks for the reply and the links, which are useful for study.
Instead of using urllib2, you could try the excellent "requests" library. It handles much of the heavy lifting for you. docs.python-requests.org/en/latest
1

Yes, you can do this.

I would suggest:

  • Read up on urllib and urllib2 for getting the page in python.
  • Investigate lxml for parsing the content from your page.
  • Take a look at this page on python excel manipulation.
  • Attempt to write some code to do what you wish.
  • If you don't succeed immediately then ask for some help and provide code examples.

Good luck

1 Comment

thanks for the details and links, and bullet points. professional!
1

You can do the same in excel itself at a small level (importing data to Excel from the web). From the Excel Ribbon select 'Data' > 'From Web. If you are bent upon using python try https://datanitro.com/ . Datanitro is an excellent python-excel integration. Here is a demo http://scriptogr.am/richie/post/python-for-excel-using-datanitro

2 Comments

another point of view. thanks.
Unfortunately DataNitro isn't free, unless you're a student. It costs $99 otherwise.
0

Yes, there is. You need to use urllib2 to pull the HTML from the web, then you need to parse the HTML for the values you need (module BeautifulSoup and regex), and finally to save the result as CSV file, which can be opened in Excel

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.