Algorithms for mapping data in data mining

Question

I need to scrape some webpages and extract content from them. I'm planning to select some specific keywords and map the data that has some relationship b/w them. But I have no Idea, how I could do that. Could anyone suggest me some algorithms for doing it?.

For example I need to download some webpages about apples and map the relevant data about apples to it and store in database so that, if someone needs specific information about it, I could provide it fastly and accurately.

Also it would be helpful pointing out helpful libraries too. I'm planning to do it in python.

There is 1 famous main algorithm for that. I suggest you to search on Google. — Vincent Cantin
– Vincent Cantin, Commented May 14, 2011 at 12:33

riza · Accepted Answer · 2011-05-14 17:23:33Z

1

Have a look at NLTK, Pattern or Orange modules.

As a start "Programming collective intelligence: building smart web 2. 0 applications" by Toby Segaran is a good book to read.

edited May 14, 2011 at 17:23

answered May 14, 2011 at 17:16

riza

17.3k7 gold badges31 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Manuel Salvadores · Accepted Answer · 2011-05-14 13:30:27Z

1

You could try algorithms based on term frequency–inverse document frequency TF-IDF, in Java I would recommend Solr ... well actually you could use Solr and access it with python see here

answered May 14, 2011 at 13:30

Manuel Salvadores

16.6k5 gold badges39 silver badges56 bronze badges

Collectives™ on Stack Overflow

Algorithms for mapping data in data mining

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related