Categorizing records in Java

Question

I've had a list of books in which each book belongs to a category.

Flying a Plane - Aviation
Painting a picture - Art
1001 Recipes - Cooking

I have a huge enough sample set of data. I need to categorize my newer books using some algorithm. I know it'll never be a 100% accurate but a good guess is good for me.

What should I use to implement to do something like this? Should I go with Classifier4J and it's Vector Classifier?

Are there other tools that I should look at like Weka? It would be great if someone could point me to some articles/examples to get me started.

Thanks

Have a look at this: java-text-classification-problem, you guys are doing almost exactly the same thing. — 16dots
– 16dots, Commented Jun 7, 2012 at 16:36

Artem Oboturov · Accepted Answer · 2012-06-08 12:47:26Z

1

There's a course on https://www.coursera.org/course/ml called Machine Learning. If you look at your problem as classification you should train N One-vs-All classifiers where N is number of your classes (=categories). To train a classifier use on of algorithms described in Natural Language Processing class https://www.coursera.org/course/nlp, normally it will be similarity to existing classes http://nlp.stanford.edu/IR-book/html/htmledition/text-classification-and-naive-bayes-1.html. All this could be done in Apache Mahout with https://cwiki.apache.org/confluence/display/MAHOUT/Bayesian.

answered Jun 8, 2012 at 12:47

Artem Oboturov

4,3862 gold badges32 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mridang Agarwalla · Accepted Answer · 2012-06-13 12:17:16Z

1

Lingpipe seems to be a good solution and seems to work well. The included demo in Lingpipe is a good place to begin:

http://alias-i.com/lingpipe/demos/tutorial/classify/read-me.html

answered Jun 13, 2012 at 12:17

Mridang Agarwalla

45.4k74 gold badges238 silver badges398 bronze badges

Collectives™ on Stack Overflow

Categorizing records in Java

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related