0

I see that the Configuration class in Hadoop is writable http://hadoop.apache.org/docs/current/api/org/apache/hadoop/conf/Configuration.html. However, I do not see any of the methods that it has exposed that can be used to add a writable object (I see a lot of methods to set and get primitive types like int, long). Let us say, I have my own writable object and I want to add it to the configuration for all my mappers and reduces to use, how do I do this?

Thanks,

Venkat

2 Answers 2

1

The configuration is really not for passing entire objects. The configuration should be used more for setting simple parameters that are needed for the setup of the Mappers/Reducers. Think of the conf as you set the variables at the beginning of the job. If you make changes during the middle of a run to the configuration, it most likely won't be there at the end as it's not really meant to dynamically pass data.

What you are looking for if you want to pass around entire Objects between nodes is the Distributed Cache. Technically speaking these are files, but you can use standard object serialization to add them. About the Distributed Cache.

*apologies for linking different hadoop versions, their pages are a bit muddled and hard to find what you need sometimes.

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for your time and reply. So, does Hadoop convert the primitives to the wrappers while serializing the configuration? This is sort of unrelated to my original question, but am just curious.
convert primitives to the wrappers? I'm not understanding your question
I am sorry if I was not clear. How does Haodoop serialize Configuration and the primitive types inside of it?
grepcode.com/file_/repo1.maven.org/maven2/…. Look to the bottom of the link until you find the "write" method. It's stored as key, value pairs as strings
It is XML. You can serialize your Object to JSON and set them as strings if you really need them.
1

You can check HBase sources (starting from HBase 0.94.6) MultiTableInputFormat.setConf() class method and appropriate TableMapReduceUtil code (for example .initTableMapperJob()). They pass Scan objects through configuration. Earlier TableInputFormat.setConf() class uses very similar mechanics. Usually only minimal attributes are passed through config but this is probably case closer to your one.

Hope it will help.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.