0

I am written a hadoop program, I know that I can directly pass arguments to hadoop use args[], I mean currently is like this

ToolRunner.run(new Configuration(), new RunDear(), args); 

but if there are many arguments, can I make a configuration file like below and pass to hadoop? where should this file located, in local file system or hdfs?

sample_size 200
input_genotype_file /data/genotypes.txt 
input_phenotype_file /data/phenotypes.txt
output_directory /outout 
mtry 200
ntree 3000
distance 0 (e.g. 0=euclidean, 1=mehalanobis
variable_important 0 (e.g. 0=information gain, 1=permutation)
etc….

3 Answers 3

1

You can put the file into distributed cache, and then pass name of the file in the configuration to Your tasks.

Sign up to request clarification or add additional context in comments.

Comments

1

You can use conf.addResource(new Path(/path/to/local/file)). This will pass the file to each and every task.

Comments

0

You can make a Wrapper Class which reads these arguments and sets these in the agrs array and then pass it.

2 Comments

ok, in this case the configuration file is local file system, right?
what if I want to run this program in aws?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.