I have a program in python which uses two files as inputs - and calculates the similarity between them. I want to use all possible combinations of files in a directory as input. How can this be done using python expanding upon the script that I already have?
I know there are tools such as glob which iterate through an entire file. However, what can I do to also create all of the different file combinations?
Also, as @hcwhsa and @Ashish Nitin Patil how can itertools be combined with glob??
Thank you for any insight.
Further detail:
My code requires 2 inputs that are identical (I have a directory of approx 50 of these files). Each input is 3-tab separated column (value1, value2, weight). Essentially with this information I calculate jaccard coefficient as found here:
def compute_jaccard_index(set_1, set_2):
return len(set_1.intersection(set_2)) / float(len(set_1.union(set_2)))
I want to calculate this coefficient for all the possible combinations of files in the directory. As of now, I called each file locally as:
with open('input_file1', 'r') as infile_B:
with open('input_file2', 'r') as infile_B:
My goal is to iterate the function over all possible combinations of files in the directory.
globalso. With your solution, all possible combinations ofinput1andinput2will be created and used directly by the program? That is the main question - I am sorry if I did not express myself clearly.filenames = [os.path.join(path, entry) for entry in entries if os.path.isfile(os.path.join(path, entry)) and entry.split('.')[-1] == 'py']