0

I am trying to list all files within a directory that contain the string I specify as part of their names. I want to vary this string with each iteration of the loop. The code I am using is:

from subprocess import Popen
from subprocess import call

species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
run_length = (len(species_array) - 5)
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for i in range (run_length):
    s = Popen("find", path, "-name", *species_array[i+1]*)
    print s.communicate()[0]

The file should contain species_array[i+1] as part of its name. Thanks in advance.

3
  • see stackoverflow.com/questions/3207219/… Commented Jan 21, 2015 at 10:41
  • just use glob of fnmatch also why use i+1 do you not want the first? Commented Jan 21, 2015 at 10:43
  • I dont want the first. And all of these glob functions only take into account a string that doesn't change over the entire program. I am looking for a code which enables me to find a variable substring in a filename. Commented Jan 21, 2015 at 10:46

2 Answers 2

2

If you want to use find you need to pass a list of args when shell=False. check_output will work for your case, you can slice the list instead of using range and you need str.format to wrap each specie/ele in *:

from subprocess import check_output

species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for ele in species_array[1:-5]:
    s = check_output(["find", path, "-name", "*{0}*".format(ele)])
    print s

For python 2.6 use Popen:

from subprocess Popen,PIPE

species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for ele in species_array[1:-5]:
    s = Popen(["find", path, "-name", "*{0}*".format(ele)],stdout=PIPE,stderr=PIPE)
    out,err = s.communicate()
    print(out,err)
Sign up to request clarification or add additional context in comments.

4 Comments

There is an error saying that check_output is not defined! I am using Python 2.6.6
check_output() was introduced in 2.7. You can use Popen() with communicate() in place of check_output().
I am getting an error like:s = Popen(["find", path, "-name", "{}".format(ele)],stdout=PIPE,stderr=PIPE) ValueError: zero length field name in format
Got it now! Thanks Padraic :D
2

Your loop is all wrong. python is much more expressive than that:

1) You can skip the first element by starting the range at 1:

for i in range(1, len(species_arr) - 4):

...then use i instead of i+1 inside your loop.

2) Even easier (and more idiomatic) is to use list slicing:

for species in species_arr[1:-4]:

3) You can format strings in python using the format() method.

Here is an example employing those concepts:

species_arr = [
    "homo_sapiens", 
    "pan_troglodytes", 
    "pongo_abelii", 
    "gorilla_gorilla", 
    "macaca_mulatta", 
    "callithrix_jacchus", 
    "bos_taurus", 
    "canis_familiaris", 
    "equus_caballus", 
    "felis_catus", 
    "ovis_aries", 
    "sus_scrofa", 
    "oryctolagus_cuniculus", 
    "rattus_norvegicus", 
    "mus_caroli", 
    "mus_pahari", 
    "mus_musculus"
]

chop_from_end = 4 

for species in species_arr[1:-chop_from_end]:
    fname = "*{0}*".format(species)
    print fname

--output:--
*pan_troglodytes*
*pongo_abelii*
*gorilla_gorilla*
*macaca_mulatta*
*callithrix_jacchus*
*bos_taurus*
*canis_familiaris*
*equus_caballus*
*felis_catus*
*ovis_aries*
*sus_scrofa*
*oryctolagus_cuniculus*

The format() method was introduced in python 3.0--but it was backported to python 2.6 (in a more limited form). If for some reason your install does not have the format() method, you can use the old way:

 fname = "*%s*" % species

See additional format() examples here:

https://docs.python.org/3/library/string.html#format-examples

4) Here's what you can do with the glob module:

import glob
import os.path
import pprint

base_dir = '/Users/7stud/python_programs/dir1'

names = ['a', 'b', 'c']

for name in names: 
    fname = "*{0}*".format(name)
    path = os.path.join(base_dir, fname)
    pprint.pprint(glob.glob(path))
    print '-' * 20

--output:--
['/Users/7stud/python_programs/dir1/__pycache__',
 '/Users/7stud/python_programs/dir1/a.txt',
 '/Users/7stud/python_programs/dir1/aa.txt',
 '/Users/7stud/python_programs/dir1/ab.txt',
 '/Users/7stud/python_programs/dir1/ba.txt']
--------------------
['/Users/7stud/python_programs/dir1/ab.txt',
 '/Users/7stud/python_programs/dir1/b.txt',
 '/Users/7stud/python_programs/dir1/ba.txt']
--------------------
['/Users/7stud/python_programs/dir1/__pycache__']
--------------------

Or, as a dict of name, matches pairs:

results = dict(
    (
      name,
      glob.iglob(os.path.join(base_dir, "*{0}*".format(name)))
    )
    for name in names
)

for name, _iter in results.items():
    print "{0}:".format(name)
    pprint.pprint(list(_iter))

--output:--
a:
['/Users/7stud/python_programs/dir1/__pycache__',
 '/Users/7stud/python_programs/dir1/a.txt',
 '/Users/7stud/python_programs/dir1/aa.txt',
 '/Users/7stud/python_programs/dir1/ab.txt',
 '/Users/7stud/python_programs/dir1/ba.txt']
c:
['/Users/7stud/python_programs/dir1/__pycache__']
b:
['/Users/7stud/python_programs/dir1/ab.txt',
 '/Users/7stud/python_programs/dir1/b.txt',
 '/Users/7stud/python_programs/dir1/ba.txt']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.