Trying to find a way to filter out parts of a string python

Question

I'm trying to filter out strings in file names that appear in a for loop

if search == "List":
        onlyfiles = [f for f in listdir("path") if isfile(join("path", f))]
        for i in onlyfiles:
            print(i)

now it will output all the filenames, as expected and wanted, but I want to filter out the .json at the end of the file as well as a few other elements in the name of the file so that I can just see the file name.

For example: filename-IDENTIFIER.json I want to filter out "-IDENTIFIER.json" out from the for loop's output

Thanks for any help

If it always has a dash, you can split the file name using the dash as a separator. — Selcuk
– Selcuk, Commented Jan 16, 2019 at 14:35
@meowgoesthedog it can be a number or letters or a combination — Flexy
– Flexy, Commented Jan 16, 2019 at 15:02

Ivo Merchiers · Accepted Answer · 2019-01-16 15:05:48Z

2

There are a few approaches here, based on how much your data can vary: So let's try to build a get_filename(f) method

Quick and dirty

If you know that f always ends in exactly the same way, then you can directly try to remove those characters. So here we have to remove the last 16 characters. It's useful to know that in Python, a string can be considered as an (immutable) array of characters, so you can use list indexing as well.

get_filename(f: str):
    return f[:-16]

This will however fail if the Identifier or suffix changes in length.

Varying lenghts

If the suffix changes based on the length, then you should split the string on a fixed delimiter and return the relevant part. In this case you want to split on -.

get_filename(f: str):
    return f.split("-")[0]

Note however that this will fail if the filename also contains a -. You can fix that by dropping the last part and rejoining all the earlier pieces, in the following way.

get_filename(f: str):
    return "-".join(f.split("-")[:-1])

Using regexes to match the format

The most general approach would be to use python regexes to select the relevant part. These allow you to very specifically target a specific pattern. The exact regex that you'll need will depend on the complexity of your strings.

edited Jan 16, 2019 at 15:05

answered Jan 16, 2019 at 14:41

Ivo Merchiers

1,68816 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Flexy Over a year ago

so using this technique the filename can only contain one "-", correct?

Ivo Merchiers Over a year ago

It did, but I've added an alternative way that can deal multiple "-".

Filipe Aleixo · Accepted Answer · 2019-01-16 14:41:12Z

0

Split the string on "-" and get the first element:

filename = f.split("-")[0]

This will get messed up case filename contains "-" though.

answered Jan 16, 2019 at 14:41

Filipe Aleixo

4,3127 gold badges52 silver badges82 bronze badges

Comments

farkas · Accepted Answer · 2019-01-16 14:47:32Z

0

This should work:

i.split('-')[0].split('.')[0]

Case 1: filename-IDENTIFIER.json

It takes the substring before the dash, so output will become filename

Case 2: filename.json

There is no dash in the string, so the first split does nothing (full string will be in the 0th element), then it takes the substring before the point. Output will be filename

Case 3: filename

Nothing to split, output will be filename

If it's always .json and -IDENTIFIER, then it's safer to use this:

i.split('-IDENTIFIER')[0].split('.json')[0]

Case 4: filename-blabla.json

If the filename has an extra dash in it, it won't be a problem, output will be filename-blabla

answered Jan 16, 2019 at 14:47

farkas

3072 silver badges8 bronze badges

1 Comment

Flexy Over a year ago

filenames will always have the -IDENTIFIER, so I don't need the '.' split, but good to know in case.

Collectives™ on Stack Overflow

Trying to find a way to filter out parts of a string python

3 Answers 3

Quick and dirty

Varying lenghts

Using regexes to match the format

2 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Quick and dirty

Varying lenghts

Using regexes to match the format

2 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related