5

I am trying to get the highest version of a string in Python. I was trying to sort the list but that of course doesnt work as easily as Python will sort the string representation.

For that I am trying to work with regex but it somehow doesnt match.

The Strings look like this:

topic_v10_ext2
topic_v20_ext2
topic_v2_ext2
topic_v5_ext2
topic_v7_ext2

My Regex looks like this.

version_no = re.search("(?:_v([0-9]+))?", v.name)

I was thinking about saving the names in a list and look for the highest v_xx in the list to return. Also for now I am doing this in two FOR loops. Which runs in 2*O(log(n)) which is not optimal I believe. How can I get the highest version in a fast and simple way?

5
  • 2
    Please show the exact data format of the input strings. Are they in a file, a list, or some other data structure? Is the value after ext part of the version number? Commented Feb 22, 2019 at 14:24
  • What is meant by "string representation"? Why can't the topic_v… be strings? Commented Feb 22, 2019 at 14:25
  • 1
    Your regex will match literally anywhere because you've made the entire thing optional with the ? at the end. Commented Feb 22, 2019 at 14:26
  • @RoryDaulton no, only the _vXX is part of the Version. But the ext is optional and not always present. Commented Feb 22, 2019 at 14:38
  • @PrinceOfCreation They are a different Object which has a .name Key which can be read as a String in a for loop. Commented Feb 22, 2019 at 14:38

2 Answers 2

8

You can use sorted or list.sort with key:

sorted(l, key=lambda x:int(x.split('_')[1][1:]), reverse=True)
['topic_v20_ext2',
 'topic_v10_ext2',
 'topic_v7_ext2',
 'topic_v5_ext2',
 'topic_v2_ext2']
  • x.split('_'): returns splitted str, e.g.: ['topic', 'v20', 'ext2']
  • Since the version is the key to the sorting, select it by x.split('_')[1]
  • Selected V20 has unwanted character 'V', thus reselect it by slicing [1:] to get all the digits.
  • Finally, convert digits to int for numerical ordering.

Also, sorted by default returns ascending order of sort. Since you require descending order, use reverse=True.

Sign up to request clarification or add additional context in comments.

5 Comments

Though mind explaining the x:int(x.split('_')[1][1:] part a bit once you have the time?
@DerekHaynes I've edited the answer. Please let me know if any part is unclear.
Yeah man! Best man of the day! This explained it really clear for me and for many to come. Thanks a lot.
Is there also a way to look for a regex inside the split like: int(x.name.split('_')[1] "Regex to start and end at the number"). The version should be in the format of "v_00" but maybe something arrives like "version_00"
@DerekHaynes That can be a completely different scenario since it contains _ and thus split may not work. If you were to extract digits, you can use lambda x: re.findall('\d+', x)[0].
1

It could also work with regular expressions, as first tried:

import re
v = 'topic_v7_ext2'
version_no = re.search("^[^_]*_v([0-9]+)", v)
print(version_no.group(1))

That expression searches for pattern from the beginning of the string (^), takes all characters different from _ (I hope your topics can't have one, else both answers are wrong), then finds the '_v' and takes the version number.
There is no need to match _ext, so it doesn't matter if it's there or not!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.