strings = ["1 asdf 2", "25etrth", "2234342 awefiasd"] #and so on
Which is the easiest way to get [1, 25, 2234342]?
How can this be done without a regex module or expression like (^[0-9]+)?
One could write a helper function to extract the prefix:
def numeric_prefix(s):
n = 0
for c in s:
if not c.isdigit():
return n
else:
n = n * 10 + int(c)
return n
Example usage:
>>> strings = ["1asdf", "25etrth", "2234342 awefiasd"]
>>> [numeric_prefix(s) for s in strings]
[1, 25, 2234342]
Note that this will produce correct output (zero) when the input string does not have a numeric prefix (as in the case of empty string).
Working from Mikel's solution, one could write a more concise definition of numeric_prefix:
import itertools
def numeric_prefix(s):
n = ''.join(itertools.takewhile(lambda c: c.isdigit(), s))
return int(n) if n else 0
new = []
for item in strings:
new.append(int(''.join(i for i in item if i.isdigit())))
print new
[1, 25, 2234342]
"1 asdf 1" what should (or do you expect) the result to be? 1 or 11? These are your strings..."1 asdf 2" problem.[int(x) for x in ["".join([c for c in s if c.isdigit()]) for s in strings]]Basic usage of regular expressions:
import re
strings = ["1asdf", "25etrth", "2234342 awefiasd"]
regex = re.compile('^(\d*)')
for s in strings:
mo = regex.match(s)
print s, '->', mo.group(0)
1asdf -> 1
25etrth -> 25
2234342 awefiasd -> 2234342
re.Building on sahhhm's answer, you can fix the "1 asdf 1" problem by using takewhile.
from itertools import takewhile
def isdigit(char):
return char.isdigit()
numbers = []
for string in strings:
result = takewhile(isdigit, string)
resultstr = ''.join(result)
if resultstr:
number = int(resultstr)
if number:
numbers.append(number)
None in the result list.So you only want the leading digits? And you want to avoid regexes? Probably there's something shorter but this is the obvious solution.
nlist = []
for s in strings:
if not s or s[0].isalpha(): continue
for i, c in enumerate(s):
if not c.isdigit():
nlist.append(int(s[:i]))
break
else:
nlist.append(int(s))
"asdf"in the input? Should it be0, or not appear in the result list?