0

I have data frame with one column and I am trying to iterate through each row of that column with function and have values into new column. SO first i tried to run my regex expression on the single string to make sure I get results i expect:

# Importing dependencies
  import pandas as pd
  from pandas import ExcelWriter
  from pandas import ExcelFile
  import re

  # Test the pattern on a s string
 s = "64\"X36\"X60\" STACKED STONE AREAWELL BOMAN KEMP"
 z = re.search(r"((\d*[\.|-]?\d+(\/\d*)?)\s*((?:cms? 
 |in|inch|inches|mms?)\b|(?:[\"|\'|\”])|\s?)\s* 
 [x|X]\s*){0,2}(\d*[\.|-]?\d+(\/\d*)?)\s*((?:cms? 
 |in|inch|inches|mms?)\b|(?:[\"|\'|\”])|\s?)" , s, 
 flags=re.I)

 print(z.group(0))

And my results are 64"X36"X60" which is exactly what i want to get. However when i apply this in form of the function on the data frame:

  def patterns(row):
  return re.search(r"((\d*[\.|-]?\d+(\/\d*)?)\s* 
  ((?:cms?|in|inch|inches|mms?)\b|(?: 
  [\"|\'|\”])|\s?)\s*[x|X]\s*){0,2}(\d*[\.|-]?\d+ 
  (\/\d*)?)\s*((?:cms?|in|inch|inches|mms?)\b|(?: 
  [\"|\'|\”])|\s?)", row["Description"], 
  flags=re.I)

# Apply the function to each row
df["Dimensions"] = df.apply(patterns, axis=1)

I get results in format like this:

re.Match object; span=(0, 11), match='52"X36"X72"'

So I think I am not structuring my function correctly. In the sample test when i add

print(z.group(0))

it reads the data from the match element only which is exactly what i nead. Anyone can pin point how do i ajdust the function to give me the same results for each row?

I tried adding .group(0) at the end of the function but this is the error I get once i execute the it with:

df["Dimensions"] = df.apply(patterns, axis=1)

Error: enter image description here

2
  • add .group(0) at the end of your return code in the patterns function. return re.search(r"((\d ... s?)", row["Description"], flags=re.I).group(0) Commented Jan 25, 2019 at 13:08
  • @ John Jefferson Bautista - thank you for response John. I have tried using that too but i get the error message: "AttributeError: ("'NoneType' object has no attribute 'group'", 'occurred at index 65')". I just posted full error up in the original post. Commented Jan 25, 2019 at 13:35

1 Answer 1

1

The error was thrown because re.search returned None, cause there is no matching string in that row. Try adding a condition to somehow return something else if the string is not found, the code below returns "None" if string is not found.

def patterns(row):
  s = re.search(r"((\d*[\.|-]?\d+(\/\d*)?)\s* 
  ((?:cms?|in|inch|inches|mms?)\b|(?: 
  [\"|\'|\”])|\s?)\s*[x|X]\s*){0,2}(\d*[\.|-]?\d+ 
  (\/\d*)?)\s*((?:cms?|in|inch|inches|mms?)\b|(?: 
  [\"|\'|\”])|\s?)", row["Description"], 
  flags=re.I)

  return s.group(0) if s else "None"
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.