1

I'm trying to split string based on a regular expression inside lambda function, the string is not getting split. I'm sure the regular expression is working fine. check the regex test link https://regex101.com/r/ryRio6/1

from pyspark.sql.functions import col,split
import re

r = re.compile(r"(?=\s\w+=)")
adsample = sc.textFile("hdfs://hostname/user/hdfs/sample/Log18Dec.txt")
splitted_sample = adsample.flatMap(lambda (x): ((v) for v in r.split(x)))

for m in splitted_sample.collect():
    print(m)

not sure where i'm going wrong..

sample line from the file:

|RECEIVE|Low| eventId=139569 msg=W4N Alert :: Critical : Interface Utilization for GigabitEthernet0/1 90.0 % in=2442 out=0 categorySignificance=/Normal categoryBehavior=/Communicate/Query categoryDeviceGroup=/Application

regex should match space before the key

output

|RECEIVE|Low|
eventId=139569
msg=W4N Alert :: Critical : Interface Utilization for GigabitEthernet0/1 90.0 %
in=2442
out=0
categorySignificance=/Normal
categoryBehavior=/Communicate/Query
categoryDeviceGroup=/Application
3
  • Can you share the data in Log18Dec.txt, and output you are expecting? Commented Dec 20, 2017 at 7:06
  • Or can you at least tell us what you expect (by this I mean "Can you describe what your regex is supposed to match?"), and what you get? Commented Dec 20, 2017 at 7:07
  • @Oli, rohikulky edited with sample line and desired output Commented Dec 20, 2017 at 7:29

1 Answer 1

1
from pyspark.sql.functions import col,split
import re

#r = re.compile(r"(?=\s\w+=)")
adsample = sc.textFile("hdfs://hostname/user/hdfs/sample/Log18Dec.txt")
splitted_sample = adsample.flatMap(lambda (x): ((v) for v in re.split('\s+(?=\w+=)',x)))

for m in splitted_sample.collect():
    print(m)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.