1

I have the following string,

s = {$deletedFields:name:[standardizedSkillUrn,standardizedSkill],entityUrn:urn:li:fs_skill:(ACoAAA0C3rkBDZ7qyoWoEmj9CxUv3QW6brC836w,25),name:Political Campaigns,$type:com.linkedin.voyager.identity.profile.Skill},{$deletedFields:[standardizedSkillUrn,standardizedSkill],entityUrn:urn:li:fs_skill:(ACoAAA0C3rkBDZ7qyoWoEmj9CxUv3QW6brC836w,28),name:Politics,$type:com.linkedin.voyager.identity.profile.Skill},name:
{$deletedFields:[standardizedSkillUrn,standardizedSkill],entityUrn:urn:li:fs_skill:(ACoAAA0C3rkBDZ7qyoWoEmj9CxUv3QW6brC836w,27),name:Political Consulting,$type:com.linkedin.voyager.identity.profile.Skill},
{$deletedFields:[standardizedSkillUrn,standardizedSkill],entityUrn:urn:li:fs_skill:(ACoAAA0C3rkBDZ7qyoWoEmj9CxUv3QW6brC836w,26),name:Grassroots Organizing,$type:com.linkedin.voyager.identity.profile.Skill},
{$deletedFields:[],profileId:ACoAAA0C3rkBDZ7qyoWoEmj9CxUv3QW6brC836w,elements:[urn:li:fs_skill:(ACoAAA0C3rkBDZ7qyoWoEmj9CxUv3QW6brC836w,25),urn:li:fs_skill:(ACoAAA0C3rkBDZ7qyoWoEmj9CxUv3QW6brC836w,26),urn:li:fs_skill:(ACoAAA0C3rkBDZ7qyoWoEmj9CxUv3QW6brC836w,27),urn:li:fs_skill:(ACoAAA0C3rkBDZ7qyoWoEmj9CxUv3QW6brC836w,28)],paging:urn:li:fs_profileView:ACoAAA0C3rkBDZ7qyoWoEmj9CxUv3QW6brC836w,skillView,paging,$type:com.linkedin.voyager.identity.profile.SkillView,$id:urn:li:fs_profileView:ACoAAA0C3rkBDZ7qyoWoEmj9CxUv3QW6brC836w,skillView},
{$deletedFields:[]

I want to grab

name:Political Campaigns

name:Politics

name:Political Consulting

name:Grassroots Organizing

name = [Political Campaigns , Politics, Political Consulting, Grassroots Organizing]

The above string is from a file i want to scrap.

Keep in mind that name has many instances in the file, is there a way to grab fs_skill then some garbage value but then look for name: near it and grab that string ending at.

2
  • Hi @hacke, what have you tried so far? Commented May 21, 2017 at 9:01
  • 1
    re.findall(r'name:(.*?)},' , s) but name has many instances on my file so i was not able to get only what i want Commented May 21, 2017 at 9:08

1 Answer 1

2
data = [pair[5:] for pair in s.split(',') if pair[:4] == 'name' and pair[5].isalpha()]

Output:

['Political Campaigns', 'Politics', 'Political Consulting', 'Grassroots Organizing']

can you try above code snippet, hope this helps

Sign up to request clarification or add additional context in comments.

4 Comments

Can you explain what you did ? I need to implement the same logic on a large file which may have more occureneces of name and those garbage strings ?
split the string by ',', then compared the first four character with 'name' and also 6th char is an alphabet, if both conditions are satisfied then take the part after ':'
Can i get ur email address pramod ?
Sent you a Mail would be glad if you can help out in this Thanks. The answer worked.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.