0

I have a long string:

query = "PREFIX pht: <http://datalab.rwth-aachen.de/vocab/pht/>
         PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

         SELECT ?Age, ?SexTypes, ?Chest_Pain_Type, ?trestbpsD, ?cholD, 
                    ?Fasting_Glucose_Level, ?Resting_ECG_Type, ?thalachD, 
                    ?Exercise_Induced_Angina, ?oldpeakD, ?caD, ?Slope, ?Thallium_Scintigraphy, ?Diagnosis
                      WHERE {?URI a sct:125676002. }"

Now I need to create a list consisting all the substrings that start with '?'. So the list should look like as follows:

schema = ['Age', 'Sex', 'Chest_Pain_Type', 'Trestbps', 'Chol', 'Fasting_Glucose_Level', 'Resting_ECG_Type', 'ThalachD', 
             'Exercise_Induced_Angina', 'OldpeakD', 'CaD', 'Slope', 'Thallium_Scintigraphy', 'Diagnosis']

I tried with str.startswith(str, beg=0,end=len(string))

But it's not working as I expected. How can do it in Python?

3
  • 1
    Why is ?URI not in the result? Commented Mar 20, 2018 at 10:52
  • vbar, nice catch! Actually, I don't need the ?URI. I wanted to explain but later on thought it will increase the complexity of the question. Commented Mar 20, 2018 at 11:12
  • Yes, it rather does increase the complexity... :-) Regular expression can find all words starting with '?' (see below), but if you want to skip some of them depending on a larger context, you'll need some more steps... Commented Mar 20, 2018 at 11:29

1 Answer 1

5

Using regex:

import re
query = """PREFIX pht: <http://datalab.rwth-aachen.de/vocab/pht/>
         PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

         SELECT ?Age, ?SexTypes, ?Chest_Pain_Type, ?trestbpsD, ?cholD, 
                    ?Fasting_Glucose_Level, ?Resting_ECG_Type, ?thalachD, 
                    ?Exercise_Induced_Angina, ?oldpeakD, ?caD, ?Slope, ?Thallium_Scintigraphy, ?Diagnosis
                      WHERE {?URI a sct:125676002. }"""

#print re.findall("\?\w+", query)
print([i.replace("?", "") for i in re.findall("\?\w+", query)])

Output:

['Age', 'SexTypes', 'Chest_Pain_Type', 'trestbpsD', 'cholD', 'Fasting_Glucose_Level', 'Resting_ECG_Type', 'thalachD', 'Exercise_Induced_Angina', 'oldpeakD', 'caD', 'Slope', 'Thallium_Scintigraphy', 'Diagnosis', 'URI']
Sign up to request clarification or add additional context in comments.

4 Comments

thanks so much for saving my day! Sorry to bug but what if I want all the occurrences before the 'WHERE'? Can we somehow limit this? I am pretty new to Python.
Sure. You can trim the query string to exclude content after 'WHERE'. Ex: query = query[:query.find("WHERE")]
Hi Rakesh, that worked perfectly! I already accepted the answer but since I don't have enough reputation so probably is not getting reflected.
Sorry, I now did! I'm pretty new to StackOverflow too. Thanks :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.