1

I know this is a basic question, but I'm having difficult with parsing some text.

So how the system will work, let's take the following example:

> set title "Hello world" 

I should therefore get:

["set title", "Hello world"] 

The problem is therefore, I need to split the string so when I enter, for example:

> plot("data.txt"); 

Should give me:

["plot", "data.txt"] 

I have tried the following:

While True:
       command = raw_input(">");
       parse = command.split("' '");

       if(parse[0] == "set title"):
               title = parse[1];

But this does not work and will not even recognise that I am entering "set title"

Any ideas?

4
  • 2
    I don't get what you mean. if you split on the space plot("data.txt") you'd get: ['plot("data.txt")] since it doesn't contain any space. Why would the parenthesis disappear in the result? I believe you don't want to "split", you want to parse the command line into tokens. That's generally done with regexes. Commented May 10, 2014 at 17:10
  • @Bakuriu Sorry.. It would be plot 'data.txt' not the one in the post, my bad Commented May 10, 2014 at 17:11
  • You should first start by designing a sane syntax/notation. How would the parser know not to split "set title" but to not split the others? Commented May 10, 2014 at 17:12
  • 1
    You can use shlex.split(), it will preserve spaces inside strings. But that's not enough. There is no way to understand that you shouldn't split on the space of set title with just a simple operation such as a "split" on a separator. You need a bit of more complex logic to check the tokens. Commented May 10, 2014 at 17:14

3 Answers 3

1

You don't need split. You need re:

import re
def parse(command):
    regex = r'(.*) "(.*)"'
    items = list(re.match(regex, command).groups())
    return items

if __name__ == '__main__':
    command = 'set title "Hello world"'
    print parse(command)

returns

['set title', 'Hello world']
Sign up to request clarification or add additional context in comments.

Comments

0
split("' '")

Will split on the literal sequence of three characters single quote, space, single quote, which don't appear in your command strings.

I think you will need to approach this more like:

command, content = command.split(" ", 1)
if command == "plot":
    plot(command[1:-1])
elif command == "set":
    item, content = content.split(" ", 1)
        if item == "title":
            title = content[1:-1]
...

Note the use of a second argument to tell split how many times to do so; 'set title "foo"'.split(" ", 1) == ['set', 'title "foo"']. Precisely how you implement will depend on the range of things you want to be able to parse.

Comments

0

To split the string by blanks you need to use

parse = command.split(' ')

For the input "set title" you will get a parse array looking like this

['set', 'title']

where parse[0] == 'set' and parse[1] == 'title'

If you want to test whether your string starts with "set title", either check the command string itself or check the first two indexes of parse.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.