1

I am using an existing perl script to process a text file output from a database query which I have no control over.

The data contains fields separated by '|', but some fields contain '||'. There are no empty fields. There may be spaces on either side of the field separator which I would also like to remove.

I cannot find a simple way to achieve this, apart from changing the '||' to something else, and putting it hack after the split, which seems a bit heavy going.

The file is substantial (typically up to about 100M).

Using split(/ *\| */, $line) works apart from the '||' character.

Any thought please?

3 Answers 3

3
split /\s*(?<!\|)\|(?!\|)\s*/
Sign up to request clarification or add additional context in comments.

1 Comment

Simple when some one shows me how. Works well, and is faster than my remove and replace (not surprisingly).
3

you can use negative look-behind and look-ahead to ensure there are no | symbols around the | you're splitting on:

split / \s* (?<!\|) \| (?!\|) \s* /x

Comments

1

Look at using Text::CSV or Tie::Handle::CSV to run through the file. If the text file has been done properly fields that contain || will be quoted.

1 Comment

I think my problem started because there isn't input data protection on the user input. I guess next thing is some one using a single pipe, then I'll have to have a rethink! Thanks for your time

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.