Parsing over a text file that has an unusual delimiter using python

Question

In supporting a legacy system, I'm faced with a field data collector that stores data in the following format:

# This is a comment <-beacuse it starts at the begining of the file
# This is a comment <- see above
# 1. Item one <- not a comment because it starts with 1.
# Description of Item 1 <- not a comment as it is after a line that starts with a number
data point 1
data point 2
data point etc
3 <-- represents number of data points under Item one

# 2. Item two <-- not a comment
# Description of item 2 <-- not a comment
data point 1
data point ..
data point 100
100
#3. Item three <--- not a comment
# Item three description
0

I'm not sure what is the correct way to parse for that file to include each Item as its own list. Note that sometimes but not always the data adds a random space between two different items.

What is the correct way to parse such a file?

Community · Accepted Answer · 2017-05-23 12:08:11Z

1

I would do this in three steps:

Remove all comments from the start of the file
Split on a regular expression to find all the other comments in the file (see here for an example of how to split using a regular expression)
Parse the remaining lines

edited May 23, 2017 at 12:08

CommunityBot

11 silver badge

answered Mar 11, 2013 at 18:12

N_A

20k4 gold badges55 silver badges98 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

CSᵠ · Accepted Answer · 2013-03-11 18:12:58Z

1

You could use REGEX and do a split by: ^(?=\# ?\d+\.)

Explained example here: http://regex101.com/r/gB3xD1

answered Mar 11, 2013 at 18:12

CSᵠ

10.2k9 gold badges43 silver badges64 bronze badges

Collectives™ on Stack Overflow

Parsing over a text file that has an unusual delimiter using python

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related