0

usually, I use python to read csv files, the structure of which looks like:

date1, value1_1, value1_2, value1_3, ...
date2, value2_1, value2_2, value2_3, ...
...

in this case, one line is one piece of data and I just use numpy.loadtxt() to read them.

but today, my colleague gave me a file with block structure, which looks like:

date1
value1_1, value1_2
value1_3, ...
date2
...

and this gives me a headache...

Does anyone have any good solution for this? Is there a function I can use to deal with this file, or do i have to write a reading_messed_files() function myself?

4
  • It might be easier to fix this at the source, and ask your colleague is she or he can give you standard a csv-formatted file instead. Commented Dec 14, 2015 at 4:09
  • 1
    Are these still line delimited strings? Is there some structure? Commented Dec 14, 2015 at 4:12
  • Without a decent example of what this file looks like, we can't make a guess on how to parse it. Commented Dec 14, 2015 at 4:27
  • the second file is line delimited strings. In the csv case, each line contains one complete piece of data, whereas in the second file, the combination of line1, 2 and 3 is one complete piece of data Commented Dec 14, 2015 at 4:33

1 Answer 1

1

This isn't a full answer, but a little long for a comment.

numpy csv readers like loadtxt and genfromtxt accept any iterable as input. While typically it's a filename that it opens and reads line by line, it can also be a list of lines, or a generator that returns one line at a time.

So you could open the file, read it line by line, rework the blocks into normal csv lines, and pass them on to loadtxt.

I remember examples using this to read multiple files (with the same columns), to skip lines, or to read blocks. Also examples process the lines to replace awkward delimiters.

I frequently demonstrate loadtxt using a list of lines derived from a cut-n-paste example.

I'm thinking of something like:

def foo(afile):
    header=None
    for line in afile:
        strings = line.split(delimiter)
        if len(strings)==1:
             header = strings[0]
        else:
            line = delimiter.join([header]+strings)
            yield line

with open(filename) as f:
    A = np.loadtxt(foo(f),....)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.