1

Thanks guys, that really helped. My main problem as you all pointed out was that I had 4 white spaces rather than a tab!

I have a textfile of the format:

string001    124.342
string002    235.111
string003    552.145

With a blank line at the bottom.

I just want to read this into an array. I thought that the best way to do this would be loadtxt from numpy, but when this didn't work, I went for genfromtxt, but couldn't quite get it to work either. This is what I have as my latest effort:

y = np.genfromtxt('1400list.txt',delimiter="\t", dtype=[('mystring','S10'),('myint','i8')])

print y

But I get the error:

rows = np.array(data, dtype=[('', _) for _ in dtype_flat])
ValueError: size of tuple must match number of fields.

Could anyone please help me to figure this one out?

Thank you!

3
  • which programming language you are using? Commented Sep 5, 2012 at 7:37
  • please add what language you are using to the question Commented Sep 5, 2012 at 7:38
  • What language are you using? Add a tag for the language if you want help! Commented Sep 5, 2012 at 7:38

3 Answers 3

2

Your code works fine here (although I suppose you want to use float dtype instead of integer one) with Python 2.7 and numpy 1.5.1:

#!/usr/bin/env python
import numpy

y = numpy.genfromtxt('1400list.txt', delimiter='\t',dtype=[('A', 'S10'),
('B', 'i8')])
print y

The output is:

vicent@deckard:/tmp$ python prova.py 
[('string001', 124L) ('string002', 235L) ('string003', 552L)]
vicent@deckard:/tmp$

Please, make sure that you are using tabs instead of spaces in your data file.

Sign up to request clarification or add additional context in comments.

Comments

0

This should work:

f = open("text.txt")
items = []
for line in f.readlines():
    arr = line.split("    ")
    if len(arr) == 2:
        items.append((arr[0], float(arr[1])))
f.close()

Note that the numbers are floats, not ints. Also note that the last line has no entry, thus the if len(arr) == 2.

Comments

0

Make sure you have the proper delimiter, that is, that your different columns are actually separated by tabulations and not hard spaces.

As an alternative, you could also use a tuple of integers as delimiters, if your initial file has some fixed formatting. In your case, that would be using

np.genfromtxt("text.txt", delimiter=(14,7), dtype=[('mystring','S10'),('myint','float')])

(Note that I corrected your dtype, using a float for the second element). The documentation will give you more details.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.