0

I was trying to run a code that I wrote the code reads lines on txt file so my text file looks like (i've 20 lines)

['39', '40', '39', '38', '35', '38', '39', '39', '42', '37', '40', '41', '37', '39', '39', '40', '38', '40', '39', '40']

['39', '40', '39', '38', '36', '39', '40', '39', '42', '38', '40', '41', '38', '39', '39', '40', '38', '40', '39', '41']

['39', '40', '40', '38', '36', '39', '40', '39', '43', '38', '40', '41', '38', '39', '39', '40', '38', '40', '39', '41']

i wrote this script to have a new file that looks like this

39 40 39 38 35 38 39 39 42 37 40 41 37 39 39 40 38 40 39 40

39 40 39 38 36 39 40 39 42 38 40 41 38 39 39 40 38 40 39 41

39 40 40 38 36 39 40 39 43 38 40 41 38 39 39 40 38 40 39 41

the script that i wrote is this one

    #!/usr/bin/python3
# -*- coding: utf-8 -*-

fichier=open("data.txt", "r")
#resultat=open("data_entier.txt", "w")

j=0;

while j < 20:
    lignes= fichier.readline()
    for i in range(len(lignes)):
         lignes[i] = int(lignes[i])

    print(lignes))

    j+=1

fichier.close()

the error that I get is this one

ValueError: invalid literal for int() with base 10: '['

13
  • Looks like all you want to do is replace every occurence of [, ', , and ] with an empty string, correct? For that, you don't need to convert anything to int. Commented Jul 19, 2017 at 12:05
  • yea exactlly, how can i do that ? Commented Jul 19, 2017 at 12:07
  • where did data.txt come from? It seems you choose to dump lists there initially, rather than the format you actually wanted... Commented Jul 19, 2017 at 12:07
  • @DrissAourir stackoverflow.com/questions/6116978/… Commented Jul 19, 2017 at 12:08
  • 1
    @Chris_Rands Good job breaking down the XY. :) At this point, yes, the best suggestion here would be to first generate the data in to the file properly to not have to face this kind of issue. Ultimately your data should have just been a dump of the numbers without their list representations in to the file. Commented Jul 19, 2017 at 12:16

4 Answers 4

4

The problem is that when you read in readline from your file, you will have a line

"['39', '40', '39', '38', '35', '38', '39', '39', '42', '37', '40', '41', '37', '39', '39', '40', '38', '40', '39', '40']\n"

As you can see, the first item in your string is [. So, you don't actually have the numbers structured as you are expecting. Instead, since you seem to already have a list structure represented as a string, consider using literal_eval from ast:

>>> d = literal_eval(d)
>>> d
['39', '40', '39', '38', '35', '38', '39', '39', '42', '37', '40', '41', '37', '39', '39', '40', '38', '40', '39', '40']

Now you actually have a list of strings. Now you can proceed modifying that to your ints. As a simple process, you can then do something like this:

>>> converted_to_ints = map(int, d)
>>> print(list(converted_to_ints))
[39, 40, 39, 38, 35, 38, 39, 39, 42, 37, 40, 41, 37, 39, 39, 40, 38, 40, 39, 40]

Note, when it comes to using map in Python 3 you get a map object, which returns an iterator. So, if you need the list, this is why list is called when printing. You can read about it here:

https://docs.python.org/3/library/functions.html#map

Area of Improvement

Based on the contents of the file you are reading, it would be best to not structure the data like this. Instead, what should be done is not setting the data as a list representation in to the file, but just the contents of the list. This avoids having to perform the above solution, and instead, a simple:

with open('file.txt') as f:
  data = f.read().splitlines() # will remove newline character
  for line in data:
      # perform operations

would suffice.

Sign up to request clarification or add additional context in comments.

Comments

2

You are not evaluating the line as a list: every line is just a string that happens to start with a '['. So you are iterating over the characters of the line.

If the file is like you describe it however, you can easily evaluate the line with ast.literal_eval:

from ast import literal_eval

with open("data.txt", "r") as fichier:
    for line,_ in zip(fichier,range(20)):
        the_list = literal_eval(line)
        the_list = [int(x) for x in the_list]
        print(the_list))

We here used zip as a way to obtain the first 20 lines. If you want to process all the lines, you can simply use:

with open("data.txt", "r") as fichier:
    for line in fichier:
        the_list = literal_eval(line)
        the_list = [int(x) for x in the_list]
        print(the_list))

Comments

0

You're reading a line, so you can simply replace the chars you don't want.

string = "['39', '40', '39', '38', '35', '38', '39', '39', '42', '37', '40', '41', '37', '39', '39', '40', '38', '40', '39', '40']"
s = string.replace(',','').replace('[','').replace(']','').replace("'","")
print s
#output: 39 40 39 38 35 38 39 39 42 37 40 41 37 39 39 40 38 40 39 40

1 Comment

If you are going to go down that route, and knowing what each line looks like, you could have just done string[1:-2]. This will remove the first [ and the trailing ]\n.
0

Although in this case the ast.literal_eval solutions presented by idjaw and Willem Van Onsem seem as an obvious best fit, let me present another solution:

numbers_text = "['39', '40', '39', '38', '35', '38', '39', '39', '42', '37', '40', '41', '37', '39', '39', '40', '38', '40', '39', '40']"

Instead of chaining multiple replace operations, you can use str.translate to get rid of multiple characters in one pass, by providing the third argument to str.maketrans:

If there is a third argument, it must be a string, whose characters will be mapped to None in the result.

Afterwards, we can either use a simple list comprehension to convert the separate numbers from str to int:

numbers_int = [int(x) for x in numbers_text.translate(str.maketrans("","","[',]")).split()]

Or make use of map:

numbers_int = list(map(int, numbers_text.translate(str.maketrans("","","[',]")).split()))

Both will result in a new list of int:

[39, 40, 39, 38, 35, 38, 39, 39, 42, 37, 40, 41, 37, 39, 39, 40, 38, 40, 39, 40]

3 Comments

nit: you will most likely have a trailing \n in the string at first.
@idjaw: yes, but it will be removed by split(): "1 2 3\n".split() -> ['1', '2', '3']
Ah right. Based on your solution that is correct. I had split(',') in my head, which would not have done that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.