-1

I am using this code to find a string in Python:

buildSucceeded = "Build succeeded."
datafile = r'C:\PowerBuild\logs\Release\BuildAllPart2.log'

with open(datafile, 'r') as f:
    for line in f:
        if buildSucceeded in line:
            print(line)

I am quite sure there is the string in the file although it does not return anything.

If I just print one line by line it returns a lot of 'NUL' characters between each "valid" character.

EDIT 1: The problem was the encoding of Windows. I changed the encoding following this post and it worked: Why doesn't Python recognize my utf-8 encoded source file?

Anyway the file looks like this:

Line 1.
Line 2.
...
Build succeeded.
    0 Warning(s)
    0 Error(s)
...

I am currently testing with Sublime for Windows editor - which outputs a 'NUL' character between each "real" character which is very odd. enter image description here

Using python command line I have this output:

C:\Dev>python readFile.py
Traceback (most recent call last):
  File "readFile.py", line 7, in <module>
    print(line)
  File "C:\Program Files\Python35\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 1: character maps to <undefined>

Thanks for your help anyway...

3
  • 1. "Quite sure" isn't enough I'm afraid. 2. Try to use strip in if buildSucceeded in line.strip() in order to remove trailing '\n'. Commented Mar 24, 2017 at 18:19
  • Try for line in f: instead of splitting the entire file. Then you can strip out the nul chars before you print. Commented Mar 24, 2017 at 18:19
  • Welcome to StackOverflow. Please read and follow the posting guidelines in the help documentation. Minimal, complete, verifiable example applies here. We cannot effectively help you until you post your MCVE code and accurately describe the problem. Read the file line by line, print each line as you read it, and see what you actually have. If this fails, then chop the data file down to a few lines, reproduce the problem, and post the output here. Commented Mar 24, 2017 at 18:21

2 Answers 2

0

If your file is not that big you can do a simple find. Otherwise I would check to file to see if you have the string in the file/ check the location for any spelling mistakes and try to narrow down the problem.

f = open(datafile, 'r') lines = f.read() answer = lines.find(buildSucceeded) Also note that if it does not find the string answer would be -1.

Sign up to request clarification or add additional context in comments.

Comments

0

As explained, the problem happening was related to encoding. In the below website there is a very good explanation on how to convert between files with one encoding to some other.

I used the last example (with Python 3 which is my case) it worked as expected:

buildSucceeded = "Build succeeded."
datafile = 'C:\\PowerBuild\\logs\\Release\\BuildAllPart2.log'

# Open both input and output streams.
#input = open(datafile, "rt", encoding="utf-16")
input = open(datafile, "r", encoding="utf-16")
output = open("output.txt", "w", encoding="utf-8")

# Stream chunks of unicode data.
with input, output:
    while True:
        # Read a chunk of data.
        chunk = input.read(4096)
        if not chunk:
            break
        # Remove vertical tabs.
        chunk = chunk.replace("\u000B", "")
        # Write the chunk of data.
        output.write(chunk)

with open('output.txt', 'r') as f:
    for line in f:
        if buildSucceeded in line:
            print(line)

Source: http://blog.etianen.com/blog/2013/10/05/python-unicode-streams/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.