12

I opened an 8 MB file in Python, because I wanted to batch change various types of file names. I went through and loaded the file into a string and used the string method replace to replace everything. I then noticed that only half of the file was being replaced; as if Python wasn't fully opening the file.

Is there some kind of string size limit or max file size limit that I must play within the bounds of in Python?

Refer to the code in Python search and replace not replacing properly.

I have changed to the suggested code. The buffer is an 8 MB HTML file that is over 150k lines. The replacement code works perfectly; it's just that it's not replacing everything. Or for example one error that is a pain is:

When I'm attempting to replace the string ff10 to FF-10, it'll be changed to FF-010.

4
  • 6
    You can open a file with any size, but when you read the whole file, MemoryOverflow can occur as 32Bit system can only allocate 2GB per process or you might have not enough memory. Commented Aug 20, 2011 at 20:08
  • 3
    Show the code that's giving you the problem, that way you can get a more useful answer than one that simply tells you whether your guess is right or not. :) Commented Aug 20, 2011 at 20:10
  • Your code is buggy. The case x==1 will always match first, so you end up with FF-010. Use proper string replacement functions or read up on regexps and/or longest prefix match. Commented Aug 20, 2011 at 21:12
  • Are you using Windows? Are you opening the file in binary mode? If not, try to … Commented Feb 26, 2012 at 22:46

1 Answer 1

21

No, there is no reachable maximum on the size of a file Python can open. 8 MB is tiny in modern terms. You made a mistake somewhere.

People regularly load gigabytes of data into memory. Depending on your computer's RAM, whether it's 64- or 32- bit OS and processor, the practical maximum for you may be anywhere from 1 GB up before you get a MemoryError.

As a test, I just loaded a 350 MB file into a string. It took only a few seconds. I then wrote it back out to a file. That took a little longer. I then hashed the file. The two are identical.

Python has no problems with large strings, until you hit the limit of your RAM, operating system, or processor.

You say you "went through and loaded the file into a string" -- that sounds like the first place you could have made a mistake. To load a file into a string, you just do fileobject.read(). If you did it some other way, that could be the problem.

Sign up to request clarification or add additional context in comments.

4 Comments

@nobody see my comment on youe question
I did a test and added the results to my answer.
@Niklas depending on your computer, you can get a MemoryError at sizes smaller than 2gb, as I mentioned.
@Peter Trivial edits are discouraged. I appreciate it when people correct errors, but the change you made didn't affect anyone's understanding of the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.