4

I am trying to write a specific number of bytes of a string to a file. In C, this would be trivial: since each character is 1 byte, I would simply write however many characters from the string I want.

In Python, however, since apparently each character/string is an object, they are of varying sizes, and I have not been able to find how to slice the string at byte-level specificity.

Things I have tried:

Bytearray: (For $, read >>>, which messes up the formatting.)

$ barray = bytearray('a')
$ import sys
$ sys.getsizeof(barray[0])
24

So turning a character into a bytearray doesn't turn it into an array of bytes as I expected and it's not clear to me how to isolate individual bytes.

Slicing byte objects as described here:

$ value = b'a'
$ sys.getsizeof(value[:1])
34 

Again, a size of 34 is clearly not 1 byte.

memoryview:

$ value = b'a'  
$ mv = memoryview(value)  
$ sys.getsizeof(mv[0])  
34  
$ sys.getsizeof(mv[0][0])  
34  

ord():

$ n = ord('a')  
$ sys.getsizeof(n)  
24  
$ sys.getsizeof(n[0])  

Traceback (most recent call last):  
  File "<pyshell#29>", line 1, in <module>  
    sys.getsizeof(n[0])  
TypeError: 'int' object has no attribute '__getitem__'  

So how can I slice a string into a particular number of bytes? I don't care if slicing the string actually leads to individual characters being preserved or anything as with C; it just has to be the same each time.

1 Answer 1

3

Make sure the string is encoded into a byte array (this is the default behaviour in Python 2.7).

And then just slice the string object and write the result to file.

In [26]: s = '一二三四'

In [27]: len(s)
Out[27]: 12

In [28]: with open('test', 'wb') as f:
   ....:     f.write(s[:2])
   ....:

In [29]: !ls -lh test
-rw-r--r--  1 satoru  wheel     2B Aug 24 08:41 test
Sign up to request clarification or add additional context in comments.

6 Comments

Wow. I did not realize write() and splitting a string like that could do that. What threw me off was that sys.getsizeof(s[:1]) was returning 34. Thank you!
@user124384 What sys.getsizeof tells you is the size of the String object, not that of the underlying byte array.
I see. How is it possible to get the size of the underlying byte array? As I showed in my examples, just using getsizeof() on a bytearray doesn't seem to do it.
@user124384 Since string is just a byte array in Python 2, why not just use len
Ahh. I knew len() returned the number of characters, but I didn't realize a character was just one byte as in C. Thank you again!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.