1

There is a list of bytes objects (each one is 4 bytes) that is returned as output of one code and I want to save it into a .csv file using CSV module and read it back later in another script. Here is the code that I have learnt from python's official documentation:

import struct
import csv

k   = 0x100000
rng = range(0, k)
x1 = [b''] * k
x = 0xffffffff

for i in rng:
    x1[i]   = struct.pack("<L", x)
    x -= 1

print(x1[0])              # b'\xff\xff\xff\xff'

List = x1

with open("test.csv", 'w', newline='') as rF:
    wr = csv.writer(rF, dialect='excel')
    for i in List:
        wr.writerow(i)

When looking inside the created test.csv using notepad, instead of a column of byte strings I see 4 columns of 8-bit integers. Few first lines of test.csv are:

255,255,255,255
254,255,255,255
253,255,255,255
252,255,255,255
251,255,255,255
250,255,255,255
       .
       .
       .

What am I doing wrong that this is happening? Is there a way to get a csv file with one column of byte strings? for example:

b'\xff\xff\xff\xff'
b'\xfe\xff\xff\xff'
b'\xfd\xff\xff\xff'
          .
          .
          .

Actually I do not care how are my bytes stored in a csv. I just care to have them back into a list of bytes using csv.reader in another script and want the loading process be the quickest possible.

1
  • 2
    CSV can't save binary data. Think about using raw binary file instead of csv. Commented Nov 17, 2018 at 17:14

1 Answer 1

2

This will do.

import pandas as pd
import struct

k   = 0x100000
rng = range(0, k)
x1 = [b''] * k
x = 0xffffffff

for i in rng:
    x1[i]   = struct.pack("<L", x)
    x -= 1

df = pd.DataFrame()
df["data"] = x1
df.to_csv("test.csv", index=False, header=None)

This will output file in bytes. Sample output

b'\xff\xff\xff\xff'
b'\xfe\xff\xff\xff'
b'\xfd\xff\xff\xff'
b'\xfc\xff\xff\xff'
b'\xfb\xff\xff\xff'

You can use pandas instead of csv, to read the file back.

df = pd.read_csv("test.csv")

Alternative

with open("test.csv", "wb") as f:
    for i in x1:
        f.write(i)
        f.write('\n'.encode('utf-8'))

# Reading file
y = []
with open("test.csv", "rb") as f:
    for i in f.readlines():
        y.append(i.replace('\n'.encode('utf-8'), "".encode("utf-8")))
pprint(y[:10])

Output

[b'\xff\xff\xff\xff',
 b'\xfe\xff\xff\xff',
 b'\xfd\xff\xff\xff',
 b'\xfc\xff\xff\xff',
 b'\xfb\xff\xff\xff',
 b'\xfa\xff\xff\xff',
 b'\xf9\xff\xff\xff',
 b'\xf8\xff\xff\xff',
 b'\xf7\xff\xff\xff',
 b'\xf6\xff\xff\xff']
Sign up to request clarification or add additional context in comments.

4 Comments

Is there any other way to do it without having to use a third party library?!
@PouJa Updated the answers with alternative. See if that works.
Thank you so much for helping I will definitely accept the answer when I make sure it works. Just wondering how to recover it to a list of bytes items in another script!?
@PouJa I have updated the answer for reading and a slight change in Alternative write approach.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.