0

I did a python script that gets data from shared mem and convert it from bytes to floats. The main problem is that it very slow.

This is how I init the shared memory:

  def _shared_mem_init(self):

        warnings.filterwarnings("ignore")
        path = "/tmp"

        # shared memory header
        key = ipc.ftok(path, 0x3110)
        self.shm = ipc.SharedMemory(key, 0, 0)
        self.shm.attach(0, 0)
        # shared memory X values
        key_x = ipc.ftok(path, 0x3111)
        self.shm_x = ipc.SharedMemory(key_x, 0, 0)
        self.shm_x.attach(0, 0)
        # shared memory Y values
        key_y = ipc.ftok(path, 0x3112)
        self.shm_y = ipc.SharedMemory(key_y, 0, 0)
        self.shm_y.attach(0, 0)
        # shared memory Z values
        key_z = ipc.ftok(path, 0x3113)
        self.shm_z = ipc.SharedMemory(key_z, 0, 0)
        self.shm_z.attach(0, 0)
        # shared memory R values
        key_r = ipc.ftok(path, 0x3114)
        self.shm_r = ipc.SharedMemory(key_r, 0, 0)
        self.shm_r.attach(0, 0)
        # shared memory G values
        key_g = ipc.ftok(path, 0x3115)
        self.shm_g = ipc.SharedMemory(key_g, 0, 0)
        self.shm_g.attach(0, 0)
        # shared memory B values
        key_b = ipc.ftok(path, 0x3116)
        self.shm_b = ipc.SharedMemory(key_b, 0, 0)
        self.shm_b.attach(0, 0)

        self.shm.write(byte_true, 0)
        print("shared Memory init")

it gets the xyzrgb from the shared memory. and after I get the data I try to convert it from bytes to floats:

 def next_point_cloud(self):

        # read 4 bytes from header - Data Lines
        buf = self.shm.read(4, 5)
        data_lines = int.from_bytes(buf, "little")

        # read all data
        buff_x2 = self.shm_x.read(4 * data_lines, 0)
        buff_y2 = self.shm_y.read(4 * data_lines, 0)
        buff_z2 = self.shm_z.read(4 * data_lines, 0)
        buff_r2 = self.shm_r.read(data_lines, 0)
        buff_g2 = self.shm_g.read(data_lines, 0)
        buff_b2 = self.shm_b.read(data_lines, 0)

        # split all data
        buff_x_breakdown = [buff_x2[i:i + 4] for i in range(0, 4 * data_lines, 4)]
        buff_y_breakdown = [buff_y2[i:i + 4] for i in range(0, 4 * data_lines, 4)]
        buff_z_breakdown = [buff_z2[i:i + 4] for i in range(0, 4 * data_lines, 4)]
        buff_r_breakdown = [buff_r2[i] for i in range(0, data_lines, 1)]
        buff_g_breakdown = [buff_g2[i] for i in range(0, data_lines, 1)]
        buff_b_breakdown = [buff_b2[i] for i in range(0, data_lines, 1)]

        xyz = np.zeros((data_lines, 3))
        colors = np.zeros((data_lines, 3))

        for i in range(data_lines):
            xyz[i, 0] = struct.unpack('f', buff_x_breakdown[i])[0]
            xyz[i, 1] = struct.unpack('f', buff_y_breakdown[i])[0]
            xyz[i, 2] = struct.unpack('f', buff_z_breakdown[i])[0]

            colors[i, 0] = float(buff_r_breakdown[i]) / 255.0
            colors[i, 1] = float(buff_g_breakdown[i]) / 255.0
            colors[i, 2] = float(buff_b_breakdown[i]) / 255.0

        self.pcdA.points = o3d.utility.Vector3dVector(xyz)
        self.pcdA.colors = o3d.utility.Vector3dVector(colors)

So my question is: Is there a way in python to write a code that runs better then the for loop that I wrote?

1 Answer 1

1

You can use np.frombuffer to construct a Numpy array from a bytes object:

>>> buffer = b''.join(struct.pack('f', x) for x in [1.0, 2.0, 3.0])
>>> np.frombuffer(buffer, dtype=np.float32, count=3)
array([1., 2., 3.], dtype=float32)

So in your case this should be:

xyz = np.stack(
    [
        np.frombuffer(self.shm_x.read(4*data_lines, 0), dtype=np.float32, count=data_lines),
        np.frombuffer(self.shm_y.read(4*data_lines, 0), dtype=np.float32, count=data_lines),
        np.frombuffer(self.shm_z.read(4*data_lines, 0), dtype=np.float32, count=data_lines),
    ],
    axis=1,
)

and the same for colors:

colors = (1./255) * np.stack(
    [
        np.frombuffer(self.shm_r.read(data_lines, 0), dtype=np.byte, count=data_lines),
        np.frombuffer(self.shm_g.read(data_lines, 0), dtype=np.byte, count=data_lines),
        np.frombuffer(self.shm_b.read(data_lines, 0), dtype=np.byte, count=data_lines),
    ],
    axis=1,
)
Sign up to request clarification or add additional context in comments.

2 Comments

Hey, I notice that in this method I lose some data and colors, unlike the slower method I use in my question, you think you know what the problem is?
The only difference I can spot is that in your original code, you first read from all buffers, then process the data. That makes those reads to be very close in time (if that matters). In my version, I do np.frombuffer(self.shm_x.read...), so it does (read r, process r), (read g, process g), (read b, process b). Even though np.frombuffer is fast, this makes read r and read b to be further apart in time. Perhaps your buffers get updated in the meantime? If that is case, however, then your original code seems brittle, too. Then it would be best to lock the buffers during reading.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.