6

Given a structured numpy array, I want to remove certain columns by name without copying the array. I know I can do this:

names = list(a.dtype.names)
if name_to_remove in names:
    names.remove(name_to_remove)
a = a[names]

But this creates a temporary copy of the array which I want to avoid because the array I am dealing with might be very large.

Is there a good way to do this?

7
  • If you want to avoid using the "names" list, you may write a lambda function which does this operation. Commented May 6, 2016 at 18:45
  • The problem is that a[names] creates a copy of the original array, assigns it to a and only then deletes the original array. I want to avoid that copy. Maybe I should clarify my question somehow? Commented May 6, 2016 at 18:59
  • You are talking about creation of the "names" list right? Commented May 6, 2016 at 19:01
  • 1
    In general I don't think it's possible for much the same reason that you can't remove arbitrary rows or columns from a 2D numpy array without generating a copy. Structured numpy arrays are backed by contiguous blocks of memory, where elements in adjacent fields reside at adjacent addresses. If you wanted to remove an arbitrary field from the middle of the array, you would need to "shift over" the elements in all of the fields after it, which would require a copy. Commented May 6, 2016 at 19:43
  • 1
    @ali_m: Actually the fields to not have to be adjacent. See my answer. Commented May 6, 2016 at 21:28

1 Answer 1

7

You can create a new data type containing just the fields that you want, with the same field offsets and the same itemsize as the original array's data type, and then use this new data type to create a view of the original array. The dtype function handles arguments with many formats; the relevant one is described in the section of the documentation called "Specifying and constructing data types". Scroll down to the subsection that begins with

{'names': ..., 'formats': ..., 'offsets': ..., 'titles': ..., 'itemsize': ...}

Here are a couple convenience functions that use this idea.

import numpy as np


def view_fields(a, names):
    """
    `a` must be a numpy structured array.
    `names` is the collection of field names to keep.

    Returns a view of the array `a` (not a copy).
    """
    dt = a.dtype
    formats = [dt.fields[name][0] for name in names]
    offsets = [dt.fields[name][1] for name in names]
    itemsize = a.dtype.itemsize
    newdt = np.dtype(dict(names=names,
                          formats=formats,
                          offsets=offsets,
                          itemsize=itemsize))
    b = a.view(newdt)
    return b


def remove_fields(a, names):
    """
    `a` must be a numpy structured array.
    `names` is the collection of field names to remove.

    Returns a view of the array `a` (not a copy).
    """
    dt = a.dtype
    keep_names = [name for name in dt.names if name not in names]
    return view_fields(a, keep_names)

For example,

In [297]: a
Out[297]: 
array([(10.0, 13.5, 1248, -2), (20.0, 0.0, 0, 0), (30.0, 0.0, 0, 0),
       (40.0, 0.0, 0, 0), (50.0, 0.0, 0, 999)], 
      dtype=[('x', '<f8'), ('y', '<f8'), ('i', '<i8'), ('j', '<i8')])

In [298]: b = remove_fields(a, ['i', 'j'])

In [299]: b
Out[299]: 
array([(10.0, 13.5), (20.0, 0.0), (30.0, 0.0), (40.0, 0.0), (50.0, 0.0)], 
      dtype={'names':['x','y'], 'formats':['<f8','<f8'], 'offsets':[0,8], 'itemsize':32})

Verify that b is a view (not a copy) of a by changing b[0]['x']...

In [300]: b[0]['x'] = 3.14

and seeing that a is also changed:

In [301]: a[0]
Out[301]: (3.14, 13.5, 1248, -2)
Sign up to request clarification or add additional context in comments.

3 Comments

Unfortunately, this doesn't work for the object dtype: TypeError: Cannot change data-type for object array.
The post v1.16 multifield indexing does the same as view_fields.
@mapft, it's not a good idea to post a new error in a comment. Errors are best addressed with context (minimal reproducible example) and traceback.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.