0

I like to add attributes to pandas DataFrame columns, for example to manage labels or units.

df = pd.DataFrame([[1, 2], [5, 6]], columns=['A', 'B'])
df['A'].units = 'm/s'

Calling the units of column (with df['A'].units) returns m/s.

However, the attribute gets lost after any DataFrame to Series operation, such as adding a new column:

df['C'] = [3, 8]
df['A'].units

AttributeError: 'Series' object has no attribute 'units'

Is there an approach to keep the attributes or an alternative to add columns?

2
  • 3
    Yes! Create your own class with the DataFrame as an attribute of your class. Manage all column attributes in your class. Commented Dec 23, 2016 at 17:15
  • @piRSquared , could you give an explicit example? I would like to return the dataframe when the object is called itself... Commented Jan 12, 2017 at 8:13

1 Answer 1

1

_metadata, is not part of public API. Not a stable way of doing it, still, for now

In [8]: df = pd.DataFrame([[1, 2], [5, 6]], columns=['A', 'B'])

In [9]: df['A']._metadata
Out[9]: ['name']

In [10]: df['A']._metadata.append({'units': 'm/s'})

In [11]: df['C'] = [3, 8]

In [12]: df['A']._metadata
Out[12]: ['name', {'units': 'm/s'}]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.