0

When using the pandas library to interpret a dataset I am trying to sum up all the pairs across 2 columns of the dataframe by using value_counts(subset = ...). Unfortunately it keeps giving me spaces in the results that shouldn't be there and make the output unreadable. Anyone know how to avoid this?

Attempted:

clm = [ 'class', 'cap-shape', 'cap-surface', 'cap-color', 'bruises?', 'odor', 'gill-attachment', 'gill-spacing', 'gill-size', 'gill-color', 'stalk-shape', 'stalk-root', 'stalk-surface-above-ring', 'stalk-surface-below-ring', 'stalk-color-above-ring', 'stalk-color-below-ring', 'veil-type', 'veil-color', 'ring-number', 'ring-type', 'spore-print-color', 'population', 'habitat' ]

myTable = pd.read_csv(file_path, header=None, names=clm)

print(myTable.value_counts(subset = ['class', 'cap-color']))

Result:

result table

There shouldn't be any blank spaces, each row should have a 'p' or an 'e'

4
  • You can add dropna=False Commented Mar 17, 2023 at 1:21
  • 1
    myTable.value_counts(subset = ['class', 'cap-color']).reset_index() Commented Mar 17, 2023 at 1:22
  • dropna=False does not change the result at all. Commented Mar 17, 2023 at 1:23
  • reset_index() worked! thank you! Commented Mar 17, 2023 at 1:29

1 Answer 1

1

In the value_counts() set the parameter normalize=False

myTable.value_counts(subset=['class', 'cap-color'], normalize=False)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.