0

I have been experimenting with pandas, and while I have figured out how to use it to read data, I have stumbled upon some trouble in writing my output, and need your help!

This is my simplified code:

import pandas as pd

df = pd.DataFrame("prices", "product1", "product2", "product3")
prices = df.prices
product1 = df.product1
product2 = df.product2
product3 = df.product3

prices = ["Price Option 1", "Price Option 2", "Price Option 3"]
product1 = [1,2,3]
product2 = [4,5,6]
product3 = [7,8,9]

df.to_csv("output.csv") 

The expected output was supposed to be something like:

prices    product1    product2    product3
price1       1           2           3
price2       4           5           6
price3       7           8           9

But instead I get this error:

Traceback (most recent call last):
  File "output.py", line 3, in <module>
    df = pd.DataFrame("prices", "product1", "product2", "product3")
  File "C:\Python27\lib\site-packages\pandas-0.14.1-py2.7-win32.egg\pandas\core\frame.py", line 194,
 in __init__
    dtype = self._validate_dtype(dtype)
  File "C:\Python27\lib\site-packages\pandas-0.14.1-py2.7-win32.egg\pandas\core\generic.py", line 10
8, in _validate_dtype
    dtype = np.dtype(dtype)
TypeError: data type "product3" not understood

And I'm not quite sure why... Help is much appreciated!

2
  • Your code doesn't run, there is a problem with the way you are constructing the df never mind trying to write it to csv. Is the df you are trying to create what you have shown as expected output? Commented Oct 27, 2014 at 18:35
  • Hi Ed, thanks for your reply. Yes this doesn't work, that expected output was what I thought my code would yield. It seems like I need some sort of data validation that needs to occur, but I am having trouble finding how to go about this, I can't seem to find tutorials addressing this issue. Commented Oct 27, 2014 at 18:38

1 Answer 1

1

You are constructing your df incorrectly, the following would work:

In [3]:

df = pd.DataFrame(columns=["prices", "product1", "product2", "product3"])
df.prices = ["Price Option 1", "Price Option 2", "Price Option 3"]
df.product1 = [1,2,3]
df.product2 = [4,5,6]
df.product3 = [7,8,9]
df
Out[3]:
           prices  product1  product2  product3
0  Price Option 1         1         4         7
1  Price Option 2         2         5         8
2  Price Option 3         3         6         9

As would putting your data into a dict:

In [4]:

data = {'prices':["Price Option 1", "Price Option 2", "Price Option 3"], 'product1':[1,2,3], 'product2':[4,5,6], 'product3':[7,8,9]}
pd.DataFrame(data)
Out[4]:
           prices  product1  product2  product3
0  Price Option 1         1         4         7
1  Price Option 2         2         5         8
2  Price Option 3         3         6         9

You can then write to csv as normal

Sign up to request clarification or add additional context in comments.

2 Comments

Ahh okay I see the slight difference there with how to define the column correctly as an argument. Thanks so much!
I think your fundamental misunderstanding is how to assign values to the columns, it should be either by attribute or key so df.product1 or df['product1']

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.