Python/Pandas: How to create a table of results with new variables and values calculated from an existing dataframe

Question

I want to be able to create a cross table/table/dataframe (what ever the name) like this:

____________________      
Performance  "value" (This value must come from a X vector, which has a formula to go to dataset, calculate and return this value)
____________________
LTFU         "value" (This value must come from a y vector, which has a formula to go to dataset, calculate and return this value)
____________________

Please, note that Performance and LTFU values are generated from a function applied to a .csv dataset in python. Performance and LTFU don't exist in the .csv dataset, both should be created just to allow me do a summary of performance.

What I get now is as below:

import pandas as pd
performance=pd.read_csv("https://www.dropbox.com/s/08kuxi50d0xqnfc/demo.csv?dl=1")

x=performance["idade"].sum()
y=performance["idade"].mean()

l = "Performance"
k = "LTFU"

def test(y):
return pd.DataFrame({'a':y, 'b':x})

test([l,k])

         a        b
0   Performance   x vector value here (it shows 1300, it is correct)
1   LTFU          y vector value here (it shows 1300, it is wrong, it should show 14.130434782608695 instead, according to the instruction of y vector)

You can copy and paste the above code to your python IDE and test and then return with your solution to me. Please, show me an example with the table results as I want.

My text has beeen distorced here. I am posting a screenshsot — MGB.py
– MGB.py, Commented Feb 11, 2018 at 9:48
What are you trying to do? Are you trying to get the same format to save as CSV / txt? Or are you trying to summarize this dataframe to reuse? — Deena
– Deena, Commented Feb 11, 2018 at 11:50
@Deena, I am trying to summarize this dataframe with new variables. What I want is to calculate the those 2 values from another variables within dataset. I want to get new values generated by another calculation. Please, note that Perfomance and LTFU dont exist in the csv dataset.They are new variables just created to summarize want I want. — MGB.py
– MGB.py, Commented Feb 11, 2018 at 12:13
I am confused test([l,k]) return DataFrame. So need write it to file? Or need create another DataFrame from csv - Performance 1300 ____________________ LTFU 60, add it to test([l,k]) and write back? — jezrael
– jezrael, Commented Feb 12, 2018 at 9:39
Yeah, Jezrael. I need to create another table(or dataframe whatever the name) which contains Performance value that is correct 1300 according to my function above and also must contain LTFU value (which must not be 1300 because the function which generate this value is different from Performance). Did you get it? — MGB.py
– MGB.py, Commented Feb 12, 2018 at 9:44

jezrael · Accepted Answer · 2018-02-13 10:01:55Z

0

You need assign output to DataFrame and then write to file by DataFrame.to_csv:

l = "Performance"
k = "LTFU"

#changed input to 2 scalar values
def test(l1,k1):
    #changed a to list [l1, k1]
    #changed b to list [x, y]
    return pd.DataFrame({'a':[l1, k1], 'b':[x, y]})

df1 = test(l,k)
print (df1)
             a            b
0  Performance  1300.000000
1         LTFU    14.130435
df1.to_csv('file.csv', index=False, header=None, sep=' ')

edited Feb 13, 2018 at 10:01

answered Feb 12, 2018 at 12:31

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

28 Comments

MGB.py Over a year ago

your code writes to a .csv and gives me the same wrong result as per vector y formula. I dont want to write to .csv the output. I just want that simple table as output only that but you need to correct to let y value be correct and not replicate x vector formula. Is there anyone who understands what I want, please, help me! I need that table output to be in DASH python Dashboard.

jezrael Over a year ago

@MGB.py I have few question. 1. df1 = test([l,k]) print(df1) . This create dataframe, (table) generate from data from demo.csv. What need to do? Display only? Or something else? 2. Why are incorrect values in df1? If demo.csv contains only data from question, what is expected output? (I cannot see all data in file, so I try simplify it). 3. What is input to DASH python Dashboard? Dataframe? Dictionary? Something else? Please be patient, I try to help you. But your thinking and my is different, so it seems we dont understand each other. Thank you.

MGB.py Over a year ago

I am editing the question showing to you all demo dataset to let we talk same language. Wait.

jezrael Over a year ago

@MGB.py Sure. I have main problem why df1 = test([l,k]) print(df1) is not your expected output. Because column 0, 1 and a, b first row? Or somwthing else?

MGB.py Over a year ago

so you had forgot what you did before. It happens!Ok. Thanks.

|

Manjit Ullal · Accepted Answer · 2018-02-11 16:19:09Z

0

your requirement does not fit the definition of pandas data frame, you already have the values, so may be you can use output using other ways

answered Feb 11, 2018 at 16:19

Manjit Ullal

1068 bronze badges

1 Comment

MGB.py Over a year ago

Looks like you did not understand my concern. Yesterday I have edited my question to be more clear to you. you already have the values, so may be you can use output using other ways - Which values? The second below value of 1300 on generated above table should not appear there, it is a replication of the above 1300 which is correct according to the function(formula) I have created.

Collectives™ on Stack Overflow

Python/Pandas: How to create a table of results with new variables and values calculated from an existing dataframe

2 Answers 2

28 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

28 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related