Update values in DataFrame based on criteria with calculation

Question

The end goal is that I am trying to modify raw stock price data as a result of a 20:1 stock split.

From raw_data I extracted the relevant ticker ('IPL') and date (< '2008-10-01') using the code below:

raw_data[(raw_data['ticker'] =='IPL') & (raw_data['date']<'2008-10-01')]

The result dataframe is below:

     ticker    date      open   high    low      close  volume     return
687     IPL 2008-01-02  117.00  118.48  116.81  117.16  150971.0    NaN
2146    IPL 2008-01-03  117.16  123.82  116.80  120.96  240929.0    0.032434
3617    IPL 2008-01-04  123.06  127.24  120.20  125.60  329834.0    0.038360
5156    IPL 2008-01-07  125.60  126.21  121.61  121.63  266578.0    -0.031608
6731    IPL 2008-01-08  119.70  121.93  118.75  119.58  362860.0    -0.016854
... ... ... ... ... ... ... ... ...
259572  IPL 2008-09-10  126.00  130.50  125.10  129.00  1046421.0   -0.030075
260940  IPL 2008-09-11  133.50  134.55  131.82  132.50  599706.0    0.027132
262251  IPL 2008-09-12  136.00  142.00  134.03  139.01  475591.0    0.049132
263608  IPL 2008-09-15  139.00  143.00  135.50  139.93  390052.0    0.006618
264980  IPL 2008-09-16  136.00  137.40  131.11  132.00  489557.0    -0.056671

I have tried to iterate through for loops and .loc[] but I am completely stuck.

I have also tried the below with & and and:

for i, row in raw_data.iterrows():
    close_val = ['close']
    if raw_data[(raw_data['ticker'] =='IPL') and (raw_data['date']<'2008-10-01')]:
        close_val = ['close'] * 0.05
    df.at[i,'close'] = close_val

But I get the following error:

"ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."

Essentially, I need to multiply all prices open, high, low, close prior to 2008-09-17 by 0.05 and divide volume by 0.05.

Does this answer your question? How to update values in a specific row in a Python Pandas DataFrame? — Bruno Mello
– Bruno Mello, Commented Apr 13, 2020 at 10:52
not really unfortunately, i think it got me closer, but still having trouble. — daveskis
– daveskis, Commented Apr 13, 2020 at 23:05

Arne · Accepted Answer · 2020-04-14 00:54:36Z

0

Pandas is smart and lets you treat DataFrame columns (i.e. Series) as vectors. So if you multiply a column by a number, pandas will multiply every row in that column by that number. This even works with whole DataFrames, so you can select any subframe you like (e.g. by indexing with a list of column names) and just multiply by a scalar number as follows (assuming the DataFrame with the dates of interest you extracted from the raw data is called df).

df[['open', 'high', 'low', 'close']] = 0.05 * df[['open', 'high', 'low', 'close']]
df['volume'] = 20 * df['volume']
df

Result:

        ticker  date        open    high    low     close   volume      return
687     IPL     2008-01-02  5.850   5.9240  5.8405  5.8580  3019420.0   NaN
2146    IPL     2008-01-03  5.858   6.1910  5.8400  6.0480  4818580.0   0.032434
3617    IPL     2008-01-04  6.153   6.3620  6.0100  6.2800  6596680.0   0.038360
5156    IPL     2008-01-07  6.280   6.3105  6.0805  6.0815  5331560.0   -0.031608
6731    IPL     2008-01-08  5.985   6.0965  5.9375  5.9790  7257200.0   -0.016854
...
259572  IPL     2008-09-10  6.300   6.5250  6.2550  6.4500  20928420.0  -0.030075
260940  IPL     2008-09-11  6.675   6.7275  6.5910  6.6250  11994120.0  0.027132
262251  IPL     2008-09-12  6.800   7.1000  6.7015  6.9505  9511820.0   0.049132
263608  IPL     2008-09-15  6.950   7.1500  6.7750  6.9965  7801040.0   0.006618
264980  IPL     2008-09-16  6.800   6.8700  6.5555  6.6000  9791140.0   -0.056671

answered Apr 14, 2020 at 0:54

Arne

10.6k2 gold badges22 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

daveskis Over a year ago

Is it possible to overwrite the original dataframe (raw data) permanently?

Arne Over a year ago

Yes, you could for example do raw_data = df, or save the new DataFrame to a file with pandas.DataFrame.to_csv(). See the documentation here: pandas.pydata.org/pandas-docs/stable/reference/api/… If you pass a filename that already exists, that file will be overwritten.

Collectives™ on Stack Overflow

Update values in DataFrame based on criteria with calculation

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related