0

The end goal is that I am trying to modify raw stock price data as a result of a 20:1 stock split.

From raw_data I extracted the relevant ticker ('IPL') and date (< '2008-10-01') using the code below:

raw_data[(raw_data['ticker'] =='IPL') & (raw_data['date']<'2008-10-01')]

The result dataframe is below:

     ticker    date      open   high    low      close  volume     return
687     IPL 2008-01-02  117.00  118.48  116.81  117.16  150971.0    NaN
2146    IPL 2008-01-03  117.16  123.82  116.80  120.96  240929.0    0.032434
3617    IPL 2008-01-04  123.06  127.24  120.20  125.60  329834.0    0.038360
5156    IPL 2008-01-07  125.60  126.21  121.61  121.63  266578.0    -0.031608
6731    IPL 2008-01-08  119.70  121.93  118.75  119.58  362860.0    -0.016854
... ... ... ... ... ... ... ... ...
259572  IPL 2008-09-10  126.00  130.50  125.10  129.00  1046421.0   -0.030075
260940  IPL 2008-09-11  133.50  134.55  131.82  132.50  599706.0    0.027132
262251  IPL 2008-09-12  136.00  142.00  134.03  139.01  475591.0    0.049132
263608  IPL 2008-09-15  139.00  143.00  135.50  139.93  390052.0    0.006618
264980  IPL 2008-09-16  136.00  137.40  131.11  132.00  489557.0    -0.056671

I have tried to iterate through for loops and .loc[] but I am completely stuck.

I have also tried the below with & and and:

for i, row in raw_data.iterrows():
    close_val = ['close']
    if raw_data[(raw_data['ticker'] =='IPL') and (raw_data['date']<'2008-10-01')]:
        close_val = ['close'] * 0.05
    df.at[i,'close'] = close_val

But I get the following error:

"ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."

Essentially, I need to multiply all prices open, high, low, close prior to 2008-09-17 by 0.05 and divide volume by 0.05.

2

1 Answer 1

0

Pandas is smart and lets you treat DataFrame columns (i.e. Series) as vectors. So if you multiply a column by a number, pandas will multiply every row in that column by that number. This even works with whole DataFrames, so you can select any subframe you like (e.g. by indexing with a list of column names) and just multiply by a scalar number as follows (assuming the DataFrame with the dates of interest you extracted from the raw data is called df).

df[['open', 'high', 'low', 'close']] = 0.05 * df[['open', 'high', 'low', 'close']]
df['volume'] = 20 * df['volume']
df

Result:

        ticker  date        open    high    low     close   volume      return
687     IPL     2008-01-02  5.850   5.9240  5.8405  5.8580  3019420.0   NaN
2146    IPL     2008-01-03  5.858   6.1910  5.8400  6.0480  4818580.0   0.032434
3617    IPL     2008-01-04  6.153   6.3620  6.0100  6.2800  6596680.0   0.038360
5156    IPL     2008-01-07  6.280   6.3105  6.0805  6.0815  5331560.0   -0.031608
6731    IPL     2008-01-08  5.985   6.0965  5.9375  5.9790  7257200.0   -0.016854
...
259572  IPL     2008-09-10  6.300   6.5250  6.2550  6.4500  20928420.0  -0.030075
260940  IPL     2008-09-11  6.675   6.7275  6.5910  6.6250  11994120.0  0.027132
262251  IPL     2008-09-12  6.800   7.1000  6.7015  6.9505  9511820.0   0.049132
263608  IPL     2008-09-15  6.950   7.1500  6.7750  6.9965  7801040.0   0.006618
264980  IPL     2008-09-16  6.800   6.8700  6.5555  6.6000  9791140.0   -0.056671
Sign up to request clarification or add additional context in comments.

2 Comments

Is it possible to overwrite the original dataframe (raw data) permanently?
Yes, you could for example do raw_data = df, or save the new DataFrame to a file with pandas.DataFrame.to_csv(). See the documentation here: pandas.pydata.org/pandas-docs/stable/reference/api/… If you pass a filename that already exists, that file will be overwritten.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.