0

I'm looking to insert information into a existing dataframe, this dataframe shape is 2001 rows × 13 columns, however, only the first column has information.

I have 12 more columns, but these are not the same dimension as the main dataframe, so I'd like to insert this additional columns into the main one using a conditional. Example dataframe:

enter image description here

This in an example, I want to insert the var column into the 2001 × 13 dataframe, using the date as a conditional and in case there is no date, it skips the row or simply adds a 0.
I'm really new to python and programming in general.

3
  • Is the first column the date? Commented Jun 16, 2020 at 18:52
  • Can you not just remove the rows with empty date? Commented Jun 16, 2020 at 18:54
  • Does this answer your question? Python Pandas update a dataframe value from another dataframe Commented Jun 16, 2020 at 18:55

1 Answer 1

0

Without a minimal working example it is hard to provide you with clear recommendations, but I think what you are looking for is the .loc a pd.DataFrame. What I would recommend you doing is the following:

  • Selection of rows with .loc works better in your case if the dates are first converted to date-time, so a first step is to make this conversion as:
# Pandas is quite smart about guessing date format. If this fails, please check the
# documentation https://docs.python.org/3/library/datetime.html to learn more about
# format strings.
df['date'] = pd.to_datetime(df['date'])

# Make this the index of your data frame.
df.set_index('date', inplace=True)
  • It is not clear how you intend to use conditionals/what is the content of your other columns. Using .loc this is pretty straightforward
# At Feb 1, 2020, add a value to columns 'var'.
df.loc['2020-02-01', 'var'] = 0.727868
  • This could also be used for ranges:
# Assuming you have a second `df2` which as a datetime columns 'date' with the
# data you wish to add to `df`. This will only work if all df2['date'] are found
# in df.index. You can workout the logic for your case.
df.loc[df2['date'], 'var2'] = df2['vals']

If the logic is to complex and the dataframe is not too large, iterating with .iterrows could be easier, specially if you are beginning with Python.

for idx, row in df.iterrows():
    if idx in list_of_other_dates:
        df.loc[i, 'var'] = (some code here)

Please clarify a bit your problem and you will get better answers. Do not forget to check the documentation.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.