2

I have an array in csv:

    date        group
0   2015-01-02  WODKA
1   2015-01-02  PIWO
2   2015-01-02  2015-01-02
3   2015-01-03  WODKA
4   2015-01-03  PIWO
5   2015-01-03  2015-01-03
6   2015-01-03  WODKA
7   2015-01-03  PIWO

And I would like to convert all the dates from the column "group" to the word "sum". But my code does not work...

import pandas as pd
import numpy as np
from datetime import datetime as dt

x = pd.read_csv("C:\\Users\dell\\Desktop\\list_1.csv", sep=';')
x.group = x.group.replace(dt, 'sum')
1
  • Why do you think that would work? dt is a module object, do you have a bunch of references to the dt module -object in your group column? Commented Oct 25, 2017 at 20:54

2 Answers 2

5

we can update those rows where we could convert group to datetime:

In [40]: df.loc[pd.to_datetime(df['group'], errors='coerce').notnull(), 'group'] = 'sum'

In [41]: df
Out[41]:
         date  group
0  2015-01-02  WODKA
1  2015-01-02   PIWO
2  2015-01-02    sum
3  2015-01-03  WODKA
4  2015-01-03   PIWO
5  2015-01-03    sum
6  2015-01-03  WODKA
7  2015-01-03   PIWO

or using RegEx (NOTE: first solution is much more flexible as it'll support different date formats):

In [46]: df['sum'] = df['group'].str.replace(r'^\d{4}-\d{2}-\d{2}', 'sum')

In [47]: df
Out[47]:
         date       group    sum
0  2015-01-02       WODKA  WODKA
1  2015-01-02        PIWO   PIWO
2  2015-01-02  2015-01-02    sum
3  2015-01-03       WODKA  WODKA
4  2015-01-03        PIWO   PIWO
5  2015-01-03  2015-01-03    sum
6  2015-01-03       WODKA  WODKA
7  2015-01-03        PIWO   PIWO
Sign up to request clarification or add additional context in comments.

9 Comments

Works great, that was it! There are still many lessons ahead of me :) Thanks!
@TomaszPrzemski, glad i could help :)
@TomaszPrzemski I will suggested you accept MaxU's answer instead of mine ,since it is well documented by pandas API , if you like my answer you can upvote . We need to consider the future visitors and bring them to the 100% right answer.
@Wen, thank you! I think it's up to OP to chose the best answer. :) PS you have my upvote already...
@MaxU Only justice, because in my project I used your solution :)
|
4

Or do some trick with special mark -( Notice , I will recommend MaxU's answer)

df.group.replace({'-':np.nan},regex=True).fillna('sum')
Out[449]: 
0    WODKA
1     PIWO
2      sum
3    WODKA
4     PIWO
5      sum
6    WODKA
7     PIWO
Name: group, dtype: object

8 Comments

This reminds me of our last discussion that prompted a question in itself :). Where are you learning these tricks, I've not seen anything like them when I'm searching for my own problems?
Source code, not docs, then? Do they explicitly state as comments in the source code that such approaches will give the desired results, or do you identify these approaches yourself from understanding the code?
@roganjosh Man , if you know me more, you will find I never ever fight for any point(reputation) related thing in SO. :-)
@roganjosh I am asking it as an question :-) stackoverflow.com/questions/46944650/…
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.