two different csv file data manipulation using pandas

Question

I have two data frame df1 and df2

df1 has following data (N Rows)

  Time(s)   sv-01 sv-02 sv-03  Val1  val2  val3 
   1339.4   1     4      12     1.6   0.6   1.3
   1340.4   1     12     4     -0.5  0.5   1.4
   1341.4   1     6      8      0.4   5     1.6
   1342.4   2     5     14      1.2   3.9    11
   ......           .....      ....   ...    ..

df2 has following data which has more rows than df1

Time(msec)   channel  svid    value-1   value-2 valu-03
1000          1       2       0          5       1
1000          2       5       1          4       2
1000          3       2       3          4       7
.....         .....................................
1339400      1        1       1.6        0.4     5.3
1339400      2        12      0.5       1.8      -4.4
1339400      3        4       -0.20     1.6      -7.9
1340400      1        1      0.3       0.3      1.5
1340400      2        6      2.3      -4.3      1.0
1340400      3        4      2.0       1.1      -0.45
1341400      1       1       2         2.1      0
1341400      2       8       3.4      -0.3       1
1341400      3       6       0         4.1      2.3
....       ....      ..      ...        ...      ...

What I am trying to achieve is

1.first multiplying Time(s) column by 1000 so that it matches with df2 millisecond column.

2.In df1 sv 01,02 and 03 are in independent column but those sv are present in same column under svid.

So goal is when time of df1(after changing) is matching with time of df2 copy next three consecutive lines i.e copy all matched lines of that time instant.

Basically I want to iterate the time of df1 in df2 time column and if there is a match copy three next rows and copy to a new df.

I have seen examples using pandas merge function but in my case both have different header.

Thanks.

jezrael · Accepted Answer · 2017-12-07 07:21:37Z

1

I think you need double boolean indexing - first df2 with isin, for multiple is used mul:

And then count values per groups by cumcount and filter first 3:

df = df2[df2['Time(msec)'].isin(df1['Time(s)'].mul(1000))]
df = df[df.groupby('Time(msec)').cumcount() < 3]
print (df)
    Time(msec)  channel  svid  value-1  value-2  valu-03
3      1339400        1     1      1.6      0.4     5.30
4      1339400        2    12      0.5      1.8    -4.40
5      1339400        3     4     -0.2      1.6    -7.90
6      1340400        1     1      0.3      0.3     1.50
7      1340400        2     6      2.3     -4.3     1.00
8      1340400        3     4      2.0      1.1    -0.45
9      1341400        1     1      2.0      2.1     0.00
10     1341400        2     8      3.4     -0.3     1.00
11     1341400        3     6      0.0      4.1     2.30

Detail:

print (df.groupby('Time(msec)').cumcount())
3     0
4     1
5     2
6     0
7     1
8     2
9     0
10    1
11    2
dtype: int64

edited Dec 7, 2017 at 7:21

answered Dec 7, 2017 at 7:16

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

jezrael Over a year ago

@Poka - yes, no problem ;)

jezrael Over a year ago

Can you check df = df2[df2['Time(msec)'].isin(df1['Time(s)'].round(1).mul(1000).astype(int))] ?

jezrael Over a year ago

Hmm, then the best is convert column in df1 to ms for match data and then use df = pd.merge(df1, df, left_on='Time(s)', right_on='Time(msec)', how='right').

Poka Over a year ago

@ Jezrael. df2 has all time instant of df1. So'' df = df2[df2['Time(msec)'].isin(df1['Time(s)'].mul(1000))]'' shall give total number of df1 time instant but is outputting less than df1 number. It is jumping from 59 to 63 after some iteration as shown 5455 32945600 5457 32945800 5459 32946000 5463 32946400 . It is not taking 3294200 when both dataframe has this value

jezrael Over a year ago

There is not problem with floats accuracy, so no match?

|

Collectives™ on Stack Overflow

two different csv file data manipulation using pandas

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related