391 questions
0
votes
0
answers
995
views
Pandas datetime matching nearest date - indexing error 'dtype='int64')' is an invalid key'
I have a DataFrame, D1 that looks as follows:
close Date Symbol ICO_to
2.71 6/12/2017 18:00 MYST 5/30/2017
2.18 6/13/2017 18:00 MYST 5/30/2017
2.1 6/14/2017 18:00 MYST 5/30/2017
2.17 6/15/...
2
votes
1
answer
1k
views
Pandas Matching nearest Datetime values in 2 columns - type integer/long error
I have a DataFrame, D1:
Date Symbol ICO_to
5/28/2017 18:00 MYST 5/30/2017
5/29/2017 18:00 MYST 5/30/2017
5/30/2017 18:00 MYST 5/30/2017
6/1/2017 18:00 MYST 5/30/2017
6/2/...
1
vote
2
answers
197
views
Efficient way of writing multiple conditions for filtering data using loc or iloc
I have written the code like below to filter out the records from the column named 'Document Type' which contains around 25 categorical values.
salesdf.loc[(salesdf['Document type'] != 'AVC') &
(...
0
votes
1
answer
171
views
Creating new row values within same pandas dataframe based on difference of other row values
Below is an existing df
data = np.array([['','Market','Product Code','Week','Sales','Units'],
['Total Customers',123,1,500,400],
['Total Customers',123,2,400,320],
...
2
votes
2
answers
4k
views
Pandas Apply and Loc - efficiency and indexing
I want to find the first value after each row that meets a certain criteria. So for example I want to find the first rate/value (not necessarily the first row after) after the current row that ...
4
votes
1
answer
16k
views
Apply loc for 2 columns values Pandas
I´m tying to loc a dataframe with 2 columns parameters:
if I do paises_cpm = df.loc[a]is working but if I do paises_cpm = df.loc[a,b] I receive an error: IndexingError: Unalignable boolean Series ...
13
votes
4
answers
16k
views
Select row using the length of list in pandas cell [duplicate]
I have a table df
a b c
1 x y [x]
2 x z [c,d]
3 x t [e,f,g]
Just wondering how to select the row using the length of c column
such as
df.loc[len(df.c) >1]
I know ...
5
votes
1
answer
9k
views
Data frame filtered with loc removing index
Consider the data frame
df = pd.DataFrame(numpy.random.randint(0,10,size=(5, 4)), columns=list('ABCD'))
df
A B C D
0 5 8 0 4
1 7 4 9 0
2 8 1 1 8
3 2 7 6 6
4 4 3 3 0
I would ...
0
votes
1
answer
76
views
How to create a new column in python under multiple conditions? [duplicate]
I would like to create a new column under the following condition:
So basically I have two column Majoy car and Major housetype. I would let all the 'nocar' within Majoy car AND 'Rented' within Major ...
2
votes
3
answers
65
views
interacting over a dateframe with functions
if I have a date frame like this:
N
EG_00_04 NEG_04_08 NEG_08_12 NEG_12_16 NEG_16_20 NEG_20_24 \
datum_von
2017-10-12 ...
5
votes
1
answer
2k
views
Apply pandas.to_numeric to selected subset of columns using loc in pandas DataFrame
How to apply pandas.to_numeric to a subset of DataFrame selected using .loc[]? E.g. consider this DataFrame:
df = pd.DataFrame(index=pd.Index([1, 2, 3]))
df['X'] = ['a', 'a', 'b']
df['Y'] = [1, 2, 3]
...
1
vote
3
answers
69
views
Selection over different columns after a groupby
I am new to pandas and hence please treat this question with patience
I have a Df with year, state and population data collected over many years and across many states
I want to find the max pop ...
30
votes
2
answers
5k
views
Why/How does Pandas use square brackets with .loc and .iloc?
So .loc and .iloc are not your typical functions. They somehow use [ and ] to surround the arguments so that it is comparable to normal array indexing. However, I have never seen this in another ...
1
vote
1
answer
1k
views
Loc function on conditions in pandas returns Nan
I genuinely don't get why it returns NaN
I have a df and i need to create one more column based on other columns values, this method always worked.
train.loc[(train.region == 'Latin America') & (...
71
votes
1
answer
210k
views
python pandas loc - filter for list of values [duplicate]
This should be incredibly easy, but I can't get it to work.
I want to filter my dataset on two or more values.
#this works, when I filter for one value
df.loc[df['channel'] == 'sale']
#if I have ...
1
vote
3
answers
990
views
Grouping by each value in a column of a dataframe in python
I have a dataframe with 7 columns, as follows:
Bank Name | Number | Firstname | Lastname | ID | Date1 | Date2
B1 | 1 | ABC | EFG | 12 | Somedate | Somedate
B2 | 2 ...
2
votes
1
answer
541
views
Pandas df.loc comparing-floats-condition never works
df[['gc_lat', 'gc_lng']] = df[['gc_lat', 'gc_lng']].apply(pd.to_numeric, errors='ignore')
df_realty[['lat', 'lng']] = df_realty[['lat', 'lng']].apply(pd.to_numeric, errors='ignore')
for index, row in ...
2
votes
1
answer
17k
views
Pandas loc does not work to subset DataFrame when using a variable
I am fairly new to Python, especially pandas. I have a DataFrame called KeyRow which is from a bigger df:
KeyRow=df.loc[df['Order'] == UniqueOrderName[i]]
Then I make a nested loop
for i in range (0,...
0
votes
0
answers
150
views
How turn iterrows if statements into vectorized function or other faster method in Pandas
I have been searching for a way to replace itterrows with vectorization and coming up blank. I have this code which i believe to be working correctly using itterows which is taking forever.
sm_state ...
15
votes
3
answers
16k
views
Pandas dataframe creating multiple rows at once via .loc
I can create a new row in a dataframe using .loc():
>>> df = pd.DataFrame({'a':[10, 20], 'b':[100,200]}, index='1 2'.split())
>>> df
a b
1 10 100
2 20 200
>>> df....
2
votes
2
answers
11k
views
Select rows in dataFrame with the same index using python
I want to ask you, How to select rows that have the same index number in a DataFrame. Example:
df=
A, B, C,
0 1. 2. 1.
1 2. 2. 2.
2 2. 2. 2.
3 3. 3. 4.
A, B, C,
0 1. 2. 1.
1 2. 2. 2.
2 2. 2. ...
1
vote
3
answers
3k
views
Pandas categorizing age variable into groups
I have a dataframe df with age and I am working on categorizing the file into age groups with 0s and 1s.
df:
User_ID | Age
35435 22
45345 36
63456 18
63523 55
I tried the ...
4
votes
1
answer
2k
views
Most efficient way of joining dataframes in pandas: loc or join?
Suppose I have two dataframes; one holds transactions, trans and the other holds product information, prod, and I want to join the product prices, the variable price, on to the transaction data frame, ...
1
vote
1
answer
651
views
.loc The same column at once with pandas
This might be a rather useless question but I would like to learn how to do
.loc for same column sliced by rows at same time. Lets imagine I have this df:
k1 = pd.DataFrame([1,2,3,4])
k2 = pd....
8
votes
1
answer
37k
views
Overwriting Nan values with .loc in Pandas [duplicate]
I tried to solve the required task with the following code line:
df['Age'][np.isnan(df["Age"])] = rand1
But this raises a "SettingWithCopyWarning" and I think locating the Nan values in the dataframe ...
1
vote
1
answer
10k
views
Python pandas DataFrame loc selection for a range of rows and columns
Here is a head() of my DataFrame df:
Temperature DewPoint Pressure
Date
2010-01-01 00:00:00 46.2 37.5 1.0
...
4
votes
1
answer
5k
views
Pandas loc() method with boolean array on axis 1
I am experimenting with the Pandas loc() method, used with boolean arrays as arguments.
I created a small dataframe to play with:
col1 col2 col3 col4
0 a 1 2 3
1 ...
3
votes
1
answer
3k
views
How to add a string to every even row in a pandas dataframe column series?
I am new to pandas.
I want to add a new column to a pandas dataframe df and assign "Start" to every odd row and "Stop" to every even row.
However, when I do df.iloc[1::2, :] = "Start", I am ...
0
votes
1
answer
3k
views
How to change specific cell values in a pandas dataframe column series based on multiple conditions? [duplicate]
I am trying to replace all values in a pandas dataframe column df.column_A if they fall within the range of 1 to 10.
However, when I do:
df.loc[(1 < df.column_A < 10), "Column_A"] = 1
...
134
votes
3
answers
65k
views
Why use loc in Pandas?
Why do we use loc for pandas dataframes? it seems the following code with or without using loc both compiles and runs at a similar speed:
%timeit df_user1 = df.loc[df.user_id=='5561']
100 loops, best ...
0
votes
0
answers
520
views
Changing column values Pandas Python LOC ILOC SETVALUE Difficulty
I encounter a strange problem in pretty big piece of code. Normally, I use .loc to change particular items in a certain column within a loop while using a row_index variable as help. Lets assume to ...
2
votes
1
answer
66
views
Updating Panel slice
I need to update a panel slice with some values retreated from a dataframe. Even if I don't get back any error it doesn't work. What it's wrong ?
df = pd.DataFrame(np.random.rand(10, 4),
...
3
votes
2
answers
14k
views
Subtract one row from another in Pandas DataFrame
I am trying to subtract one row from another in a Pandas DataFrame. I have multiple descriptor columns preceding one numerical column, forcing me to set the index of the DataFrame on the two ...
0
votes
0
answers
148
views
Dataframe loc - unexpected behaviour
I have a dataframe df which looks like this:
Order Type Quantity
2015-04-30 Buy 200
2015-05-06 Buy 168
2015-05-08 Sell 368
2015-06-04 Buy ...
1
vote
1
answer
524
views
Pandas, Using .loc on a cell from another row
I am looking to manipulate a large set of data based on a couple of conditionals. One is based on the same row whereas the other is based on a cell from a different row.
For example i have a df like ...
19
votes
3
answers
52k
views
Use of loc to update a dataframe python pandas
I have a pandas dataframe (df) with the column structure :
month a b c d
this dataframe has data for say Jan, Feb, Mar, Apr. A,B,C,D are numeric columns. For the month of Feb , I want to recalculate ...
2
votes
1
answer
4k
views
Returning subset of each group from a pandas groupby object
I have the multilevel dataframe that looks like:
date_time name note value
list index
1 0 2015-05-22 05:37:59 Tom 129 ...
0
votes
1
answer
1k
views
pandas.DataFrame.xs() returns error on multi index when `level` arg used
I have a potential pandas bug, or maybe I've just been staring at this too long. I have not had issues using xs on a multi index before. Code is bellow and I've verified that the error occurs on both ...
1001
votes
7
answers
884k
views
How are iloc and loc different?
Can someone explain how these two methods of slicing are different? I've seen the docs
and I've seen previous similar questions (1, 2), but I still find myself unable to understand how they are ...
0
votes
1
answer
320
views
Cryptic warning pops up when doing pandas assignment with loc and iloc
There is a statement in my code that goes:
df.loc[i] = [df.iloc[0][0], i, np.nan]
where i is an iteration variable that I used in the for loop that this statement is residing in,np is my imported ...
3
votes
2
answers
3k
views
KeyError when using s.loc and s.first_valid_index()
I have data similar to this post: pandas: Filling missing values within a group
That is, I have data in a number of observation sessions, and there is a focal individual for each session. That focal ...