0

I need to subset a particular column from a Dataframe, of simulated stock prices, and find its mean.

Variables previously defined are:

T  = 1  
dt = 1/1000 
which makes T/dt = 1000. (float)

Now, directly indexing DataFrame as follows, throws an error:

StockPrice[T/dt].mean() -> error

However, casting index as 'int' before using, works fine:

StockPrice[int(T/dt)].mean()

So I am trying to understand, what is the standard practice when sub-setting DataFrames using other variables that may generate integer values (but with float datatype). Should we cast them as int and then use them, or is there an alternate way?

2
  • Which version of pandas are you using? What is the exact error message? Commented Feb 15, 2016 at 18:43
  • Pandas Version: 0.17.1 Snippet from error message: 1974 # get column 1975 if self.columns.is_unique: 1976 return self._get_item_cache(key) 1977 1978 # duplicate columns & possible reduce dimensionality KeyError: 1000.0 Commented Feb 15, 2016 at 18:51

2 Answers 2

2

Given the stock prices are a continuous variable you might be best to use range to capture relevant stock prices around your target price. That range can be as large or as small as needed.

Pandas Series comes with a .between() method. This will evaluate to True or False for each value of the series within the range. Then use this 'criteria' in a boolean slicing operation to pull out the relevant values.

np.random.seed(1)

df = pd.DataFrame(np.random.rand(1000,1),columns=['stockprice'])*10000.

epsilon = 100.
dt = 1000.

criteria = df['stockprice'].between(dt-epsilon,dt+epsilon)
print df[criteria]
Sign up to request clarification or add additional context in comments.

1 Comment

While this code may answer the question, providing additional context regarding why and/or how this code answers the question improves its long-term value.
0

You should. I get this warning in pandas 0.17.1:

FutureWarning: scalar indexers for index type Int64Index should be integers and not floating point

It's a feature, not a bug.

Besides, using floats seems to work for Series but not with DataFrames so the future is almost here.

1 Comment

Thanks this is helpful!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.