-1

I'm working on Capacity prediction models for Lithium-Ion batteries.

I have 10 datasets from 10 different batteries including the capacity and multiple features. Each dataset is time dependent. In the end I want to predict the capacity for a specific time.

To do so, I want to build one model using all data and I'm not sure on how to continue with having 10 datasets from 10 different measurements. Can I merge the 10 datasets into 1 and then devide the complete dataset into train, test and validation set? I'm unsure because the time stamps of each datasets are the same.

1
  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. Commented Oct 19, 2022 at 11:29

1 Answer 1

0

I suggest you to add the datasets one under another like this:

Index          Columns
time1 battery1  feature1 feature2 ... y_true=capacity
time1 battery2  feature1 feature2 ... y_true
...
time2 battery1  feature1 feature2 ... y_true
time2 battery2  feature1 feature2 ... y_true
...

Then you can onehot encode the battery and also have it as a feature (or not, depends whether you want to find some outliers it them)

Just be careful in using the TimeSeriesSplit from sklearn. You need to groupby then split. So that the split does not happen in the middle of the battery batch for the same time code. Otherwise you will have a look-ahead bias

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.