0

I’m fetching ENTSO-E imbalance prices/volumes with entsoe-py and hit a parser crash because the <position> field contains a thousands separator comma (e.g. "1,346"), which int() can’t parse.

Environment:

  • Windows 10, Python 3.11.9

  • pandas 2.2.x

  • entsoe-py 0.6.10 (also repro’d on latest as of Nov 2025)

  • Locale is en-GB; requests made from the official Transparency API via EntsoePandasClient

Minimal repro:

import keyring
import pandas as pd
from entsoe import EntsoePandasClient

ENTSOE_TOKEN = keyring.get_password("baringa-entsoe", "token")
client = EntsoePandasClient(api_key=ENTSOE_TOKEN)

start = pd.Timestamp('2024-01-01 00:00:00', tz='UTC')
end   = pd.Timestamp('2024-12-31 23:59:59', tz='UTC')

# France example (happens on other countries/years too)
df = client.query_imbalance_volumes(country_code='FR', start=start, end=end)
print(df.shape)

Traceback (excerpt):

File ...\entsoe\parsers.py", line 665, in _parse_imbalance_volumes_timeseries
    position = int(point.find('position').text)
ValueError: invalid literal for int() with base 10: '1,346'

I also occasionally see a follow-on error when the above doesn’t happen:

ValueError: Index contains duplicate entries, cannot reshape
# from df.set_index(['position','category']).unstack()

What I’ve tried / Notes

  • Cleaning Quantity post-hoc doesn’t help (crash occurs inside the parser before I get a dataframe).

  • Timestamps are tz='UTC'; switching to Etc/UTC doesn’t change the behavior.

  • Looks like the XML returned by the API sometimes includes <position> with commas (1,346) rather than a plain integer. I can’t see an option in entsoe-py to sanitize this or request a different number format.

  • The duplicate-index error seems to come from multiple <TimeSeries> sharing the same (timestamp, position, category) combo in the ZIP payload (not my main blocker, but mentioning for completeness).

Questions

  1. Is there a recommended way in entsoe-py to handle locale/thousands separators in <position>?

    • e.g., a documented flag, or a known version that doesn’t parse <position> with int() directly?
  2. If not, what’s the cleanest workaround?

    • Monkey-patch the parser to strip commas before int()?

    • Pre-download the ZIP, sanitize XML (replace ,<digit> in <position>), then call the internal parser?

    • Another approach I’m missing?

  3. Any guidance on the “Index contains duplicate entries” when unstacking on ['position','category']?

    • Is deduping by (['timestamp','position','category']) with first the right approach, or is there a better semantic grouping?

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.