1

I have raw data that looks like this:

25023,Zwerg+M%C3%BCtze,0,1,986,3780
25871,red+earth,0,1,38,8349
25931,K4m%21k4z3,90,1,1539,2530

It is saved as a .txt file: https://de205.die-staemme.de/map/player.txt

The "characters" starting with % are unicode, as far as I can tell.

I found the following table about it: https://www.i18nqa.com/debug/utf8-debug.html

Here is my code so far:

urllib.urlretrieve(url,pfad + "player.txt")

f = open(pfad + "player.txt","r",encoding="utf-8")
raw = raw.split("\n")
f.close()

Python does not convert the %-characters. They are read as if they were seperate characters.

Is there a way to convert these characters without calling .replace like 200 times?

Thank you very much in advance for help and/or useful hints!

1

1 Answer 1

3

The %s are URL-encoding; use urllib.parse.unquote to decode the string.

>>> raw = """25023,Zwerg+M%C3%BCtze,0,1,986,3780
... 25871,red+earth,0,1,38,8349
... 25931,K4m%21k4z3,90,1,1539,2530"""
>>> import urllib.parse
>>> print(urllib.parse.unquote(raw))
25023,Zwerg+Mütze,0,1,986,3780
25871,red+earth,0,1,38,8349
25931,K4m!k4z3,90,1,1539,2530
Sign up to request clarification or add additional context in comments.

1 Comment

This is exactly what I was searching for. Thank you very much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.