I'm making a mod for a game where the majority of the files are XMLs, the text of which is Simplified Chinese. My goal is to replace all of the Simplified Chinese in the files with Traditional, followed by an English translation. I'm using the Cloud Translate API from Google to do that part, and it all works fine. At first I was just doing a find and replace on the Chinese text and then adding English to the end of string, but the issue with that is that I'm getting extra English translations whenever the Chinese text occurs more than once.
In an effort to fix that I read more of the XML documentation for Python, and I started trying to use tree.write, but that's where I'm getting stuck. When I use it, the XML file has the UTF codes for the Chinese characters, rather than the actual characters. If I open the file in a web browser, the characters render correctly, but at this point I'm just unsure if they'll still work with the game if they're not writing into the XML normally.
Here's an example XML I'm working with:
<Texts Type="Story">
<List>
<Text Name="TradeAuction">
<DisplayName>拍卖会</DisplayName>
<Desc>[NAME]来到了[PLACE],发现此地有个拍卖行。</Desc>
<Selections.0.Display>参与拍卖</Selections.0.Display>
<Selections.1.Display>离去</Selections.1.Display>
</Text>
</List>
</Texts>
My code which works but sometimes duplicates English translations:
import lxml.etree as ET
from google.cloud import translate_v2 as translate
import pinyin
translator = translate.Client()
tgt = "zh-TW"
tt = "en"
with open('/home/dave/zh-TW-final/Settings/MapStories/MapStory_Auction.xml', 'r', encoding="utf-8") as f:
tree = ET.parse(f)
root = tree.getroot()
for elem in root.iter('Text'):
print(elem.text)
for child in elem:
txt = child.text
ttxt = translator.translate(txt, target_language=tgt)
etxt = translator.translate(txt, target_language=tt)
with open('/home/dave/zh-TW-final/Settings/MapStories/MapStory_Auction.xml', 'r') as n:
new = n.read().replace(txt, ttxt['translatedText'] + '(' + etxt['translatedText'] + ')', 1)
with open('/home/dave/zh-TW-final/Settings/MapStories/MapStory_Auction.xml', 'w') as n:
n.write(new)
The output of that looks like this:
<Texts Type="Story">
<List>
<Text Name="TradeAuction">
<DisplayName>拍賣會(auctions)</DisplayName>
<Desc>[NAME]來到了[PLACE],發現此地有個拍賣行。([NAME] came to [PLACE] and found an auction house here.)</Desc>
<Selections.0.Display>參與拍賣(Participate in the auction)</Selections.0.Display>
<Selections.1.Display>離去(Leave)</Selections.1.Display>
</Text>
</List>
</Texts>
And here's my tree.write code:
import lxml.etree as ET
from google.cloud import translate_v2 as translate
import pinyin
translator = translate.Client()
tgt = "zh-TW"
tt = "en"
with open('/home/dave/zh-TW/Settings/MapStories/MapStory_Auction.xml', 'r', encoding="utf-8") as f:
tree = ET.parse(f)
root = tree.getroot()
for elem in root.iter('Text'):
print(elem.text)
for child in elem:
print(child.text)
txt = child.text
ttxt = translator.translate(txt, target_language=tgt)
etxt = translator.translate(txt, target_language=tt)
child.text = ttxt['translatedText'] + "(" + etxt['translatedText'] + ")"
tree.write('/home/dave/zh-TW-final/Settings/MapStories/MapStory_Auction.xml')
And the output from that looks like this:
<Texts Type="Story">
<List>
<Text Name="TradeAuction">
<DisplayName>拍賣會(auctions)</DisplayName>
<Desc>[NAME]來到了[PLACE],發現此地有個拍賣行。([NAME] came to [PLACE] and found an auction house here.)</Desc>
<Selections.0.Display>參與拍賣(Participate in the auction)</Selections.0.Display>
<Selections.1.Display>離去(Leave)</Selections.1.Display>
</Text>
</List>
</Texts>
Any help would be appreciated. I think once I figure this out I should be able to fly through the rest of the translating.