I need to translate some of HTML page content. I have a lot of HTML documents as a list of files and a map with translations like this:
List<File> files
Map<String, String> translations
Only strings in specific tags (p, h1..h6, li) have to be translated. I want to end up with the same files like at the beginning but with replaced strings.
Two solutions that don't work:
- Replacing - because I don't want to translate strings inside comments or in javascript, another problem is that one string with original text can be a part of another string with original text.
- Parsing libraries like Jsoup - because it cleans, fixes dom structure and I want to have unmodified HTML structure.
Any solutions?