-1

I have a Java string like this:

String str = "<table><tr><td>ALL,</td></tr><tr><td></td></tr><tr><td> Please find attached file for this week<tr><td></td></tr><tr><td>Thanks</td></tr><tr><td>Support Team</td></tr>";

I want output like this:

All,

Please find attached file for this week.

Thanks, Support Team

4
  • 1
    That's not going to be easy as the html is not well-formed. Unless you use regexes, which is not recommendable Commented Jun 22, 2023 at 9:55
  • you should provide more context to the question. Do you need to format the string with carriage return (\n)? Giving you are dealing with a <table> I think you are looking for a templating engine such as FreeMarker or Thymeleaf. Commented Jun 22, 2023 at 9:56
  • Does this answer your question? how to convert HTML text to plain text? Commented Jun 22, 2023 at 10:55
  • Does this answer your question? Remove HTML tags from a String Commented Jun 22, 2023 at 11:24

1 Answer 1

3

You should really use a proper html parser, but if you want something quick and dirty and your html is well-formed you can use something from package javax.swing.text.html:

    public static String stripTags(String content) throws Exception {
        String result = null;
        HTMLEditorKit kit = new HTMLEditorKit();
        InputStream in = new ByteArrayInputStream(content.getBytes());
        Document doc = new HTMLDocument();
        kit.read(in, doc, 0);
        result = doc.getText(0, doc.getLength());

        return result;
    }
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.