Remove only a part of a string [duplicate]

Question

If I have the following string:

< asd="testJava"><a href="/title/text/">BLA BLA <asddead>

How can I get only the string BLA BLA.

I tried split but it removes me all the chars, I need to remove only those from ">" to "<". Once I get the string, I'm gonna add it to an ArrayList with array.add(); Can someone help me with the code that removes the strings? Thank you!

Im gonna use that in Java. I need to remove the html code and conserve only the string. — user3704449
– user3704449, Commented Jun 3, 2014 at 19:24

kajacx · Accepted Answer · 2014-06-03 19:39:15Z

2

Use regex to replace everything between < and > by nothing:

String newText = oldText.replaceAll("<[^>]*>", "").trim();

2 more notes:

This wouldn't work on something like <a href="foo>com">BLA BLA</a>, since regex would match the > in foo>com and not the corrent one. In such case, I would reccomend a proper HTML / XML parser.
add .trim() to erase any whitespaces before / after your text. Without it, <img> <br> BLA BLA would not resolve into 'BLA BLA', but ' BLA BLA'

edited Jun 3, 2014 at 19:39

answered Jun 3, 2014 at 19:25

kajacx

13.1k5 gold badges49 silver badges75 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Adam Yost · Accepted Answer · 2014-06-03 19:25:59Z

1

Ignoring the implications of expanding this solution to a full HTML parser... you could use replaceAll with a regex.

str = str.replaceAll("<[^>]*>","");

should replace all the html with nothing, leaving just your labelof BLABLA

answered Jun 3, 2014 at 19:25

Adam Yost

3,62525 silver badges36 bronze badges

Collectives™ on Stack Overflow

Remove only a part of a string [duplicate]

2 Answers 2

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Linked

Related