0

I wrote a code to read news from XML file (Feed) .. and I have to display the description of each item in my list view ... and I used this peas of code to remove the html tags exists inside the description tag :

else if ("description".equals(tagName)){
                             sourcedescription= parser.nextText();
                             description=Html.fromHtml(sourcedescription).toString();
                             Log.d("msg", description);
                             feedDescription.add(description);

                         }

some items I succeeded to display its description without tags i.e. in an understood manner , BUT I failed to remove all tags for some other items which have {iframe} {/iframe} tag ... and I think this tag exists in the description tags of the items which have "no description"

<description><![CDATA[<p>{iframe height="600"}<a href="http://admreg.yu.edu.jo/index.php?option=com_content&view=article&id=606:------20132014&catid=87:2011-01-25-18-12-08&Itemid=438">http://admreg.yu.edu.jo/index.php?option=com_content&view=article&id=606:------20132014&catid=87:2011-01-25-18-12-08&Itemid=438</a><span style="line-height: 1.3em;">{/iframe}</span></p>]]></description>

My question is how to remove the iframe tag by using regular expressions ?

2

4 Answers 4

2

A posible solution would be

    String regexp = "\\{/?iframe.*?\\}";
    String text = "<description><![CDATA[<p>{iframe height=\"600\"}<a href=\"http://admreg.yu.edu.jo/index.php?option=com_content&view=article&id=606:------20132014&catid=87:2011-01-25-18-12-08&Itemid=438\">http://admreg.yu.edu.jo/index.php?option=com_content&view=article&id=606:------20132014&catid=87:2011-01-25-18-12-08&Itemid=438</a><span style=\"line-height: 1.3em;\">{/iframe}</span></p>]]></description>";
    System.out.println(text.replaceAll(regexp, ""));

If you want to remove the content inside the tag iframe, use this regexp instead:

text.replaceAll("\\{iframe .*?\\}.*?\\{/iframe\\}", "")
Sign up to request clarification or add additional context in comments.

Comments

2

Use these regex:

\{iframe[^\}]*\}   // to delete the opening tag
\{/iframe[^\}]*\}  // to delete the closing tag

These regex won't delete what is in the iframe.

Comments

1

Note: Use a parser if you have the option. That said...for a quick and dirty..

str.replaceAll("\\{/?iframe.*?\\}", "");

To remove the content between these tags.

str.replaceAll("\\{iframe.*?\\}.*?\\{/iframe\\}", "")

2 Comments

Ok this work .. thanks :) ,, But what inside this tag still appeared : admreg.yu.edu.jo/…
I mean the content of <a href =" ... "> </a> which is the value of href !!
0

HTML is not a regular language. Don't use RegEx with it, or you'll die.

3 Comments

Ok, I'll try not doing it any more.
LOL this was a funny comment!!
@SameerSawla this is an answer to a deleted comment

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.