0

I've got an xml feed coming from Twitter which I want to transform using XSLT. What I want the xslt to do is to replace every occuring URL in an twittermessage. I've already created the following xslt template using this and this topic here on stackoverflow. How can I achieve this? If I use the template as below i'm getting an infinite loop but I don't see where. As soon as I comment out the call to the 'replaceAll'-template everything seem to work, but then ofcourse no content of the twittermessage gets replaced. I'm new to XSLT so every bit of help is welcome.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
    <xsl:output method="text" omit-xml-declaration="yes" indent="yes"  encoding="utf-8" />
    <xsl:param name="html-content-type" />
    <xsl:variable name="urlRegex" select="8"/>
    <xsl:template match="statuses">
        <xsl:for-each select="//status[position() &lt; 2]">
            <xsl:variable name="TwitterMessage" select="text" />
            <xsl:call-template name="replaceAll">
                <xsl:with-param name="text" select="$TwitterMessage"/>
                <xsl:with-param name="replace" select="De"/> <!--This should become an regex to replace urls, maybe something like the rule below?-->
                <xsl:with-param name="by" select="FOOOO"/> <!--Here I want the matching regex value to be replaced with valid html to create an href-->
                <!--<xsl:value-of select="replace(text,'^http://(.*)\.com','#')"/>
                <xsl:value-of select="text"/>-->
            </xsl:call-template>
            <!--<xsl:value-of select="text"/>-->
            <!--<xsl:apply-templates />-->
        </xsl:for-each>
    </xsl:template>

    <xsl:template name="replaceAll">
        <xsl:param name="text"/>
        <xsl:param name="replace"/>
        <xsl:param name="by"/>
        <xsl:choose>
            <xsl:when test="contains($text,$replace)">
                <xsl:value-of select="substring-before($text,$replace)"/>
                <xsl:value-of select="$by"/>
                <xsl:call-template name="replaceAll">
                    <xsl:with-param name="text" select="substring-after($text,$replace)"/>
                    <xsl:with-param name="replace" select="$replace"/>
                    <xsl:with-param name="by" select="$by"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="$text"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

EDIT: This in an example of the xml feed.

<?xml version="1.0" encoding="UTF-8"?>
<statuses type="array">
<status>
  <created_at>Mon May 16 14:17:12 +0000 2011</created_at>
  <id>10000000000000000</id>
  <text>This is an message from Twitter http://bit.ly/xxxxx http://yfrog.com/xxxxx</text>
<status>

This is just the basic html twitter outputs on an url like below;

http://twitter.com/statuses/user_timeline.xml?screen_name=yourtwitterusername

This text;

This is an message from Twitter http://bit.ly/xxxxx http://yfrog.com/xxxxx

Should be converted to;

This is an message from Twitter <a href="http://bit.ly/xxxxx>http://bit.ly/xxxxx</a> <a href="http://yfrog.com/xxxxx">http://yfrog.com/xxxxx</a>
9
  • Have you considered that you might be using the wrong technology? XSLT is great at transforming the structure of XML, but terrible at modifying its content! For this sort of task I would use something like Linq-to-XML so that I can use C# code for making these changes. Commented May 23, 2011 at 12:16
  • @ColinE, good point! The problem here only is that i'm working with an standard CMS component providing me this data. But will consider this with the projectteam. You got any other ideas on how to solve this using the mentioned technologies? Commented May 23, 2011 at 12:22
  • Could you provide a bit of your XML input? Commented May 23, 2011 at 12:30
  • @empo, added an example. Commented May 23, 2011 at 12:41
  • This is not clear. Could you, please, provide just the source text, the resulting text you want and explain the rules for the replacement operation? I would recommend to use XSLT 2.0 which together with XPath 2.0 has suppoert for regular expressions processing. Commented May 23, 2011 at 12:54

2 Answers 2

1

So, your question isn't about XSLT. What you want is to find out the best option for manipulating a text string in XPath. If you are using a standalone XSLT engine, you can probably use XPath 2, which just about has the power you need, though with regexs it will get a bit fiddly. If you are running this from an engine with EXSLT support, you will need to look up what functions are available there. If you are running this from PHP, text manipulation is generally very good to hand over to the PHP code; you do that by make a PHP function to do what you want, and call it from the XSLT using php:function('f-name', inputs ...) as the XPath expression.

As far as regexs go, I guess you are looking for something pretty much along these lines:

send (https?://.*?)(?=[.,:;)]*($|\s)) to <a href="$1">$1</a>.

If it doesn't match all URLs, that's fine, and you only need to handle incoming data as well as Twitter's munging. Checking for punctuation at the end (the [] in the regex) is really the only tricky thing that your users will expect you to do.

Sign up to request clarification or add additional context in comments.

1 Comment

In the end I ended up not using xslt but javscript which works for now. Not the most elegant solution, but for the moment the easiest since I'm running a deadline for this project. Your answer was closest because the regex does exaclty what I needed.
1

Generally, I wouldnt implement a new replace function. I'd use the one provided by EXSLT. If your XSLT processor supports exslt, you just need to set the stylesheet as follows:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:regex="http://exslt.org/regular-expressions"
                extension-element-prefixes="regex"
                version="1.0">

Otherwise download and imort the stylesheet from EXSLT.

For a global replace you can use the function as follows:

<xsl:value-of select="regexp:replace(string($TwitterMessage), 'yourppatern', 'g', 'yourreplace')" />

Sorry for the general answer, but I'm not able to test XSLT at the moment.

4 Comments

Thanks for your answer. Will get back to you about this tomorrow.
That greedy regex is badly going to mess things up, not to mention the .com.
@Nicholas: yes probably. I was just reusing the OP regex without taking care of it. It's better to remove its reference.
Since there was a deadline on this project I ended up doing this with javascript. Not the most elegant solution, but it works for now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.