1

This seems like it should be simple: Using an Ant task, can I use replaceregexp to replace all of certain repeated characters, only in certain strings in files?

File contents:

Blah blah blah <ac:link> words_with_underscores_to_turn_to_spaces</link>
Blah blah blah Blah blah blah Blah blah blah Blah blah blah
Words_with_underscores_that_I_want_to_keep. Blah blah blah Blah blah blah. 

Result wanted is:

Blah blah blah <ac:link> words with underscores to turn to spaces</link> 
Blah blah blah Blah blah blah Blah blah blah Blah blah blah 
Words_with_underscores_that_I_want_to_keep. Blah blah blah Blah blah blah. 

I can use replaceregexp to match &lt;ac:link.*?/link&gt; and limit the replacements to only within those strings, but in that case how do I tell it to replace all underscores that it finds inside that string, no matter where they fall? The lines with underscores aren't always the same number of words.

I also tried a copy task approach, like this:

  <copy todir=".\test_output">
   <filterchain>
   <tokenfilter>
     <containsregex pattern="(ac:link.*?link)" flags="gi"/>
    <replacestring from="_" to=" "/>
   </tokenfilter>
  </filterchain>
  <fileset dir=".\underscore_test_output" includes="**/*.txt"/>
 </copy>

That replaces the underscores with spaces in the links and moves the links into a new file, but it excludes the rest of the source file, since I only matched the links. Any ideas?

1 Answer 1

1

Using a <scriptfilter> is an excellent way to have conditional logic in a <filterchain>.

In the script below, a <filetokenizer/> treats the entire input file as a single token. This allows the JavaScript to match tags across newlines.

Ant script

<copy todir="${out.dir}">
  <fileset dir="${basedir}" includes="test.txt"/>
  <filterchain>
    <tokenfilter>
      <filetokenizer/>
      <scriptfilter language="javascript"><![CDATA[
        var originalFile = self.getToken();
        var originalFileIndex = 0;
        var transformedFile = '';
        var keepGoing = true;

        // The "ac:" vs no "ac:" discrepency between the opening and closing 
        // tags is in the sample text from the question.
        var openingTagFormat = '<ac:link>';
        var closingTagFormat = '</link>';

        while (keepGoing) {
          var openingAcLinkBeginIndex = originalFile.indexOf(openingTagFormat, originalFileIndex);
          keepGoing = openingAcLinkBeginIndex > -1;
          if (keepGoing) {
            var openingAcLinkEndIndex = openingAcLinkBeginIndex + openingTagFormat.length;
            var closingAcLinkBeginIndex = originalFile.indexOf(closingTagFormat, openingAcLinkEndIndex);
            keepGoing = closingAcLinkBeginIndex > -1;
            if (keepGoing) {
              transformedFile += originalFile.slice(originalFileIndex, openingAcLinkEndIndex);
              var closingAcLinkEndIndex = closingAcLinkBeginIndex + closingTagFormat.length;
              var stringBetweenAcLinkTags = originalFile.slice(openingAcLinkEndIndex, closingAcLinkBeginIndex);
              transformedFile += stringBetweenAcLinkTags.replace(/_/g, ' ');
              transformedFile += originalFile.slice(closingAcLinkBeginIndex, closingAcLinkEndIndex);
              originalFileIndex = closingAcLinkEndIndex;
            }
          }
        }

        transformedFile += originalFile.substring(originalFileIndex);

        self.setToken(transformedFile);
      ]]></scriptfilter>
    </tokenfilter>
  </filterchain>
</copy>

Output

Blah blah blah <ac:link> words with underscores to turn to spaces</link>
Blah blah blah Blah blah blah Blah blah blah Blah blah blah
Words_with_underscores_that_I_want_to_keep. Blah blah blah Blah blah blah.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.