2

I am trying to clean up an XML file using sed.

I need to remove all <DistanceMeters>123.123</DistanceMeters>.

I've been trying to use this command, without success:

sed 's/(<DistanceMeters>)[.]*?(<\/DistanceMeters>)/ /g' file.txc

Example node:

<Trackpoint><Time>2014-02-12T18:18:49+11:00</Time>
<Position><LatitudeDegrees>35.209656</LatitudeDegrees><LongitudeDegrees>28.99924</LongitudeDegrees></Position>
<AltitudeMeters>586.99994</AltitudeMeters>
<DistanceMeters>148.30713</DistanceMeters>
<Cadence>4</Cadence>
<Extensions><TPX xmlns="http://www.garmin.com/xmlschemas/ActivityExtension/v2" CadenceSensor="Bike"><Speed>0.043145742</Speed></TPX></Extensions></Trackpoint>

To make things a little more confusing, the source file is all on a single line.

Thanks.

1 Answer 1

3

If DistanceMeters is in a separated line, just do:

awk '!/DistanceMeters/' file
<Trackpoint><Time>2014-02-12T18:18:49+11:00</Time>
<Position><LatitudeDegrees>35.209656</LatitudeDegrees><LongitudeDegrees>28.99924</LongitudeDegrees></Position>
<AltitudeMeters>586.99994</AltitudeMeters>
<Cadence>4</Cadence>
<Extensions><TPX xmlns="http://www.garmin.com/xmlschemas/ActivityExtension/v2" CadenceSensor="Bike"><Speed>0.043145742</Speed></TPX></Extensions></Trackpoint>

To remove it from inside a text block, you can do:

awk '{sub(/<DistanceMeters>[^>]*>/,x)}8' file

Or with sed:

sed 's/<DistanceMeters>[^>]*>//g' file

Both this is none greedy, so it will not destroy lines with multiple occurrence of <DistanceMeters> blocks, as oppose to use the greedy .*

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.