0

I need to remove nodes from a file if they don't have a certain tag. How can I only keep nodes which have a name tag using awk, sed or grep?

Input:

    <node user="user1">
      <tag k="name" v="name1"/>
    </node>
    <node user="user2">
      <tag k="network" v="nw1"/>
    </node>

Desired output:

    <node user="user1">
      <tag k="name" v="name1"/>
    </node>
6
  • Please add your desired output for that sample input to your question. Commented Nov 27, 2016 at 11:07
  • 4
    No, you absolutely don't do this using "awk, sed oder grep". You do this using xsltproc or xmlstarlet. Please read up on these tools. Also, please ask a question. Currently this is a programming assignment, not a question. Commented Nov 27, 2016 at 11:07
  • @Tomalak I need to use awk, grep or grep. Commented Nov 27, 2016 at 11:29
  • 1
    No, you don't. Give me one valid reason why you need to use command line tool X, while the much more appropriate command line tool Y, which is also installed (!), can't be used. Commented Nov 27, 2016 at 11:30
  • 1
    That's highly unlikely. They want a problem solved, and if they are paying you then I am pretty sure they want it solved properly. If you come to a car repair shop you don't get to tell the mechanic which tools to use, and in the same way your customer is not telling you which tools to use. So... I don't think that you are really in that situation. Commented Nov 27, 2016 at 12:02

3 Answers 3

2

If your file's really that simple, with GNU awk for multi-char RS:

$ awk -v RS='</node>\n' '/v="name1"/{printf "%s%s", $0, RT}' file
    <node user="user1">
      <tag k="name" v="name1"/>
    </node>
Sign up to request clarification or add additional context in comments.

6 Comments

It is not working, the output of your code is: tag k="name" v="name1"/>
No it's not, see the answer. If whatever you are running is producing that output then you are not running the command in my answer against the input in your question.
I suggest to replace RT by RS (typo).
Thanks, but my ouput is still: tag k="name" v="name1"/> </node>
Okay. With my nawk I miss </node> if I use RT.
|
1

With GNU grep:

grep -Poz '.*<node .*\n.*<tag .*v="name1".*\n.*</node>' file.xml

Output:

   <node user="user1">
      <tag k="name" v="name1"/>
    </node>

2 Comments

Yes it works, but how can I insert a Newline if I have several output nodes?
With grep I don't know. I suggest to use Ed Morton's awk solution and replace %s%s by %s%s\n. See my comment to his solution. There's a small typo in his answer.
1

Some hints with xmlstarlet and this file (file.xml):

<root>
   <node user="user1">
      <tag k="name" v="name1"/>
    </node>
    <node user="user2">
      <tag k="network" v="nw1"/>
    </node>
   <node user="user3">
      <tag k="foo" v="bar"/>
    </node>
</root>

Get attributes:

xmlstarlet sel -t -v '//root/node/tag/@v' file.xml

Output:

name1
nw1
bar

Delete one node with attribute v="name1":

xmlstarlet ed -d '//root/node[tag[@v="name1"]]' file.xml

Output:

<?xml version="1.0"?>
<root>
  <node user="user2">
    <tag k="network" v="nw1"/>
  </node>
  <node user="user3">
    <tag k="foo" v="bar"/>
  </node>
</root>

Delete two nodes with attributes v="name1" or v="bar":

xmlstarlet ed -d '//root/node[tag[@v="name1"]]' -d '//root/node[tag[@v="bar"]]' file.xml

Output:

<?xml version="1.0"?>
<root>
  <node user="user2">
    <tag k="network" v="nw1"/>
  </node>
</root>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.