I'm working with a wordpress xml dump, and for whatever reason, wordpress has exported every user in our database as an "author" of each post. In order to make the xml file easier to work with, I would like to remove all of the author nodes except for one.
Here's an example of what I have:
<rss version="2.0" xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:wp="http://wordpress.org/export/1.2/">
<wp:author>
<wp:author_id>35622</wp:author_id>
<wp:author_login>some_username_1</wp:author_login>
<wp:author_email>[email protected]</wp:author_email>
<wp:author_display_name><![CDATA[some_username_1]]></wp:author_display_name>
<wp:author_first_name><![CDATA[]]></wp:author_first_name>
<wp:author_last_name><![CDATA[]]></wp:author_last_name>
</wp:author>
<wp:author>
<wp:author_id>35290</wp:author_id>
<wp:author_login>my_unique_username</wp:author_login>
<wp:author_email>[email protected]</wp:author_email>
<wp:author_display_name><![CDATA[my_unique_username]]></wp:author_display_name>
<wp:author_first_name><![CDATA[]]></wp:author_first_name>
<wp:author_last_name><![CDATA[]]></wp:author_last_name>
</wp:author>
<wp:author>
<wp:author_id>35289</wp:author_id>
<wp:author_login>some_username_2</wp:author_login>
<wp:author_email>[email protected]</wp:author_email>
<wp:author_display_name><![CDATA[some_username_2]]></wp:author_display_name>
<wp:author_first_name><![CDATA[]]></wp:author_first_name>
<wp:author_last_name><![CDATA[]]></wp:author_last_name>
</wp:author>
<wp:author>
<wp:author_id>33404</wp:author_id>
<wp:author_login>some_username_3</wp:author_login>
<wp:author_email>[email protected]</wp:author_email>
<wp:author_display_name><![CDATA[some_username_3]]></wp:author_display_name>
<wp:author_first_name><![CDATA[]]></wp:author_first_name>
<wp:author_last_name><![CDATA[]]></wp:author_last_name>
</wp:author>
Times a few thousand more entries
I would like to remove all of the nodes except for this one:
<wp:author>
<wp:author_id>35290</wp:author_id>
<wp:author_login>my_unique_username</wp:author_login>
<wp:author_email>[email protected]</wp:author_email>
<wp:author_display_name><![CDATA[my_unique_username]]></wp:author_display_name>
<wp:author_first_name><![CDATA[]]></wp:author_first_name>
<wp:author_last_name><![CDATA[]]></wp:author_last_name>
</wp:author>
Attempting to do this in a shell script but I'm not really sure where to start as I've never used xmlstarlet before so would appreciate any help.
Updated to reflect data root and solution that I found:
xmlstarlet ed -d "//wp:author[wp:author_id != '35290']" file.xml > out.xml