0

I have an xml file which has the following structure that contains numerous <Episodes></Episodes> to which the structure looks like this:

<Episode>
  <id>4195462</id>
  <Combined_episodenumber>8</Combined_episodenumber>
  <Combined_season>2</Combined_season>
  <DVD_chapter></DVD_chapter>
  <DVD_discid></DVD_discid>
  <DVD_episodenumber></DVD_episodenumber>
  <DVD_season></DVD_season>
  <Director>Jay Karas</Director>
  <EpImgFlag>2</EpImgFlag>
  <EpisodeName>Karl's Wedding</EpisodeName>
  <EpisodeNumber>8</EpisodeNumber>
  <FirstAired>2011-11-08</FirstAired>
  <GuestStars>Katee Sackhoff|Carla Gallo</GuestStars>
  <IMDB_ID></IMDB_ID>
  <Language>en</Language>
  <Overview>Karl Hevacheck, aka the Human Genius, gets married.</Overview>
  <ProductionCode>209</ProductionCode>
  <Rating>7.6</Rating>
  <RatingCount>20</RatingCount>
  <SeasonNumber>2</SeasonNumber>
  <Writer>Kevin Etten</Writer>
  <absolute_number></absolute_number>
  <filename>episodes/211751/4195462.jpg</filename>
  <lastupdated>1362547148</lastupdated>
  <seasonid>471254</seasonid>
  <seriesid>211751</seriesid>
</Episode>

I've figured out how to pull the information between a single tag like so

  value=$(grep -m 1 "<Rating>" path_to_file | sed 's/<.*>\(.*\)<\/.*>/\1/')

but I can't find a way to verify that I am looking at the correct episode ie. to check If this is the correct branch which is for <Combined_season>2</Combined_season> <EpisodeNumber>8</EpisodeNumber> before saving the values for specific attributes. I know this can somehow be done using a combination of sed and awk but can't seem to figure it out anyhelp on how I can do this would be greatly appreciated.

4
  • 3
    Use a proper XML parser not sed or awk! Commented May 7, 2013 at 10:18
  • @sudo_O this function is part of a much larger bash program so i was hoping i could just use one of these... why is this such a bad idea? Commented May 7, 2013 at 10:19
  • 1
    You can still call your XML parser from your bash script. It's a bad idea because XML is a structured file, sed and awk typical work with line oriented files. You will just give yourself a headache by using the wrong tool for the job. Commented May 7, 2013 at 10:24
  • @sudo_O Simple enough then, I'll just use php then ... if you'd like copy to an answer and ill accept it Commented May 7, 2013 at 10:26

1 Answer 1

1

Use a proper XML parser not sed or awk. You can still call your XML parser from your bash script just like you would with sed or awk. It's a bad idea to use sed or awk because XML is a structured file, sed and awk typical work with line oriented files. You will just give yourself a headache by using the wrong tool for the job. I suggest using a dedicated tools or a language such a php, python or perl (or any other language not starting with p) that has libraries for parsing XML.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.