13

I need to parse through a text file that contains something like :

1|Song Title|Release date||"ignore me"|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0

which is the song number, followed by the release date, followed by a website that I need to ignore, and followed by a series of 0's and 1's which could represent an vector of genres.

I need a way to separate this data, and ignore the one that say's the website while at the same time creating a new instance of a Song Object which has an : (int songNumber,string songTitle, vector* genres, string releaseDate)

Thanks!

3 Answers 3

19

The C++ String Toolkit Library (StrTk) has the following solution to your problem:

#include <string>
#include <deque>
#include "strtk.hpp"

struct song_type
{
   unsinged int id;
   std::string release_date;
   std::string url;
   char genre[8];
};

strtk_parse_begin(song_type)
 strtk_parse_type(id)
 strtk_parse_type(release_date)
 strtk_parse_type(url)
 strtk_parse_type(genre[0])
 strtk_parse_type(genre[1])
 strtk_parse_type(genre[2])
 strtk_parse_type(genre[3])
 strtk_parse_type(genre[4])
 strtk_parse_type(genre[5])
 strtk_parse_type(genre[6])
 strtk_parse_type(genre[7])
strtk_parse_end()

int main()
{
   std::deque<song_type> song_list;

   strtk::for_each_line("songs.txt",
                        [&song_list](const std::string& line)
                        {
                           song_type s;
                           if (strtk::parse(line,"|",s))
                              song_list.push_back(s);
                        });

   return 0;
}

More examples can be found Here

Sign up to request clarification or add additional context in comments.

Comments

4
  • Define a class Song that holds the data in the form you require, as you stated above
  • implement Song::operator>>(const istream&); to populate the class by parsing the above data from an input stream
  • read the file line by line using string::getline
  • for each line, convert to stringstream and then use your operator>> to fill in the fields in an instance of Song.

It's straightforward to tokenize the stringstream with the '|' character as a separator, which would be the bulk of the work.

int main()
{
   std::string token;
   std::string line("1|Song Title|Release date||\"ignore me\"|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0");

   std::istringstream iss(line);
   while ( getline(iss, token, '|') )
   {
      std::cout << token << std::endl;
   }
   return 0;
}

Code lifted from here.

2 Comments

how do I know that the 1 and the Song title are separated from each other for example?
@Edward - see EDIT on how to parse each input line into tokens, each of which is std::string. You could modify this to put the tokens directly into the song number int and your vector of genres if your input is known to be well-formed.
3

You'd typically do this by overloading operator>> for the type of object:

struct song_data { 
    std::string number;
    std::string title;
    std::string release_date;
    // ...
};

std::istream &operator>>(std::istream &is, song_data &s_d) {        
    std::getline(is, s_d.number, '|');
    std::getline(is, s_d.title, '|');
    std::getline(is, s_d.release_date, '|');
    std::string ignore;
    std::getline(is, ignore, '|');
    // ...
    return is;
}

Depending on whether there are more fields you might want to ignore (especially trailing fields) it can sometimes be more convenient to read the entire line into a string, then put that into an istringstream, and parse the individual fields from there. In particular, this can avoid extra work reading more fields you don't care about, instead just going on to the next line when you've parsed out the fields you care about.

Edit: I would probably handle the genres by adding a std::vector<bool> genres;, and reading the 0's and 1's into that vector. I'd then add an enumeration specifying what genre is denoted by a particular position in the vector, so (for example) testing whether a particular song is classified as "country" would look something like:

enum { jazz, country, hiphop, classic_rock, progressive_rock, metal /*, ... */};

if (songs[i].genres[country])

if (songs[i].genres[hiphop])
    process_hiphop(songs[i]);

Of course, the exact genres and their order is something I don't know, so I just made up a few possibilities -- you'll (obviously) have to use the genres (and order) defined for the file format.

As far as dealing with hundreds of songs goes, the usual way would be (as implied above) create something like: std::vector<song_data> songs;. Using a stream extraction like above, you can then copy the data from the file to the vector:

std::copy(std::istream_iterator<song_data>(infile),
          std::istream_iterator<song_data>(),
          std::back_inserter(songs));

If you're likely to look up songs primarily by name (for one example), you might prefer to use std::map<std::string, song_data> songs. This will make it easy to do something like:

songs["new song"].release_date = Today;

4 Comments

Yes I think the free operator>> is a better bet than using a member function
Right now I have a song object with 4 params being the number, title, release data and a vector of genres. By doing it this way how would I be able to create a new instance of the song object, and is there a good way I could handle the genres of 1's and 0's? also would this be okay as is if there were hundreds of lines like this or would it have to be eddited?
@Edward - if the list of genre flags that's the last thing on your line you could just read until you run out of data on that line, appending to a vector of genres as you go using push_back. But it's not clear to me what structure is implied in that list of 0s and 1s - is the flag for a given genre always in the same place? there's no issue with size of your data provided it does not cause out of memory in your computer - correctly-constructed loops for file input and for genre parsing will work fine.
it's supposed to be where if the [0] element is a 1 then it would be country for example, and if the position [1] is a 1 then it would be something like rock. A song can be both rock and country at the same time and so on. There are 18 genres it can be. And what is the difference from what you have as a struct and what i have as a movie object?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.