0

I have a binary file to read from and inside the file are non-fixed lengths of data but they do have start and stop sequences.

Start Sequence is 0x1B 0x5B 0x30 0x48
Stop Sequence is 0x1b 0x5B 0x31 0x48

This particular file does have 28 entries in it, thought how many entries could be different.

I've read the binary file into a vector to the size of the file

ifstream datafile("myfile.bin", ios_base::in|ios_base::binary);
vector<char> buff;
int size = datafile.tellg();
buff.resize(size);

datafile.read(buff.data(), size);

Now I've tried to iterate over the vector byte by byte (as that is how it is stored in the vector right? but that's not quite what I want.

It would be nice to read over the vector writing the data to another (temp) variable and then stop writing to it when I see the Stop Sequence. Then continue on with the rest of the vector, writing to another variable until the next Stop Sequence is seen etc. Like writing to a vector<vector<char>> ?

Below is the iteration I do for byte-by-byte.

for (vector<char>::iterator it = buff.begin(); it != buff.end(); ++it)
{
  if (*it == 0x1B)
  {
    // found ESC char
  }
}

How might I set up reading from the binary file, writing the bytes up until the Stop Sequence and then repeating for the rest of the file?

1
  • If I understand correctly you want to read the whole file into memory and then extract the bytes between each set of start/end markers into their own buffers? Sounds easy enough, what part of it are you having trouble with? Commented Jun 20, 2018 at 0:20

2 Answers 2

1

I wrote some sample code that scans a given vector of bytes and stores the runs of bytes found in between the start/stop sequences into a vector of vectors of bytes.

Haven't really tested it, but it does compile :-)

void findSequences( vector< char >& buff, vector< vector< char > > *dataRuns )
{
  char startSequence[] = { 0x1B, 0x5B, 0x30, 0x48 };
  char endSequence[] = { 0x1b, 0x5B, 0x31, 0x48 };

  bool findingStart = true;

  vector< char >::iterator it = buff.begin();
  vector< char >::iterator itEnd = buff.end();
  while ( it != itEnd )
  {
    vector< char >::iterator findIt;
    if ( findingStart )
      findIt = search( it, itEnd, startSequence, startSequence + 4 );
    else
      findIt = search( it, itEnd, endSequence, endSequence + 4 );

    if ( findIt != itEnd )
    {
      if ( findingStart )
      {
        it = findIt + 4;
        findingStart = false;
      }
      else
      {
        dataRuns->push_back( vector< char >( it, findIt ) );
        it = findIt + 4;
        findingStart = true;
      }
    }
    else
    {
      // failed to find a start or stop sequence

      break;
    }
  }
}
Sign up to request clarification or add additional context in comments.

1 Comment

This works nicely. I modified it slightly because I still required the start/end sequences in the vector. Thanks!!
1

The format looks erroneous to me. What if your data contains the begin/end sequences? How do you encode them?

You rely too much on stl. You don’t have to read the input into a vector. Write a function to extract the tokens from the stream using istream::get and istream::unget. This will be likely be the most complex function you have to write. The tokens your function must return are:

  • data-begin: your begin escape sequence.
  • data: a data byte.
  • data-end: your end escape sequence.
  • done: end of stream.

This function will make data extraction trivial:

bool reader_t::get_data( std::vector< char >& d ) // returns false on end of stream
{
  d.clear();

  get_token();

  if ( _tok == done )
    return false; // end of stream

  if ( _tok != data_beg )
    throw "data begin expected";

  while ( get_token() == data )
    d.push_back( _c );

  if ( _tok != data_end )
    throw "data end expected";

  return true;
}

Processing the whole stream is trivial, too:

int main()
{
  std::ifstream is { R"(d:\temp\test.bin)" };
  if ( !is )
    return 0;

  reader_t r { is };
  std::vector< char > v;
  try
  {
    while ( r.get_data( v ) )
      ;// process v;
  }
  catch ( const char* e )
  {
    std::cout << e;
  }

  return 0;
}

This is how your reader should look like:

class reader_t
{
  std::istream& _is;

  enum token_t
  {
    data_beg,
    data_end,
    data,
    done
  };

  token_t _tok;
  char _c;

  token_t get_token();

public:

  reader_t( std::istream& a_is );
  bool get_data( std::vector< char >& d ) // returns false on end of stream
};

Here's a demo written in hurry - no warranty.

1 Comment

I would hope the start/end sequences would not be part of the data itself, but it is also not my protocol that is being used. Thanks for the explanation of code example!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.