If by "streams" you're talking about C++ iostreams, they're already buffered at a reasonable size and the cost of inserting into that buffer is very low. The standard library is mature; beating it at its own game is very hard. and you'll need exploitable specifics you can take advantage of to get anything worthwhile. That said:
How big your output buffer should be (with the degenerate case being a single-element buffer, i.e. no buffering) depends on the overhead of a buffer flush. That overhead will have a fixed cost and a size-related cost -- not a simple linear one given cache effects. With more expensive fixed overhead bigger buffers help amortize the fixed expense. For instance, if a buffer flush can trigger zero-copy I/O it can be dramatically cheaper to buffer all of a largish serialization, but if the output operation is going to be copying your source buffer, buffer sizes down around a quarter of your L1 cache size are a decent choice when the fixed cost of a flush is low.
None of this matters at all unless the time serialization takes puts it on a critical path, i.e. makes it what a user's waiting on -- for something like this, that's getting hard to produce unless you're talking about millions of items and up. Even then, if you haven't already worked on it it's almost certain there's more waste in how you produce an individual serialization than in the buffering scheme you choose -- and even then never forget what you're racing. Is it I/O bandwidth? Sending your serialized stream through a low-grade compressor could easily save more time than anything you could do up front.