I have a stream of binary data. Assume no prior knowledge about the expected pattern in input data.
The symbols can represent binary data or other symbols, hence hierarchical.
The output should minimize space, but does not need to be optimal. But the algorithm needs to be online - that is, with more input, the representation need to adapt. Approximation is allowed and very desirable if it can be controlled with some parameter that can decide trade off between accuracy of representation with update runtime and space usage.
Example: 00011100110011 => ABCDCDC => ABEEC
000 => A
111=>B
00=>C
11=>D
CD=>E