1

I'm building a script that reads 381 bytes from a file and attempts to decode the input. I am interested in 348 of those bytes I am labelling "presets". 3 byte chunks of the presets ByteString can be decoded into a single Int16, and "values" below are the 116 Int16 I am interested in...

decodeFile :: FilePath -> IO [Maybe PresetValue]
decodeFile filename =
  do h <- openFile (dir ++ filename) ReadMode
     header  <- h `BL.hGet` 32
     presets <- h `BL.hGet` 348
     f7      <- h `BL.hGet` 1
     let values = Bin.runGet getPresets presets
     hClose h
     return values

getPresets = do
  empty <- Bin.isEmpty
  if empty
    then return []
    else do p  <- getAndDecodeTriple
            ps <- getPresets
            return (p:ps)

getAndDecodeTriple = do
  b1 <- Bin.getWord8
  b2 <- Bin.getWord8
  b3 <- Bin.getWord8
  return $ decode (b1,b2,b3)

The problem I am having is decoding a 3 byte chunk, given I know how it was encoded in C++

Here is the C++ encoding

void SysexReader::sx_encode(int val, char* dest)
{
    char encode;
    
    // Encode Byte 1 (4 bits of payload)
    encode = 0x40 | ((val >> 12) & 0x000F);
    *dest++ = encode;
    
    // Encode Byte 2 (6 bits of payload)
    encode = (val >> 6) & 0x003F;
    *dest++ = encode;
    
    // Encode Byte 3 (6 bits of payload)
    encode = val & 0x003F;
    *dest = encode;
}

Here is the C++ encoding translated to Haskell...

type Encoding a  = (a,a,a)
type PresetValue = Int16

encode :: Integral a => PresetValue -> Encoding a
encode val =
  let f = fromIntegral
  in (f $ enc1 val, f $ enc2 val, f $ enc3 val)
  where
    enc1 = or40 . and000F . (flip shiftR 12)
      where and000F = (0x000F .&.)
            or40    = (0x40 .|.)
    enc2 = enc3 . flip shiftR 6
    enc3 = (0x003F .&.)

My attempt at decoding uses the fact that I have the encoding procedure and I know that PresetValue can only be in the range of (0,127)

--    (3 Sysex Bytes) -> (Preset Value)   --
-------------------------------------------------------
decode :: Integral a => (a,a,a) -> Maybe PresetValue
decode encoded =
  case match of
    [value] -> Just value
    []      -> Nothing  --error "encode not surjective"
    many    -> error "encode not injective"
  where
    match = filter (\x -> encode x == encoded) [0..127]

Unfortunately I can't decode all values, as you can see from the 116-entry list below containing Nothing in many places.

[Just 14,Just 84,Just 97,Just 117,Just 114,Just 117,Just 115,Just 32,Just 73,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Nothing,Nothing,Just 0,Nothing,Nothing,Nothing,Just 0,Nothing,Nothing,Just 0,Just 0,Nothing,Nothing,Just 0,Just 1,Nothing,Just 0,Nothing,Nothing,Just 0,Just 0,Just 0,Just 1,
Just 0,Just 0,Nothing,Just 5,Just 0,Just 1,Just 0,Just 0,Just 0,Nothing,Nothing,
Just 3,Just 2,Just 0,Just 0,Nothing,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Nothing,Nothing,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Nothing]

What am I doing wrong? I feel like it must be the types I am using to represent each chunk from the incoming file. Or maybe I'm losing information using fromIntegral.

I've been a developer for a while and have never posted a question on here and always fought through for an answer, but I'm really lost on this one. Thanks.

2
  • Plug a `mod` 256 somewhere? Commented May 15, 2021 at 21:11
  • I just tried that. Did not work; I got a bunch of Nothing. Why might that work? That might lead me to an answer. Commented May 15, 2021 at 21:43

1 Answer 1

1

It might be better to use openBinaryFile in place of openFile. This shouldn't make a difference here, since I believe hGet ignores whether files have been open in text or binary mode, but it's good practice.

Also, it would also be better to use a Word16 in place of your Int16. The C code is using an int, so any 16-bit integer value is going to be unsigned. Again, if you really are only dealing with presets in the range [0..127] it shouldn't matter, but it seems like good practice.

There's nothing obviously wrong with your code that I can see, but it's pretty much impossible to duplicate your problem without access to the input file. I might suggest using a better implementation of decode:

decode :: (Word8, Word8, Word8) -> Maybe PresetValue
decode (a,b,c)
  |  0x40 <= a && a <= 0x4f
  && b <= 0x3f && c <= 0x3f
  = Just $ (fromIntegral a .&. 0xf) `shiftL` 12 .|. fromIntegral b `shiftL` 6 .|. fromIntegral c
decode _ = Nothing

which handles all possible encoded preset values from 0 to 65535. If you still get Nothing values in your decode, then the encoded file is probably corrupt.

It looks like the first bad value is at offset 19, corresponding to bytes 57-59 (0x39-0x41), or accounting for the 32-byte header, bytes 89-91 (0x59-0x61). It might be helpful to open the file in a hex editor and see what three bytes are at that offset that are giving you trouble.

Sign up to request clarification or add additional context in comments.

3 Comments

A better implementation (of decode) is a kind understatement. The only thing I changed was to add a where f = fromIntegral in order to get the formatting right. It turns out some values were indeed way above 127.
Is there a chance the endianness assumed in the solution above would be different than the endianness used in the C++ implementation and could it affect the result? If so, how could the above solution be modified to take endianness into account?
Endianness shouldn't play a role here. The C++ implementation writes out three consecutive bytes, containing bit fields of size 4, 6, and 6 starting with the most significant bits, and the Haskell code reads it back in the same way, using the endian-free getWord8 function. If you were get-ting multibyte integers directly, there could be a potential endian issue, but not with getWord8.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.