I think my solution is more complex than it should be, but I'm not familiar enough with Rust yet to know a better way. Below is the only solution I got working after much trial and error.
Context
- Read in from an XML file that describes the records structure of a binary file. This step is skipped in the working code below and is simply simulated. See comments in the code blocks.
- One of the fields each record provides is a
data_typedescription which is stored as a &str or String depending on how the Rust struct holding the records is defined. - Read the records from the binary file using the XML description information to automate which type the resulting parsed &[u8] is cast into so that it can be used later. This data is also simulated in the code below since parsing the binary file is not the focus.
Example of a Simplified XML Description File
Assume many more records and many more data types. This is not used in the working code.
use quick_xml;
let xml_desc = r#""
<Records>
<RecordDesc>
<name>Single</name>
<number>1</number>
<location unit="byte">0</location>
<data_type>IEEE754LSBSingle</data_type>
<length> unit="byte">4</length>
</RecordDesc>
<RecordDesc>
<name>Double</name>
<number>1</number>
<location unit="byte">5</location>
<data_type>IEEE754LSBDouble</data_type>
<length> unit="byte">8</length>
</RecordDesc>
</Records>
""#;
// quick_xml/serde used to get that string of xml into the rust records
#[derive(Serialize, Deserialize, Debug)]
pub struct Records {
pub records: Vec<Record>,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct Record {
...,
pub data_type: &str, //could also be String, but this example doesn't use this struct
...,
}
// For each record use data_type to cast into type rust can use
Working Code
First Macro
Creates functions that convert the &mut &[u8] into specific Rust equivalent types. Example output shown in first_example.
macro_rules! type_cast_function {
($func_name:ident, $endian:ident, $output_type:ty ) => {
fn $func_name(input: &mut &[u8]) -> $output_type {
let (int_bytes, _) = input.split_at(std::mem::size_of::<$output_type>());
<$output_type>::$endian(int_bytes.try_into().unwrap())
}
};
}
Second Macro
Creates impl blocks for unwraping each specific value from the varients in DataTypes. Example output shown in first_example.
macro_rules! create_unwrap_impl_for_type {
($unwrap_name:ident, $variant:path, $output_type:ty) => {
impl DataTypes {
pub fn $unwrap_name(self) -> $output_type {
match self {
$variant(val) => val,
_ => panic!(),
}
}
}
};
}
Create Enum for Various Data Types
Note: Case is reflective of the case in the xml_desc
#[derive(Debug)]
pub enum DataTypes {
// 4 Bytes
IEEE754LSBSingle(f32),
// 8 Bytes
IEEE754LSBDouble(f64),
}
First Example
Matches data_type: &str descriptions and generates the relevant function and impl block for unwrapping the value for each match to be used elsewhere.
fn first_example(){
// Simulated Data that would come from parsing the binary file
let mut data: &[u8] = &[172, 152, 111, 195];
let mut data2: &[u8] = &[172, 152, 111, 195, 117, 93, 133, 192];
// Simulated looping through records with different types
for dtype in ["IEEE754LSBSingle", "IEEE754LSBDouble"] {
match dtype {
"IEEE754LSBSingle" => {
create_unwrap_impl_for_type!(unwrap_le_f32,DataTypes::IEEE754LSBSingle,f32);
/*
outputs:
impl DataTypes {
pub fn unwrap_le_f32(self) -> f32 {
match self {
DataTypes::IEEE754LSBSingle(val) => val,
_ => panic!(),
}
}
}
*/
type_cast_function!(read_le_f32, from_le_bytes, f32);
/*
outputs:
fn read_le_f32(input: &mut &[u8]) -> f32 {
let (int_bytes, _) = input.split_at(std::mem::size_of::<f32>());
f32::from_le_bytes(int_bytes.try_into().unwrap())
}
*/
let single = DataTypes::IEEE754LSBSingle(read_le_f32(&mut data)).unwrap_le_f32();
println!("First Example\tIEEE754LSBSingle {:?}",single);
},
"IEEE754LSBDouble" => {
create_unwrap_impl_for_type!(unwrap_le_f64,DataTypes::IEEE754LSBDouble,f64);
/*
outputs:
impl DataTypes {
pub fn unwrap_le_f64(self) -> f64 {
match self {
DataTypes::IEEE754LSBDouble(val) => val,
_ => panic!(),
}
}
}
*/
type_cast_function!(read_le_f64, from_le_bytes, f64);
/*
outputs:
fn read_le_f64(input: &mut &[u8]) -> f64 {
let (int_bytes, _) = input.split_at(std::mem::size_of::<f64>());
f64::from_le_bytes(int_bytes.try_into().unwrap())
}
*/
let double = DataTypes::IEEE754LSBDouble(read_le_f64(&mut data2)).unwrap_le_f64();
println!("First Example\tIEEE754LSBDouble {:?}",double);
},
_ => panic!(),
};
}
}
One Macro to Rule Them All
One macro for creating the function and impl blocks from the other macros. Makes the difference between the first_example above and the second_example below
macro_rules! generate_casting_extraction_functions {
($func_name:ident, $endian:ident, $unwrap_name:ident, $variant:path, $output_type:ty) => {
create_unwrap_impl_for_type!($unwrap_name, $variant, $output_type);
type_cast_function!($func_name, $endian, $output_type);
}
}
Second Example
Matches data_type: &str descriptions and generates the relevant function and impl block for unwrapping the value for each match to be used elsewhere.
fn second_example(){
// Simulated Data that would come from parsing the binary file
let mut data: &[u8] = &[172, 152, 111, 195];
let mut data2: &[u8] = &[172, 152, 111, 195, 117, 93, 133, 192];
// Simulated looping through records with different types
for dtype in ["IEEE754LSBSingle", "IEEE754LSBDouble"] {
match dtype {
"IEEE754LSBSingle" => {
// Same output as first_example
generate_casting_extraction_functions!(read_le_f32_2, from_le_bytes,unwrap_le_f32_2,DataTypes::IEEE754LSBSingle,f32);
let single = DataTypes::IEEE754LSBSingle(read_le_f32_2(&mut data)).unwrap_le_f32_2();
println!("Second Example\tIEEE754LSBSingle {:?}",single);
},
"IEEE754LSBDouble" => {
// Same output as first_example
generate_casting_extraction_functions!(read_le_f64_2, from_le_bytes,unwrap_le_f64_2,DataTypes::IEEE754LSBDouble,f64);
let double = DataTypes::IEEE754LSBDouble(read_le_f64_2(&mut data2)).unwrap_le_f64_2();
println!("Second Example\tIEEE754LSBDouble {:?}",double);
},
_ => panic!(),
};
}
}
fn main() {
first_example();
second_example();
}
<Records>is a kind of index telling where each data type is in the binary file? Do you know all of the possible data types ahead of time? If so, you should have serde deserializedata_typeas an enum instead, rather than using strings.Recordsis an index of the descriptions of each data contained in the binary file. Where they're located, what types they are, human readable information about them, etc. All potential types are known and the beginnings of them are shown in theDataTypesenum described. Types may or may not vary from each record. I think I'd run into an issue with the serde deserializedata_typewith theDataTypesenum because wouldn't it need the data in the associated type before it would be available?