Module serde

Module serde 

Source
Expand description

§Using Avro in Rust, the Serde way.

Avro is a schema-based format, this means it requires a few extra steps to use compared to a data format like JSON.

§Schemas

It’s strongly recommended to derive the schemas for your types using the AvroSchema derive macro. The macro uses the Serde attributes to generate a matching schema and checks that no attributes are used that are incompatible with the Serde implementation in this crate. See the trait documentation for details on how to change the generated schema.

Alternatively, you can write your own schema. If you go down this path, it is recommended you start with the schema derived by AvroSchema and then modify it to fit your needs.

§Performance pitfall

One performance pitfall with Serde is (de)serializing bytes. The implementation of Serialize and Deserialize for types as Vec<u8>, &[u8] and Cow<[u8]> will all use the array of integers representation. This can normally be fixed using the serde_bytes crate, however this crate also needs some extra information. Therefore, you need to use the bytes, bytes_opt, fixed, fixed_opt, slice, and slice_opt modules of this crate instead.

§Using existing schemas

If you have schemas that are already being used in other parts of your software stack, generating types from the schema can be very useful. There is a third-party crate rsgen-avro that implements this.

§Serializing data

Writing data is very simple. Use T::get_schema() to get the schema for the type you want to serialize. It is recommended to keep this schema around as long as possible as generating the schema is quite expensive. Then create a Writer with your schema and use the append_ser() function to serialize your data.

§Deserializing data

Reading data is both simpler and more complex than writing. On the one hand, you don’t need to generate a schema, as the Avro file has it embedded. But you can’t directly deserialize from a Reader. Instead, you have to iterate over the Values in the reader and deserialize from those via from_value.

§Putting it all together

The following is an example of how to combine everything showed so far and it is meant to be a quick reference of the Serde interface:

#[derive(AvroSchema, Serialize, Deserialize, PartialEq, Debug)]
struct Foo {
    a: i64,
    b: String,
    // Otherwise it will be serialized as an array of integers
    #[avro(with)]
    #[serde(with = "apache_avro::serde::bytes")]
    c: Vec<u8>,
}

// Creating this schema is expensive, reuse it as much as possible
let schema = Foo::get_schema();
// A writer needs the schema of the type that is going to be written
let mut writer = Writer::new(&schema, Vec::new())?;

let foo = Foo {
    a: 42,
    b: "Hello".to_string(),
    c: b"Data".to_vec()
};

// Serialize as many items as you want.
writer.append_ser(&foo)?;
writer.append_ser(&foo)?;
writer.append_ser(&foo)?;

// Always flush
writer.flush()?;
// Or consume the writer
let data = writer.into_inner()?;

// The reader does not need a schema as it's included in the data
let reader = Reader::new(Cursor::new(data))?;
// The reader is an iterator
for result in reader {
    let value = result?;
    let new_foo: Foo = from_value(&value)?;
    assert_eq!(new_foo, foo);
}

Modules§

bytes
Efficient (de)serialization of Avro bytes values.
bytes_opt
Efficient (de)serialization of optional Avro bytes values.
fixed
Efficient (de)serialization of Avro fixed values.
fixed_opt
Efficient (de)serialization of optional Avro fixed values.
slice
Efficient (de)serialization of Avro bytes/fixed borrowed values.
slice_opt
Efficient (de)serialization of optional Avro bytes/fixed borrowed values.

Traits§

AvroSchema
Trait for types that serve as an Avro data model.
AvroSchemaComponent
Trait for types that serve as fully defined components inside an Avro data model.

Functions§

from_value
Interpret a Value as an instance of type D.
to_value
Interpret a serializeable instance as a Value.