Skip to main content

apache_avro/documentation/
serde_data_model_to_avro.rs

1// Licensed to the Apache Software Foundation (ASF) under one
2// or more contributor license agreements.  See the NOTICE file
3// distributed with this work for additional information
4// regarding copyright ownership.  The ASF licenses this file
5// to you under the Apache License, Version 2.0 (the
6// "License"); you may not use this file except in compliance
7// with the License.  You may obtain a copy of the License at
8//
9//   http://www.apache.org/licenses/LICENSE-2.0
10//
11// Unless required by applicable law or agreed to in writing,
12// software distributed under the License is distributed on an
13// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14// KIND, either express or implied.  See the License for the
15// specific language governing permissions and limitations
16// under the License.
17
18//! # Mapping the Serde data model to the Avro data model
19//!
20//! When manually mapping Rust types to an Avro schema, or the reverse, it is important to understand
21//! how the different data models are mapped. When mapping from an Avro schema to Rust types,
22//! see [the documentation for the reverse](super::avro_data_model_to_serde).
23//!
24//! Only the schemas generated by the [`AvroSchema`] derive and the mapping as defined here are
25//! supported. Any other behavior might change in a minor version.
26//!
27//! The following list is based on [the data model defined by Serde](https://serde.rs/data-model.html):
28//! - **14 primitive types**
29//!     - `bool` => [`Schema::Boolean`]
30//!     - `i8`, `i16`, `i32`, `u8`, `u16` => [`Schema::Int`]
31//!     - `i64`, `u32` => [`Schema::Long`]
32//!     - `u64` => [`Schema::Fixed`]`(name: "u64", size: 8)`
33//!         - This is not a `Schema::Long` as that is a signed number of maximum 64 bits.
34//!     - `i128` => [`Schema::Fixed`]`(name: "i128", size: 16)`
35//!     - `u128` => [`Schema::Fixed`]`(name: "u128", size: 16)`
36//!     - `f32` => [`Schema::Float`]
37//!     - `f64` => [`Schema::Double`]
38//!     - `char` => [`Schema::String`]
39//!         - Only one character allowed, deserializer will return an error for strings with more than one character.
40//! - **string** => [`Schema::String`]
41//! - **byte array** => [`Schema::Bytes`] or [`Schema::Fixed`]
42//! - **option** => [`Schema::Union([Schema::Null, _])`](crate::schema::Schema::Union)
43//! - **unit** => [`Schema::Null`]
44//! - **unit struct** => [`Schema::Record`] with the unqualified name equal to the name of the struct and zero fields
45//! - **unit variant** => See [Enums](#enums)
46//! - **newtype struct** => [`Schema::Record`] with the unqualified name equal to the name of the struct and one field
47//! - **newtype variant** => See [Enums](#enums)
48//! - **seq** => [`Schema::Array`]
49//! - **tuple**
50//!     - For tuples with one element, the schema will be the schema of the only element
51//!     - For tuples with more than one element, the schema will be a [`Schema::Record`] with as many fields as there are elements.
52//!       The schema must have the attribute `org.apache.avro.rust.tuple` with the value set to `true`.
53//!     - **Note:** Serde (de)serializes arrays (`[T; N]`) as tuples. To (de)serialize an array as a
54//!       [`Schema::Array`] use [`apache_avro::serde::array`].
55//! - **tuple_struct** => [`Schema::Record`] with the unqualified name equal to the name of the struct and as many fields as there are elements
56//!     - **Note:** Tuple structs with 0 or 1 elements will also be (de)serialized as a [`Schema::Record`]. This
57//!       is different from normal tuples.
58//! - **tuple_variant** => See [Enums](#enums)
59//! - **map** => [`Schema::Map`]
60//!     - **Note:** the key type of the map will be (de)serialized as a [`Schema::String`]
61//! - **struct** => [`Schema::Record`]
62//! - **struct_variant** => See [Enums](#enums)
63//!
64//! ## Enums
65//!
66//! ### Externally tagged
67//! This is the default enum representation for Serde. It can be mapped in three ways to the Avro data model.
68//! For all three options it is important that the enum index matches the Avro index.
69//! - As a [`Schema::Enum`], this is only possible for enums with only unit variants.
70//! - As a [`Schema::Union`] with a [`Schema::Record`] for every variant:
71//!     - **unit_variant** => [`Schema::Record`] named as the variant and with no fields.
72//!     - **newtype_variant** => [`Schema::Record`] named as the variant and with one field.
73//!       The schema must have the attribute `org.apache.avro.rust.union_of_records` with the value set to `true`.
74//!     - **tuple_variant** => [`Schema::Record`] named as the variant and with as many fields as there are elements.
75//!     - **struct_variant** => [`Schema::Record`] named as the variant and with a field for every field of the struct variant.
76//! - As a [`Schema::Union`] without the wrapper [`Schema::Record`], all schemas must be unique:
77//!     - **unit_variant** => [`Schema::Null`].
78//!     - **newtype_variant** => The schema of the inner type.
79//!     - **tuple_variant** => [`Schema::Record`] named as the variant and with as many fields as there are elements.
80//!     - **struct_variant** => [`Schema::Record`] named as the variant and with a field for every field of the struct variant.
81//!
82//! ### Internally tagged
83//! This enum representation is used by Serde if the attribute `#[serde(tag = "...")]` is used.
84//! It maps to a [`Schema::Record`]. There must be at least one field that is named as the value of the
85//! `tag` attribute. If a field is not used by all variants it must have a `default` set.
86//!
87//! ### Adjacently tagged
88//! This enum representation is used by Serde if the attributes `#[serde(tag = "...", content = "...")]` are used.
89//! It maps to a [`Schema::Record`] with two fields. One field must be named as the value of the `tag`
90//! attribute and use the [`Schema::Enum`] schema. The other field must be named as the value of the
91//! `content` tag and use the [`Schema::Union`] schema.
92//!
93//! ### Untagged
94//! This enum representation is used by Serde if the attribute `#[serde(untagged)]` is used. It maps
95//! to a [`Schema::Union`] with the following schemas:
96//!   - **unit_variant** => [`Schema::Null`].
97//!   - **newtype_variant** => The schema of the inner type.
98//!   - **tuple_variant** => [`Schema::Record`] named as the variant and with as many fields as there are elements.
99//!   - **struct_variant** => [`Schema::Record`] named as the variant and with a field for every field of the struct variant.
100//!
101//! [`AvroSchema`]: crate::AvroSchema
102//! [`Schema::Array`]: crate::schema::Schema::Array
103//! [`Schema::Boolean`]: crate::schema::Schema::Boolean
104//! [`Schema::Bytes`]: crate::schema::Schema::Bytes
105//! [`Schema::Double`]: crate::schema::Schema::Double
106//! [`Schema::Enum`]: crate::schema::Schema::Enum
107//! [`Schema::Fixed`]: crate::schema::Schema::Fixed
108//! [`Schema::Float`]: crate::schema::Schema::Float
109//! [`Schema::Int`]: crate::schema::Schema::Int
110//! [`Schema::Long`]: crate::schema::Schema::Long
111//! [`Schema::Map`]: crate::schema::Schema::Map
112//! [`Schema::Null`]: crate::schema::Schema::Null
113//! [`Schema::Record`]: crate::schema::Schema::Record
114//! [`Schema::String`]: crate::schema::Schema::String
115//! [`Schema::Union`]: crate::schema::Schema::Union
116//! [`apache_avro::serde::array`]: crate::serde::array