The current version of Avro is 1.4.0. The current version of libavro is 21:0:0. This document was created 2010-09-02.

1. Introduction to Avro

Avro is a data serialization system.

Avro provides:

  • Rich data structures.

  • A compact, fast, binary data format.

  • A container file, to store persistent data.

  • Remote procedure call (RPC).

This document will focus on the C implementation of Avro. To learn more about Avro in general, visit the Avro website.

2. Introduction to Avro C

    ___                      ______
   /   |_   ___________     / ____/
  / /| | | / / ___/ __ \   / /
 / ___ | |/ / /  / /_/ /  / /___
/_/  |_|___/_/   \____/   \____/

A C program is like a fast dance on a newly waxed dance floor by people carrying razors.

(walra%moacs11 @ nl.net) 94/03/18
— Waldi Ravens

The C implementation has been tested on MacOSX and Linux but, over time, the number of support OSes should grow. Please let us know if you’re using Avro C on other systems. There are no dependencies on external libraries. We embedded Jansson into Avro C for parsing JSON into schema structures.

The C implementation supports:

  • binary encoding/decoding of all primitive and complex data types

  • storage to an Avro Object Container File

  • schema resolution, promotion and projection

  • validating and non-validating mode for writing Avro data

The C implementation is lacking:

  • RPC

To learn about the API, take a look at the examples and reference files later in this document.

We’re always looking for contributions so, if you’re a C hacker, please feel free to submit patches to the project.

3. Reference Counting

Avro C does reference counting for all schema and data objects. When the number of references drops to zero, the memory is freed.

For example, to create and free a string, you would use:

avro_datum_t string = avro_string("This is my string");

...
avro_datum_decref(string);

Things get a little more complicated when you consider more elaborate schema and data structures.

For example, let’s say that you create a record with a single string field:

avro_datum_t example = avro_record("Example");
avro_datum_t solo_field = avro_string("Example field value");

avro_record_set(example, "solo", solo_field);

...
avro_datum_decref(example);

In this example, the solo_field datum would not be freed since it has two references: the original reference and a reference inside the Example record. The avro_datum_decref(example) call drops the number of reference to one. If you are finished with the solo_field schema, then you need to avro_schema_decref(solo_field) to completely dereference the solo_field datum and free it.

4. Wrap It and Give It

You’ll notice that some datatypes can be "wrapped" and "given". This allows C programmers the freedom to decide who is responsible for the memory. Let’s take strings for example.

To create a string datum, you have three different methods:

avro_datum_t avro_string(const char *str);
avro_datum_t avro_wrapstring(const char *str);
avro_datum_t avro_givestring(const char *str);

If you use, avro_string then Avro C will make a copy of your string and free it when the datum is dereferenced. In some cases, especially when dealing with large amounts of data, you want to avoid this memory copy. That’s where avro_wrapstring and avro_givestring can help.

If you use, avro_wrapstring then Avro C will do no memory management at all. It will just save a pointer to your data and it’s your responsibility to free the string.

Warning
When using avro_wrapstring, do not free the string before you dereference the string datum with avro_datum_decref().

Lastly, if you use avro_givestring then Avro C will free the string later when the datum is dereferenced. In a sense, you are "giving" responsibility for freeing the string to Avro C.

Warning

Don’t "give" Avro C a string that you haven’t allocated from the heap with e.g. malloc or strdup.

For example, don’t do this:

avro_datum_t bad_idea = avro_givestring("This isn't allocated on the heap");

5. Schema Validation

If you want to write a datum, you would use the following function

int avro_write_data(avro_writer_t writer,
                    avro_schema_t writers_schema, avro_datum_t datum);

If you pass in a writers_schema, then you datum will be validated before it is sent to the writer. This check ensures that your data has the correct format. If you are certain your datum is correct, you can pass a NULL value for writers_schema and Avro C will not validate before writing.

Note
Data written to an Avro File Object Container is always validated.

6. Examples

I’m not even supposed to be here today!

— Dante Hicks

Imagine you’re a free-lance hacker in Leonardo, New Jersey and you’ve been approached by the owner of the local Quick Stop Convenience store. He wants you to create a contact database case he needs to call employees to work on their day off.

You might build a simple contact system using Avro C like the following…

/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to you under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
 * implied.  See the License for the specific language governing
 * permissions and limitations under the License.
 */

#include <avro.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

avro_schema_t person_schema;
int64_t id = 0;

/* A simple schema for our tutorial */
#define PERSON_SCHEMA \
"{\"type\":\"record\",\
  \"name\":\"Person\",\
  \"fields\":[\
     {\"name\": \"ID\", \"type\": \"long\"},\
     {\"name\": \"First\", \"type\": \"string\"},\
     {\"name\": \"Last\", \"type\": \"string\"},\
     {\"name\": \"Phone\", \"type\": \"string\"},\
     {\"name\": \"Age\", \"type\": \"int\"}]}"

/* Parse schema into a schema data structure */
void init_schema(void)
{
        avro_schema_error_t error;
        if (avro_schema_from_json(PERSON_SCHEMA, sizeof(PERSON_SCHEMA),
                                  &person_schema, &error)) {
                fprintf(stderr, "Unable to parse person schema\n");
                exit(EXIT_FAILURE);
        }
}

/* Create a datum to match the person schema and save it */
void
add_person(avro_file_writer_t db, const char *first, const char *last,
           const char *phone, int32_t age)
{
        avro_datum_t person = avro_record("Person", NULL);

        avro_datum_t id_datum = avro_int64(++id);
        avro_datum_t first_datum = avro_string(first);
        avro_datum_t last_datum = avro_string(last);
        avro_datum_t age_datum = avro_int32(age);
        avro_datum_t phone_datum = avro_string(phone);

        if (avro_record_set(person, "ID", id_datum)
            || avro_record_set(person, "First", first_datum)
            || avro_record_set(person, "Last", last_datum)
            || avro_record_set(person, "Age", age_datum)
            || avro_record_set(person, "Phone", phone_datum)) {
                fprintf(stderr, "Unable to create Person datum structure");
                exit(EXIT_FAILURE);
        }

        if (avro_file_writer_append(db, person)) {
                fprintf(stderr,
                        "Unable to write Person datum to memory buffer");
                exit(EXIT_FAILURE);
        }

        /* Decrement all our references to prevent memory from leaking */
        avro_datum_decref(id_datum);
        avro_datum_decref(first_datum);
        avro_datum_decref(last_datum);
        avro_datum_decref(age_datum);
        avro_datum_decref(phone_datum);
        avro_datum_decref(person);

        fprintf(stdout, "Successfully added %s, %s id=%"PRId64"\n", last, first, id);
}

int print_person(avro_file_reader_t db, avro_schema_t reader_schema)
{
        int rval;
        avro_datum_t person;

        rval = avro_file_reader_read(db, reader_schema, &person);
        if (rval == 0) {
                int64_t i64;
                int32_t i32;
                char *p;
                avro_datum_t id_datum, first_datum, last_datum, phone_datum,
                    age_datum;

                if (avro_record_get(person, "ID", &id_datum) == 0) {
                        avro_int64_get(id_datum, &i64);
                        fprintf(stdout, "%"PRId64" | ", i64);
                }
                if (avro_record_get(person, "First", &first_datum) == 0) {
                        avro_string_get(first_datum, &p);
                        fprintf(stdout, "%15s | ", p);
                }
                if (avro_record_get(person, "Last", &last_datum) == 0) {
                        avro_string_get(last_datum, &p);
                        fprintf(stdout, "%15s | ", p);
                }
                if (avro_record_get(person, "Phone", &phone_datum) == 0) {
                        avro_string_get(phone_datum, &p);
                        fprintf(stdout, "%15s | ", p);
                }
                if (avro_record_get(person, "Age", &age_datum) == 0) {
                        avro_int32_get(age_datum, &i32);
                        fprintf(stdout, "%d", i32);
                }
                fprintf(stdout, "\n");

                /* We no longer need this memory */
                avro_datum_decref(person);
        }
        return rval;
}

int main(void)
{
        int rval;
        avro_file_reader_t dbreader;
        avro_file_writer_t db;
        avro_schema_t projection_schema, first_name_schema, phone_schema;
        int64_t i;
        const char *dbname = "quickstop.db";

        /* Initialize the schema structure from JSON */
        init_schema();

        /* Delete the database if it exists */
        unlink(dbname);
        /* Create a new database */
        rval = avro_file_writer_create(dbname, person_schema, &db);
        if (rval) {
                fprintf(stderr, "There was an error creating %s\n", dbname);
                exit(EXIT_FAILURE);
        }
        /* Add people to the database */
        add_person(db, "Dante", "Hicks", "(555) 123-4567", 32);
        add_person(db, "Randal", "Graves", "(555) 123-5678", 30);
        add_person(db, "Veronica", "Loughran", "(555) 123-0987", 28);
        add_person(db, "Caitlin", "Bree", "(555) 123-2323", 27);
        add_person(db, "Bob", "Silent", "(555) 123-6422", 29);
        add_person(db, "Jay", "???", "(555) 123-9182", 26);
        avro_file_writer_close(db);

        fprintf(stdout, "\nNow let's read all the records back out\n");

        /* Read all the records and print them */
        avro_file_reader(dbname, &dbreader);
        for (i = 0; i < id; i++) {
                if (print_person(dbreader, NULL)) {
                        fprintf(stderr, "Error printing person\n");
                        exit(EXIT_FAILURE);
                }
        }
        avro_file_reader_close(dbreader);

        /* You can also use projection, to only decode only the data you are
           interested in.  This is particularly useful when you have
           huge data sets and you'll only interest in particular fields
           e.g. your contacts First name and phone number */
        projection_schema = avro_schema_record("Person", NULL);
        first_name_schema = avro_schema_string();
        phone_schema = avro_schema_string();
        avro_schema_record_field_append(projection_schema, "First",
                                        first_name_schema);
        avro_schema_record_field_append(projection_schema, "Phone",
                                        phone_schema);

        /* Read only the record you're interested in */
        fprintf(stdout,
                "\n\nUse projection to print only the First name and phone numbers\n");
        avro_file_reader(dbname, &dbreader);
        for (i = 0; i < id; i++) {
                if (print_person(dbreader, projection_schema)) {
                        fprintf(stderr, "Error printing person\n");
                        exit(EXIT_FAILURE);
                }
        }
        avro_file_reader_close(dbreader);
        avro_schema_decref(first_name_schema);
        avro_schema_decref(phone_schema);
        avro_schema_decref(projection_schema);

        /* We don't need this schema anymore */
        avro_schema_decref(person_schema);
        return 0;
}

When you compile and run this program, you should get the following output

Successfully added Hicks, Dante id=1
Successfully added Graves, Randal id=2
Successfully added Loughran, Veronica id=3
Successfully added Bree, Caitlin id=4
Successfully added Silent, Bob id=5
Successfully added ???, Jay id=6

Avro is compact. Here is the data for all 6 people.
| 02 0A 44 61 6E 74 65 0A | 48 69 63 6B 73 1C 28 35 |   ..Dante.Hicks.(5
| 35 35 29 20 31 32 33 2D | 34 35 36 37 40 04 0C 52 |   55) 123-4567@..R
| 61 6E 64 61 6C 0C 47 72 | 61 76 65 73 1C 28 35 35 |   andal.Graves.(55
| 35 29 20 31 32 33 2D 35 | 36 37 38 3C 06 10 56 65 |   5) 123-5678<..Ve
| 72 6F 6E 69 63 61 10 4C | 6F 75 67 68 72 61 6E 1C |   ronica.Loughran.
| 28 35 35 35 29 20 31 32 | 33 2D 30 39 38 37 38 08 |   (555) 123-09878.
| 0E 43 61 69 74 6C 69 6E | 08 42 72 65 65 1C 28 35 |   .Caitlin.Bree.(5
| 35 35 29 20 31 32 33 2D | 32 33 32 33 36 0A 06 42 |   55) 123-23236..B
| 6F 62 0C 53 69 6C 65 6E | 74 1C 28 35 35 35 29 20 |   ob.Silent.(555)
| 31 32 33 2D 36 34 32 32 | 3A 0C 06 4A 61 79 06 3F |   123-6422:..Jay.?
| 3F 3F 1C 28 35 35 35 29 | 20 31 32 33 2D 39 31 38 |   ??.(555) 123-918
| 32 34 .. .. .. .. .. .. | .. .. .. .. .. .. .. .. |   24..............

Now let's read all the records back out
1 |           Dante |           Hicks |  (555) 123-4567 | 32
2 |          Randal |          Graves |  (555) 123-5678 | 30
3 |        Veronica |        Loughran |  (555) 123-0987 | 28
4 |         Caitlin |            Bree |  (555) 123-2323 | 27
5 |             Bob |          Silent |  (555) 123-6422 | 29
6 |             Jay |             ??? |  (555) 123-9182 | 26


Use projection to print only the First name and phone numbers
          Dante |  (555) 123-4567 |
         Randal |  (555) 123-5678 |
       Veronica |  (555) 123-0987 |
        Caitlin |  (555) 123-2323 |
            Bob |  (555) 123-6422 |
            Jay |  (555) 123-9182 |

The Quick Stop owner was so pleased, he asked you to create a movie database for his RST Video store.

7. Reference files

7.1. avro.h

The avro.h header file contains the complete public API for Avro C. The documentation is rather sparse right now but we’ll be adding more information soon.

/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to you under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
 * implied.  See the License for the specific language governing
 * permissions and limitations under the License.
 */
#ifndef AVRO_H
#define AVRO_H
#ifdef __cplusplus
extern "C" {
#define CLOSE_EXTERN }
#else
#define CLOSE_EXTERN
#endif

#include <stdio.h>
#include <stdint.h>

enum avro_type_t {
        AVRO_STRING,
        AVRO_BYTES,
        AVRO_INT32,
        AVRO_INT64,
        AVRO_FLOAT,
        AVRO_DOUBLE,
        AVRO_BOOLEAN,
        AVRO_NULL,
        AVRO_RECORD,
        AVRO_ENUM,
        AVRO_FIXED,
        AVRO_MAP,
        AVRO_ARRAY,
        AVRO_UNION,
        AVRO_LINK
};
typedef enum avro_type_t avro_type_t;

enum avro_class_t {
        AVRO_SCHEMA,
        AVRO_DATUM
};
typedef enum avro_class_t avro_class_t;

struct avro_obj_t {
        avro_type_t type;
        avro_class_t class_type;
        unsigned long refcount;
};

#define avro_classof(obj)     ((obj)->class_type)
#define is_avro_schema(obj)   (obj && avro_classof(obj) == AVRO_SCHEMA)
#define is_avro_datum(obj)    (obj && avro_classof(obj) == AVRO_DATUM)

#define avro_typeof(obj)      ((obj)->type)
#define is_avro_string(obj)   (obj && avro_typeof(obj) == AVRO_STRING)
#define is_avro_bytes(obj)    (obj && avro_typeof(obj) == AVRO_BYTES)
#define is_avro_int32(obj)    (obj && avro_typeof(obj) == AVRO_INT32)
#define is_avro_int64(obj)    (obj && avro_typeof(obj) == AVRO_INT64)
#define is_avro_float(obj)    (obj && avro_typeof(obj) == AVRO_FLOAT)
#define is_avro_double(obj)   (obj && avro_typeof(obj) == AVRO_DOUBLE)
#define is_avro_boolean(obj)  (obj && avro_typeof(obj) == AVRO_BOOLEAN)
#define is_avro_null(obj)     (obj && avro_typeof(obj) == AVRO_NULL)
#define is_avro_primitive(obj)(is_avro_string(obj) \
                             ||is_avro_bytes(obj) \
                             ||is_avro_int32(obj) \
                             ||is_avro_int64(obj) \
                             ||is_avro_float(obj) \
                             ||is_avro_double(obj) \
                             ||is_avro_boolean(obj) \
                             ||is_avro_null(obj))
#define is_avro_record(obj)   (obj && avro_typeof(obj) == AVRO_RECORD)
#define is_avro_enum(obj)     (obj && avro_typeof(obj) == AVRO_ENUM)
#define is_avro_fixed(obj)    (obj && avro_typeof(obj) == AVRO_FIXED)
#define is_avro_named_type(obj)(is_avro_record(obj) \
                              ||is_avro_enum(obj) \
                              ||is_avro_fixed(obj))
#define is_avro_map(obj)      (obj && avro_typeof(obj) == AVRO_MAP)
#define is_avro_array(obj)    (obj && avro_typeof(obj) == AVRO_ARRAY)
#define is_avro_union(obj)    (obj && avro_typeof(obj) == AVRO_UNION)
#define is_avro_complex_type(obj) (!(is_avro_primitive(obj))
#define is_avro_link(obj)     (obj && avro_typeof(obj) == AVRO_LINK)

typedef struct avro_reader_t_ *avro_reader_t;
typedef struct avro_writer_t_ *avro_writer_t;

/*
 * schema
 */
typedef struct avro_obj_t *avro_schema_t;

avro_schema_t avro_schema_string(void);
avro_schema_t avro_schema_bytes(void);
avro_schema_t avro_schema_int(void);
avro_schema_t avro_schema_long(void);
avro_schema_t avro_schema_float(void);
avro_schema_t avro_schema_double(void);
avro_schema_t avro_schema_boolean(void);
avro_schema_t avro_schema_null(void);

avro_schema_t avro_schema_record(const char *name, const char *space);
avro_schema_t avro_schema_record_field_get(const avro_schema_t
                                           record, const char *field_name);
int avro_schema_record_field_append(const avro_schema_t record,
                                    const char *field_name,
                                    const avro_schema_t type);

avro_schema_t avro_schema_enum(const char *name);
int avro_schema_enum_symbol_append(const avro_schema_t
                                   enump, const char *symbol);

avro_schema_t avro_schema_fixed(const char *name, const int64_t len);
avro_schema_t avro_schema_map(const avro_schema_t values);
avro_schema_t avro_schema_array(const avro_schema_t items);

avro_schema_t avro_schema_union(void);
int avro_schema_union_append(const avro_schema_t
                             union_schema, const avro_schema_t schema);

avro_schema_t avro_schema_link(avro_schema_t schema);

typedef struct avro_schema_error_t_ *avro_schema_error_t;
int avro_schema_from_json(const char *jsontext,
                          const int32_t len,
                          avro_schema_t * schema, avro_schema_error_t * error);
int avro_schema_to_json(avro_schema_t schema, avro_writer_t out);

int avro_schema_to_specific(avro_schema_t schema, const char *prefix);

const char *avro_schema_name(const avro_schema_t schema);
avro_schema_t avro_schema_copy(avro_schema_t schema);
int avro_schema_equal(avro_schema_t a, avro_schema_t b);

avro_schema_t avro_schema_incref(avro_schema_t schema);
void avro_schema_decref(avro_schema_t schema);

/*
 * io
 */
avro_reader_t avro_reader_file(FILE * fp);
avro_writer_t avro_writer_file(FILE * fp);
avro_reader_t avro_reader_memory(const char *buf, int64_t len);
avro_writer_t avro_writer_memory(const char *buf, int64_t len);

int avro_read(avro_reader_t reader, void *buf, int64_t len);
int avro_skip(avro_reader_t reader, int64_t len);
int avro_write(avro_writer_t writer, void *buf, int64_t len);

void avro_writer_reset(avro_writer_t writer);
int64_t avro_writer_tell(avro_writer_t writer);
void avro_writer_flush(avro_writer_t writer);

void avro_writer_dump(avro_writer_t writer, FILE * fp);
void avro_reader_dump(avro_reader_t reader, FILE * fp);

void avro_reader_free(avro_reader_t reader);
void avro_writer_free(avro_writer_t writer);

/*
 * datum
 */

/* constructors */
typedef struct avro_obj_t *avro_datum_t;
avro_datum_t avro_string(const char *str);
avro_datum_t avro_wrapstring(const char *str);
avro_datum_t avro_givestring(const char *str);
avro_datum_t avro_bytes(const char *buf, int64_t len);
avro_datum_t avro_wrapbytes(const char *buf, int64_t len);
avro_datum_t avro_givebytes(const char *buf, int64_t len);
avro_datum_t avro_int32(int32_t i);
avro_datum_t avro_int64(int64_t l);
avro_datum_t avro_float(float f);
avro_datum_t avro_double(double d);
avro_datum_t avro_boolean(int8_t i);
avro_datum_t avro_null(void);
avro_datum_t avro_record(const char *name, const char *space);
avro_datum_t avro_enum(const char *name, int i);
avro_datum_t avro_fixed(const char *name, const char *bytes,
                        const int64_t size);
avro_datum_t avro_wrapfixed(const char *name, const char *bytes,
                            const int64_t size);
avro_datum_t avro_givefixed(const char *name, const char *bytes,
                            const int64_t size);
avro_datum_t avro_map(void);
avro_datum_t avro_array(void);
avro_datum_t avro_union(int64_t discriminant, const avro_datum_t datum);

/* getters */
int avro_string_get(avro_datum_t datum, char **p);
int avro_bytes_get(avro_datum_t datum, char **bytes, int64_t * size);
int avro_int32_get(avro_datum_t datum, int32_t * i);
int avro_int64_get(avro_datum_t datum, int64_t * l);
int avro_float_get(avro_datum_t datum, float *f);
int avro_double_get(avro_datum_t datum, double *d);
int avro_boolean_get(avro_datum_t datum, int8_t * i);

int avro_fixed_get(avro_datum_t datum, char **bytes, int64_t * size);
int avro_record_get(const avro_datum_t record, const char *field_name,
                    avro_datum_t * value);
int avro_map_get(const avro_datum_t datum, const char *key,
                 avro_datum_t * value);
int avro_array_get(const avro_datum_t datum, int64_t index, avro_datum_t * value);

/* setters */
int avro_string_set(avro_datum_t datum, const char *p);
int avro_givestring_set(avro_datum_t datum, const char *p);
int avro_wrapstring_set(avro_datum_t datum, const char *p);

int avro_bytes_set(avro_datum_t datum, const char *bytes, const int64_t size);
int avro_givebytes_set(avro_datum_t datum, const char *bytes,
                       const int64_t size);
int avro_wrapbytes_set(avro_datum_t datum, const char *bytes,
                       const int64_t size);

int avro_int32_set(avro_datum_t datum, const int32_t i);
int avro_int64_set(avro_datum_t datum, const int64_t l);
int avro_float_set(avro_datum_t datum, const float f);
int avro_double_set(avro_datum_t datum, const double d);
int avro_boolean_set(avro_datum_t datum, const int8_t i);

int avro_fixed_set(avro_datum_t datum, const char *bytes, const int64_t size);
int avro_givefixed_set(avro_datum_t datum, const char *bytes,
                       const int64_t size);
int avro_wrapfixed_set(avro_datum_t datum, const char *bytes,
                       const int64_t size);

int avro_record_set(const avro_datum_t record, const char *field_name,
                    const avro_datum_t value);
int avro_map_set(const avro_datum_t map, const char *key,
                 const avro_datum_t value);
int avro_array_append_datum(const avro_datum_t array_datum,
                            const avro_datum_t datum);

/* reference counting */
avro_datum_t avro_datum_incref(avro_datum_t value);
void avro_datum_decref(avro_datum_t value);

void avro_datum_print(avro_datum_t value, FILE * fp);

int avro_datum_equal(avro_datum_t a, avro_datum_t b);

int avro_schema_match(avro_schema_t writers_schema,
                      avro_schema_t readers_schema);

int avro_schema_datum_validate(avro_schema_t
                               expected_schema, avro_datum_t datum);

int avro_read_data(avro_reader_t reader,
                   avro_schema_t writer_schema,
                   avro_schema_t reader_schema, avro_datum_t * datum);
int avro_skip_data(avro_reader_t reader, avro_schema_t writer_schema);
int avro_write_data(avro_writer_t writer,
                    avro_schema_t writer_schema, avro_datum_t datum);
int64_t avro_size_data(avro_writer_t writer,
                       avro_schema_t writer_schema, avro_datum_t datum);

/* File object container */
typedef struct avro_file_reader_t_ *avro_file_reader_t;
typedef struct avro_file_writer_t_ *avro_file_writer_t;

int avro_file_writer_create(const char *path, avro_schema_t schema,
                            avro_file_writer_t * writer);
int avro_file_writer_open(const char *path, avro_file_writer_t * writer);
int avro_file_reader(const char *path, avro_file_reader_t * reader);

int avro_file_writer_append(avro_file_writer_t writer, avro_datum_t datum);
int avro_file_writer_sync(avro_file_writer_t writer);
int avro_file_writer_flush(avro_file_writer_t writer);
int avro_file_writer_close(avro_file_writer_t writer);

int avro_file_reader_read(avro_file_reader_t reader,
                          avro_schema_t readers_schema, avro_datum_t * datum);
int avro_file_reader_close(avro_file_reader_t reader);

CLOSE_EXTERN
#endif

7.2. test_avro_data.c

Another good way to learn how to encode/decode data in Avro C is to look at the test_avro_data.c unit test. This simple unit test checks that all the avro types can be encoded/decoded correctly.

/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to you under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
 * implied.  See the License for the specific language governing
 * permissions and limitations under the License.
 */

#include "avro_private.h"
#include <inttypes.h>
#include <limits.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

char buf[4096];
avro_reader_t reader;
avro_writer_t writer;

typedef int (*avro_test) (void);

void init_rand(void)
{
        srand(time(NULL));
}

double rand_number(double from, double to)
{
        double range = to - from;
        return from + ((double)rand() / (RAND_MAX + 1.0)) * range;
}

int64_t rand_int64(void)
{
        return (int64_t) rand_number(LONG_MIN, LONG_MAX);
}

int32_t rand_int32(void)
{
        return (int32_t) rand_number(INT_MIN, INT_MAX);
}

void
write_read_check(avro_schema_t writers_schema,
                 avro_schema_t readers_schema, avro_datum_t datum, char *type)
{
        avro_datum_t datum_out;
        int validate;

        for (validate = 0; validate <= 1; validate++) {

                reader = avro_reader_memory(buf, sizeof(buf));
                writer = avro_writer_memory(buf, sizeof(buf));

                /* Validating read/write */
                if (avro_write_data
                    (writer, validate ? writers_schema : NULL, datum)) {
                        fprintf(stderr, "Unable to write %s validate=%d\n",
                                type, validate);
                        exit(EXIT_FAILURE);
                }
                int64_t size =
                    avro_size_data(writer, validate ? writers_schema : NULL,
                                   datum);
                if (size != avro_writer_tell(writer)) {
                        fprintf(stderr,
                                "Unable to calculate size %s validate=%d (%"PRId64" != %"PRId64")\n",
                                type, validate, size, avro_writer_tell(writer));
                        exit(EXIT_FAILURE);
                }
                if (avro_read_data
                    (reader, writers_schema, readers_schema, &datum_out)) {
                        fprintf(stderr, "Unable to read %s validate=%d\n", type,
                                validate);
                        exit(EXIT_FAILURE);
                }
                if (!avro_datum_equal(datum, datum_out)) {
                        fprintf(stderr,
                                "Unable to encode/decode %s validate=%d\n",
                                type, validate);
                        exit(EXIT_FAILURE);
                }

                avro_reader_dump(reader, stderr);
                avro_datum_decref(datum_out);
                avro_reader_free(reader);
                avro_writer_free(writer);
        }
}

static int test_string(void)
{
        unsigned int i;
        const char *strings[] = { "Four score and seven years ago",
                "our father brought forth on this continent",
                "a new nation", "conceived in Liberty",
                "and dedicated to the proposition that all men are created equal."
        };
        avro_schema_t writer_schema = avro_schema_string();
        for (i = 0; i < sizeof(strings) / sizeof(strings[0]); i++) {
                avro_datum_t datum = avro_wrapstring(strings[i]);
                write_read_check(writer_schema, NULL, datum, "string");
                avro_datum_decref(datum);
        }
        avro_schema_decref(writer_schema);
        return 0;
}

static int test_bytes(void)
{
        char bytes[] = { 0xDE, 0xAD, 0xBE, 0xEF };
        avro_schema_t writer_schema = avro_schema_bytes();
        avro_datum_t datum = avro_wrapbytes(bytes, sizeof(bytes));

        write_read_check(writer_schema, NULL, datum, "bytes");
        avro_datum_decref(datum);
        avro_schema_decref(writer_schema);
        return 0;
}

static int test_int32(void)
{
        int i;
        avro_schema_t writer_schema = avro_schema_int();
        for (i = 0; i < 100; i++) {
                avro_datum_t datum = avro_int32(rand_int32());
                write_read_check(writer_schema, NULL, datum, "int");
                avro_datum_decref(datum);
        }
        avro_schema_decref(writer_schema);
        return 0;
}

static int test_int64(void)
{
        int i;
        avro_schema_t writer_schema = avro_schema_long();
        for (i = 0; i < 100; i++) {
                avro_datum_t datum = avro_int64(rand_int64());
                write_read_check(writer_schema, NULL, datum, "long");
                avro_datum_decref(datum);
        }
        avro_schema_decref(writer_schema);
        return 0;
}

static int test_double(void)
{
        int i;
        avro_schema_t schema = avro_schema_double();
        for (i = 0; i < 100; i++) {
                avro_datum_t datum = avro_double(rand_number(-1.0E10, 1.0E10));
                write_read_check(schema, NULL, datum, "double");
                avro_datum_decref(datum);
        }
        avro_schema_decref(schema);
        return 0;
}

static int test_float(void)
{
        int i;
        avro_schema_t schema = avro_schema_double();
        for (i = 0; i < 100; i++) {
                avro_datum_t datum = avro_double(rand_number(-1.0E10, 1.0E10));
                write_read_check(schema, NULL, datum, "float");
                avro_datum_decref(datum);
        }
        avro_schema_decref(schema);
        return 0;
}

static int test_boolean(void)
{
        int i;
        avro_schema_t schema = avro_schema_boolean();
        for (i = 0; i <= 1; i++) {
                avro_datum_t datum = avro_boolean(i);
                write_read_check(schema, NULL, datum, "boolean");
                avro_datum_decref(datum);
        }
        avro_schema_decref(schema);
        return 0;
}

static int test_null(void)
{
        avro_schema_t schema = avro_schema_null();
        avro_datum_t datum = avro_null();
        write_read_check(schema, NULL, datum, "null");
        avro_datum_decref(datum);
        return 0;
}

static int test_record(void)
{
        avro_schema_t schema = avro_schema_record("person", NULL);
        avro_datum_t datum = avro_record("person", NULL);
        avro_datum_t name_datum, age_datum;

        avro_schema_record_field_append(schema, "name", avro_schema_string());
        avro_schema_record_field_append(schema, "age", avro_schema_int());

        name_datum = avro_wrapstring("Joseph Campbell");
        age_datum = avro_int32(83);

        avro_record_set(datum, "name", name_datum);
        avro_record_set(datum, "age", age_datum);

        write_read_check(schema, NULL, datum, "record");

        avro_datum_decref(name_datum);
        avro_datum_decref(age_datum);
        avro_datum_decref(datum);
        avro_schema_decref(schema);
        return 0;
}

static int test_enum(void)
{
        enum avro_languages {
                AVRO_C,
                AVRO_CPP,
                AVRO_PYTHON,
                AVRO_RUBY,
                AVRO_JAVA
        };
        avro_schema_t schema = avro_schema_enum("language");
        avro_datum_t datum = avro_enum("language", AVRO_C);

        avro_schema_enum_symbol_append(schema, "C");
        avro_schema_enum_symbol_append(schema, "C++");
        avro_schema_enum_symbol_append(schema, "Python");
        avro_schema_enum_symbol_append(schema, "Ruby");
        avro_schema_enum_symbol_append(schema, "Java");

        write_read_check(schema, NULL, datum, "enum");
        avro_datum_decref(datum);
        avro_schema_decref(schema);
        return 0;
}

static int test_array(void)
{
        int i, rval;
        avro_schema_t schema = avro_schema_array(avro_schema_int());
        avro_datum_t datum = avro_array();

        for (i = 0; i < 10; i++) {
                avro_datum_t i32_datum = avro_int32(i);
                rval = avro_array_append_datum(datum, i32_datum);
                avro_datum_decref(i32_datum);
                if (rval) {
                        exit(EXIT_FAILURE);
                }
        }

        write_read_check(schema, NULL, datum, "array");
        avro_datum_decref(datum);
        avro_schema_decref(schema);
        return 0;
}

static int test_map(void)
{
        avro_schema_t schema = avro_schema_map(avro_schema_long());
        avro_datum_t datum = avro_map();
        int64_t i = 0;
        char *nums[] =
            { "zero", "one", "two", "three", "four", "five", "six", NULL };
        while (nums[i]) {
                avro_datum_t i_datum = avro_int64(i);
                avro_map_set(datum, nums[i], i_datum);
                avro_datum_decref(i_datum);
                i++;
        }
        write_read_check(schema, NULL, datum, "map");
        avro_datum_decref(datum);
        avro_schema_decref(schema);
        return 0;
}

static int test_union(void)
{
        avro_schema_t schema = avro_schema_union();
        avro_datum_t union_datum;
        avro_datum_t datum;

        avro_schema_union_append(schema, avro_schema_string());
        avro_schema_union_append(schema, avro_schema_int());
        avro_schema_union_append(schema, avro_schema_null());

        datum = avro_wrapstring("Follow your bliss.");
        union_datum = avro_union(0, datum);

        write_read_check(schema, NULL, union_datum, "union");
        avro_datum_decref(union_datum);
        avro_datum_decref(datum);
        avro_schema_decref(schema);
        return 0;
}

static int test_fixed(void)
{
        char bytes[] = { 0xD, 0xA, 0xD, 0xA, 0xB, 0xA, 0xB, 0xA };
        avro_schema_t schema = avro_schema_fixed("msg", sizeof(bytes));
        avro_datum_t datum = avro_wrapfixed("msg", bytes, sizeof(bytes));
        write_read_check(schema, NULL, datum, "fixed");
        avro_datum_decref(datum);
        avro_schema_decref(schema);
        return 0;
}

int main(void)
{
        unsigned int i;
        struct avro_tests {
                char *name;
                avro_test func;
        } tests[] = {
                {
                "string", test_string}, {
                "bytes", test_bytes}, {
                "int", test_int32}, {
                "long", test_int64}, {
                "float", test_float}, {
                "double", test_double}, {
                "boolean", test_boolean}, {
                "null", test_null}, {
                "record", test_record}, {
                "enum", test_enum}, {
                "array", test_array}, {
                "map", test_map}, {
                "fixed", test_fixed}, {
                "union", test_union}
        };

        init_rand();
        for (i = 0; i < sizeof(tests) / sizeof(tests[0]); i++) {
                struct avro_tests *test = tests + i;
                fprintf(stderr, "**** Running %s tests ****\n", test->name);
                if (test->func() != 0) {
                        return EXIT_FAILURE;
                }
        }
        return EXIT_SUCCESS;
}