Package org.apache.avro.hadoop.io
Class AvroSequenceFile
java.lang.Object
org.apache.avro.hadoop.io.AvroSequenceFile
A wrapper around a Hadoop
SequenceFile
that also
supports reading and writing Avro data.
The vanilla Hadoop SequenceFile
contains a header
followed by a sequence of records. A record consists of a
key and a value. The key and value must either:
- implement the
Writable
interface, or - be accepted by a
Serialization
registered with theSerializationFactory
.
Since Avro data are Plain Old Java Objects (e.g., Integer
for
data with schema "int"), they do not implement Writable.
Furthermore, a Serialization
implementation cannot determine whether an object instance of type
CharSequence
that also implements Writable
should
be serialized using Avro or WritableSerialization.
The solution implemented in AvroSequenceFile
is to:
- wrap Avro key data in an
AvroKey
object, - wrap Avro value data in an
AvroValue
object, - configure and register
AvroSerialization
with theSerializationFactory
, which will accept only objects that are instances of eitherAvroKey
orAvroValue
, and - store the Avro key and value schemas in the SequenceFile header.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
A reader for SequenceFiles that may contain Avro data.static class
A writer for an uncompressed SequenceFile that supports Avro data. -
Field Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic SequenceFile.Writer
Creates a writer from a set of options.
-
Field Details
-
METADATA_FIELD_KEY_SCHEMA
The SequenceFile.Metadata field for the Avro key writer schema. -
METADATA_FIELD_VALUE_SCHEMA
The SequenceFile.Metadata field for the Avro value writer schema.
-
-
Method Details
-
createWriter
public static SequenceFile.Writer createWriter(AvroSequenceFile.Writer.Options options) throws IOException Creates a writer from a set of options.Since there are different implementations of
Writer
depending on the compression type, this method constructs the appropriate subclass depending on the compression type given in theoptions
.- Parameters:
options
- The options for the writer.- Returns:
- A new writer instance.
- Throws:
IOException
- If the writer cannot be created.
-