Class AvroKeyValueRecordWriter<K,V>

java.lang.Object
org.apache.hadoop.mapreduce.RecordWriter<K,V>
org.apache.avro.mapreduce.AvroKeyValueRecordWriter<K,V>
Type Parameters:
K - The type of key to write.
V - The type of value to write.
All Implemented Interfaces:
Syncable

public class AvroKeyValueRecordWriter<K,V> extends RecordWriter<K,V> implements Syncable
Writes key/value pairs to an Avro container file.

Each entry in the Avro container file will be a generic record with two fields, named 'key' and 'value'. The input types may be basic Writable objects like Text or IntWritable, or they may be AvroWrapper subclasses (AvroKey or AvroValue). Writable objects will be converted to their corresponding Avro types when written to the generic record key/value pair.

  • Constructor Details

    • AvroKeyValueRecordWriter

      public AvroKeyValueRecordWriter(AvroDatumConverter<K,?> keyConverter, AvroDatumConverter<V,?> valueConverter, GenericData dataModel, CodecFactory compressionCodec, OutputStream outputStream, int syncInterval) throws IOException
      Constructor.
      Parameters:
      keyConverter - A key to Avro datum converter.
      valueConverter - A value to Avro datum converter.
      dataModel - The data model for key and value.
      compressionCodec - A compression codec factory for the Avro container file.
      outputStream - The output stream to write the Avro container file to.
      syncInterval - The sync interval for the Avro container file.
      Throws:
      IOException - If the record writer cannot be opened.
    • AvroKeyValueRecordWriter

      public AvroKeyValueRecordWriter(AvroDatumConverter<K,?> keyConverter, AvroDatumConverter<V,?> valueConverter, GenericData dataModel, CodecFactory compressionCodec, OutputStream outputStream) throws IOException
      Constructor.
      Parameters:
      keyConverter - A key to Avro datum converter.
      valueConverter - A value to Avro datum converter.
      dataModel - The data model for key and value.
      compressionCodec - A compression codec factory for the Avro container file.
      outputStream - The output stream to write the Avro container file to.
      Throws:
      IOException - If the record writer cannot be opened.
  • Method Details

    • getWriterSchema

      public Schema getWriterSchema()
      Gets the writer schema for the key/value pair generic record.
      Returns:
      The writer schema used for entries of the Avro container file.
    • write

      public void write(K key, V value) throws IOException
      Specified by:
      write in class RecordWriter<K,V>
      Throws:
      IOException
    • close

      public void close(TaskAttemptContext context) throws IOException
      Specified by:
      close in class RecordWriter<K,V>
      Throws:
      IOException
    • sync

      public long sync() throws IOException
      Return the current position as a value that may be passed to DataFileReader.seek(long). Forces the end of the current block, emitting a synchronization marker.
      Specified by:
      sync in interface Syncable
      Throws:
      IOException - - if an error occurred while attempting to sync.