Package org.apache.avro.mapred

Run Hadoop MapReduce jobs over Avro data, with map and reduce functions written in Java.

See:
          Description

Class Summary
AvroAsTextInputFormat An InputFormat for Avro data files, which converts each datum to string form in the input key.
AvroCollector<T> A collector for map and reduce output.
AvroInputFormat<T> An InputFormat for Avro data files
AvroJob Setters to configure jobs for Avro data.
AvroKey<T> The wrapper of keys for jobs configured with AvroJob .
AvroKeyComparator<T> The RawComparator used by jobs configured with AvroJob.
AvroMapper<IN,OUT> A mapper for Avro data.
AvroOutputFormat<T> An OutputFormat for Avro data files.
AvroRecordReader<T> An RecordReader for Avro data files.
AvroReducer<K,V,OUT> A reducer for Avro data.
AvroSerialization<T> The Serialization used by jobs configured with AvroJob.
AvroTextOutputFormat<K,V> The equivalent of TextOutputFormat for writing to Avro Data Files with a "bytes" schema.
AvroUtf8InputFormat An InputFormat for text files.
AvroValue<T> The wrapper of values for jobs configured with AvroJob .
AvroWrapper<T> The wrapper of data for jobs configured with AvroJob .
FsInput Adapt an FSDataInputStream to SeekableInput.
Pair<K,V> A key/value pair.
SequenceFileInputFormat<K,V> An InputFormat for sequence files.
SequenceFileReader<K,V> A FileReader for sequence files.
SequenceFileRecordReader<K,V> A RecordReader for sequence files.
 

Package org.apache.avro.mapred Description

Run Hadoop MapReduce jobs over Avro data, with map and reduce functions written in Java.

Avro data files do not contain key/value pairs as expected by Hadoop's MapReduce API, but rather just a sequence of values. Thus we provide here a layer on top of Hadoop's MapReduce API.

In all cases, input and output paths are set and jobs are submitted as with standard Hadoop jobs:

For jobs whose input and output are Avro data files:

For jobs whose input is an Avro data file and which use an AvroMapper, but whose reducer is a non-Avro Reducer and whose output is a non-Avro format:

For jobs whose input is non-Avro data file and which use a non-Avro Mapper, but whose reducer is an AvroReducer and whose output is an Avro data file:

For jobs whose input is non-Avro data file and which use a non-Avro Mapper and no reducer, i.e., a map-only job:



Copyright © 2011 The Apache Software Foundation. All Rights Reserved.