org.apache.avro.mapred (Avro 1.4.1 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES

Package org.apache.avro.mapred

Run Hadoop MapReduce jobs over Avro data, with map and reduce functions written in Java.

See:
Description

Class Summary
AvroCollector<T>	A collector for map and reduce output.
AvroInputFormat<T>	An `InputFormat` for Avro data files
AvroJob	Setters to configure jobs for Avro data.
AvroKey<T>	The wrapper of keys for jobs configured with `AvroJob` .
AvroKeyComparator<T>	The `RawComparator` used by jobs configured with `AvroJob`.
AvroMapper<IN,OUT>	A mapper for Avro data.
AvroOutputFormat<T>	An `OutputFormat` for Avro data files.
AvroRecordReader<T>	An `RecordReader` for Avro data files.
AvroReducer<K,V,OUT>	A reducer for Avro data.
AvroSerialization<T>	The `Serialization` used by jobs configured with `AvroJob`.
AvroUtf8InputFormat	An `InputFormat` for text files.
AvroValue<T>	The wrapper of values for jobs configured with `AvroJob` .
AvroWrapper<T>	The wrapper of data for jobs configured with `AvroJob` .
FsInput	Adapt an `FSDataInputStream` to `SeekableInput`.
Pair<K,V>	A key/value pair.
SequenceFileInputFormat<K,V>	An `InputFormat` for sequence files.
SequenceFileReader<K,V>	A `FileReader` for sequence files.
SequenceFileRecordReader<K,V>	A `RecordReader` for sequence files.

Package org.apache.avro.mapred Description

Run Hadoop MapReduce jobs over Avro data, with map and reduce functions written in Java.

Avro data files do not contain key/value pairs as expected by Hadoop's MapReduce API, but rather just a sequence of values. Thus we provide here a layer on top of Hadoop's MapReduce API which eliminates the key/value distinction.

To use this for jobs whose input and output are Avro data files:

Call AvroJob.setInputSchema(org.apache.hadoop.mapred.JobConf, org.apache.avro.Schema) and AvroJob.setOutputSchema(org.apache.hadoop.mapred.JobConf, org.apache.avro.Schema) with your job's input and output schemas.
Subclass AvroMapper and specify this as your job's mapper with AvroJob.setMapperClass(org.apache.hadoop.mapred.JobConf, java.lang.Class)
Subclass AvroReducer and specify this as your job's reducer and perhaps combiner, with AvroJob.setReducerClass(org.apache.hadoop.mapred.JobConf, java.lang.Class) and AvroJob.setCombinerClass(org.apache.hadoop.mapred.JobConf, java.lang.Class)
Specify input files with FileInputFormat.setInputPaths(org.apache.hadoop.mapred.JobConf, java.lang.String)
Specify an output directory with FileOutputFormat.setOutputPath(org.apache.hadoop.mapred.JobConf, org.apache.hadoop.fs.Path)
Run your job with JobClient.runJob(org.apache.hadoop.mapred.JobConf)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES