Package org.apache.avro.mapred

Run Hadoop MapReduce jobs over Avro data, with map and reduce functions written in Java.

See: Description

Package org.apache.avro.mapred Description

Run Hadoop MapReduce jobs over Avro data, with map and reduce functions written in Java.

Avro data files do not contain key/value pairs as expected by Hadoop's MapReduce API, but rather just a sequence of values. Thus we provide here a layer on top of Hadoop's MapReduce API.

In all cases, input and output paths are set and jobs are submitted as with standard Hadoop jobs:

For jobs whose input and output are Avro data files:

For jobs whose input is an Avro data file and which use an AvroMapper, but whose reducer is a non-Avro Reducer and whose output is a non-Avro format:

For jobs whose input is non-Avro data file and which use a non-Avro Mapper, but whose reducer is an AvroReducer and whose output is an Avro data file:

For jobs whose input is non-Avro data file and which use a non-Avro Mapper and no reducer, i.e., a map-only job:

Copyright © 2009-2012 The Apache Software Foundation. All Rights Reserved.