public final class AvroJob extends Object
When using Avro data as MapReduce keys and values, data must be wrapped in a suitable AvroWrapper implementation. MapReduce keys must be wrapped in an AvroKey object, and MapReduce values must be wrapped in an AvroValue object.
Suppose you would like to write a line count mapper that reads from a text file. If
instead of using a Text and IntWritable output value, you would like to use Avro data
with a schema of "string" and "int", respectively, you may parameterize
your mapper with AvroKey<CharSequence>
and AvroValue<Integer>
types. Then, use the setMapOutputKeySchema()
and
setMapOutputValueSchema()
methods to set writer schemas for the records
you will generate.
Modifier and Type | Field and Description |
---|---|
static String |
CONF_OUTPUT_CODEC
The configuration key for a job's output compression codec.
|
Modifier and Type | Method and Description |
---|---|
static Schema |
getInputKeySchema(Configuration conf)
Gets the job input key schema.
|
static Schema |
getInputValueSchema(Configuration conf)
Gets the job input value schema.
|
static Schema |
getMapOutputKeySchema(Configuration conf)
Gets the map output key schema.
|
static Schema |
getMapOutputValueSchema(Configuration conf)
Gets the map output value schema.
|
static Schema |
getOutputKeySchema(Configuration conf)
Gets the job output key schema.
|
static Schema |
getOutputValueSchema(Configuration conf)
Gets the job output value schema.
|
static void |
setDataModelClass(Job job,
Class<? extends GenericData> modelClass)
Sets the job data model class.
|
static void |
setInputKeySchema(Job job,
Schema schema)
Sets the job input key schema.
|
static void |
setInputValueSchema(Job job,
Schema schema)
Sets the job input value schema.
|
static void |
setMapOutputKeySchema(Job job,
Schema schema)
Sets the map output key schema.
|
static void |
setMapOutputValueSchema(Job job,
Schema schema)
Sets the map output value schema.
|
static void |
setOutputKeySchema(Job job,
Schema schema)
Sets the job output key schema.
|
static void |
setOutputValueSchema(Job job,
Schema schema)
Sets the job output value schema.
|
public static final String CONF_OUTPUT_CODEC
CodecFactory
public static void setInputKeySchema(Job job, Schema schema)
job
- The job to configure.schema
- The input key schema.public static void setInputValueSchema(Job job, Schema schema)
job
- The job to configure.schema
- The input value schema.public static void setMapOutputKeySchema(Job job, Schema schema)
job
- The job to configure.schema
- The map output key schema.public static void setMapOutputValueSchema(Job job, Schema schema)
job
- The job to configure.schema
- The map output value schema.public static void setOutputKeySchema(Job job, Schema schema)
job
- The job to configure.schema
- The job output key schema.public static void setOutputValueSchema(Job job, Schema schema)
job
- The job to configure.schema
- The job output value schema.public static void setDataModelClass(Job job, Class<? extends GenericData> modelClass)
job
- The job to configure.modelClass
- The job data model class.public static Schema getInputKeySchema(Configuration conf)
conf
- The job configuration.public static Schema getInputValueSchema(Configuration conf)
conf
- The job configuration.public static Schema getMapOutputKeySchema(Configuration conf)
conf
- The job configuration.public static Schema getMapOutputValueSchema(Configuration conf)
conf
- The job configuration.public static Schema getOutputKeySchema(Configuration conf)
conf
- The job configuration.public static Schema getOutputValueSchema(Configuration conf)
conf
- The job configuration.Copyright © 2009-2014 The Apache Software Foundation. All Rights Reserved.