public final class AvroJob extends Object
When using Avro data as MapReduce keys and values, data must be wrapped in a suitable AvroWrapper implementation. MapReduce keys must be wrapped in an AvroKey object, and MapReduce values must be wrapped in an AvroValue object.
Suppose you would like to write a line count mapper that reads from a text file. If
instead of using a Text and IntWritable output value, you would like to use Avro data
with a schema of "string" and "int", respectively, you may parameterize
your mapper with AvroKey<CharSequence>
and AvroValue<Integer>
types. Then, use the setMapOutputKeySchema()
and
setMapOutputValueSchema()
methods to set writer schemas for the records
you will generate.
Modifier and Type | Field and Description |
---|---|
static String |
CONF_OUTPUT_CODEC
The configuration key for a job's output compression codec.
|
Modifier and Type | Method and Description |
---|---|
static Schema |
getInputKeySchema(org.apache.hadoop.conf.Configuration conf)
Gets the job input key schema.
|
static Schema |
getInputValueSchema(org.apache.hadoop.conf.Configuration conf)
Gets the job input value schema.
|
static Schema |
getMapOutputKeySchema(org.apache.hadoop.conf.Configuration conf)
Gets the map output key schema.
|
static Schema |
getMapOutputValueSchema(org.apache.hadoop.conf.Configuration conf)
Gets the map output value schema.
|
static Schema |
getOutputKeySchema(org.apache.hadoop.conf.Configuration conf)
Gets the job output key schema.
|
static Schema |
getOutputValueSchema(org.apache.hadoop.conf.Configuration conf)
Gets the job output value schema.
|
static void |
setInputKeySchema(org.apache.hadoop.mapreduce.Job job,
Schema schema)
Sets the job input key schema.
|
static void |
setInputValueSchema(org.apache.hadoop.mapreduce.Job job,
Schema schema)
Sets the job input value schema.
|
static void |
setMapOutputKeySchema(org.apache.hadoop.mapreduce.Job job,
Schema schema)
Sets the map output key schema.
|
static void |
setMapOutputValueSchema(org.apache.hadoop.mapreduce.Job job,
Schema schema)
Sets the map output value schema.
|
static void |
setOutputKeySchema(org.apache.hadoop.mapreduce.Job job,
Schema schema)
Sets the job output key schema.
|
static void |
setOutputValueSchema(org.apache.hadoop.mapreduce.Job job,
Schema schema)
Sets the job output value schema.
|
public static final String CONF_OUTPUT_CODEC
CodecFactory
public static void setInputKeySchema(org.apache.hadoop.mapreduce.Job job, Schema schema)
job
- The job to configure.schema
- The input key schema.public static void setInputValueSchema(org.apache.hadoop.mapreduce.Job job, Schema schema)
job
- The job to configure.schema
- The input value schema.public static void setMapOutputKeySchema(org.apache.hadoop.mapreduce.Job job, Schema schema)
job
- The job to configure.schema
- The map output key schema.public static void setMapOutputValueSchema(org.apache.hadoop.mapreduce.Job job, Schema schema)
job
- The job to configure.schema
- The map output value schema.public static void setOutputKeySchema(org.apache.hadoop.mapreduce.Job job, Schema schema)
job
- The job to configure.schema
- The job output key schema.public static void setOutputValueSchema(org.apache.hadoop.mapreduce.Job job, Schema schema)
job
- The job to configure.schema
- The job output value schema.public static Schema getInputKeySchema(org.apache.hadoop.conf.Configuration conf)
conf
- The job configuration.public static Schema getInputValueSchema(org.apache.hadoop.conf.Configuration conf)
conf
- The job configuration.public static Schema getMapOutputKeySchema(org.apache.hadoop.conf.Configuration conf)
conf
- The job configuration.public static Schema getMapOutputValueSchema(org.apache.hadoop.conf.Configuration conf)
conf
- The job configuration.public static Schema getOutputKeySchema(org.apache.hadoop.conf.Configuration conf)
conf
- The job configuration.public static Schema getOutputValueSchema(org.apache.hadoop.conf.Configuration conf)
conf
- The job configuration.Copyright © 2009-2013 The Apache Software Foundation. All Rights Reserved.