public final class AvroJob extends Object
When using Avro data as MapReduce keys and values, data must be wrapped in a suitable AvroWrapper implementation. MapReduce keys must be wrapped in an AvroKey object, and MapReduce values must be wrapped in an AvroValue object.
Suppose you would like to write a line count mapper that reads from a text
file. If instead of using a Text and IntWritable output value, you would like
to use Avro data with a schema of "string" and "int",
respectively, you may parametrize your mapper with
AvroKey<CharSequence>
and AvroValue<Integer>
types. Then, use
the setMapOutputKeySchema()
and
setMapOutputValueSchema()
methods to set writer schemas for the
records you will generate.
Modifier and Type | Field and Description |
---|---|
static String |
CONF_OUTPUT_CODEC
The configuration key for a job's output compression codec.
|
Modifier and Type | Method and Description |
---|---|
static Schema |
getInputKeySchema(Configuration conf)
Gets the job input key schema.
|
static Schema |
getInputValueSchema(Configuration conf)
Gets the job input value schema.
|
static Schema |
getMapOutputKeySchema(Configuration conf)
Gets the map output key schema.
|
static Schema |
getMapOutputValueSchema(Configuration conf)
Gets the map output value schema.
|
static Schema |
getOutputKeySchema(Configuration conf)
Gets the job output key schema.
|
static Schema |
getOutputValueSchema(Configuration conf)
Gets the job output value schema.
|
static void |
setDataModelClass(Job job,
Class<? extends GenericData> modelClass)
Sets the job data model class.
|
static void |
setInputKeySchema(Job job,
Schema schema)
Sets the job input key schema.
|
static void |
setInputValueSchema(Job job,
Schema schema)
Sets the job input value schema.
|
static void |
setMapOutputKeySchema(Job job,
Schema schema)
Sets the map output key schema.
|
static void |
setMapOutputValueSchema(Job job,
Schema schema)
Sets the map output value schema.
|
static void |
setOutputKeySchema(Job job,
Schema schema)
Sets the job output key schema.
|
static void |
setOutputValueSchema(Job job,
Schema schema)
Sets the job output value schema.
|
public static final String CONF_OUTPUT_CODEC
CodecFactory
public static void setInputKeySchema(Job job, Schema schema)
job
- The job to configure.schema
- The input key schema.public static void setInputValueSchema(Job job, Schema schema)
job
- The job to configure.schema
- The input value schema.public static void setMapOutputKeySchema(Job job, Schema schema)
job
- The job to configure.schema
- The map output key schema.public static void setMapOutputValueSchema(Job job, Schema schema)
job
- The job to configure.schema
- The map output value schema.public static void setOutputKeySchema(Job job, Schema schema)
job
- The job to configure.schema
- The job output key schema.public static void setOutputValueSchema(Job job, Schema schema)
job
- The job to configure.schema
- The job output value schema.public static void setDataModelClass(Job job, Class<? extends GenericData> modelClass)
job
- The job to configure.modelClass
- The job data model class.public static Schema getInputKeySchema(Configuration conf)
conf
- The job configuration.public static Schema getInputValueSchema(Configuration conf)
conf
- The job configuration.public static Schema getMapOutputKeySchema(Configuration conf)
conf
- The job configuration.public static Schema getMapOutputValueSchema(Configuration conf)
conf
- The job configuration.public static Schema getOutputKeySchema(Configuration conf)
conf
- The job configuration.public static Schema getOutputValueSchema(Configuration conf)
conf
- The job configuration.Copyright © 2009–2022 The Apache Software Foundation. All rights reserved.