Package org.apache.avro.mapreduce
Class AvroJob
java.lang.Object
org.apache.avro.mapreduce.AvroJob
Utility methods for configuring jobs that work with Avro.
When using Avro data as MapReduce keys and values, data must be wrapped in a suitable AvroWrapper implementation. MapReduce keys must be wrapped in an AvroKey object, and MapReduce values must be wrapped in an AvroValue object.
Suppose you would like to write a line count mapper that reads from a text
file. If instead of using a Text and IntWritable output value, you would like
to use Avro data with a schema of "string" and "int",
respectively, you may parametrize your mapper with
AvroKey<CharSequence>
and AvroValue<Integer>
types. Then, use
the setMapOutputKeySchema()
and
setMapOutputValueSchema()
methods to set writer schemas for the
records you will generate.
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
The configuration key for a job's output compression codec. -
Method Summary
Modifier and TypeMethodDescriptionstatic Schema
Gets the job input key schema.static Schema
Gets the job input value schema.static Schema
Gets the map output key schema.static Schema
Gets the map output value schema.static Schema
Gets the job output key schema.static Schema
Gets the job output value schema.static void
setDataModelClass
(Job job, Class<? extends GenericData> modelClass) Sets the job data model class.static void
setInputKeySchema
(Job job, Schema schema) Sets the job input key schema.static void
setInputValueSchema
(Job job, Schema schema) Sets the job input value schema.static void
setMapOutputKeySchema
(Job job, Schema schema) Sets the map output key schema.static void
setMapOutputValueSchema
(Job job, Schema schema) Sets the map output value schema.static void
setOutputKeySchema
(Job job, Schema schema) Sets the job output key schema.static void
setOutputValueSchema
(Job job, Schema schema) Sets the job output value schema.
-
Field Details
-
CONF_OUTPUT_CODEC
The configuration key for a job's output compression codec. This takes one of the strings registered inCodecFactory
- See Also:
-
-
Method Details
-
setInputKeySchema
Sets the job input key schema.- Parameters:
job
- The job to configure.schema
- The input key schema.
-
setInputValueSchema
Sets the job input value schema.- Parameters:
job
- The job to configure.schema
- The input value schema.
-
setMapOutputKeySchema
Sets the map output key schema.- Parameters:
job
- The job to configure.schema
- The map output key schema.
-
setMapOutputValueSchema
Sets the map output value schema.- Parameters:
job
- The job to configure.schema
- The map output value schema.
-
setOutputKeySchema
Sets the job output key schema.- Parameters:
job
- The job to configure.schema
- The job output key schema.
-
setOutputValueSchema
Sets the job output value schema.- Parameters:
job
- The job to configure.schema
- The job output value schema.
-
setDataModelClass
Sets the job data model class.- Parameters:
job
- The job to configure.modelClass
- The job data model class.
-
getInputKeySchema
Gets the job input key schema.- Parameters:
conf
- The job configuration.- Returns:
- The job input key schema, or null if not set.
-
getInputValueSchema
Gets the job input value schema.- Parameters:
conf
- The job configuration.- Returns:
- The job input value schema, or null if not set.
-
getMapOutputKeySchema
Gets the map output key schema.- Parameters:
conf
- The job configuration.- Returns:
- The map output key schema, or null if not set.
-
getMapOutputValueSchema
Gets the map output value schema.- Parameters:
conf
- The job configuration.- Returns:
- The map output value schema, or null if not set.
-
getOutputKeySchema
Gets the job output key schema.- Parameters:
conf
- The job configuration.- Returns:
- The job output key schema, or null if not set.
-
getOutputValueSchema
Gets the job output value schema.- Parameters:
conf
- The job configuration.- Returns:
- The job output value schema, or null if not set.
-