public class AvroMultipleInputs extends Object
Schema and AvroMapper for each path.
Usage:
Case 1: (ReflectData based inputs)
// Enable ReflectData usage across job. AvroJob.setReflect(job); Schema type1Schema = ReflectData.get().getSchema(Type1Record.class) AvroMultipleInputs.addInputPath(job, inputPath1, type1Schema, Type1AvroMapper.class);Where Type1AvroMapper would be implemented as
class Type1AvroMapper extends AvroMapper<Type1Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Schema type2Schema = ReflectData.get().getSchema(Type2Record.class) AvroMultipleInputs.addInputPath(job, inputPath2, type2Schema, Type2AvroMapper.class);Where Type2AvroMapper would be implemented as
class Type2AvroMapper extends AvroMapper<Type2Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Case 2: (SpecificData based inputs)
Schema type1Schema = Type1Record.SCHEMA$; AvroMultipleInputs.addInputPath(job, inputPath1, type1Schema, Type1AvroMapper.class);Where Type1AvroMapper would be implemented as
class Type1AvroMapper extends AvroMapper<Type1Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Schema type2Schema = Type2Record.SCHEMA$; AvroMultipleInputs.addInputPath(job, inputPath2, type2Schema, Type2AvroMapper.class);Where Type2AvroMapper would be implemented as
class Type2AvroMapper extends AvroMapper<Type2Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Note on InputFormat: The InputFormat used will always be
AvroInputFormat when using this class.
Note on collector outputs: When using this class, you will
need to ensure that the mapper implementations involved must all emit the
same Key type and Value record types, as set by
AvroJob.setOutputSchema(JobConf, Schema) or
AvroJob.setMapOutputSchema(JobConf, Schema).
| Constructor and Description |
|---|
AvroMultipleInputs() |
| Modifier and Type | Method and Description |
|---|---|
static void |
addInputPath(org.apache.hadoop.mapred.JobConf conf,
org.apache.hadoop.fs.Path path,
Class<? extends AvroMapper> mapperClass,
Schema inputSchema)
|
public static void addInputPath(org.apache.hadoop.mapred.JobConf conf,
org.apache.hadoop.fs.Path path,
Class<? extends AvroMapper> mapperClass,
Schema inputSchema)
conf - The configuration of the jobpath - Path to be added to the list of inputs for the jobinputSchema - Schema to use for this pathmapperClass - AvroMapper class to use for this pathCopyright © 2009–2020 The Apache Software Foundation. All rights reserved.