public class AvroMultipleInputs extends Object
Schema
and AvroMapper
for each path.
Usage:
Case 1: (ReflectData based inputs)
// Enable ReflectData usage across job. AvroJob.setReflect(job); Schema type1Schema = ReflectData.get().getSchema(Type1Record.class) AvroMultipleInputs.addInputPath(job, inputPath1, type1Schema, Type1AvroMapper.class);Where Type1AvroMapper would be implemented as
class Type1AvroMapper extends AvroMapper<Type1Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Schema type2Schema = ReflectData.get().getSchema(Type2Record.class) AvroMultipleInputs.addInputPath(job, inputPath2, type2Schema, Type2AvroMapper.class);Where Type2AvroMapper would be implemented as
class Type2AvroMapper extends AvroMapper<Type2Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Case 2: (SpecificData based inputs)
Schema type1Schema = Type1Record.SCHEMA$; AvroMultipleInputs.addInputPath(job, inputPath1, type1Schema, Type1AvroMapper.class);Where Type1AvroMapper would be implemented as
class Type1AvroMapper extends AvroMapper<Type1Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Schema type2Schema = Type2Record.SCHEMA$; AvroMultipleInputs.addInputPath(job, inputPath2, type2Schema, Type2AvroMapper.class);Where Type2AvroMapper would be implemented as
class Type2AvroMapper extends AvroMapper<Type2Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Note on InputFormat: The InputFormat used will always be
AvroInputFormat
when using this class.
Note on collector outputs: When using this class, you will
need to ensure that the mapper implementations involved must all emit the
same Key type and Value record types, as set by
AvroJob.setOutputSchema(JobConf, Schema)
or
AvroJob.setMapOutputSchema(JobConf, Schema)
.
Constructor and Description |
---|
AvroMultipleInputs() |
Modifier and Type | Method and Description |
---|---|
static void |
addInputPath(org.apache.hadoop.mapred.JobConf conf,
org.apache.hadoop.fs.Path path,
Class<? extends AvroMapper> mapperClass,
Schema inputSchema)
|
public static void addInputPath(org.apache.hadoop.mapred.JobConf conf, org.apache.hadoop.fs.Path path, Class<? extends AvroMapper> mapperClass, Schema inputSchema)
conf
- The configuration of the jobpath
- Path
to be added to the list of inputs for the jobinputSchema
- Schema
to use for this pathmapperClass
- AvroMapper
class to use for this pathCopyright © 2009–2020 The Apache Software Foundation. All rights reserved.