Package org.apache.avro.mapred
Class AvroAsTextInputFormat
- All Implemented Interfaces:
InputFormat<Text,
Text>
An
InputFormat
for Avro data files, which
converts each datum to string form in the input key. The input value is
always empty. The string representation is
JSON.
This InputFormat
is useful for applications
that wish to process Avro data using tools like MapReduce Streaming.
By default, when pointed at a directory, this will silently skip over any
files in it that do not have .avro extension. To instead include all files,
set the avro.mapred.ignore.inputs.without.extension property to false.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileInputFormat
FileInputFormat.Counter
-
Field Summary
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptiongetRecordReader
(InputSplit split, JobConf job, Reporter reporter) protected FileStatus[]
listStatus
(JobConf job) Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, isSplitable, makeSplit, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
-
Constructor Details
-
AvroAsTextInputFormat
public AvroAsTextInputFormat()
-
-
Method Details
-
listStatus
- Overrides:
listStatus
in classFileInputFormat<Text,
Text> - Throws:
IOException
-
getRecordReader
public RecordReader<Text,Text> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException - Specified by:
getRecordReader
in interfaceInputFormat<Text,
Text> - Specified by:
getRecordReader
in classFileInputFormat<Text,
Text> - Throws:
IOException
-