Class AvroSequenceFile.Writer.Options

java.lang.Object
org.apache.avro.hadoop.io.AvroSequenceFile.Writer.Options
Enclosing class:
AvroSequenceFile.Writer

public static class AvroSequenceFile.Writer.Options extends Object
A helper class to encapsulate the options that can be used to construct a Writer.
  • Constructor Details

    • Options

      public Options()
      Creates a new Options instance with default values.
  • Method Details

    • withFileSystem

      public AvroSequenceFile.Writer.Options withFileSystem(FileSystem fileSystem)
      Sets the filesystem the SequenceFile should be written to.
      Parameters:
      fileSystem - The filesystem.
      Returns:
      This options instance.
    • withConfiguration

      public AvroSequenceFile.Writer.Options withConfiguration(Configuration conf)
      Sets the Hadoop configuration.
      Parameters:
      conf - The configuration.
      Returns:
      This options instance.
    • withOutputPath

      public AvroSequenceFile.Writer.Options withOutputPath(Path outputPath)
      Sets the output path for the SequenceFile.
      Parameters:
      outputPath - The output path.
      Returns:
      This options instance.
    • withKeyClass

      public AvroSequenceFile.Writer.Options withKeyClass(Class<?> keyClass)
      Sets the class of the key records to be written.

      If the keys will be Avro data, use withKeySchema(org.apache.avro.Schema) to specify the writer schema. The key class will be automatically set to AvroKey.

      Parameters:
      keyClass - The key class.
      Returns:
      This options instance.
    • withKeySchema

      public AvroSequenceFile.Writer.Options withKeySchema(Schema keyWriterSchema)
      Sets the writer schema of the key records when using Avro data.

      The key class will automatically be set to AvroKey, so there is no need to call withKeyClass(Class) when using this method.

      Parameters:
      keyWriterSchema - The writer schema for the keys.
      Returns:
      This options instance.
    • withValueClass

      public AvroSequenceFile.Writer.Options withValueClass(Class<?> valueClass)
      Sets the class of the value records to be written.

      If the values will be Avro data, use withValueSchema(org.apache.avro.Schema) to specify the writer schema. The value class will be automatically set to AvroValue.

      Parameters:
      valueClass - The value class.
      Returns:
      This options instance.
    • withValueSchema

      public AvroSequenceFile.Writer.Options withValueSchema(Schema valueWriterSchema)
      Sets the writer schema of the value records when using Avro data.

      The value class will automatically be set to AvroValue, so there is no need to call withValueClass(Class) when using this method.

      Parameters:
      valueWriterSchema - The writer schema for the values.
      Returns:
      This options instance.
    • withBufferSizeBytes

      public AvroSequenceFile.Writer.Options withBufferSizeBytes(int bytes)
      Sets the write buffer size in bytes.
      Parameters:
      bytes - The desired buffer size.
      Returns:
      This options instance.
    • withReplicationFactor

      public AvroSequenceFile.Writer.Options withReplicationFactor(short replicationFactor)
      Sets the desired replication factor for the file.
      Parameters:
      replicationFactor - The replication factor.
      Returns:
      This options instance.
    • withBlockSizeBytes

      public AvroSequenceFile.Writer.Options withBlockSizeBytes(long bytes)
      Sets the desired size of the file blocks.
      Parameters:
      bytes - The desired block size in bytes.
      Returns:
      This options instance.
    • withProgressable

      public AvroSequenceFile.Writer.Options withProgressable(Progressable progressable)
      Sets an object to report progress to.
      Parameters:
      progressable - A progressable object to track progress.
      Returns:
      This options instance.
    • withCompressionType

      public AvroSequenceFile.Writer.Options withCompressionType(SequenceFile.CompressionType compressionType)
      Sets the type of compression.
      Parameters:
      compressionType - The type of compression for the output file.
      Returns:
      This options instance.
    • withCompressionCodec

      public AvroSequenceFile.Writer.Options withCompressionCodec(CompressionCodec compressionCodec)
      Sets the compression codec to use if it is enabled.
      Parameters:
      compressionCodec - The compression codec.
      Returns:
      This options instance.
    • withMetadata

      Sets the metadata that should be stored in the file header.
      Parameters:
      metadata - The file metadata.
      Returns:
      This options instance.
    • getFileSystem

      public FileSystem getFileSystem()
      Gets the filesystem the SequenceFile should be written to.
      Returns:
      The file system to write to.
    • getConfiguration

      public Configuration getConfiguration()
      Gets the Hadoop configuration.
      Returns:
      The Hadoop configuration.
    • getConfigurationWithAvroSerialization

      public Configuration getConfigurationWithAvroSerialization()
      Gets the Hadoop configuration with Avro serialization registered.
      Returns:
      The Hadoop configuration.
    • getOutputPath

      public Path getOutputPath()
      Gets the output path for the sequence file.
      Returns:
      The output path.
    • getKeyClass

      public Class<?> getKeyClass()
      Gets the class of the key records.
      Returns:
      The key class.
    • getValueClass

      public Class<?> getValueClass()
      Gets the class of the value records.
      Returns:
      The value class.
    • getBufferSizeBytes

      public int getBufferSizeBytes()
      Gets the desired size of the buffer used when flushing records to disk.
      Returns:
      The buffer size in bytes.
    • getReplicationFactor

      public short getReplicationFactor()
      Gets the desired number of replicas to store for each block of the file.
      Returns:
      The replication factor for the blocks of the file.
    • getBlockSizeBytes

      public long getBlockSizeBytes()
      Gets the desired size of the file blocks.
      Returns:
      The size of a file block in bytes.
    • getProgressable

      public Progressable getProgressable()
      Gets the object to report progress to.
      Returns:
      A progressable object to track progress.
    • getCompressionType

      public SequenceFile.CompressionType getCompressionType()
      Gets the type of compression.
      Returns:
      The compression type.
    • getCompressionCodec

      public CompressionCodec getCompressionCodec()
      Gets the compression codec.
      Returns:
      The compression codec.
    • getMetadata

      public SequenceFile.Metadata getMetadata()
      Gets the SequenceFile metadata to store in the header.
      Returns:
      The metadata header.