org.apache.avro.file
Class DataFileWriter<D>

java.lang.Object
  extended by org.apache.avro.file.DataFileWriter<D>
All Implemented Interfaces:
Closeable, Flushable

public class DataFileWriter<D>
extends Object
implements Closeable, Flushable

Stores in a file a sequence of data conforming to a schema. The schema is stored in the file with the data. Each datum in a file is of the same schema. Data is written with a DatumWriter. Data is grouped into blocks. A synchronization marker is written between blocks, so that files may be split. Blocks may be compressed. Extensible metadata is stored at the end of the file. Files may be appended to.

See Also:
DataFileReader

Constructor Summary
DataFileWriter(DatumWriter<D> dout)
          Construct a writer, not yet open.
 
Method Summary
 void append(D datum)
          Append a datum to the file.
 void appendAllFrom(DataFileStream<D> otherFile, boolean recompress)
          Appends data from another file.
 DataFileWriter<D> appendTo(File file)
          Open a writer appending to an existing file.
 void close()
          Close the file.
 DataFileWriter<D> create(Schema schema, File file)
          Open a new file for data matching a schema.
 DataFileWriter<D> create(Schema schema, OutputStream outs)
          Open a new file for data matching a schema.
 void flush()
          Flush the current state of the file.
 DataFileWriter<D> setCodec(CodecFactory c)
          Configures this writer to use the given codec.
 DataFileWriter<D> setMeta(String key, byte[] value)
          Set a metadata property.
 DataFileWriter<D> setMeta(String key, long value)
          Set a metadata property.
 DataFileWriter<D> setMeta(String key, String value)
          Set a metadata property.
 DataFileWriter<D> setMetaInternal(String key, String value)
           
 DataFileWriter<D> setSyncInterval(int syncInterval)
          Set the synchronization interval for this file, in bytes.
 long sync()
          Return the current position as a value that may be passed to DataFileReader.seek(long).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DataFileWriter

public DataFileWriter(DatumWriter<D> dout)
Construct a writer, not yet open.

Method Detail

setCodec

public DataFileWriter<D> setCodec(CodecFactory c)
Configures this writer to use the given codec. May not be reset after writes have begun.


setSyncInterval

public DataFileWriter<D> setSyncInterval(int syncInterval)
Set the synchronization interval for this file, in bytes. Valid values range from 32 to 2^30 Suggested values are between 2K and 2M Invalid values throw IllegalArgumentException

Parameters:
syncInterval - the approximate number of uncompressed bytes to write in each block
Returns:
this DataFileWriter

create

public DataFileWriter<D> create(Schema schema,
                                File file)
                         throws IOException
Open a new file for data matching a schema.

Throws:
IOException

create

public DataFileWriter<D> create(Schema schema,
                                OutputStream outs)
                         throws IOException
Open a new file for data matching a schema.

Throws:
IOException

appendTo

public DataFileWriter<D> appendTo(File file)
                           throws IOException
Open a writer appending to an existing file.

Throws:
IOException

setMetaInternal

public DataFileWriter<D> setMetaInternal(String key,
                                         String value)

setMeta

public DataFileWriter<D> setMeta(String key,
                                 byte[] value)
Set a metadata property.


setMeta

public DataFileWriter<D> setMeta(String key,
                                 String value)
Set a metadata property.


setMeta

public DataFileWriter<D> setMeta(String key,
                                 long value)
Set a metadata property.


append

public void append(D datum)
            throws IOException
Append a datum to the file.

Throws:
IOException

appendAllFrom

public void appendAllFrom(DataFileStream<D> otherFile,
                          boolean recompress)
                   throws IOException
Appends data from another file. otherFile must have the same schema. Data blocks will be copied without de-serializing data. If the codecs of the two files are compatible, data blocks are copied directly without decompression. If the codecs are not compatible, blocks from otherFile are uncompressed and then compressed using this file's codec.

If the recompress flag is set all blocks are decompressed and then compressed using this file's codec. This is useful when the two files have compatible compression codecs but different codec options. For example, one might append a file compressed with deflate at compression level 1 to a file with deflate at compression level 7. If recompress is false, blocks will be copied without changing the compression level. If true, they will be converted to the new compression level.

Parameters:
otherFile -
recompress -
Throws:
IOException

sync

public long sync()
          throws IOException
Return the current position as a value that may be passed to DataFileReader.seek(long). Forces the end of the current block, emitting a synchronization marker.

Throws:
IOException

flush

public void flush()
           throws IOException
Flush the current state of the file.

Specified by:
flush in interface Flushable
Throws:
IOException

close

public void close()
           throws IOException
Close the file.

Specified by:
close in interface Closeable
Throws:
IOException


Copyright © 2010 The Apache Software Foundation