Avro C#
Loading...
Searching...
No Matches
Classes | Public Types | Public Member Functions | Static Public Member Functions | Protected Member Functions | Static Protected Member Functions | Properties | List of all members
Avro.IO.Parsing.Symbol Class Referenceabstract

Symbol is the base of all symbols (terminals and non-terminals) of the grammar. More...

Inheritance diagram for Avro.IO.Parsing.Symbol:
Avro.IO.Parsing.Symbol.Alternative Avro.IO.Parsing.Symbol.ImplicitAction Avro.IO.Parsing.Symbol.IntCheckAction Avro.IO.Parsing.Symbol.Repeater Avro.IO.Parsing.Symbol.Root Avro.IO.Parsing.Symbol.Sequence Avro.IO.Parsing.Symbol.Terminal Avro.IO.Parsing.Symbol.DefaultStartAction Avro.IO.Parsing.Symbol.ErrorAction Avro.IO.Parsing.Symbol.FieldAdjustAction Avro.IO.Parsing.Symbol.FieldOrderAction Avro.IO.Parsing.Symbol.ResolvingAction Avro.IO.Parsing.Symbol.SkipAction Avro.IO.Parsing.Symbol.UnionAdjustAction Avro.IO.Parsing.Symbol.WriterUnionAction Avro.IO.Parsing.Symbol.EnumLabelsAction

Classes

class  Alternative
 Alternative symbol. More...
 
class  DefaultStartAction
 The default start action. More...
 
class  EnumLabelsAction
 The enum labels action. More...
 
class  ErrorAction
 The error action. More...
 
class  FieldAdjustAction
 The field adjust action. More...
 
class  FieldOrderAction
 THe field order action. More...
 
class  Fixup
 Fixup symbol. More...
 
class  ImplicitAction
 Implicit action. More...
 
class  IntCheckAction
 Int check action. More...
 
class  Repeater
 Repeater symbol. More...
 
class  ResolvingAction
 The resolving action. More...
 
class  Root
 Root symbol. More...
 
class  Sequence
 Sequence symbol. More...
 
class  SkipAction
 The skip action. More...
 
class  Terminal
 Terminal symbol. More...
 
class  UnionAdjustAction
 The union adjust action. More...
 
class  WriterUnionAction
 The writer union action. More...
 

Public Types

enum  Kind {
  Terminal , Root , Sequence , Repeater ,
  Alternative , ImplicitAction , ExplicitAction
}
 The type of symbol. More...
 

Public Member Functions

virtual int FlattenedSize ()
 Returns the flattened size.
 

Static Public Member Functions

static Symbol NewRoot (params Symbol[] symbols)
 A convenience method to construct a root symbol.
 
static Symbol NewSeq (params Symbol[] production)
 A convenience method to construct a sequence.
 
static Symbol NewRepeat (Symbol endSymbol, params Symbol[] symsToRepeat)
 A convenience method to construct a repeater.
 
static Symbol NewAlt (Symbol[] symbols, string[] labels)
 A convenience method to construct a union.
 

Protected Member Functions

 Symbol (Kind kind)
 Constructs a new symbol of the given kind.
 
 Symbol (Kind kind, Symbol[] production)
 Constructs a new symbol of the given kind and production.
 
virtual Symbol Flatten (IDictionary< Sequence, Sequence > map, IDictionary< Sequence, IList< Fixup > > map2)
 Flatten the given sub-array of symbols into a sub-array of symbols.
 

Static Protected Member Functions

static Symbol Error (string e)
 A convenience method to construct an ErrorAction.
 
static Symbol Resolve (Symbol w, Symbol r)
 A convenience method to construct a ResolvingAction.
 
static void Flatten (Symbol[] input, int start, Symbol[] output, int skip, IDictionary< Sequence, Sequence > map, IDictionary< Sequence, IList< Fixup > > map2)
 Flattens the given sub-array of symbols into an sub-array of symbols. Every Sequence in the input are replaced by its production recursively. Non-Sequence symbols, they internally have other symbols those internal symbols also get flattened. When flattening is done, the only place there might be Sequence symbols is in the productions of a Repeater, Alternative, or the symToParse and symToSkip in a UnionAdjustAction or SkipAction.
 
static int FlattenedSize (Symbol[] symbols, int start)
 Returns the amount of space required to flatten the given sub-array of symbols.
 

Properties

Kind SymKind [get]
 The kind of this symbol.
 
Symbol[] Production [get]
 The production for this symbol. If this symbol is a terminal this is null. Otherwise this holds the the sequence of the symbols that forms the production for this symbol. The sequence is in the reverse order of production. This is useful for easy copying onto parsing stack.
 
static Symbol Null = new Terminal("null") [get]
 The terminal symbols for the grammar.
 
static Symbol Boolean = new Terminal("boolean") [get]
 Boolean.
 
static Symbol Int = new Terminal("int") [get]
 Int.
 
static Symbol Long = new Terminal("long") [get]
 Long.
 
static Symbol Float = new Terminal("float") [get]
 Float.
 
static Symbol Double = new Terminal("double") [get]
 Double.
 
static Symbol String = new Terminal("string") [get]
 String.
 
static Symbol Bytes = new Terminal("bytes") [get]
 Bytes.
 
static Symbol Fixed = new Terminal("fixed") [get]
 Fixed.
 
static Symbol Enum = new Terminal("enum") [get]
 Enum.
 
static Symbol Union = new Terminal("union") [get]
 Union.
 
static Symbol ArrayStart = new Terminal("array-start") [get]
 ArrayStart.
 
static Symbol ArrayEnd = new Terminal("array-end") [get]
 ArrayEnd.
 
static Symbol MapStart = new Terminal("map-start") [get]
 MapStart.
 
static Symbol MapEnd = new Terminal("map-end") [get]
 MapEnd.
 
static Symbol ItemEnd = new Terminal("item-end") [get]
 ItemEnd.
 
static Symbol WriterUnion = new WriterUnionAction() [get]
 WriterUnion.
 
static Symbol FieldAction = new Terminal("field-action") [get]
 FieldAction - a pseudo terminal used by parsers.
 
static Symbol RecordStart = new ImplicitAction(false) [get]
 RecordStart.
 
static Symbol RecordEnd = new ImplicitAction(true) [get]
 RecordEnd.
 
static Symbol UnionEnd = new ImplicitAction(true) [get]
 UnionEnd.
 
static Symbol FieldEnd = new ImplicitAction(true) [get]
 FieldEnd.
 
static Symbol DefaultEndAction = new ImplicitAction(true) [get]
 DefaultEndAction.
 
static Symbol MapKeyMarker = new Terminal("map-key-marker") [get]
 MapKeyMarker.
 

Detailed Description

Symbol is the base of all symbols (terminals and non-terminals) of the grammar.

Member Enumeration Documentation

◆ Kind

The type of symbol.

Enumerator
Terminal 

terminal symbols which have no productions

Root 

Start symbol for some grammar.

Sequence 

non-terminal symbol which is a sequence of one or more other symbols

Repeater 

non-terminal to represent the contents of an array or map

Alternative 

non-terminal to represent the union

ImplicitAction 

non-terminal action symbol which are automatically consumed

ExplicitAction 

non-terminal action symbol which is explicitly consumed

Member Function Documentation

◆ Error()

static Symbol Avro.IO.Parsing.Symbol.Error ( string  e)
staticprotected

A convenience method to construct an ErrorAction.

Parameters
e

◆ Flatten() [1/2]

virtual Symbol Avro.IO.Parsing.Symbol.Flatten ( IDictionary< Sequence, Sequence map,
IDictionary< Sequence, IList< Fixup > >  map2 
)
protectedvirtual

◆ Flatten() [2/2]

static void Avro.IO.Parsing.Symbol.Flatten ( Symbol[]  input,
int  start,
Symbol[]  output,
int  skip,
IDictionary< Sequence, Sequence map,
IDictionary< Sequence, IList< Fixup > >  map2 
)
inlinestaticprotected

Flattens the given sub-array of symbols into an sub-array of symbols. Every Sequence in the input are replaced by its production recursively. Non-Sequence symbols, they internally have other symbols those internal symbols also get flattened. When flattening is done, the only place there might be Sequence symbols is in the productions of a Repeater, Alternative, or the symToParse and symToSkip in a UnionAdjustAction or SkipAction.

Why is this done? We want our parsers to be fast. If we left the grammars unflattened, then the parser would be constantly copying the contents of nested Sequence productions onto the parsing stack. Instead, because of flattening, we have a long top-level production with no Sequences unless the Sequence is absolutely needed, e.g., in the case of a Repeater or an Alternative.

Well, this is not exactly true when recursion is involved. Where there is a recursive record, that record will be "inlined" once, but any internal (ie, recursive) references to that record will be a Sequence for the record. That Sequence will not further inline itself – it will refer to itself as a Sequence. The same is true for any records nested in this outer recursive record. Recursion is rare, and we want things to be fast in the typical case, which is why we do the flattening optimization.

The algorithm does a few tricks to handle recursive symbol definitions. In order to avoid infinite recursion with recursive symbols, we have a map of Symbol->Symbol. Before fully constructing a flattened symbol for a Sequence we insert an empty output symbol into the map and then start filling the production for the Sequence. If the same Sequence is encountered due to recursion, we simply return the (empty) output Sequence from the map. Then we actually fill out the production for the Sequence. As part of the flattening process we copy the production of Sequences into larger arrays. If the original Sequence has not not be fully constructed yet, we copy a bunch of nulls. Fix-up remembers all those null patches. The fix-ups gets finally filled when we know the symbols to occupy those patches.

Parameters
inputThe array of input symbols to flatten
startThe position where the input sub-array starts.
outputThe output that receives the flattened list of symbols. The output array should have sufficient space to receive the expanded sub-array of symbols.
skipThe position where the output input sub-array starts.
mapA map of symbols which have already been expanded. Useful for handling recursive definitions and for caching.
map2A map to to store the list of fix-ups.

◆ FlattenedSize() [1/2]

virtual int Avro.IO.Parsing.Symbol.FlattenedSize ( )
virtual

Returns the flattened size.

Reimplemented in Avro.IO.Parsing.Symbol.Sequence.

◆ FlattenedSize() [2/2]

static int Avro.IO.Parsing.Symbol.FlattenedSize ( Symbol[]  symbols,
int  start 
)
inlinestaticprotected

Returns the amount of space required to flatten the given sub-array of symbols.

Parameters
symbolsThe array of input symbols.
startThe index where the subarray starts.
Returns
The number of symbols that will be produced if one expands the given input.

◆ NewRepeat()

static Symbol Avro.IO.Parsing.Symbol.NewRepeat ( Symbol  endSymbol,
params Symbol[]  symsToRepeat 
)
static

A convenience method to construct a repeater.

Parameters
endSymbolThe end symbol.
symsToRepeatThe symbols to repeat in the repeater.

◆ NewSeq()

static Symbol Avro.IO.Parsing.Symbol.NewSeq ( params Symbol[]  production)
static

A convenience method to construct a sequence.

Parameters
productionThe constituent symbols of the sequence.

◆ Resolve()

static Symbol Avro.IO.Parsing.Symbol.Resolve ( Symbol  w,
Symbol  r 
)
staticprotected

A convenience method to construct a ResolvingAction.

Parameters
wThe writer symbol
rThe reader symbol

Property Documentation

◆ Production

Symbol [] Avro.IO.Parsing.Symbol.Production
get

The production for this symbol. If this symbol is a terminal this is null. Otherwise this holds the the sequence of the symbols that forms the production for this symbol. The sequence is in the reverse order of production. This is useful for easy copying onto parsing stack.

Please note that this is a final. So the production for a symbol should be known before that symbol is constructed. This requirement cannot be met for those symbols which are recursive (e.g. a record that holds union a branch of which is the record itself). To resolve this problem, we initialize the symbol with an array of nulls. Later we fill the symbols. Not clean, but works. The other option is to not have this field a final. But keeping it final and thus keeping symbol immutable gives some comfort. See various generators how we generate records.


The documentation for this class was generated from the following file: