Built-in fields specification¶
Destructify comes with a smorgasbord of built-in field types. This means that you can specify the most common structures right out of the box.
Common attributes¶
All fields are subclasses of Field and therefore come with some properties by default. These are the following
and can be defined on every class:
-
Field.name¶ The field name. This is set automatically by the
Structure’s metaclass when it is initialized.
-
Field.default¶ The field’s default value. This is used when the
Structureis initialized if it is provided. If it is not provided, the field determines its own default value.You can set it to one of the following:
- A callable with zero arguments
- A callable taking a
ParsingContext.fobject - A value
All of the following are valid usages of the default attribute:
Field(default=None) Field(default=3) Field(default=lambda: datetime.datetime.now()) Field(default=lambda c: c.value)
You can check whether a default is set using the
Field.has_defaultattribute. The default given a context is obtained by callingField.get_default(context)
-
Field.override¶ Using
Field.override, you can change the value of the field in a structure, just before it is being written to a stream. This is useful if you, for instance, wish to override a field’s value based on some other property in the structure. For instance, you can change a length field based on the actual length of a field.You can set it to one of the following:
- A value
- A callable taking a
ParsingContext.fobject and the current value of the field
For instance:
Field(override=3) Field(override=lambda c, v: c.value if v is None else v)
You can check whether an override is set using the
Field.has_overrideattribute. The override given a context is obtained by callingField.get_overridden_value(value, context). Note, however, that you probably want to callField.get_final_value()instead.
-
Field.decoder¶ -
Field.encoder¶ Sometimes, a field value can be different than the value in the binary structure. This can happen, for instance, if the value in the structure is off-by-one. Rather than overriding
Field.overridewhile writing, you can useField.encoderandField.decoderto change the way a value is written to and read from the stream, respectively.You can set it to a callable taking the current value of the field:
Field(decoder=lambda v: v * 2, encoder=lambda v: v // 2)
The
Field.decoderis used when reading from the stream. It is called fromField.decode_value().Field.encoderis used when writing to the stream. It is called fromField.encode_value()
-
Field.offset¶ -
Field.skip¶ The offset of the field absolutely in the stream (in the case of
offset), or the offset of the field relative to the previous field (in the case ofskip).offsetcan be a negative value to indicate an offset from the end of the stream.You can’t set both at the same time. You can set each to one of the following:
- A callable with zero arguments
- A callable taking a
ParsingContext.fobject - A string that represents the field name that contains the value
- An integer
Fields are always processed in the order they are defined, so a field following a field that has one of these attributes set, will continue from the then-current position.
When you set
offsetorskip,StructureOptions.alignmentis ignored for this field.The value of
skipis automatically accounted for when usinglen(Structure). Ifoffsetis set,len(Structure)is not possible anymore.
-
Field.lazy¶ A lazy field is not parsed from the stream during the parsing of the bytes; its parsing is deferred until the value is evaluated. This is done by returning a Proxy object from the module lazy-object-proxy that references the offset of the field in the stream and the stream itself. The first time the Proxy object is evaluated, the stream is read and the data is parsed. This Proxy object can be used almost the same as an actual value.
This requires that the stream is not closed when not all lazy fields have been parsed. Additionally, the stream must be seekable to find the appropriate data.
Note that specifying
lazydoes not prohibit the parser to parse the field anyway, and return the actual value rather than a Proxy object. Some cases where this happens:- The
lazyattribute has no effect when a value can not be retrieved lazily, i.e.Field.seek_end()returnsNone, and the next field defines no absoluteoffset. In this case, the field must still be parsed to retrieve its full length, and is therefore parsed immediately. - When
lazyfields are referenced and subsequently parsed during parsing, theStructurewill be built with the actual value rather than the Proxy object.
Additionally,
lazyfields that have an absoluteoffsetset (to an integer value), can be referenced during parsing, even if they are defined later.This attribute has no effect when writing to a stream; a lazy value will be resolved by
Structure.to_stream().- The
BytesField¶
-
class
destructify.BytesField(*args, length=None, terminator=None, step=1, terminator_handler='consume', strict=True, padding=None, **kwargs)¶ A
BytesFieldcan be used to read bytes from a stream. This is most commonly used as a base class for other methods, as it can be used for the most common use cases.There are three typical ways to use this field:
- Setting a
BytesField.lengthto read a specified amount of bytes from a stream. - Setting a
BytesField.terminatorto read until the specified byte from a stream. - Setting both
BytesField.lengthandBytesField.terminatorto first read the specified amount of bytes from a stream and then find the terminator in this amount of bytes.
-
length¶ This specifies the length of the field. This is the amount of data that is read from the stream and written to the stream. The length may also be negative to indicate an unbounded read, i.e. until the end of stream.
You can set this attribute to one of the following:
- A callable with zero arguments
- A callable taking a
ParsingContext.fobject - A string that represents the field name that contains the length
- An integer
For instance:
class StructureWithLength(Structure): length = UnsignedByteField() value = BytesField(length='length')
The length given a context is obtained by calling
FixedLengthField.get_length(value, context).
When the class is initialized on a
Structure, and the length property is specified using a string, the default implementation of theField.overrideon the named attribute of theStructureis changed to match the length of the value in thisField.Continuing the above example, the following works automatically:
>>> bytes(StructureWithLength(value=b"123456")) b'\x06123456'
However, explicitly specifying the length would override this:
>>> bytes(StructureWithLength(length=1, value=b"123456")) b'\x01123456'
This behaviour can be changed by manually specifying a different
Field.overrideonlength.-
strict¶ This boolean (defaults to
True) enables raising errors in the following cases:- A
StreamExhaustedErrorwhen there are not sufficient bytes to completely fill the field while reading. - A
StreamExhaustedErrorwhen the terminator is not found while reading. - A
WriteErrorwhen there are not sufficient bytes to fill the field while writing andpaddingis not set. - A
WriteErrorwhen the field must be padded, but the bytes that are to be written are not a multiple of the size ofpadding. - A
WriteErrorwhen there are too many bytes to fit in the field while writing. - A
WriteErrorwhen the terminator is missing from the value, when using theterminator_handlerinclude
Disabling
BytesField.strictis not recommended, as this may cause inadvertent errors.- A
-
padding¶ When set, this value is used to pad the bytes to fill the entire field while writing, and chop this off the value while reading. Padding is removed right to left and must be aligned to the end of the value (which matters for multibyte paddings).
While writing in
strictmode, and the remaining bytes are not a multiple of the length of this value, aWriteErroris raised. Ifstrictmode is not enabled, the padding will simply be appended to the value and chopped of whenever required. However, this can’t be parsed back by Destructify (as the padding is not aligned to the end of the structure).This can only be set when
lengthis used.
-
terminator¶ The terminator to read until. It can be multiple bytes.
When this is set,
paddingis ignored while reading from a stream, but may be used to pad bytes that are written.
-
step¶ The size of the steps for finding the terminator. This is useful if you have a multi-byte terminator that is aligned. For instance, when reading NULL-terminated UTF-16 strings, you’d expect two NULL bytes aligned to two bytes (from the start). Defaults to 1.
Example usage:
>>> class TerminatedStructure(Structure): ... foo = BytesField(terminator=b'\0') ... bar = BytesField(terminator=b'\r\n') ... >>> TerminatedStructure.from_bytes(b"hello\0world\r\n") <TerminatedStructure: TerminatedStructure(foo=b'hello', bar=b'world')>
-
terminator_handler¶ A string defining what to do with the terminator as soon as it is encountered. You have three options:
consume- This is the default handler, and consumes the terminator, leaving it off the resulting value.
include- This handler will include the entire terminator into the resulting value. You must also write it back yourself.
until- This handler is only available when you are not using
length, allowing you to consume up until, but not including the terminator. This means that the next field will include the terminator.
This class can be used trivially to extend functionality. For instance,
StringFieldis a subclass of this field.- Setting a
FixedLengthField¶
-
class
destructify.FixedLengthField(length, *args, **kwargs)¶ This class is identical to
BytesField, but specifies the length as a required first argument. It is intended to read a fixed amount ofBytesField.lengthbytes.
TerminatedField¶
-
class
destructify.TerminatedField(terminator=b'x00', *args, **kwargs)¶ This class is identical to
BytesField, but specifies the terminator as its first argument, defaulting to a single NULL-byte. It is intended to continue reading untilBytesField.terminatoris hit.
StringField¶
-
class
destructify.StringField(*args, encoding=None, errors='strict', **kwargs)¶ The
StringFieldis a subclass ofBytesFieldthat converts the resultingbytesobject to astrobject, given theencodinganderrorsattributes.See
BytesFieldfor all available attributes.-
encoding¶ The encoding of the string. This defaults to the value set on the
StructureOptions, which defaults toutf-8, but can be any encoding supported by Python.
-
errors¶ The error handler for encoding/decoding failures. Defaults to Python’s default of
strict.
-
IntegerField¶
-
class
destructify.IntegerField(length, byte_order=None, *args, signed=False, **kwargs)¶ The
IntegerFieldis used for fixed-length representations of integers.Note
The
IntegerFieldis not to be confused with theIntField, which is based onStructField.-
length¶ The length (in bytes) of the field. When writing a number that is too large to be held in this field, you will get an
OverflowError.
-
byte_order¶ The byte order (i.e. endianness) of the bytes in this field. If you do not specify this, you must specify a
byte_orderon the structure.
-
signed¶ Boolean indicating whether the integer is to be interpreted as a signed or unsigned integer.
-
VariableLengthIntegerField¶
-
class
destructify.VariableLengthIntegerField(*, name=None, default=NOT_PROVIDED, override=NOT_PROVIDED, decoder=None, encoder=None, offset=None, skip=None, lazy=False)¶ Implementation of a variable-length quantity structure.
BitField¶
-
class
destructify.BitField(length, *args, realign=False, **kwargs)¶ A subclass of
FixedLengthField, reading bits rather than bytes. The field writes and reads integers.When using the
BitField, you must be careful to align the field to whole bytes. You can use multipleBitFields consecutively without any problem, but the following would raise errors:class MultipleBitFields(Structure): bit0 = BitField(length=1) bit1 = BitField(length=1) byte = FixedLengthField(length=1)
You can fix this by ensuring all consecutive bit fields align to a byte in total, or, alternatively, you can specify
realignon the lastBitFieldto realign to the next byte.-
length¶ The amount of bits to read.
-
realign¶ This specifies whether the stream must be realigned to entire bytes after this field. If set, after bits have been read, bits are skipped until the next whole byte. This means that the intermediate bits are ignored. When writing and this boolean is set, it is padded with zero-bits until the next byte boundary.
Note that this means that the following:
class BitStructure(Structure): foo = BitField(length=5, realign=True) bar = FixedLengthField(length=1)
Results in this parsing structure:
76543210 76543210 fffff bbbbbbbb
Thus, ignoring bits 2-0 from the first byte.
A
BitFieldhas some important gotchas and exceptions to normal fields:StructureOptions.alignmentis ignored when twoBitFieldfollow each other, and the previous field does not specifyrealign.Field.skipandField.offsetmust be specified in entire bytes, and require the field to be aligned.Field.lazydoes not work, due to complexities with parsing partial bytes.len(BitField)returns the value in bits rather than in bytes.len(Structure)works properly, but requires that all fields are aligned, including the last field.
-
ConstantField¶
-
class
destructify.ConstantField(value, base_field=None, *args, **kwargs)¶ The
ConstantFieldis intended to read/write a specific magic string from and to a stream. If anything else is read or written, an exception is raised. Note that theField.defaultis also set to the magic.-
value¶ The magic bytes that must be checked against.
-
base_field¶ The field to read the
valuefrom. If this is not set, andvalueis a bytes object, aFixedLengthFieldas its default. If the value is of any other object, you must specify this yourself.
-
StructField¶
-
class
destructify.StructField(format=None, byte_order=None, *args, multibyte=True, **kwargs)¶ The
StructFieldenables you to use Pythonstructconstructs if you wish to. Note that using complex formats in this field kind-of defeats the purpose of this module.-
format¶ The format to be passed to the
structmodule. See Struct Format Strings in the manual of Python for information on how to construct these.You do not need to include the byte order in this attribute. If you do, it acts as a default for the
byte_orderattribute if you do not specify one.
-
byte_order¶ The byte order to use for the struct. If this is not specified, and none is provided in the
formatfield, it defaults to thebyte_orderspecified in the meta of thedestructify.structures.Structure.
-
multibyte¶ When set to
False, the Python representation of this field is the first result of the tuple as returned by thestructmodule. Otherwise, the tuple is the result.
-
Subclasses of StructField¶
This project also provides several default implementations for the different types of structs. For each of the formats described in Struct Format Strings, there is a single-byte class. Note that you must specify your own
Each of the classes is listed in the table below.
Hint
Use a IntegerField when you know the amount of bytes you need to parse. Classes below are typically used
for system structures and the IntegerField is typically used for network structures.
| Base class | Format |
|---|---|
CharField |
c |
ByteField |
b |
UnsignedByteField |
B |
BoolField |
? |
ShortField |
h |
UnsignedShortField |
H |
IntField |
i |
UnsignedIntField |
I |
LongField |
l |
UnsignedLongField |
L |
LongLongField |
q |
UnsignedLongLongField |
Q |
SizeField |
n |
UnsignedSizeField |
N |
HalfPrecisionFloatField |
e |
FloatField |
f |
DoubleField |
d |
StructureField¶
-
class
destructify.StructureField(structure, *args, length=None, **kwargs)¶ The
StructureFieldis intended to create a structure that nests other structures. You can use this for complex structures, or when combined with for instance anArrayFieldto create arrays of structures, and when combined withSwitchFieldto create type-based structures.-
length¶ The length of this structure. This allows you to limit the structure’s length. This is particularly useful when you have a
Structurethat contains an unbounded read, but the encapsulating structure limits this.- A callable with zero arguments
- A callable taking a
ParsingContext.fobject - A string that represents the field name that contains the size
- An integer
When specified using a string, this field does not override the value of the referenced field due to complications in calculating the length.
During reading and writing, if the specified length is larger than the structure, the remaining bytes are skipped. If it is shorter, the structure parsing will break.
Example usage:
>>> class Sub(Structure): ... foo = FixedLengthField(length=11) ... >>> class Encapsulating(Structure): ... bar = StructureField(Sub) ... >>> s = Encapsulating.from_bytes(b"hello world") >>> s <Encapsulating: Encapsulating(bar=<Sub: Sub(foo=b'hello world')>)> >>> s.bar <Sub: Sub(foo=b'hello world')> >>> s.bar.foo b'hello world'
This field providesthe
ParsingContextof the substructure inFieldContext.subcontext.-
ArrayField¶
-
class
destructify.ArrayField(base_field, count=None, length=None, until=None, *args, **kwargs)¶ A field that repeats the provided base field multiple times. The implementation will build a structure-like parsing context with field names that are the element indexes.
-
base_field¶ The field that is to be repeated.
-
count¶ This specifies the amount of repetitions of the base field.
You can set it to one of the following:
- A callable with zero arguments
- A callable taking a
ParsingContext.fobject - A string that represents the field name that contains the size
- An integer
The count given a context is obtained by calling
ArrayField.get_count(value, context).When this attribute is set using a string, and the referenced field does not have an override set, the override of this field will be set to take the length of the value of this field.
When writing, the count must exactly match the amount of items in the provided iterable.
Example usage:
>>> class ArrayStructure(Structure): ... count = UnsignedByteField() ... foo = ArrayField(TerminatedField(terminator=b'\0'), count='count') ... >>> s = ArrayStructure.from_bytes(b"\x02hello\0world\0") >>> s.foo [b'hello', b'world']
-
length¶ This specifies the size of the field, if you do not know the count of the fields, but do know the size.
You can set it to one of the following:
- A callable with zero arguments
- A callable taking a
ParsingContext.fobject - A string that represents the field name that contains the size
- An integer
The length given a context is obtained by calling
ArrayField.get_length(value, context).You can specify a negative length if you want to read until the stream ends. Note that this is currently implemented by swallowing a
StreamExhaustedErrorfrom the base field.When specified using a string, this field does not override the value of the referenced field due to complications in calculating the length.
When writing using a positive length, the written amount of bytes must be exactly the specified length.
-
until¶ This is a function taking a context and the value of the most-recent parsed element. If this function returns true, the parsing stops.
This function is ignored during writing.
-
ConditionalField¶
-
class
destructify.ConditionalField(base_field, condition, *args, fallback=None, **kwargs)¶ A field that may or may not be present. When the
conditionevaluates to true, thebase_fieldfield is parsed, otherwise the field isNone.-
base_field¶ The field that is conditionally present.
-
condition¶ This specifies the condition on whether the field is present.
You can set it to one of the following:
- A callable with zero arguments
- A callable taking a
ParsingContext.fobject - A string that represents the field name that evaluates to true or false. Note that
b'\0'evaluates to true. - A value that is to be evaluated
The condition given a context is obtained by calling
ConditionalField.get_condition(value, context).
-
fallback¶ The value that is used in the structure when loading from the stream and no value was present in the stream. Defaults to
None, but could be any value.
-
SwitchField¶
-
class
destructify.SwitchField(cases, switch, *args, other=None, **kwargs)¶ The
SwitchFieldcan be used to represent various types depending on some other value. You set the different cases using a dictionary of value-to-field-types in thecasesattribute. Theswitchvalue defines the case that is applied. If none is found, an error is raised, unlessotheris set.-
switch¶ This specifies the switch, i.e. the key for
cases.You can set it to one of the following:
- A callable with zero arguments
- A callable taking a
ParsingContext.fobject - A string that represents the field name that evaluates to the value of the condition
- A value that is to be evaluated
-
other¶ The ‘default’ case that is used when the
switchis not part of thecases. If not specified, and an unknown value is encountered, an exception is raised.Hint
A confusion is easily made by setting
Field.defaultinstead ofother, though their purposes are entirely different.
Example:
class ConditionalStructure(Structure): type = EnumField(IntegerField(1), enum=Types) perms = SwitchField(cases={ Types.FIRST: StructureField(Structure1), Types.SECOND: StructureField(Structure2), }, other=StructureField(Structure0), switch='type')
-
EnumField¶
-
class
destructify.EnumField(base_field, enum, *args, **kwargs)¶ A field that takes the value as evaluated by the
base_fieldand parses it as the providedenum.While writing, the value can be of a enum member of specified
enum, a string referencing an enum member, or the value that is to be written. Note that providing a string that is not a valid enum member, will be passed to the field directly.During parsing, a value must be a valid enum member, or the enum must properly handle the case of missing members.
-
base_field¶ The field that returns the value that is provided to the
enum.Enum
-
enum¶ The
enum.Enumclass.
You can also use an
EnumFieldto handle flags:>>> class Permissions(enum.IntFlag): ... R = 4 ... W = 2 ... X = 1 ... >>> class EnumStructure(Structure): ... perms = EnumField(UnsignedByteField(), enum=Permissions) ... >>> EnumStructure.from_bytes(b"\x05") <EnumStructure: EnumStructure(perms=<Permissions.R|X: 5>)>
-