Changes between Version 5 and Version 6 of StructuredBinaryData

2012-05-08T04:57:59Z (12 years ago)
Sean Bartell

Requirements and Existing Tools


  • StructuredBinaryData

    v5 v6  
    88[ GSoC project page].
     10== Requirements ==
     12* View on different levels; for instance, view the integer and sequence of
     13  bytes comprising a string if necessary.
     14* Check whether files are consistent.
     15* Handle broken files.
     16* Don’t try to read the whole file at once.
     17* Allow full modifications. Ideally, allow creation of a whole filesystem from scratch.
    1019== Existing Tools ==
    1423=== [ Construct] ===
    16 TODO: looks promising. Also look at issues and forks.
     25A Python library for creating declarative structure definitions. Each instance
     26of the `Construct` class has a name, and knows how to read from a stream, write
     27to a stream, and determine its length. Some predefined `Construct` subclasses
     28use an arbitrary Python function evaluated at runtime, or behave differently
     29depending on whether sub‐`Construct`s throw exceptions. `Const` uses a
     30sub‐`Construct` and makes sure the value is correct. Also has lazy
     33Unfortunately, if you change the size of a structure, you still have to change
     34everything else manually.
     36TODO: look at issues and forks.
    1838=== [ BinData] ===
    20 TODO: looks promising.
     40Makes good use of Ruby syntax; mostly has the same features as Construct.
    22 === [ Wireshark Generic Dissector] ===
     42=== Imperative DSLs ===
    24 The length and real value of a field can depend on all previous fields and use
    25 complex expressions. Structures can contain `if`/`while`/`continue`/`break`/…
    26 statements.
     44DSLs in this category are used in an obvious, deterministic manner, and complex
     45structures can’t be edited. They are simple imperative languages in which
     46fields, structures, bitstructures, and arrays can be defined. The length,
     47decoded value, and presence of fields can be determined by expressions using
     48any previously decoded field, and structures can use
     49`if`/`while`/`continue`/`break` and similar statements. Structures can inherit
     50from other structures, meaning that the parent’s fields are present at the
     51beginning of the child. Statements can move to a different offset in the input
     52data. There may be a real programming language that can be used along with the
     55 [ PyFFI]::
     56  Lets you create or modify files instead of just reading them. Fields can
     57  refer to blocks of data elsewhere in the file. Uses an XML format.
     58 [ Synalize It!]::
     59  Not completely imperative; if you declare optional structs where part of the
     60  data is constant, the correct struct will be displayed. Has a Graphviz export
     61  of file structure. Uses an XML format.
     62 Other free::
     63  [ Wireshark Generic Dissector].
     64 Other proprietary::
     65  [ Hex Editor Neo].
    2867=== Less interesting tools ===
     69 Simple formats in hex editors::
     70  These support static fields and dynamic lengths only:
     71  [ FlexHex],
     72  [ HexEdit],
     73  [ Hex Workshop], and
     74  [ Okteta].
     75 Simple formats elsewhere::
     76  [ ffe],
     77  [ Node Packet], and
     78  [ Scapy]
     79  can only handle trivial structures.
     80  [ Python’s struct] and
     81  [ VStruct]
     82  use concise string formats to describe simple structures.
     83  [ Hachoir]
     84  uses Python for most things.
     85 Protocol definition formats::
     86  [ ASN.1],
     87  [ MIDL],
     88  [ Piqi],
     89  and other IPC implementations go in the other direction: they generate a
     90  binary format from a text description of a structure. ASN.1 in particular
     91  has many features.
    3092 [ Wireshark] and [ tcpdump]::
    3193  As the Construct wiki notes, you would expect these developers to have some
    3294  sort of DSL, but they just use C for everything. Wireshark does use ASN.1,
    3395  Diameter, and MIDL for protocols developed with them.
    34  [ Okteta]::
    35   Has an XML format for simple structures, where the length of a field can
    36   depend on a previous value. Also has an on‐line database of structures, but
    37   it isn’t very popular—there are only nine submissions!
    38  Other simple formats::
    39   [ ffe] can only handle trivial
    40   structures. [ Python’s struct]
    41   and [ VStruct] use concise string formats
    42   to describe simple structures.
    43  Other hex editors::
    44   [ Beye], [ Bless], and
    45   [ GHex] lack interesting features.
    46  Protocol definition formats::
    47   [ ASN.1],
    48   [ MIDL],
    49   and other IPC implementations go in the other direction: they generate a
    50   binary format from a text description of a structure. ASN.1 in particular
    51   has many features.