12 | | * View on different levels; for instance, view the integer and sequence of |
13 | | bytes comprising a string if necessary. |
14 | | * Check whether files are consistent. |
15 | | * Handle broken files. |
16 | | * Don’t try to read the whole file at once. |
17 | | * Allow full modifications. Ideally, allow creation of a whole filesystem from scratch. |
| 12 | * Work in HelenOS—this means the code must be in C and/or an easily ported |
| 13 | language like Lua. |
| 14 | * View on different layers; for instance, switch between viewing the formatted |
| 15 | date and time for a FAT directory entry, the integers, and the original |
| 16 | bytes. |
| 17 | * Check whether data is valid; handle broken data reasonably well. |
| 18 | * Parse pieces of the data lazily; don’t try to read everything at once. |
| 19 | * Work in both directions (parsing and building) without requiring too much |
| 20 | extra effort. |
| 21 | * Support full modifications. Ideally, allow creation of a whole filesystem |
| 22 | from scratch. |
| 23 | |
| 24 | == Interesting formats == |
| 25 | |
| 26 | These formats will be interesting and/or difficult to handle. I will keep them |
| 27 | in mind when designing the library. |
| 28 | |
| 29 | * Filesystem allocation tables, which should be kept consistent with the actual |
| 30 | usage of the disk. |
| 31 | * Filesystem logs, which should be applied to the rest of the disk before |
| 32 | interpreting it. |
| 33 | * Formats where the whole file can have either endianness depending on a field |
| 34 | in the header. |
| 35 | * The [http://www.blender.org/development/architecture/blender-file-format/ Blender file format] |
| 36 | is especially dynamic. When Blender saves a file, it just copies the |
| 37 | structures from memory and translates the pointers. Since each Blender |
| 38 | version and architecture will have different structures, the output file |
| 39 | includes a header describing the fields and binary layout of each structure. |
| 40 | When the file is loaded, the header is read first and the structures will be |
| 41 | translated as necessary. |
| 42 | * If the language is powerful enough, it might be possible to have a native |
| 43 | description of Zlib and other compression formats. |
| 44 | * It could be interesting to parse ARM or x86 machine code. |
45 | | structures can’t be edited. They are simple imperative languages in which |
46 | | fields, structures, bitstructures, and arrays can be defined. The length, |
47 | | decoded value, and presence of fields can be determined by expressions using |
48 | | any previously decoded field, and structures can use |
49 | | `if`/`while`/`continue`/`break` and similar statements. Structures can inherit |
50 | | from other structures, meaning that the parent’s fields are present at the |
51 | | beginning of the child. Statements can move to a different offset in the input |
52 | | data. There may be a real programming language that can be used along with the |
53 | | DSL. |
| 71 | edits (changing the length of a structure) are difficult or impossible. They |
| 72 | are simple imperative languages in which fields, structures, bitstructures, and |
| 73 | arrays can be defined. The length, decoded value, and presence of fields can be |
| 74 | determined by expressions using any previously decoded field, and structures |
| 75 | can use `if`/`while`/`continue`/`break` and similar statements. Structures can |
| 76 | inherit from other structures, meaning that the parent’s fields are present at |
| 77 | the beginning of the child. Statements can move to a different offset in the |
| 78 | input data. There may be a real programming language that can be used along |
| 79 | with the DSL. |
63 | | [http://wsgd.free.fr/ Wireshark Generic Dissector]. |
| 98 | [http://www-old.bro-ids.org/wiki/index.php/BinPAC_Userguide BinPAC], |
| 99 | [https://metacpan.org/module/Data::ParseBinary Data::ParseBinary], |
| 100 | [http://datascript.berlios.de/DataScriptLanguageOverview.html DataScript], |
| 101 | [http://www.dataworkshop.de/ DataWorkshop], |
| 102 | [http://wsgd.free.fr/ Wireshark Generic Dissector], |
| 103 | [http://metafuzz.rubyforge.org/binstruct/ Metafuzz BinStruct], and |
| 104 | [http://www.padsproj.org/ PADS]. |
65 | | [http://www.hhdsoftware.com/doc/hex-editor/language-reference-overview.html Hex Editor Neo]. |
| 106 | [http://www.sweetscape.com/010editor/#templates 010 Editor], |
| 107 | [http://www.nyangau.org/be/be.htm Andys Binary Folding Editor], |
| 108 | [https://www.technologismiki.com/prod.php?id=31 Hackman Suite], |
| 109 | [http://www.hhdsoftware.com/doc/hex-editor/language-reference-overview.html Hex Editor Neo], |
| 110 | [http://apps.tempel.org/iBored/ iBored], and |
| 111 | [https://www.x-ways.net/winhex/templates.html#User_Templates WinHext]. |
| 147 | |
| 148 | == Miscellaneous ideas == |
| 149 | |
| 150 | === Code exporter === |
| 151 | |
| 152 | A tool could generate C code to read and write data given a specification. A |
| 153 | separate file could be used to specify which types should be used and which |
| 154 | things should be read lazily or strictly. |
| 155 | |
| 156 | === Diff === |
| 157 | |
| 158 | A diff tool could show differences in the interpreted data. |
| 159 | |
| 160 | === Space‐filling curves === |
| 161 | |
| 162 | [http://corte.si/posts/visualisation/binvis/index.html Space‐filling curves] |
| 163 | look cool, but this project is about ''avoiding'' looking at raw binary data. |