KeyedBits

KeyedBits is a binary data serialization format created by Alex Nichol in September of 2011. The first known implementation, KBKit, uses Objective-C datatypes to represent KeyedBits encoded data. The main focus of the KeyedBits format is to represent common data structures in a binary format without extensive overhead. Many datatypes can be encoded in a way that only one or two bytes of overhead is needed for proper decoding.
Capabilities
KeyedBits supports serialization of several primitive datatypes. These datatypes include arrays, associative arrays, strings, and integers.
All integers are stored in signed little endian. The current version of KeyedBits supports both 32 bit and 64 bit integers, allowing a range from -9223372036854775808 to 9223372036854775808.
Floating point values are stored as decimal, human-readable strings. This is at least partially due to the fact that, although standardized, floating points are encoded differently on different CPUs. By encoding floating points as strings, it is ensured that KeyedBits works seamlessly across platforms.
Basic Format
KeyedBits data is simply the "serialized" version of a single object. By serializing an array or associative array, one can achieve the serialization of multiple objects in one chunk of KeyedBits data. Different datatypes may have very different KeyedBits data, only sharing the single byte type identifier.
The first byte of an object's serialized data indicates the type of data the follows. This type identifier also indicates the method by which the decoder can calculate the length of the upcoming object data. In all cases, strings are terminated by a null character, whereas raw data is preceded by a length field. Arrays are terminated by NULL elements, represented by a 0 type field.
KeyedBits allows for data recursion, meaning that a collection (array) could contain one or more other collections. In the case of several arrays being directly contained within each other, the corresponding KeyedBits data may be ended by a sequence of several null terminators.
Advantages
Data Conservation
Uncompressed KeyedBits data is relatively compact, with a low amount of overhead. Even when compared to a widely used archive format such as JSON, KeyedBits generally produces less encoded data than it's JSON counterpart. In most cases, only one or two bytes of overhead is necessary to represent a large amount of data. In other words, KeyedBits makes itself as transparent as possible, using most of the encoded space to store the encoded data itself.
Binary
KeyedBits is a binary serialization format. Being what it is, it is relatively straight forward to represent raw data. Unlike many human-readable formats, data can be blatantly injected into KeyedBits data, assuming that a length and type field precede it. In most human-readable formats, data is usually expressed as hexadecimal or base64. Both of these alternatives take more raw space than simply encoding the binary data as is.
Disadvantages
Part of KeyedBits' overhead reduction is it's strictly encoded strings. All string objects are encoded as null terminated UTF8 byte strings. This disallows the use of other character encodings such as Unicode.
Associative arrays take this a step further, only allowing ASCII data to be used as keys. This allows the highest bit of each ASCII character be used to indicate the last character in the string. Although it does save a byte of overhead for every entry, this could be considered a large problem to people who wish to encode user-defined keys. An obvious way to overcome this issue would be to represent an associative array by other means, possibly with a plain array.
 
< Prev   Next >