Friday, February 6, 2015

Something to replace BSON.

I have been working on my own implementation of the BSON Spec for about 3 years now. I have come to the conclusion that the BSON design seems haphazard and organic. The parsing is harder and more intensive than it should be. Scanning it is poorly designed. The specific types require too much knowledge to handle properly. They even continue making small changes to the specification, but not bumping the version number.

In order to solve my problems with it, I have decided to create my own specification and call it WatSON. Well, technically, I am calling it “왓슨”, but I don’t know how to type that on my English (US) keyboard.

Here are some of my design goals.

  • Keys are defined at the start of the document to eliminate repetitive keys.
  • Arrays don’t have keys at all.
  • Types and Objects are connected, rather than being separated by the key name.
  • Containers and strings have 64 bit size markers. Other types have 1 byte size markers. The format must have a simple rule for skipping elements that don’t match a known type.
  • Document format is little-endian.
  • String formats support Snappy compression.
  • Format supports header information to influence how the document is read.

A lot of my influence in the design is coming from dealing with atoms in the MP4 file format. 

For more detail on how things are coming together: