Map-Based Codecs

Map-based codec derivation is enabled with this import:

sourceimport io.bullet.borer.derivation.MapBasedCodecs.*

With these codecs case classes are encoded as CBOR/JSON maps with the member name as key, i.e. the popular “standard” way for serializing to and from JSON.

Here is an example relying on map-based codecs:

sourceimport io.bullet.borer.{Json, Codec}
import io.bullet.borer.derivation.MapBasedCodecs.*

case class Foo(int: Int, string: String) derives Codec
case class Bar(foo: Foo, d: Double) derives Codec

val foo = Foo(int = 42, string = "yeah")
val bar = Bar(foo, d = 1.234)

// Json.encode(bar).toByteArray or
Json.encode(bar).toUtf8String ==>
"""{"foo":{"int":42,"string":"yeah"},"d":1.234}"""

Default ADT Encoding

The default encoding for ADT super-types is a single-entry map, with the key being the type id and the value becoming the encoding of the actual ADT subtype instance.

For example, a Dog instance of the following ADT:

sourcesealed trait Animal
case class Dog(age: Int, name: String)                      extends Animal
case class Cat(weight: Double, color: String, home: String) extends Animal
case class Fish(color: String)                              extends Animal
case object Yeti                                            extends Animal

would be encoded to JSON like this:

{ "Dog" :
  {
    "age":2,
    "name":"Rex"
  }
}

Here is the code (Note the explicit upcast to the ADT super-type Animal!):

sourceimport io.bullet.borer.Json

val animal: Animal = Dog(2, "Rex")

// Json.encode(animal).toByteArray or
Json.encode(animal).toUtf8String ==> """{"Dog":{"age":2,"name":"Rex"}}"""

Alternative “Flat” Encoding of ADTs

The default encoding for ADTs (single-entry maps) cleanly separates the encoding of the ADT super-type from the encoding of the ADT sub-types by introducing a dedicated “envelope” layer holding the type ID in an efficient form. This encoding can be written, stored and read in a very efficient manner and is therefore recommended for all application where you have full control over the both ends of the encoding “channel”.

However, especially when interoperating with other systems, it is sometimes required to support another encoding style, which carries the type ID in a special map entry, e.g. like this:

{
  "_type": "Dog",
  "age":2,
  "name":"Rex"
}

This alternative ADT encoding can be switched to by making the result of a call to AdtEncodingStrategy.flat() implicitly available:

sourceimport io.bullet.borer.{AdtEncodingStrategy, Json, Codec}
import io.bullet.borer.derivation.MapBasedCodecs.*

// this enables the flat ADT encoding
given AdtEncodingStrategy =
  AdtEncodingStrategy.flat(typeMemberName = "_type")

given Codec[Animal] = deriveAllCodecs[Animal]

val animal: Animal = Dog(2, "Rex")

// Json.encode(animal).toByteArray or
Json.encode(animal).toUtf8String ==> """{"_type":"Dog","age":2,"name":"Rex"}"""

While this “flat” ADT encoding has the benefit that the encoding can be decoded into the ADT super-type or the respective ADT sub-type (provided that the decoder simply ignores surplus members, as borer does) it also has a number of significant drawbacks:

  • It requires all ADT sub-types to be encoded to maps, i.e. it violates separation of concerns between the super- and sub-types. This can be annoying, e.g. when unary case classes could otherwise be stored more efficiently in an “unwrapped” form, e.g. with the CompactMapBasedCodecs.

  • It conflates two conceptually well separated and distinct name spaces: The one of the envelope (ADT super-type) and the one of the sub-type. This always leads to the risk of name collisions, which gives rise to type member names like $type or __type or the like, which make the hacky “smell” of the approach even more visible.

  • Due the unordered nature of JSON objects/maps it makes efficient single-pass parsing impossible in the general case, since the information, how to interpret an object’s members (the type of the object) might be delayed until the very end of the input file.
    This requires either potentially unbounded caching or a second pass over the input. borer’s approach is the former (caching) with a configurable bound on the cache size (triggering an exception, when exceeded).

  • It’s slightly less efficient storage-wise (depending on the length of the typeMemberName).

For these reasons borer’s default ADT encoding relying on single-element maps is generally the preferred choice.

Default Values

Map-based codecs support missing and extra members.

By default a member whose value matches the default value is not written to the output at all during encoding. This can be changed by bringing a customized io.bullet.borer.derivation.DerivationConfig instance into scope.

During decoding the type’s Decoder will use the potentially defined default value for all missing members, e.g.:

sourceimport io.bullet.borer.{Json, Codec}
import io.bullet.borer.derivation.MapBasedCodecs.*

case class Dog(age: Int, name: String = "<unknown>") derives Codec

Json
  .decode("""{ "age": 4 }""" getBytes "UTF8")
  .to[Dog]
  .value ==> Dog(age = 4)

Extra members (i.e. map keys present in the encoding but not defined as a case class member or @key) are simply ignored.

Also, Encoder/Decoder type classes can implement the Encoder.DefaultValueAware / Decoder.DefaultValueAware trait in order to alter their behavior in the presence of a default value.

This is used, for example, by the pre-defined encoder and decoder for Option[T], which change their encoding/decoding strategy, if a None default value is defined for a case class member. In this case the optional value will only be written if it’s defined (and then without any wrapping structure). If the option is undefined nothing is written at all.

Here is the actual implementation of borer’s Encoder for Option:

sourcegiven forOption[T: Encoder]: Encoder.DefaultValueAware[Option[T]] =
  new Encoder.DefaultValueAware[Option[T]] {

    def write(w: Writer, value: Option[T]) =
      value match
        case Some(x) => w.writeToArray(x)
        case None    => w.writeEmptyArray()

    def withDefaultValue(defaultValue: Option[T]): Encoder[Option[T]] =
      if (defaultValue eq None)
        new Encoder.PossiblyWithoutOutput[Option[T]] {
          def producesOutputFor(value: Option[T]) = value ne None
          def write(w: Writer, value: Option[T]) =
            value match
              case Some(x) => w.write(x)
              case None    => w
        }
      else this
  }

Correspondingly, during decoding the presence of the member yields a defined option instance holding the decoded value and None, if the member is missing.

This behavior should match the intution of what an Option[T] case class member would behave like when written to a JSON representation.

Customized Member Keys

borer supports customizing the name of case class members in the encoding with the same @key annotation, that is also used for custom ADT type-ids (see Array-Based Codec and the @key sources here for more info).

Simply annotate a case class member do provide a custom name:

sourceimport io.bullet.borer.{Json, Codec}
import io.bullet.borer.derivation.key
import io.bullet.borer.derivation.MapBasedCodecs.*

case class Dog(age: Int, @key("the-name") name: String) derives Codec

Json.encode(Dog(1, "Lolle")).toUtf8String ==>
"""{"age":1,"the-name":"Lolle"}"""

CompactMapBasedCodecs

As a slight variant of the MapBasedCodecs borer also provides CompactMapBasedCodecs which are identical except for the encoding of unary case classes, i.e. case classes with only a single member.

The CompactMapBasedCodecs encode them in an unwrapped form, i.e. directly as the value of the single member. The name of the sole case class therefore doesn’t appear in the encoding at all.