Map-Based Codecs
Map-based codec derivation is enabled with this import:
sourceimport io.bullet.borer.derivation.MapBasedCodecs.*
With these codecs case classes are encoded as CBOR/JSON maps with the member name as key, i.e. the popular “standard” way for serializing to and from JSON.
Here is an example relying on map-based codecs:
sourceimport io.bullet.borer.{Json, Codec}
import io.bullet.borer.derivation.MapBasedCodecs.*
case class Foo(int: Int, string: String) derives Codec
case class Bar(foo: Foo, d: Double) derives Codec
val foo = Foo(int = 42, string = "yeah")
val bar = Bar(foo, d = 1.234)
// Json.encode(bar).toByteArray or
Json.encode(bar).toUtf8String ==>
"""{"foo":{"int":42,"string":"yeah"},"d":1.234}"""
Default ADT Encoding
The default encoding for ADT super-types is a single-entry map, with the key being the type id and the value becoming the encoding of the actual ADT subtype instance.
For example, a Dog
instance of the following ADT:
sourcesealed trait Animal
case class Dog(age: Int, name: String) extends Animal
case class Cat(weight: Double, color: String, home: String) extends Animal
case class Fish(color: String) extends Animal
case object Yeti extends Animal
would be encoded to JSON like this:
{ "Dog" :
{
"age":2,
"name":"Rex"
}
}
Here is the code (Note the explicit upcast to the ADT super-type Animal
!):
sourceimport io.bullet.borer.Json
val animal: Animal = Dog(2, "Rex")
// Json.encode(animal).toByteArray or
Json.encode(animal).toUtf8String ==> """{"Dog":{"age":2,"name":"Rex"}}"""
Alternative “Flat” Encoding of ADTs
The default encoding for ADTs (single-entry maps) cleanly separates the encoding of the ADT super-type from the encoding of the ADT sub-types by introducing a dedicated “envelope” layer holding the type ID in an efficient form. This encoding can be written, stored and read in a very efficient manner and is therefore recommended for all application where you have full control over the both ends of the encoding “channel”.
However, especially when interoperating with other systems, it is sometimes required to support another encoding style, which carries the type ID in a special map entry, e.g. like this:
{
"_type": "Dog",
"age":2,
"name":"Rex"
}
This alternative ADT encoding can be switched to by making the result of a call to AdtEncodingStrategy.flat()
implicitly available:
sourceimport io.bullet.borer.{AdtEncodingStrategy, Json, Codec}
import io.bullet.borer.derivation.MapBasedCodecs.*
// this enables the flat ADT encoding
given AdtEncodingStrategy =
AdtEncodingStrategy.flat(typeMemberName = "_type")
given Codec[Animal] = deriveAllCodecs[Animal]
val animal: Animal = Dog(2, "Rex")
// Json.encode(animal).toByteArray or
Json.encode(animal).toUtf8String ==> """{"_type":"Dog","age":2,"name":"Rex"}"""
While this “flat” ADT encoding has the benefit that the encoding can be decoded into the ADT super-type or the respective ADT sub-type (provided that the decoder simply ignores surplus members, as borer does) it also has a number of significant drawbacks:
-
It requires all ADT sub-types to be encoded to maps, i.e. it violates separation of concerns between the super- and sub-types. This can be annoying, e.g. when unary case classes could otherwise be stored more efficiently in an “unwrapped” form, e.g. with the CompactMapBasedCodecs.
-
It conflates two conceptually well separated and distinct name spaces: The one of the envelope (ADT super-type) and the one of the sub-type. This always leads to the risk of name collisions, which gives rise to type member names like
$type
or__type
or the like, which make the hacky “smell” of the approach even more visible. -
Due the unordered nature of JSON objects/maps it makes efficient single-pass parsing impossible in the general case, since the information, how to interpret an object’s members (the type of the object) might be delayed until the very end of the input file.
This requires either potentially unbounded caching or a second pass over the input. borer’s approach is the former (caching) with a configurable bound on the cache size (triggering an exception, when exceeded). -
It’s slightly less efficient storage-wise (depending on the length of the
typeMemberName
).
For these reasons borer’s default ADT encoding relying on single-element maps is generally the preferred choice.
Default Values
Map-based codecs support missing and extra members.
By default a member whose value matches the default value is not written to the output at all during encoding. This can be changed by bringing a customized io.bullet.borer.derivation.DerivationConfig
instance into scope.
During decoding the type’s Decoder
will use the potentially defined default value for all missing members, e.g.:
sourceimport io.bullet.borer.{Json, Codec}
import io.bullet.borer.derivation.MapBasedCodecs.*
case class Dog(age: Int, name: String = "<unknown>") derives Codec
Json
.decode("""{ "age": 4 }""" getBytes "UTF8")
.to[Dog]
.value ==> Dog(age = 4)
Extra members (i.e. map keys present in the encoding but not defined as a case class member or @key
) are simply ignored.
Also, Encoder
/Decoder
type classes can implement the Encoder.DefaultValueAware
/ Decoder.DefaultValueAware
trait in order to alter their behavior in the presence of a default value.
This is used, for example, by the pre-defined encoder and decoder for Option[T]
, which change their encoding/decoding strategy, if a None
default value is defined for a case class member. In this case the optional value will only be written if it’s defined (and then without any wrapping structure). If the option is undefined nothing is written at all.
Here is the actual implementation of borer’s Encoder
for Option
:
sourcegiven forOption[T: Encoder]: Encoder.DefaultValueAware[Option[T]] =
new Encoder.DefaultValueAware[Option[T]] {
def write(w: Writer, value: Option[T]) =
value match
case Some(x) => w.writeToArray(x)
case None => w.writeEmptyArray()
def withDefaultValue(defaultValue: Option[T]): Encoder[Option[T]] =
if (defaultValue eq None)
new Encoder.PossiblyWithoutOutput[Option[T]] {
def producesOutputFor(value: Option[T]) = value ne None
def write(w: Writer, value: Option[T]) =
value match
case Some(x) => w.write(x)
case None => w
}
else this
}
Correspondingly, during decoding the presence of the member yields a defined option instance holding the decoded value and None
, if the member is missing.
This behavior should match the intution of what an Option[T]
case class member would behave like when written to a JSON representation.
Customized Member Keys
borer supports customizing the name of case class members in the encoding with the same @key
annotation, that is also used for custom ADT type-ids (see Array-Based Codec and the @key
sources here for more info).
Simply annotate a case class member do provide a custom name:
sourceimport io.bullet.borer.{Json, Codec}
import io.bullet.borer.derivation.key
import io.bullet.borer.derivation.MapBasedCodecs.*
case class Dog(age: Int, @key("the-name") name: String) derives Codec
Json.encode(Dog(1, "Lolle")).toUtf8String ==>
"""{"age":1,"the-name":"Lolle"}"""
CompactMapBasedCodecs
As a slight variant of the MapBasedCodecs
borer also provides CompactMapBasedCodecs
which are identical except for the encoding of unary case class
es, i.e. case class
es with only a single member.
The CompactMapBasedCodecs
encode them in an unwrapped form, i.e. directly as the value of the single member. The name of the sole case class
therefore doesn’t appear in the encoding at all.