BSON Serialization with MongoDB C# Driver

MongoDB C# Driver consists of two parts:

  • BSON Serialization support
  • The Driver itself

In this post we will have a look at the most important components of BSON Serialization and how it works under the cover. So let’s pull the Git repository and drill into the code.

High-Level API: ToJson and ToBson

Going top-down: the high-level serialization API are two handy extension methods: ToJson and ToBson. They can be used on an arbitrary object and hide complexity of underlying machinery:

There is an extensive set of unit tests for the C# Driver. Most of my code snippets are based on that tests.

[Test]
public void TestToJson()
{
    var c = new C { N = 1, Id = ObjectId.Empty };
    var json = c.ToJson();
    Assert.That(json, Is.EqualTo(
        "{ \"N\" : 1, \"_id\" : ObjectId(\"000000000000000000000000\") }"));
}
[Test]
public void TestToBson()
{
    var c = new C { N = 1, Id = ObjectId.Empty };
    var bson = c.ToBson();
    var expected = new byte[] { 29, 0, 0, 0, 16, 78, 0, 1, 0, 0, 0, 7, 95, 105, 100, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
    Assert.IsTrue(expected.SequenceEqual(bson));
}

Looking inside implementation we’ll immediately see, that the these are just thin wrappers for couple of other classes and call to BsonSerializer.Serialize as the central point:

// ToBson
using (var buffer = new BsonBuffer())
{
    using (var bsonWriter = BsonWriter.Create(buffer, settings))
    {
        BsonSerializer.Serialize(bsonWriter, nominalType, obj, options);
    }
    return buffer.ToByteArray();
}
// ToJson
using (var stringWriter = new StringWriter())
{
    using (var bsonWriter = BsonWriter.Create(stringWriter, settings))
    {
        BsonSerializer.Serialize(bsonWriter, nominalType, obj, options);
    }
    return stringWriter.ToString();
}

We’ll see later what is the purpose of BsonWriter.
For deserialization there are no extension methods, so one need directly grab the BsonSerializer:

// BSON
var c = BsonSerializer.Deserialize(bsonBytes);
Assert.AreEqual(1, c.N);
// Json
var c = BsonSerializer.Deserialize(jsonString);
Assert.AreEqual(1, c.N);

This is pretty much it – couple of straightforward functions that could cover 80% of use cases. For other 20% we should understand how it works underneath.

There is also an API for (de)serialization in form of DOM aka BsonDocument. Although BsonDocument is something completely different comparing to raw BSON byte stream, serialization is implemented using same design concepts – dependency injection in action.

Middle-Level: BsonReader, BsonWriter and BsonSerializer

Stepping one level down, we are getting to BsonReader and BsonWriter. These are actually class families with specific implementation for three related formats: BSON, Json and BsonDocument. Surfing through the code, it is not difficult to identify their responsibility: (de)serialize particular elements seeking over incoming/outgoing buffer – much like System.IO.BinaryReader or System.Xml.XmlReader. It means that for example BsonBinaryReader/Writer pair implements the BSON specification for particular elements and JsonReader/Writer do same for Json, including Mongo-related extensions like ObjectIds and $-notation for field names.

[Test]
public void TestRegularExpressionStrict()
{
    var json = "{ \"$regex\" : \"pattern\", \"$options\" : \"imxs\" }";
    using (var bsonReader = BsonReader.Create(json))
    {
        Assert.AreEqual(BsonType.RegularExpression, bsonReader.ReadBsonType());
        var regex = bsonReader.ReadRegularExpression();
        Assert.AreEqual("pattern", regex.Pattern);
        Assert.AreEqual("imxs", regex.Options);
        Assert.AreEqual(BsonReaderState.Done, _bsonReader.State);
    }
    var settings = new JsonWriterSettings { OutputMode = JsonOutputMode.Strict };
    Assert.AreEqual(json, BsonSerializer.Deserialize(new StringReader(json)).ToJson(settings));
 }

Responsibility of BsonSerializer in this context is to orchestrate individual calls to the readers and writers during serialization and compose result.
All in all the whole high-level process could be drawn this way:

BSON Serialization overview

Low-Level: Serializers and Conventions

Stepping down once again to see individual components under BsonSerializer:

BSON Serialization classes

BsonSerializer contains collection of serialization providers that can be used to look up particular serializer. An Serializer is something like this:

IBsonSerializer

It is hardly to overlook the ambiguity between BsonSerializer and IBsonSerializer. Nevertheless the classes serve very different purposes. The first one is the static class and the central point for the whole serialization logic, while the second one contains numerous particular implementation for the whole bunch of types and normally should not be used directly.

From this definition of IBsonSerializer we can identify its purpose – to create an object of specified type using particular BsonReader and vice versa. So the control flow is as follows:

  • BsonSerializer is called to (de)serialize specific type using particular reader or writer
  • It asks then an serialization provider (registry of particular serializers) if there is a serializer registered for the requested type
  • If there is one, the serializer triggers the actual low-level process, orchestrate calls to readers and writers

There are two predefined serialization providers – BsonDefaultSerializationProvider and BsonClassMapSerializationProvider. The Default provider is always used as the first one and delivers serializers for most of .NET native types and specialized BSON types (like ObjectId or JavaScript). If there is no predefined serializer for the requested type, then the ClassMap provider is used to engage the BsonClassMapSerializer. This one is a very powerful facility to handle user-defined types. The most important aspect here is the configuration of object-to-BSON mappings.

The mapping is handled by the BsonClassMap that contains all metadata for the requested type like serializable member names and their order, id field and id generation strategy, discriminator fields for polymorphic types and lots more. It works out of the box with reasonable behavior, but is also highly customizable:

BsonClassMap.RegisterClassMap(cm =>
{
    cm.MapIdProperty(e => e.EmployeeId);
    cm.MapProperty(e => e.FirstName).SetElementName("fn");
    cm.MapProperty(e => e.LastName).SetElementName("ln");
    cm.MapProperty(e => e.DateOfBirth).SetElementName("dob").SetSerializer(new DateOfBirthSerializer());
    cm.MapProperty(e => e.Age).SetElementName("age");
});

Nice to see that implementation of all customization concepts is not gathered in single place, but distributed over particular components, aka conventions. Every convention is responsible for some mapping aspect and could be applied to the BsonClassMap to update current configuration. (For example the MemberNameElementNameConvention could be applied to a MemberMap of BsonClassMap to set the corresponding BSON element name which surely should be same as the class member name, if not overridden by AttributeConvention using BsonElementAttribute.)

class TestClass
{
      [BsonElement("fn")]
      public string FirstName;
}
[Test]
public void TestOptsInMembers()
{
    var convention = AttributeConventionPack.Instance;
    var classMap = new BsonClassMap();
    new ConventionRunner(convention).Apply(classMap);

    Assert.AreEqual(1, classMap.DeclaredMemberMaps.Count());
    Assert.AreEqual("fn", classMap.GetMemberMap("FirstName").ElementName);
}

Conclusion

The whole class structure is very powerful. You can do pretty everything you want by plugging into it at an appropriate place:

  • Manipulate serialization process using IBsonSerializationOptions
  • Fine tune BSON structure though manual configuration of BsonClassMap
  • Implement and register own conventions for user defined types
  • Implement a serializer for your very special types and register it with own serialization provider

All in all it is very nice to see good separation of concerns in action – with few exceptions.

Further Reading:


One Comment on “BSON Serialization with MongoDB C# Driver”

  1. […] have already seen, how the most important components of BSON Serialization work. Now it is time to review the other […]


Leave a comment