So, most cloud services charge by some unit of size (Amazon charges by the Gig). If you’re working for a company that uses these services, then, well, duhh….send around the smallest bits of data you can, right?
The way I’ve got Ellemy stuff for serialization now is that I have one little interface.
1: public interface ISerializer
2: {
3: object Deserialize(string input, Type desiredType);
4: object DeserializeObject(string input);
5: string Serialize(object input);
6: }
I hate the DeserializeObject thing, but, well, that’s for another time.
I have one concrete implementation of it currently, it uses JSON. I won’t show it here, because that’s not the point of the post.
So, I’ve been looking at Google Protocol Buffers and the .Net project for it, so I figured I’d write a ISerializer that uses protocol buffers. It looks pretty nice, apparently about half the size of Json serialization and a bit quicker too. The not so nice part is that the data is not self descriptive, and you MUST know the type of the object being requested. Honestly though, that’s not a problem for most apps I write, Just send the type in the message, and we’re good right?
So taking a look at the Protobuf-net wiki, I see that we’ll need data contracts for our messages.
Given a class like so.
1: public class SomeThingToSerialize
2: {
3: public string SomeStringProperty { get; set; }
4: public Guid SomeGuidProperty { get; set; }
5: public int SomeIntProperty { get; set; }
6: }
We’d need a class like this so that Protocol Buffer can do its thing.
1: [ProtoContract]
2: public class SomeThingToSerialize
3: {
4: [ProtoMember(1)]
5: public string SomeStringProperty { get; set; }
6: [ProtoMember(2)]
7: public Guid SomeGuidProperty { get; set; }
8: [ProtoMember(3)]
9: public int SomeIntProperty { get; set; }
10: }
I don’t want to muddy my problem domains with sterilization specific attributes, so I think what I’ll do is actually generate the classes for the messages on the fly – let’s play in CodeDom a bit!
Test
1: [TestFixture]
2: public class ProtocolBufferGenerator_tests
3: {
4: private ProtocolBufferDataContractGenerator _generator;
5: [SetUp]
6: public void arrange()
7: {
8: _generator = new ProtocolBufferDataContractGenerator();
9: }
10: [Test]
11: public void message_is_decorated_with_ProtoContract_attribute()
12: {
13: var random = new Random();
14: var testThing = new SomeThingToSerialize
15: {
16: SomeGuidProperty = Guid.NewGuid(),
17: SomeIntProperty = random.Next(),
18: SomeStringProperty = "Blah"
19: };
20: var protoClass = _generator.GenerateProtoFor(testThing);
21: var foundProtoContractAttribute = protoClass.GetType()
22: .GetCustomAttributes(false)
23: .Any(attribute => attribute.GetType().Name == "ProtoContractAttribute");
24:
25: Assert.IsTrue(foundProtoContractAttribute);
26: }
27: }
Yeah, just what the test says, it just checks that the object this ProtocolBufferDataContractGenerator is decorated with the correct attribute. Lets make it so.
1: public class ProtocolBufferDataContractGenerator
2: {
3: private const string _codeContractsNamespace = "Ellemy.CQRS.Serializers.GoogleProtocolBuffers.Contracts";
4:
5:
6: public object GenerateProtoFor<T>(T thing)
7: {
8: var nameSpace = new CodeNamespace(_codeContractsNamespace);
9: nameSpace.Imports.Add(new CodeNamespaceImport("ProtoBuf"));
10: nameSpace.Imports.Add(new CodeNamespaceImport(thing.GetType().Namespace));
11: var @class = new CodeTypeDeclaration(thing.GetType().Name)
12: {
13: IsClass = true,
14: Attributes = MemberAttributes.Public
15: };
16: var protoContractAttribute = new CodeAttributeDeclaration("ProtoContract");
17: @class.CustomAttributes.Add(protoContractAttribute);
18: nameSpace.Types.Add(@class);
19: var compileUnit = new CodeCompileUnit();
20: compileUnit.Namespaces.Add(nameSpace);
21: compileUnit.ReferencedAssemblies.Add("protobuf-net.dll");
22: var thingAssembly = thing.GetType().Assembly;
23: var assemblyToAdd = thingAssembly.GetName().Name + ".dll";
24: compileUnit.ReferencedAssemblies.Add(assemblyToAdd);
25: var parameters = new CompilerParameters {GenerateInMemory = true};
26:
27: var provider = new CSharpCodeProvider();
28: var results = provider.CompileAssemblyFromDom(parameters,compileUnit);
29: if(results.Errors.Count != 0)
30: {
31: throw new InvalidOperationException(results.Errors[0].ErrorText);
32: }
33:
34: return
35: results.CompiledAssembly.CreateInstance(string.Format("{0}.{1}", _codeContractsNamespace,thing.GetType().Name));
36:
37:
38: }
39:
40:
41: }
Lots of code there, but it does work! If you’re not familiar with code dom, it’s actually not very complicated. We’re just using C# to gen c#, and the code is fairly self documenting.
Pretty neat that we’re leaving the assembly in memory, but I’m thinking that in the future, we might wanna save that assembly, and not pay the cost of genning that class multiple times, but we’ll get to that soon enough.
So far, we’re creating an empty class decorated with the ProtoContract attribute, which is perfectly worthless since we don’t have the properties. Let’s fix that.
First, make sure we’re adding properties on the message.
1: [Test]
2: public void properties_are_added()
3: {
4: var expectedNumberOfProperties = typeof(SomeThingToSerialize).GetProperties().Count();
5: var actualNumberOfProperties = _protoClass.GetType().GetProperties().Count();
6: Assert.AreEqual(expectedNumberOfProperties,actualNumberOfProperties);
7: }
To make it pass I wrote this little method and called it on line 19 (actually doesn’t matter where) from the GenerateProto for method.
1: private void AddProperties<T>(T thing, CodeTypeDeclaration @class)
2: {
3: foreach (var propertyInfo in thing.GetType().GetProperties().OrderBy(p => p.Name))
4: {
5: var field = new CodeMemberField
6: {
7: Type = new CodeTypeReference(propertyInfo.PropertyType.FullName),
8: Attributes = MemberAttributes.Private,
9: Name = "_" + propertyInfo.Name
10: };
11: @class.Members.Add(field);
12: var @property = new CodeMemberProperty
13: {
14: Name = propertyInfo.Name,
15: HasGet = true,
16: HasSet = true,
17: Type = new CodeTypeReference(propertyInfo.PropertyType.FullName),
18: Attributes = MemberAttributes.Public,
19: };
20: var getter = new CodeSnippetStatement(String.Format("return _{0};",propertyInfo.Name));
21: @property.GetStatements.Add(getter);
22: var setter = new CodeSnippetStatement(String.Format("_{0} = value;", propertyInfo.Name));
23: @property.SetStatements.Add(setter);
24: @class.Members.Add(@property);
25: }
26: }
Ok, so now we need to decorate each property with a ProtoMember attribute. Here’s the test.
1: [Test]
2: public void every_property_on_the_message_is_decorated_with_a_ProtoMember_attribute()
3: {
4:
5: foreach(var propertyInfo in _protoClass.GetType().GetProperties())
6: {
7: var foundProtoMemberAttribute = _protoClass.GetType().GetProperty(propertyInfo.Name)
8: .GetCustomAttributes(false)
9: .Any(attribute => attribute.GetType().Name == "ProtoMemberAttribute");
10: Assert.IsTrue(foundProtoMemberAttribute);
11:
12: }
13: }
And now lets make it pass. I wrote this little method and added it as in that loop in AddProperties.
1: private void AddProtoMemberAttribute(CodeMemberProperty property, int memberNumber)
2: {
3: var protoBuffAttribute = new CodeAttributeDeclaration("ProtoMember");
4: var attributeArgument = new CodeAttributeArgument(new CodePrimitiveExpression(memberNumber));
5: protoBuffAttribute.Arguments.Add(attributeArgument);
6: @property.CustomAttributes.Add(protoBuffAttribute);
7: }
Sweet, now we’re generating the DataContracts! We’re still not actually serializing objects though. We’re just making stuff that Google Protocol Buffers can work with.
Lets add a new test.
1: [TestFixture]
2: public class using_the_GoogleProtocolBuffer_serializer
3: {
4: private Serializer _serializer;
5: [SetUp]
6: public void Arrange()
7: {
8: _serializer = new Serializer();
9: }
10:
11: [Test]
12: public void serialize_an_non_DataContract_class()
13: {
14: var testThing = new TestThing { Guid = Guid.NewGuid(), Int = 1, String = "Some String"};
15: var output = _serializer.Serialize(testThing);
16: Assert.IsNotNullOrEmpty(output);
17: Console.WriteLine(output);
18: }
Ok, so not much going on here, we’re just testing that the output is actually not null, and (by extension), that the Serializer class doesn’t throw an exception.
Here’s what I did to make it work.
1: public class Serializer : ISerializer
2: {
3: private readonly ProtocolBufferDataContractGenerator _protocolBufferDataContractGenerator;
4:
5: public Serializer()
6: {
7: _protocolBufferDataContractGenerator = new ProtocolBufferDataContractGenerator();
8: }
9: public object Deserialize(string input, Type desiredType)
10: {throw new NotImplementedException("patience is a virtue"); }
11:
12: public object DeserializeObject(string input)
13: {
14: throw new NotSupportedException("nope, dis dont werk ");
15: }
16: public string Serialize(object input)
17: {
18: var t = _protocolBufferDataContractGenerator.GenerateProtoFor(input);
19: foreach (var property in input.GetType().GetProperties())
20: {
21: var setterForT = t.GetType().GetProperty(property.Name);
22: var value = property.GetValue(input, null);
23: setterForT.SetValue(t, value,null);
24: }
25: string data;
26: using (var writer = new MemoryStream())
27: {
28: ProtoBuf.Serializer.NonGeneric.Serialize(writer, t);
29: writer.Position = 0;
30: using (var reader = new StreamReader(writer,Encoding.ASCII))
31: {
32: data = reader.ReadToEnd();
33: }
34: }
35: return data;
36: }
37: }
38: }
on line 18, I simply get an instance of the DataContract class (we just saw what’s in there). I then loop through all the types on the DataContract via reflection and set all the values appropriately.
On line 26-34, we’re just using Protobuff-net to serialize the object, and return the results.
Not horribly complicated, but we’re still not done, because we can’t yet deserialize.
New test.
1: [Test]
2: public void serialize_then_deserialize()
3: {
4: var testThing = new TestThing { Guid = Guid.NewGuid(), Int = 1, String = "Some String" };
5: var output = _serializer.Serialize(testThing);
6: var result = (TestThing)_serializer.Deserialize(output, typeof(TestThing));
7: Assert.AreEqual(testThing.Guid, result.Guid);
8: Assert.AreEqual(testThing.String, result.String);
9: Assert.AreEqual(testThing.Int, result.Int);
10: Assert.AreEqual(testThing.Enum1, result.Enum1);
11: }
Yeah, now we’re testing that it actually works. Lets make this pass.
1: public object Deserialize(string input, Type desiredType)
2: {
3: var bytes = ASCIIEncoding.ASCII.GetBytes(input);
4: var @event = Activator.CreateInstance(desiredType);
5: using (var stream = new MemoryStream(bytes))
6: {
7: var thisSucksINeedToFixIt = Activator.CreateInstance(desiredType);
8: var protobufferType = _protocolBufferDataContractGenerator.GenerateProtoFor(thisSucksINeedToFixIt).GetType();
9: var protobuffer = ProtoBuf.Serializer.NonGeneric.Deserialize(protobufferType, stream);
10:
11: foreach (var fieldInfo in protobuffer.GetType().GetProperties())
12: {
13: var setter = desiredType.GetProperty(fieldInfo.Name);
14: var value = fieldInfo.GetValue(protobuffer,null);
15: setter.SetValue(@event, value, null);
16: }
17: }
18: return @event;
19: }
Not too bad, huh?
Except for that thisSucksINeedToFixIt variable. I should defer to some IOC container or something there, but honestly, I don’t think events should have dependencies – yeah, argument for a different day.
Ok, so the test passes except for the Guid assertion. Gunna go take a look at that now.
Ok, after a lot of googling I realized that the issue was my ASCII encoding. I doubt anyone reading this blog cares too much, but what I did to fix it was to use BitConverter.ToString() method when Serializing and some code I ripped off off stack overflow to deserialize. If you’re interested, yeah go git the code!
So there we go, I like it, but there’s still a lot of work to go before it’s ready for prime time.
Some things I have to get done.
- Handle versioning well (the order is very important, if you added a new property with the current implementation in the middle of a class, it would break old stuff)
- Maybe even output .proto files, and have the GoogleProtocolBufferGenerator use them if they exist, that way it’ll work by default with conventions, but allow customization, and it’ll save on the overhead of generating the contract.
Have fun with it! Thoughts?