XmlSerializer namespaces

Xml is a data-language - it allows you to describe and carry data in one file, which is awesome for inter-system communication. Another nice feature is the ability to version your data. This is done using namespaces.

<myRoot xmlns="http://porse.prg/data/2019/01/11">
    <myElement />
    <yourElement />
    <herElement />
</myRoot>

See, all the data belongs to the namespace http://porse.org/data/2019/01/11 so I know what to expect - especially if I also produce a schema definition.

So in a month's time or so, I'll probably release v2 which introduces the field <ourElement />. the other elements are unaltered, and so I could do something like the following

<myRoot xmlns="http://porse.org/data/2019/02/03"  xmlns:v1="http://porse.prg/data/2019/01/11">
    <v1:myElement />
    <v1:yourElement />
    <v1:herElement />
    <ourElement />
</myRoot>

Notice that the "default" namespace changed to /2019/02/03, v1 was explicitly named and I reused the elements my-/your-/her from v1 in my "new" myRoot type. And here's the point of my post: It's not obvious how make System.Xml.Serialization.XmlSerializer produce the output from above.

Let's assume we write these classes and have XmlSerializer serialize it

 

[XmlType(Namespace = "http://porse.org/data/2019/01/11")]
public class Data { ... }

[XmlType(Namespace = "http://porse.org/data/2019/02/03")]
public class Data2 { ... }

[XmlRoot("myRoot", Namespace="http://porse.org/data/2019/02/03")]
public class myRoot 
{
    [XmlElement("myElement")]
    public Data My {get; }

    [XmlElement("yourElement")]
    public Data Your {get;}

    [XmlElement("herElement")]
    public Data Her {get;}

    [XmlElement("ourElement")]
    public Data2 Our {get;}
}

XmlSerializer ser = new XmlSerializer(typeof(myRoot));
ser.Serialize(outStream, instanceOfMyRoot);

The result would be something like this:

<myRoot xmlns="http://porse.org/data/2019/02/03">
    <myElement xmlns="http://porse.org/data/2019/01/11" />
    <yourElement xmlns="http://porse.org/data/2019/01/11" />
    <herElement xmlns="http://porse.org/data/2019/01/11" />
    <ourElement />
</myRoot>

Which technically is correct, but it's hard on the eyes and the bandwidth. To make it behave you need to tell the serializer, that you have more namespaces in the mix. This is done using the AttributesOverride parameter in the XmlSerializer constructor. But this is for your ad-hoc serialization needs, if for instance you are calling web services using svcutil-generated classes, they will use the default serializer constructor without AttributeOverrides.

But you can declare the overrides in code:

[XmlRoot("myRoot", Namespace="http://porse.org/data/2019/02/03")]
public class MyRoot
{
    [XmlNamespaceDeclarations]
    public XmlSerializerNamespaces MyCustomNamespaces;

    public MyRoot()
    {
        MyCustomNamespaces = new XmlSerializerNamespace();
        MyCustomNamespaces.Add("v1", "http://porse.org/data/2019/01/11");
    }
    ... etc
}

When the serializer happens upon a MyRoot instance, it knows to look for a field or property with the XmlNamespaceDeclarationsAttribute and insert these. The field MyCustomNamespaces will not be serialized to the output, and we'll save 114 bytes in the transmission and gain a lot of readability.

 

 

 

 

Roslyn - Extract Interface

RoslynI took it upon myself to write a CodeFix that would save us some time in the team. Specifically we need to extract the interfaces of the classes that comprise the kernel of our system and use these in our DI-system. I wrote a DiagnosticsAnalyzer which compares the interface (if present) with the class, and if there are discrepancies, then report. The Codefix is to loop over all classes and produce one big interfaces.cs file with all the interfaces (yes, generate everything in one file, and yes, I know, but such is the requirement)

I took a brief look at CSharpSyntaxFactory and immediately dismissed it as being too verbose. SyntaxGenerator has everything we need.

var gen = SyntaxGenerator.FromDocument(interfacesfile);

// loop over INamedTypeSymbols and for each call
gen.InterfaceDeclaration( ... ) 

But not so fast ... Let's just have a look at INamedTypeSymbol.GetMembers()

It produces all the members of a Type, which are Properties, Contructors, Fields, Methods and operators. We don't want that.. so we need to filter the members to only get what we need

var methodsForInterface = theClass.GetMembers()
    .OfType<IMethodSymbol>()
    .Where(p => p.MethodKind == MethodKind.Ordinary)
    .Where(p => !p.IsStatic && p.DeclaredAccessibility == Accessibility.Public)
    .ToList();

MethodKind == MethodKind.Ordinay filters out constructs and property-getters and what have you. Also, naturally, only the public instance methods are to be in the interface.

Right.. The result is a List<IMethodSymbol> which is easily enumerated, and SyntaxGenerator has a nice little method 

public Microsoft.CodeAnalysis.SyntaxNode 
MethodDeclaration (Microsoft.CodeAnalysis.IMethodSymbol method, 
System.Collections.Generic.IEnumerable<Microsoft.CodeAnalysis.SyntaxNode> statements = null);

And since we're not interested in the method body we can leave the statements part out. But not all is good. As it turns out, the above method cuts a few corners and leaves out stuff that I want left in. 

 

var mSym = theMethodSymbol;
var synGen = syntaxGenerator;

var methodDecl = synGen.MethodDeclaration(mSym.Name,
    returnType: synGen.TypeExpression(mSym.ReturnType),
    typeParameters: <TODO>
    parameters: mSym.Parameters.Select(p => synGen.ParameterDeclaration(p))
    );

Ok, let's just skip the type-parameters for now and concentrate on the parameters. As stated above, the SyntaxGenerator.ParameterDeclaration(IParameterSymbol) method cuts an important corner as well. It leaves out any potential "Explicit Default values", so if for instance your method looks like this:

public void YourMethod(int parm1, bool parm2 = false, string parm3 = null){ ... }

the interface would end up looking like this:

void YourMethod(int parm1, bool parm2, string parm3)

Again the long way...  

parameters = mSym.Parameters
    .Select(p => synGen.ParameterDeclaration(p.Name,
        type: synGen.TypeExpression(p.Type),
        initializer: p.HasExplicitDefaultValue 
            ? synGen.LiteralExpression(p.ExplicitDetaultValue)
            : null,
        refKind: p.RefKind))

Getting closer... Now, type-parameters.. What if your class is a Generic one? Foo<T>.. We would want our interface to also be generic: IFoo<T>  that's the typeParameters that I left TODO a few lines back. It's actually pretty easy as it's just names, but what if T is constrained to certain types?!

For good measure:

    typeParameters: p.TypeParameters.Select(t => t.Name)

If there are constraints

if(mSym.TypeParameters.Any())
{
    mSym.TypeParameters.Select(t => {
		var specialKind = SpecialTypeConstraintKind.None;
		if(t.HasConstructorConstraint)
			specialKind |= SpecialTypeConstraintKind.Constructor;
		if(t.HasReferenceTypeConstraint)
			specialKind |= SpecialTypeConstraintKind.ReferenceType;
		if(t.HasValueTypeConstraint)
			specialKind |= SpecialTypeConstraintKind.ValueType;
		
		methodDecl = synGen.WithTypeConstraint(methodDecl , t.Name, specialKind, 
			t.ConstraintTypes.Select(ct => gen.TypeExpression(ct)).ToArray());
    }
}

Finally: attributes ... As a lot of VS teams, we still depend on JetBrains Resharper and make use of attributes like [NotNull], [Null], [ItemNotNull] to help our intellisense a bit. Well, attributes are not included in MethodDeclaration.. It's been factored out and you need adorn your methodDecl and parmDecl's yourself.

var attrs = mSym.GetAttributes();
if (attrs.Any())
    methodDecl = synGen.AddAttributes(methodDecl, attrs.Select(a => gen.Attribute(a)));

and likewise within the parameters: mSym.Parameters.Select(...) 

 

Now, the same goes for Public properties.. class.GetMembers().OfType<IPropertySymbol>().Etc().AndSoForth().

I'll leave it to you, to glue everything together. Next up, I'll try to put the generated interface in a file

Roslyn - Semantics

I've been getting to know Roslyn the past couple of weeks. It's a little frustrating as the documentation has been cleansed of any useful examples, but luckily there is stackoverflow and gitter and google :o)

I initially began writing about a concrete problem I had to solve: Extract the interface of a class and put it somewhere else in the solution. And came to the conclusion, that I spent too much time explaining the difference between syntax and semantics. Eventually I deleted everything and started writing this.

Semantics is the birds-eye-view of your code, that is to say the definitions that make up your code and how it fits together. We're talking namespaces, interfaces, types and methods. The actual implementation of a method, which may indeed involve types is irrelevant to the semantic model.

Syntax is the representation of your code as the compiler sees it. It contains everything, even the white space (or lack there-of) between keywords in your code. When coding Roslyn Analyzers/CodeFixes you will want to know if you're trying to fix the semantics of your code or the syntax. A good example is mine from above: Extract the interface of a class. The extraction of an interface is an exercise in sematics (how to interact with my class at a high level).

Now, I tried to extract an interface from syntax, but inevitably had to fail, because the syntax tree lacks a higher understanding of types and their origin - which led to missing namespace imports - as well as understanding of partial classes - which led to missing methods in the interfaces.

With that out of the way, hopefully I can now share what I learned while building my interface extractor.

 

 

Getting to grips with RX

Reactive Extensions 

Rx for short - is a pretty neat little framework by Erik Meijer. I've sort of been avoiding it - don't really know why, but Copenhagen .Net User Group (CNUG) recently had Tamir Dresher do an introduction to the tech (and blatantly plug his upcoming book on the subject), and it was a bit of an eye opener, so I've decided to dedicate some time to get to the bottom of this - and blog about it.

The Limitations of IEnumerable

We all know IEnumerable and it's trusty sidekicks, foreach(var foo in bar) and .Select/.Where/.Any/.All/.etc. It's wonderfully convenient to have an interface, where going through a collection/set/array/list is a matter of letting the compiler do its magic.

foreach(var foo in bar)
{
    // do stuff with foo
}

But what if bar contains something exotic like a bunch of Task<T>'s that are currently crunching away on healthy portions of data. We can't really be sure when any one of them will return, but we can be 100% sure, that they won't return in the same order we enumerate them, so we will be wasting time - even if we employ Parallel.ForEach.

Ideally we would like to be able to respond to the tasks completing as events, such that when the first Task completes, we immediately respond to it. We would also like to know when the last Task completes, in order to be able to shutdown gracefully.

Another scenario: What if bar is in fact not a collection with a fixed number of elements. Think a mailbox or maybe a performance counter. New elements keep popping into it, which makes foreach-ing over the collection soft of impossible. We're forced to employ a different strategy probably involving queues and maybe OnNew-events which we need to subscribe to - and which threads are now accessing which parts of the application.

IObservable to the Rescue

Rx is an implementation of the Reactor-pattern  (hence Reactive) in Linq (which are eXtensions). IObservable is mathematically dual to IEnumerable, which could be translated into something like 'The same but seen from the other side'. In stead of pulling foos out of bar, let bar push foos out to you.

In the case of an IObservable of Task<T>, the result would be to get the Tasks in the order they complete. In the case of the mailbox or performance counter, the IObservable implementation would simply respond to additions by emitting them to anyone listening without further notice.

bar.Subscribe(bar => // doStuff with bar)

Oh, and don't get me wrong, it's not really that simple - but the gist of it is this: Rx provides a pattern for solving concurrency-issues in a short and elegant way.

I'll be examining Rx in detail in the next couple of weeks

A few links

http://reactivex.io/
http://josemigueltorres.net/index.php/ienumerableiobservable-duality/

 

 

 

 

The NUnit upgrade

Recently we decided that an upgrade was long overdue - our unittest project was using NUnit 2.6.4, and v3.4.1 had been around for at couple of weeks.

So, we did what anyone would, and let NuGet do the heavy lifting. Of course there were a few breaking changes, but nothing we couldn't handle. 30 minutes and 8 commits later, we were officially back to 2016. But TeamCity didn't agree. It seemed that after an NUnit run, the nunit-agent.exe process was never killed off, keeping pesky references to the assets in the bin-folder thus preventing the next build from succeeding.

So, a bit of digging and we found this: https://github.com/nunit/nunit-console/issues/43. TL;DR: the above happens if nunit-console output is redirected to anything but std-out, like TeamCity. Fix - probably NUnit 3.5, but they're taking it seriously

Screengrap from github/nunit/issue/43

Oh, well. Back to 2.6.4 and now I've written this, to remind me why our unittest project was left behind in 2015

---
Update 27.nov.2016
NUnit released v3.5 recently, and we've tried it out - still doesn't work for us. We get intermittent "AppDomainUnloadExceptions" presumably because something in our test fixtures is being kept alive for longer than the timeout.