TDD and the Silver Bullet Express

So, there has been a lot of talk about TDD in the past half a year. It (possibly) started with David Heinemeier Hansson’s articles, TDD is Dead. Long Live Testing and Test-induced design damage. Those led to the ‘TDD is Dead’ Google Hangouts between Hansson, Martin Fowler and Kent Beck, which I highly recommend watching. And in the midst of all of it, as a response to the GOTO Fail / Heartbleed incidents, ex-Googler, Mike Bland posted Goto Fail, Heartbleed, and Unit Testing Culture.

Hansson’s sentiments resonate with me. He is not anti-testing; he merely asks you to consider the trade-offs of the TDD approach because, despite what TDD zealots might claim, they do exist. Unit-testing can prevent bugs. Unit-testing can also unnecessarily complicate/mangle your design and cost time/money that you might not have. And, bugs may or may not be acceptable, depending on what it is you’re implementing.

The GOTO Fail bug led to a furor of responses which, in enlightened hindsight (or rather, in knee-jerk rhetoric), have provided an excuse for every man and his dog to bleat on about how their favourite technique/tool/preference/plush toy could have prevented it. Bland’s article above attempts to demonstrate how unit-tests could have prevented GOTO Fail / Heartbleed. He’s simply trying to remind developers that they aren’t hopelessly subjected to the whims of software (i.e. the indifferent and feeble ‘bugs happen / shit happens’ attitude) and do have tools at their disposal which can help…

I don’t take issue with the article and I actually don’t disagree that unit-testing might have been an/the appropriate measure. I take issue with the TDD fanatics who misinterpreted it in the worst way possible. It’s all very easy, after a bug has been exposed, to be Captain Hindsight and write a unit-test which reveals it. I find it incredibly insulting to collective human intelligence that zealots think they can extrapolate from this singular demonstration, that all people everywhere should be operating under TDD in earnest, regardless of all other factors, and that it would prevent any/all bug like this and lead to developer salvation worldwide. Maybe unit-tests would have been a good fit for this chunk of security related code? It does not mean that TDD is the most suitable solution for all codebases everywhere, or even a viable/sensible solution. It does not mean that I should be aiming for 100% code coverage if I’m writing a local community website which is used by 10 people a year.

Because you know what else might have prevented the GOTO Fail bug?

  • Static code analysis, which would reveal dead code after the second accidental GOTO… and for the record, this would be infinitely cheaper than unit-testing.
  • Using a language which offers try/catch/finally and does not require the goto/cleanup pattern which is idiomatic in C.
  • Using a functional language which would have eliminated this imperative sprawl altogether
  • Mandating the use of braces around if/else code blocks (not that I agree with this)
  • Regulations/protocol to ensure that developers are more careful when merging code (as that appears to have been the source of this fault).
  • Code reviews
  • Community code review poetry sessions, where speakers attempt to shoehorn all code into iambic pentameter

It’s not just the TDD fanatics who were being sanctimonious. In fact, that event caused all sorts of fanatics everywhere to come out of the woodwork and engage bleating. And this brings me to my more general gripe with the software world… the existence of these silver bullets which promise to solve all your problems, and the rampant disingenuous brain-dead evangelism which soon follows. Dijkstra says that if you’re exposed to QBasic in your nascent days of programming, that you’ll be mentally mutilated beyond hope of regeneration. I say that if you blindly latch onto anything touted as a silver bullet, you’re in the same boat. Let’s take a moment to reflect on these panaceas which have been misappropriated.

TDD, BDD, DDD, user driven development, XP, RUP, Agile, Scrum, unit-testing, design patterns, OOP, immutability, the Cloud, NoSQL, ORMs, CMM, IDEs, convention over configuration, XML over/for everything, IOC containers, prototyping, static typing, dynamic typing, basically every application/web framework ever invented… I’ve only worked for a few years as a full time developer, but I’ve already heard these pearls of wisdom, often from senior people:

  • If you’re not using TDD, you can’t call yourself a software engineer.
  • What? Automated user interface testing. That’s not real testing!
  • You need to be using XML. It’s not declarative otherwise.
  • How can you be developing C# without IOC containers?
  • How are people going to understand code you write without UML?
  • (In the context of OOP), private methods are just classes waiting to burst free.
  • How can software developers possibly be self-managing without daily standups with the PM?

Now here’s the thing. These methodologies are not useless – I personally use some of them on a daily basis. Many of them actually have useful and great ideas! Ideas which can fantastically benefit you from a cost/value perspective, if, IF they are applied properly. with discretion and judgement. Often they’re formulated by smart people who aren’t terribly dogmatic themselves! But somewhere down the line, they’ve been warped and abused…

These bullets are good in intention. They’re generally groups of related best practices / techniques / technologies / ideas which, when applied correctly to the appropriate problem, will yield benefits (on average). It’s easy to see why they exist. They usually begin as a reaction to address something which is perceived as nasty. They’re a symptom of the state of the software world during a particular period in time. And it’s easy to see how they become popular. A few people start using them to great success, and they quickly spread because they adeptly solve the problem they were engineered to solve. Even more so if books are being written about them and marketing departments are getting behind them. Architects/senior devs/managers catch wind of them and start using/prescribing them too. They work well in some cases, and the adopters become very excited about them and tell their friends. Junior devs join a company and start using the bullets which are already being used. They like them, as they provide some sort of direction in the scary jungle… it’s probably better than blindly spraying code all over the place, right?

So where do the problems begin? What makes these bullets such an alluring target for abuse, warping them into annoying fads which proliferate our airspace? What is it about them which brings out the worst in people? Why must I tolerate the existence of people who refer to themselves as ‘Agile warriors’ or ‘Agile ninjas’?

  • People often lack the judgement not to apply the same hammer to every problem in cargo cult fashion. From their point of view, if it worked for the last project, there’s no reason not to use it again for their next project/job… and this time round, it should be even faster/easier! There’s often little consideration as to whether the new problem is even remotely similar to the previous problem, or misunderstanding as to what makes a solution a good fit for a problem. Bullets get misappropriated. A good example is JS templating frameworks… it’s relatively easy to learn something like Knockout… it’s slightly harder to have the prudence to know when to use it, and when not to.
  • People tend to project their own tasks/constraints onto others. They don’t consider that the problem they’re solving might be vastly different to the gamut of problems that other people are solving. A zealot who measures a man’s worth based on his OOP skills probably isn’t even considering embedded/game programmers, who often have little need for it. Hell, I think web development has less use for OOP than many other areas (but that’s another argument for another day).
  • It’s easier to feel invested in your existing trove of knowledge and defend it vehemently than to learn new ways. Provided that a technique you’ve used in the past resulted in an at least tolerable experience, you’ll probably favour it over new unknown techniques which you / your company have to invest time/money in learning. And besides, there simply isn’t enough time in the day to evaluate all the bullets.
  • Most devs tend to get locked into a particular ecosystem (whatever they happen to use at work) and ignore everything outside it… this leads to myopic adherence to certain bullets (think Microsoft vs Unix stack, relational vs NoSQL etc.). This kind of insularity is particularly prevalent in Microsoft land (although things are improving nowadays!). And when the minimal group paradigm shows us that even the slightest distinction between groups leads to discrimination, it’s easy to see how fanaticism starts!
  • Devs often have a tendency to take things to a logical extreme, leading them to religiously adhere to every aspect of a bullet, rather than retaining the useful parts and discarding the chaff. Just because I’m utilising four XP practices doesn’t mean I have to use the other 8.
  • While these bullets do generally solve the problem they were intended to solve, they often create other problems which didn’t originally exist; case in point is OOP, which can solve many problems but can also wreak havoc when in the hands of architecture astronauts. These trade-offs are often ignored or swept under the rug.
  • Some ideas aren’t optimal on an individual basis, but on average, produce a better result. E.g. I’m not fond of the rule that a method should be no longer than ‘x’ lines. Even if you ignore the arbitrary limit and treat it as a fuzzy guideline, it’s still dangerously non-optimal in many cases which might be better served by your own judgement. However, if you’re on a team full of people whose judgement leads them to write tangled thousand line methods everywhere, then having a rule like this is probably preferable, for the greater good.
  • Misattribution of benefits. The best example of this is people saying that TDD improves their design and using that as a unilateral argument for applying TDD to all parts of a codebase all the time. Yes, TDD forces you to think about your design/problem (to some extent). And yes, doing that will probably improve your design. But that doesn’t mean that TDD is a pre-requisite for thinking about your design/problem up front! If you find yourself unable to, you should probably re-evaluate that crutch.
  • Consultants, marketing departments and tech authors all tend to benefit when the technology/trend they know/write about is thriving. It’s how they make a living, so they have every reason to get behind it, regardless of all other factors. And let’s face it, a well reasoned and objective evaluation of competing solutions in an area isn’t going to sell as well as an extreme evangelistic acclamation of a single solution.

Fortunately, I work at a company (or at least in a team) where fads do not run amok. Instead, we attempt to use our judgement in order to make decisions which are sensible with consideration of the task at hand and people involved. And because this has proven to be effective for us, it clearly means it’ll be effective for everyone. So I’m going to add to the development methodology dog-pile with one of my own. I call it the Use Your Brain methodology, and here are its tenets which I shall furiously impose on you:

  • Test your code if you think it needs to be tested. Would tests offer much value in this case? Would a failure in that piece of code affect many people? What would a failure cost? Would a failure be catastrophic in consequences – does the code deal with security, credentials, money? Does it tick enough boxes in your ‘criteria of things which tend to require testing’ to justify the development/maintenance cost of writing tests. Think about it.
  • Use whatever frameworks you feel are appropriate.
  • Document code if it makes sense to document it.
  • Refactor if you feel it’s necessary. What will it cost you if you don’t?
  • Substance over style. Think about it.
  • Manage tasks/estimations/quality in a way which makes sense. Think about it.
  • Operate in a way which makes sense given the strengths/weaknesses of your team.

Zzz, I’m getting bored of writing about this banal process – you get the picture. Now, I’m going to proceed to shove it down everyone’s throats! :) And in the meantime, I’m awaiting the Next Big Thing .

Posted in General, Rants | Tagged , , , , , , , | Leave a comment

NuGet issues with Nested Solutions / Branches

NuGet is a decent tool if you use it in exactly the way Microsoft envisages… but unfortunately, like many products/frameworks in their ecosystem, it suffers from Microsoft Tunnel Vision™. At work, I’m attempting to properly solve a problem which has been plaguing us for a while now, once and for all. You might find it referred to elsewhere as the nested solution/branch problem, the HintPath problem, the per-solution problem, the repositoryPath problem etc… it seems people have been complaining about it, and to little avail.

The fundamental issue is that NuGet packages are stored relative to the solution (usually in $(SolutionDir)\packages), and references to DLLs in those packages are project relative. So, if you have a project which is added to multiple solutions which live in different folders, the project’s assembly references will be valid in one solution, but invalid in the other. To illustrate, here’s our typical code structure:

/MyFooBar
    MyFooBar.sln

    /ReusableLibrary
        ReusableLibrary.sln

        /UsefulWidgets
            UsefulWidgets.csproj
        /packages
            /NLog
                NLog.dll
    /Foo
        Foo.csproj
    /packages
        /NLog
            NLog.dll

Our common reusable library, ReusableLibrary.sln exists in its own branch. Since we want to use it in MyFooBar.sln, we branch it into a folder at the source level. Our library projects were already members of ReusableLibrary.sln and are now also members of MyFooBar.sln. NuGet ‘packages’ folders ideally should not be checked in. This is an extremely common and sensible branching structure.

But look at what happens when UsefulWidgets is made to reference NLog.dll, acquired via NuGet. If added in the context of ReusableLibrary.sln, UsefulWidgets will have a reference to ..\packages\NLog\NLog.dll. But when the library is opened in the context of MyFooBar.sln, that packages folder will not exist, and the reference will be invalid. NuGet will pull in the same NLog package at /MyFooBar/packages/NLog, for which the correct reference would be ..\..\packages\NLog\NLog.dll. This seems like a pretty big design flaw and oversight to me. I’m not too sure how other people deal with this, but these are the candidate solutions we’ve evaluated.

The checked-in packages solution

This is the solution we used for over a year until recently. It involves checking in all NuGet packages and disabling package restore. This ensures that, upon getting latest, the packages always exist where projects expect them to exist. Problems with this approach include:

  • Developers need to remember to check in packages. The moment they forget, the build breaks.
  • Packages must only be added to projects in the context of the innermost solution they belong to. That is to say, if I want to add a NuGet package to UsefulWidgets.csproj, I need to do it in the context of MyReusableLibrary.sln. Otherwise, the new package would be added to the outer packages folder which does not exist when ReusableLibrary is opened in isolation in its own branch (or anywhere else it has been branched). Annoying.
  • Someone might accidentally enable package restore.
  • Checking in NuGet packages is generally frowned upon. I’m sure this has been discussed at length elsewhere, so I won’t get into it.

The HintPath hack solution (ding ding ding!)

This is the solution we’ve just switched to. It involves using NuGet as normal without changing anything, and without checking in packages. But whenever an assembly reference is added to a project as part of a NuGet package being added, you must manually edit the project file and change the HintPath (the path to the DLL) from being project relative to solution relative. In the above case, the reference would change from ..\packages\NLog\NLog.dll to $(SolutionDir)packages\NLog\NLog.dll. Provided package restore is working as it should, this reference should always be valid. From our short trial period, this seems like the least bad solution. Problems include:

  • The obvious hassle of having to hack all NuGet originated references. Fortunately, this could be automated via a Powershell script (or perhaps as part of the build process). And unless you’re adding references galore, it’s not terribly onerous.
  • In newer versions of NuGet, package restore only happens in the context of Visual Studio. So in order to get your build server to fetch packages on the fly as is required, you’ll need to edit your project files and add a task to invoke NuGet and make it do exactly what VS is doing (a call to nuget.exe restore).
  • In some cases (e.g. using MSBuild to build a project in isolation), the $(SolutionDir) environment variable might not exist.

The single shared repository solution

NuGet allows you to specify repository paths via solution level nuget.config files. So you could set things up so that all solutions put their NuGet packages in the inner repository (i.e. ReusableLibrary\packages). So, you’d have a single stable repository which exists in all contexts. This solution is not viable for us because:

  • The repository becomes polluted with packages which are only relevant to outer solutions. Essentially, the inner library repository ends up holding the set of all packages used by the inner library and everything which uses the library.
  • This approach fails as soon as you have another nested solution whose projects have the ‘must work in multiple contexts’ requirement and does not lie on the repository’s root path (i.e. it does not contain the repository). Consider what happens if you branch in AnotherReusableLibrary at the same level as ReusableLibrary.

The ‘Don’t Use NuGet’ solution

Owing to our frustration with this issue, we were about ready to abandon NuGet altogether! After all, the old school method of manually checking in DLLs to a stable location might be slightly slow, archaic and cumbersome, but at least it’s reliable – it simply works without any hidden surprises. And it’s not like we have a huge number of dependencies anyway.

But, I like NuGet too much. It’s a step in the right direction for the M$ stack and it pleases me to be able to have the nice things that Ruby/Java/Linux/whatever developers have had for a long long time prior. Moreover, MVC is virtually married to NuGet nowadays. Divorcing them every time you create a web project would be a PITA.

Posted in ASP .NET, C#, General, Version Control | Tagged , , , , , , , , , , , | Leave a comment

Some wacky HTML generation code I wrote the other week

On a site I’m working on, we offer content authors a multi-line textbox whose content later is rendered to end-users by way of replacing newline-like tokens with <br />s. It was then decided that we needed to add support for bullets points. So we needed something to take the ‘markup’ on the left and convert it to the HTML on the right (or similar):

1 markup

Owing to extreme time constraints and the fact that we wanted to exclude all other forms of rich content (and other reasons), I decided to roll my own quick and nasty HTML generation code rather than use/customise something like CKEditor/Markdown. So here it is – I felt a little dirty writing it but hey, it took next to no time, it works like a dream and is even unit-tested (bonus!). My only issue with it is that we’re using <br /> elements instead of <p> elements with margins, but on the other hand, content authors will seldomly be using line breaks and it’s something which can be easily changed later.

Note that RegexReplace(), StringJoin(), IsNullOrEmptyOrWhiteSpace() and Remove() are all extensions which we’ve added as we prefer a more functional/fluent way of programming. Excuse the lack of constants :D

static string ConvertMarkupToHtml(string markup)
{
  if (markup == null)
    return null;

  return markup
    // "* [something]" will eventually be translated to <li>[something]</li>, so remove the space now
    .Replace("* ", "*")
    // Replace *[something] linebreak with <ul><li>[something]</li></ul>
    .RegexReplace(@"(\*(?<content>[^\r\n]*)(\r\n|\n))", "<ul><li>${content}</li></ul>")
    // Replace *[something]END with <ul><li>[something]</li></ul>END
    .RegexReplace(@"(\*(?<content>[^\r\n]*)$)", "<ul><li>${content}</li></ul>")
    // Unescape asterisks
    .Replace("**", "*")
    // Replace newline like tokens with <br />s
    .Replace("\r\n", "<br />")
    .Replace("\n", "<br />")
    // Combine adjacent lists
    .Remove("</ul><ul>")
    // Surround all content which is not HTML or inside HTML with <p></p>
    // ... RegexSplit includes the delimiters (bits of HTML) as part of its result - so this returns <br />s, <ul></ul>s and raw content
    .RegexSplit(@"(<br />|<ul>.+?</ul>)")
    .Where(c => !c.IsNullOrEmptyOrWhiteSpace())
    // ... Let markup pass through - wrap non-markup in <p></p>
    .Select(chunk => chunk.StartsWith("<br />") || chunk.StartsWith("<ul>") ? chunk : "<p>" + chunk + "</p>")
    .StringJoin()
    // Whenever 1-2 <br />s exists between two paragraphs, amalgamate the paragraphs preserving the <br />s
    // ... Why doesn't this work? (<br />){1, 2}
    .RegexReplace("</p>(?<brs>(<br />|<br /><br />))<p>", "${brs}")
    // Whenever 3+ <br />s exist between two paragraphs, remove two of them to compensate for p element spacing
    .RegexReplace("</p><br /><br /><br />(?<brs>(<br />)*)<p>", "</p>${brs}<p>")
    // Whenever some number of <br />s exists between a </p> and a <ul>, remove one of them
    .RegexReplace("</p><br />(?<brs>(<br />)*)<ul>", "</p>${brs}<ul>");
}
Posted in C#, Web | Tagged , , , , , , , , , | Leave a comment

Provisioning SharePoint Lookup Fields Programmatically

There’s a large number of blog posts floating around on the topic of creating lookup fields (whether programmatically, via XML, or via the UI), as there are many ways of doing so with many variations, each with its own set of limitations. I’ve come up with a reasonably generic and flexible way of doing this with very few real drawbacks. It takes the form of the extension method below which I’ve modified slightly for public consumption.

public static SPFieldCollectionExt
{
  public static SPFieldLookup CreateLookupField(this SPFieldCollection fields, string title, string internalName,
    SPField lookupListTargetField, bool addToAllContentTypes, Action<SPFieldLookup> fieldActions)
  {
    string caml = String.Format(
      "<Field Type=\"Lookup\" DisplayName=\"{0}\" Name=\"{0}\" List=\"{1}\" />",
      internalName, "{" + lookupListTargetField.ParentList.ID.ToString().ToUpper() + "}");

    string nameOfCreatedField = fields.AddFieldAsXml(caml, true,
      addToAllContentTypes ? SPAddFieldOptions.AddToAllContentTypes : SPAddFieldOptions.AddToDefaultContentType);

    return fields.UpdateField<SPFieldLookup>(nameOfCreatedField, f =>
    {
      f.Title = title;
      f.LookupWebId = lookupListTargetField.ParentList.ParentWeb.ID;
      f.LookupField = lookupListTargetField.InternalName;
  
      if (fieldActions != null)
        fieldActions(f);
    });
  }


  static T UpdateField<T>(this SPFieldCollection fields, string fieldInternalName, Action<T> fieldActions) where T : SPField
  {
    var field = (T)fields.GetFieldByInternalName(fieldInternalName);
    fieldActions(field);
    field.Update();
    return (T)fields.GetFieldByInternalName(fieldInternalName);
  }
}

Which can be called using code like this:

site.RootWeb.Lists["Pages"].Fields.CreateLookupField("My Lookup Field", "MyLookupField", lookupTargetField, true, f => 
{
  f.AllowMultipleValues = true;
  f.Required = true;
});

Advantages:

  • This method allows a lookup field to target a list outside the current web. Most methods (particularly CAML oriented ones) don’t as they rely on setting List to a web relative URL rather than a list ID.
  • This method allows any number of field properties to be configured in a strongly typed fashion by way of a lambda (see the final advantage below). And Update() is called automatically afterwards.
  • This method uses AddFieldAsXml() instead of Add() or AddLookup(). This means:
    • You can add a list field to all content types available in that list. Basically, without extra work, the latter two methods cannot be used to achieve what you’d accomplish in the UI by ticking the below box:
      pic
    • It can be used to add custom field types -though this would only matter if you were subclassing SPFieldLookup. More generally speaking, a CreateField() wrapper (one which can create any type of field) should call the first method for this reason.
  • This abstracts away lookup/CAML nastiness. E.g. it’s not sufficient to supply the list ID – despite its uniqueness, one must still provide the LookupWebId. But callers of this method need not worry about such SP idiosyncrasies.
  • It uses a programmatic approach – some will disagree here, but one day I’ll write a blog post which explains why I consider this approach superior. Of course, this is a disadvantage if in your codebase, you do everything via XML.

Disadvantages:

  • The lookup target list must exist prior to calling this. But this is most likely a non-issue – if you’re adding a lookup field programmatically, you’re probably also provisioning the list programmatically… therefore, you have control over when this happens.
  • This method will only add a lookup to existing content types. Content types added after the fact will need to be updated.
Posted in C#, SharePoint | Tagged , , , , , , , , , , | Leave a comment

And on the Seventh Day, Monkeys Created SharePoint…

So lately at work I’ve been developing some reasonably complex document/picture library driven components, resulting in me writing some SPDocumentLibrary/SPFolder helpers for our SharePoint framework. And there’s nothing like a bit of alone time with the object model to remind you just what a monstrous abortion it is… it’s actually quite surreal how much lunacy Microsoft has managed to pack into SP.

Now, let’s play a game! Document/picture libraries consist of folders and files. The top level folder is the root folder which of course does not have a parent… so if you were to invoke the SPFolder.ParentFolder property on a root folder, what would you expect in return? Is it:

    1. An exquisite exception, baked with love from SharePoint? This isn’t a bad guess as SP does tend to be rather ‘exception happy’.
    2. Null
    3. Uh… refrigerator!

B is the obviously preferable answer, but A would be adequate (provided that there was some sort of HasParentFolder property/method in place). But of course (!), the answer is C. MSDN says that the parent folder of a root folder is itself… familial weirdness aside, it should then hold that SPDocumentLibrary.RootFolder.UniqueID == SPDocumentLibrary.RootFolder.ParentFolder.UniqueID. But this invariant doesn’t hold and it’s actually far worse… calling ParentFolder on a root folder really gives you a peculiar half-baked SPFolder instance whose ParentList is zeroed out despite its Exists property being set to true and whose parent folder is itself. Hmmm…

Conversely, if you call SPWeb.GetFolder(strUrl) for a non-existent folder, you’ll get a ‘properly’ half-baked SPFolder instance with Exists set to false… I mean they wouldn’t do anything completely outrageous like simply returning null now would they?

Conversely (in another dimension), invoking SPFolder.SubFolders[“IDontExist”] will result in an exception being raised. Make up your mind!

I’ve come across plenty of other funniness in SPFolder alone, but I’ll omit it for now as I’m trying to keep my posts short nowadays (and I’ve blogged about object model oddities before). Most of the object model reads like it was written by capricious bipolar developers who have never seen an API in their lives. And since it’s strangely comforting to hear about other people in the same boat with the same gripes, here’s some literature :D

http://adrianoconnor.net/2011/01/i-hate-developing-sharepoint-a-rant/

http://www.stum.de/2010/07/16/is-the-sharepoint-object-model-too-weak-for-excellent-applications/

http://www.idiotsyncrasies.com/2007/11/note-to-self-sharepoint-queries.aspx

http://readmystuff.wordpress.com/2009/08/01/top-5-software-abominations-of-all-time/

Posted in Rants, SharePoint | Tagged , , , , , , | Leave a comment

Strongly Typed URLs in SharePoint

So it seems that the current flurry of advancement in our SP framework at work is leading me to write another topical blog post. For the longest time (at least a year), I’d been wanting to improve the way we handle URLs during site collection authoring – I’d go as far as to say the user experience for content authors was even a little hostile. For something as conceptually simple as pointing a web part at a list, the content author workflow would be:

  1. Edit the page on which the web part resides
  2. Encounter a web part property which has the words ‘Site Collection Relative List URL’ in the name.
  3. Call upon jedi training (the product training manual we present to them) and recollect the format of a site collection URL and how a list URL can be obtained.
  4. … Find the list in question, go to its Settings page and snag the list URL (in absolute form) from there.
  5. Take the URL, manually make it site relative, and paste that into the web part property textbox.

So when more than one content author complained about this process, I had the perfect excuse to go off and spend time making it right. And it sure did take time, because what I had to do was fairly obvious and simple in idea, but painful in execution (yay, SharePoint!). I developed an EditorPart (a custom authoring UI for web parts) as well as a custom SP field, both of which present a similar UI which allows content authors to either enter an absolute URL or to select a site collection resource (e.g. a list, web, page, picture etc.) via the OOTB asset picker. These ideas are not new and have been implemented by others in the past with varying degrees of success. But it’s not those I want to write about- I want to write about the infrastructure behind them, the URL classes which back them.

Normally in web development, one is concerned with absolute and server relative URLs and not much else. But in SharePoint land, it often makes sense to have the notion of a site collection relative URL; a URL like ~/MyWeb/Pages/default.aspx which refers to a resource in a site collection and is portable across site collections, regardless of whether they’re root site collections or mounted on a managed path (e.g. /sites/sc). Previously we represented all URLs as strings from the moment they’re born (at the point of user input or in code) until the moment they’re used to accomplish some data access or are reified through rendering. We generally had little use for the cruddy .NET URI class.

As part of the above tasks, I went off and created a bunch of URL classes to represent all URL types and input schemes. As one might expect, this included types like AbsoluteUrl, SiteRelativeUrl and ServerRelativeUrl, but also input schemes like:

  • AbsoluteOrSiteRelativeUrl – for the scenario described above, where the user can enter one or the other.
  • AbsoluteOrServerRelativeUrl – useful for dealing with many SP controls like the publishing image field, which return URLs which are absolute (if external) or server relative (if the user has picked an image in the site collection)

Each is an immutable type which takes a string in its constructor and balks if it’s malformed. The idea is that you create these from string as soon as possible and reify them as late as possible. Having URLs which are strongly rather than stringly typed is advantageous for these reasons:

  • You instantly know what type of URL you’re dealing with – there’s no repeated dodgy detection code which is invoked on magic strings.
  • One can write succinct well-defined conversion methods between different URL types. Furthermore, these methods (as well as your URL utility methods) are simply defined as instance or extension methods on the types, so they’re easy to find.
  • The concept of nullity is unambiguous. E.g. if a SiteRelativeUrl object is null, it’s conceptually null, if it’s not null, it’s conceptually non-null. C.f. a string which represents the same thing. A null string obviously corresponds to a null URL, but what is an empty string? Does it signify the absence of a URL? Or is it a URL which corresponds to the root of the site collection or the root of the current context?
  • Variable names become less ambiguous/confusing/verbose – the C# type keeps track of the URL type rather than the variable name… so rather than an absoluteOrSiteRelativeWebUrl of type string, you have a webUrl of type AbsoluteOrSiteRelativeUrl
  • The type system stops you from attempting nonsensical conversions and assignments.
  • The value of the URL is handled by code. Previously we had defensive code like siteRelativeUrl.EnsureStartsWith(“~”) scattered everywhere. This existed firstly, to correct content authors’ input and coerce it into the expected form, but also because five layers deep in code when passing around magic strings, it wasn’t always clear whether the URL had already been properly defined/sanitised at that point. It’s easy to forget or not know, especially when you consider that one can arrive at a point via multiple code paths which may or may not do their own checking. So it became easier to be defensive absolutely everywhere to be safe. This sort of fuzziness and lack of precision disappears when you use types which automatically verify integrity for you.

Before embarking upon writing these URL classes (and converting our entire sprawling codebase to use them), I was a little skeptical as to how much it might achieve and whether it would buy us enough to justify its existence. I’m generally a minimalist and am not inclined to vomit forth a plethora of sugar-coated OOP classes or impose design patterns which unnecessarily complicate the required mental model of a developer reading my code. But I think the benefits are manifold, and as this re-architecture has only recently come to fruition, I’m interested to see how my fellow devs find it.

Posted in ASP .NET, C#, SharePoint, Web | Tagged , , , , , , , , | Leave a comment

SharePoint Pain Points

Having worked with the developer nightmare that is Microsoft SharePoint for a couple of years and being one of the architects of a framework which sits on top of it, I’ve come to know a lot about various parts of it. SharePoint is notorious for being awful to work with, and much of the effort which has gone into our framework has been around making it easier, faster and less painful to use. Rather than an all out rant, I thought I’d post about some of the shadier parts of SharePoint, with war stories interspersed for… posterity? Comedy? I don’t know…

PublishingWeb.IsPublishingWeb()

Let’s start off with my favourite one! IsPublishingWeb() seems like an honest enough method right? You call it on an SPWeb to figure out if it is a publishing web or not. Simple! And it always works… except for when it doesn’t. Turns out that it sometimes returns the wrong result and it’s non-deterministic about the whole shebang.

We tend to provision site collections using object model code in a feature activation event-receiver. About a year ago, a colleague and I spent a couple of days trying to figure out why our provisioning code was failing in one particular environment, but worked everywhere else. We tried everything, thinking it was something complex like a fundamental flaw in our deployment scripts or an issue related to SP multi-tenanted environments. Nope, it was simply some overly defensive code which called this method as a sanity check – unnecessary since we actually knew we were dealing with a publishing web, but understandable nonetheless. Having gone completely around the twist, we whipped up a console app which would create publishing webs and call this method afterwards, and we executed two instances of it simultaneously against our, until then, untainted environment. We observed that under concurrent site creation, this bug chooses to rear its ugly head. I think this was the point when I lost what little faith I had in the SP object model.

Linq to SharePoint

Linq to SharePoint is Microsoft’s attempt at providing a more modern way of querying SP resources without paying the performance penalty of the object model, and without resorting to CAML. Seems like a good idea, right? In the past, many of our SP sites made use of this technology for reading data from lists in a very simple manner and it worked fine for the most part. But sporadically, and usually once every month or two, these reads would mysteriously begin to fail with a horrible stack trace and it could only be resolved by recycling app pools. For the longest time, Microsoft ignored our complaints blaming our setup and code, despite the fact that it has been observed by other people too. We simply ended up replacing all L2SP code with object model code – problem solved! Avoid like the plague.

Custom Field Types

I’ve recently gone through the ordeal of implementing a custom SP field type. If you’ve ever done the same, and you’ve tried to add properties to it, you will also have tried to make use of SPField.GetCustomProperty() and SetCustomProperty(). You will also have realised that the bloody methods don’t work!!! There’s a variety of workarounds, some of which involve doing crazy stuff like storing property values temporarily in thread local storage. Needless to say, one should not have to do this!!! It really is quite amazing that M$ developers manage to botch something as simple as storing key/value data against an object, but I suppose incompetency knows no bounds.

Web Provisioned Event Receiver

We make heavy use of event-receivers to initialise SPWebs after creation. There’s a funny bug where a WebProvisioned event-receiver will be called twice if you create the web through object model code, but once if you create it through the UI. Obviously the desirable behaviour is generally to execute it once, and this can be achieved by setting SPWebEventProperties.Cancel to true.

SPWeb AllProperties vs Properties

You can store key/value data against an SPWeb in two ways- via the AllProperties hash table or via the Properties SPPropertyBag. Now for some godawful reason, SPPropertyBag internally lowercaseifies the key you use. Microsoft has declared Properties to be a legacy thing, presumably because of the above weirdness and maybe because of other things too? Unfortunately, when they introduced AllProperties, they also introduced some funny interactions between the two, leading to some bizarre behaviour which is exhibited depending on whether your keys are lower case and on the order in which you update SPWeb and SPWeb.Properties. Someone has blogged about it in detail here.

CSSRegistration.After

The SharePoint CssRegistration control allows you to render CSS links. I originally used it as it exposes an After property which purportedly allows you to specify that one .css file should be included after another – a common use case is to include your own CSS after corev4.css. IIRC, this works fine for two CSS files but fails for more. In fact, when using this property to impose ordering on >2 CSS links, the result seems completely arbitrary! If it was at least deterministically wrong, then it would be correctable!

Array prototype extension

Once upon a time, I decided to extend the JavaScript array prototype with an indexWhere function… basically a function which would return the index of the element (if any) in an array which satisfies a predicate. Turns out SharePoint doesn’t like this, and it triggers mysterious errors from within the bowels of its JS. I’ve since learned that extending built-in prototypes is generally not a good idea, but it’s not unheard of either. All of this essentially means that you can’t easily use libs like Modernizr and Prototype with SP as they use this technique.

SPWebConfigModification

I’ve blogged about the horrors within previously.

MSDN

Yup, MSDN is a disaster. Documentation on the SharePoint object model is at best scant and vague, and quite often it’s completely incorrect and full of lies. Let’s have a look at the docs for a fundamental method, SPWeb.GetList(string strUrl):

According to Visual Studio => strUrl: A string that contains the site-relative URL for a list, for example, /Lists/Announcements.

According to MSDN => strUrl: The server-relative URL to the root folder of a list, such as /sites/sitecollection/subsite/Lists/Announcements.

Problems here! Firstly, the VS and MSDN docs contradict each other. Secondly, the VS example of a site-relative URL is incorrect – M$ convention is that they do not begin with forward slashes. Thirdly, both are wrong. I’ve written unit-tests against the SP object model which tell me that GetList() works with (at least) server-relative URLs and absolute URLs. This is merely one example- MSDN is rife with this rubbish… is it any surprise that most SP devs resort to a decompiler and skip the docs altogether?

Posted in SharePoint | Tagged , , , , , , , , , | 1 Comment

Thoughts on Knockout JS

I’m lazy and I think it’s high time I revived this blog so here goes. In the past six months at work, I’ve used Knockout JS in a couple of projects, including a (very) rich client app where I had the freedom to go nuts and develop it however I desired. Despite lack of experience in the area, I’d basically already made up my mind that I wanted to use some sort of templating framework and that I’d be mad not to, given that higher level declarative code is generally more succinct and maintainable than reams of low level spidery DOM manipulation. Without a huge amount of deliberation, I opted for Knockout.

I’ve found Knockout to be an intuitive framework which does a few key things and does them well. So here are some of my scatterbrained thoughts about it, coming from the perspective of someone whose previous JS framework experience included only jQuery, jQuery UI and Underscore. Won’t make much sense to anyone who hasn’t used it.

Overall architecture
Knockout JS is often described as an MVVM framework but really, it’s more like a VVM framework. The view model is a JS object consisting of KO observables, the view is HTML decorated with KO data-bind attributes etc., and the model is generally your own ‘stuff on the server’, so to speak. Commonly, changes made by users via the UI trigger AJAX GETs/POSTs from client to server. Most web apps have some sort of initial model which resides on the server and should be displayed to the user upon page load. KO doesn’t prescribe any specific method of transforming this server-side model into a KO view model, but it seems to me that there are at least a couple of ways:

1.) Do not render the initial model server-side… render nothing, and have your JS fetch the initial model upon page load via AJAX calls. This is elegant in the sense that all model reading/writing can be done client side in the same way via the same AJAX calls, whether it’s the initial view model load or anything which happens thereafter… the server does not require a view model and does not have to render anything. But this might not be realistic for large models where it will cause noticeable delay.

2.) Server-side render a representation of the model into a hidden field (say as JSON). Have your JS digest and interpret this model and create a view model accordingly. This might provide a better user experience for large complex models but of course requires you to write JS to translate the JSONified server-side model into a KO view model. A variation on this would be to construct a view model server side which is identical to your KO view model, and serialise it into a hidden field. Then on page load, the JS merely needs to deserialise this into an ‘initial view model’ without additional transformation.

3.) Server-side render the markup corresponding to the original model. *Somehow* reverse engineer it into a view model (essentially, do the opposite of what data binding does). Perhaps there are plugins out there for this?

In my app, I took the approach of delegating as much as possible to KO. That meant using built in bindings like click/checked etc. instead of their JS/jQuery counterparts, and writing KO bindings to wrap jQuery Sortable/Draggable etc., leaving very little raw jQuery event-handling code. Given my skepticism around the mixing of frameworks, I was initially worried that I’d hit a brick wall, but it turned out fine in the end. If I were to do things again though, I’d most likely handle mouse events outside KO. I’m not sure what most people tend to do here…

Unlike some other frameworks, KO does not impose any sort of code organisation/structure. While I prefer this flexibility, one of my colleagues was a bit apprehensive about it. And understandably so, as one can see that it would be quite easy to write some very messy and unnecessarily complex view models. However, there are KO plugins out there which deal with this sort of thing. I chose a very OOP-like approach when writing my view model, with my types (main app.js, view model, helpers) defined across many JS files using the module pattern.

Performance
Generally KO’s performance tax was never an issue for me, but one place where it did become an issue was with KO templates. I ran into an issue where KO would over-zealously re-render an entire collection of objects simply because an unimportant observable in one of them changed. It makes sense from KO’s point of view- it follows a simple rule which states that it should re-render something when anything it conceivably depends on has changed. But it’d be nice if you could control how KO does this. And since it’s easy to unintentionally create complex chains of dependencies, it’d be neat if KO provided a visualisation or debug dump of dependencies.

Composite Bindings
One obvious feature missing from KO is composite bindings. There were a few places in my markup where I would apply the same combination of five or so KO bindings with mostly the same arguments. This is problematic not only because it bloats markup, but also because these sets of identical bindings must be maintained in several different places in parallel. One slightly cludgey solution would be to abstract it away as a reusable HTML template. Another solution (which I attempted), was to roll my own custom binding which applied applies the bindings itself… this /almost/ worked but I had issues with one of the bindings which I was manipulating in hacky unsupported ways. The ideal solution would be a composite binding.

Computables vs Manual Subscriptions
One gotcha is that computed observables are often evaluated by Knockout itself earlier than you might expect. And if those observables rely on the view model being properly initialised prior to evaluation, this can lead to funny errors when it’s not. This and other things led to me sometimes converting computables into observables fuelled by manual subscriptions.

Posted in JavaScript, Web | Tagged , , , , | 2 Comments

Why All Developers Should Learn Functional Programming

Last Friday, I turned up to work completely unaware that I’d be spending the rest of the day at the CodeMania programming conference – it was a nice surprise to say the least. The most poignant presentation for me, was Ivan Towlson’s ‘How to Not Write a For Loop’, as he touched on many ideas I’ve been thinking about in the last year since I embraced functional programming myself. Through examples written in imperative C#, Linq and F#, he put forward a fairly compelling argument for why programmers should consider using functional style constructs instead of just banging out the same old for/foreach loops to solve every problem in existence. Compelling enough to prompt me to blog about my strong opinions on the matter.

Functional programming has existed for over half a century but has never really gained much traction until relatively recently. There are niche areas where it clearly shines (real time systems, parallelism etc.), but for the most part, it has been largely confined to obscure languages and dusty academic tomes. Part of the reason for that is that in the past, it has required programmers to step away from the mainstream languages they tend to be comfortable with, to drop everything they know about programming in the traditional imperative style, and to adopt a completely new mindset, in the wake of a steep learning curve full of monads and other mathematical morsels.

But then something great happened… Microsoft released Linq. Now, criticise them all you like, but Linq really is a very well thought-out library. It’s not that C# is the first popular language to incorporate functional-style operations (Python has had them for aeons) or that Linq is even the best example. It’s the fact that Microsoft exposed a massive sector of the programming world to a new way of thinking by injecting functionally inspired, yet accessible constructs into C#. And Linq is an utter joy to use.

So here are a couple of everyday operations which are far better expressed in a functional manner – the operation of finding a human called Aberforth, and the operation of finding all humans who are unlucky enough to have one-eared cats. Firstly, in C# imperative:

Human humanCalledAberforth = null;
foreach (Human human in humans)
{
  if (human.Name == "Aberforth")
  {
    humanCalledAberforth = human;
    break;
  }
}

var unfortunateHumans = new List<Human>();
foreach (Human human in humans)
{
  foreach (Cat cat in human.PetCats)
  {
    if (cat.NumEars == 1)
      unfortunateHumans.Add(human);
  }
}

And the Linq version is below. To understand it, you need to know the following:

  • a => b is lambda shorthand for a function which takes one argument (a) and returns b.
  • Since the below Linq methods are functional in nature, they take some input, perform some computation, and return output, without altering the input which is treated as immutable.
  • FirstOrDefault() searches through a sequence and finds the first element which satisfies a predicate, or returns the default value (null in this case) if it can’t find one.
  • Where() filters a list, based on a predicate. Any() returns true if there are any elements in a sequence which satisfy the specified predicate.
var humanCalledAberforth = humans.FirstOrDefault(
  human => human.Name == "Aberforth");
var unfortunateHumans = humans.Where(
  human => human.PetCats.Any(cat => cat.NumEars == 1) );

Observe that the imperative example is huuuge. A random programmer cannot glance at it and instantly understand its intention – they firstly must mentally process about five times as much code. Furthermore, like most list operations written in an imperative style, it’s packed with boilerplate ifs and loops and temporary variables which aren’t at all representative of the problem the code is actually solving. On the other hand, as soon as I see FirstOrDefault() in the Linq version, I know that the entire operation returns the first human which satisfies some condition… and then I look at the condition, and I’m done.

Essentially, what you realise when first coding in this style is that the majority of things you usually would do with loops can be better represented in terms of higher level standard list operations (projecting, filtering, ordering, aggregation etc.), which are conveniently first class citizens in functional languages/libraries. So for the same reason that you use if and for instead of goto, and foreach instead of for, it makes sense here to use functional list operations instead of overly general Computer Science 101 control flow primitives. The tiny cost is that you have to invest a little time learning about them.

Now I’m not insisting that all programmers write absolutely everything in a functional style, as there are many cases when an imperative style leads to more comprehensible (or more performant) code. I simply think it’s beneficial that programmers learn either a purely functional language, or at least a functional style library. Either will change the way you think about programming. I know that learning a bit of Python and Haskell, and using Linq on a daily basis has permanently warped my mind for the better.

 

Posted in C#, Functional Programming | Tagged , , , , , , | 1 Comment

JSON DateTime Serialisation Gotchas

DateTimes are a bit nasty, really. They appear deceptively elementary and unthreatening, leading generations of programmers to misuse them, or even grossly underestimate them and attempt to roll their own datetime libraries, only to end up in no man’s land with war stories galore. At work, we recently encountered a problem where the dates on our front-end lagged the dates in our database by a day. This was immediately indicative of a time zone issue, but we weren’t sure exactly where it was coming from.

It turns out that the discrepancy was partially a result of our MVC3 web service serialising datetimes using its default JavaScriptSerializer, and then our front-end deserialising them using DataContractSerializer – they use different formats. But worse still, after doing a bit of testing, I discovered that JavaScriptSerializer’s serialisation and deserialisation operations aren’t inverse. That’s to say that if you use it to serialise a DateTime of a non-UTC kind (i.e. local or unspecified), then deserialise the result, you’ll end up with a different date to the one you started with.

Here’s what happens when you serialise a DateTime of 1/1/2000 12 am with its Kind set to UTC and Local respectively. A bigger number means a later date/time. Note that NZ is UTC+13, so 1/1/2000 Local is equivalent to 31/12/1999 11am UTC. As this is earlier than 1/1/2000 UTC, the Local  number is smaller than the UTC one.

JavaScriptSerializer DataContractSerializer
UTC \/Date(946684800000)\/ \/Date(946684800000)\/
Local \/Date(946638000000)\/ \/Date(946638000000+1300)\/

Prior to serialisation, both serialisers convert the DateTime to UTC if it’s Local/Unspecified – and in the case of NZ, this means subtracting time. The difference is that the DataContractSerializer keeps track of this subtraction by way of a timezone annotation, so that you can reliably deserialise it later. And it shows, because the DataContractSerializer will spit your original DT back at you if you point it at either of its serialised forms above. The same can’t be said for the JavaScriptSerializer, which treats everything as UTC, and hence, only succeeds with UTC DTs.

This won’t be an issue for people who always convert their DateTimes to UTC before sending them across the wire. It caught us out because we were transmitting Unspecified DateTimes which are treated the same as local dates, causing the JavaScriptSerializer to  convert them to UTC, subtracting 13 hrs in the process. Since we wanted to pass around semantically unspecified DateTimes (i.e. unknown time zone), it didn’t make sense to convert to UTC. So we decided to instead serialise them as strings in the format YYYY-MM-DDThh:mm:ss, a variation on a W3C suggested format found here.

Posted in C#, Web | Tagged , , , , , , , , , | 2 Comments