Skip to content

you've been HAACKED - Phil Haack
Syndicate content you've been HAACKED
...and you like it.
Updated: 3 hours 57 min ago

Hazards of Converting Binary Data To A String

Mon, 01/30/2012 - 08:15

Back in November, someone asked a question on StackOverflow about converting arbitrary binary data (in the form of a byte array) to a string. I know this because I make it a habit to read randomly selected questions in StackOverflow written in November 2011. Questions about text encodings in particular really turn me on.

In this case, the person posing the question was encrypting data into a byte array and converting that data into a string. The conversion code he used was similar to the following:

string text = System.Text.Encoding.UTF8.GetString(data);

That isn’t exactly their code, but this is a pattern I’ve seen in the past. In fact, I have a story about this I want to tell you in a future blog post. But I digress.

The infamous Jon Skeet answers:

You should absolutely not use an Encoding to convert arbitrary binary data to text. Encoding is for when you've got binary data which genuinely is encoded text - this isn't.

Instead, use Convert.ToBase64String to encode the binary data as text, then decode usingConvert.FromBase64String.

Yes! Absolutely. Totally agree. As a general rule of thumb, agreeing with Jon Skeet is a good bet.

Not to give you the impression that I’m stalking Skeet, but I did notice that this wasn’t the first time Skeet answered a question about using encodings to convert binary data to text. In response to an earlier question he states:

Basically, treating arbitrary binary data as if it were encoded text is a quick way to lose data. When you need to represent binary data in a string, you should use base64, hex or something similar.

This perked my curiosity. I’ve always known that if you need to send binary data in text format, base64 encoding is the safe way to do so. But I didn’t really understand why the other encodings were unsafe. What are the cases in which you might lose data?

Round Tripping UTF-8 Encoded Strings

Well let’s look at one example. Imagine you’re receiving a stream of bytes and you store it as a UTF-8 string and pop it in the database. Later on, you need to relay that data so you take it out, encode it back to bytes, and send it on its merry way.

The following code simulates that scenario with a byte array containing a single byte, 128.

var data = new byte[] { 128 };
string text = Encoding.UTF8.GetString(data);
var bytes = Encoding.UTF8.GetBytes(text);

Console.WriteLine("Original:\t" + String.Join(", ", data));
Console.WriteLine("Round Tripped:\t" + String.Join(", ", bytes));

The first line of code creates a byte array with a single byte. The second line converts it to a UTF-8 string. The third line takes the string and converts it back to a byte array.

If you drop that code into the Main method of a Console app, you’ll get the following output.

Original:      128
Round Tripped: 239, 191, 189

WTF?! The data was changed and the original value is lost!

If you try it with 127 or less, it round trips just fine. What’s going on here?

UTF-8 Variable Width Encoding

To understand this, it’s helpful to understand what UTF-8 is in the first place. UTF-8 is a format that encodes each character in a string with one to four bytes. It can represent every unicode character, but is also backwards compatible with ASCII.

ASCII is an encoding that represents each character with seven bits of a single byte, and thus consists of 128 possible characters. The high order bit in standard ASCII is always zero. Why only 7-bits and not the full eight?

Because seven bits ought to be enough for anybody:

When you counted all possible alphanumeric characters (A to Z, lower and upper case, numeric digits 0 to 9, special characters like "% * / ?" etc.) you ended up a value of 90-something. It was therefore decided to use 7 bits to store the new ASCII code, with the eighth bit being used as a parity bit to detect transmission errors.

UTF-8 takes advantage of this decision to create a scheme that’s both backwards compatible with the ASCII characters, but also able to represent all unicode characters by leveraging the high order bit that ASCII ignores. Going back to Wikipedia:

UTF-8 is a variable-width encoding, with each character represented by one to four bytes. If the character is encoded by just one byte, the high-order bit is 0 and the other bits give the code value (in the range 0..127).

This explains why bytes 0 through 127 all round trip correctly. Those are simply ASCII characters.

But why does 128 expand into multiple bytes when round tripped?

If the character is encoded by a sequence of more than one byte, the first byte has as many leading "1" bits as the total number of bytes in the sequence, followed by a "0" bit, and the succeeding bytes are all marked by a leading "10" bit pattern.

How do you represent 128 in binary? 10000000

Notice that it’s marked with a leading 10 bit pattern which means it’s a continuation character. Continuation of what?

…the first byte never has 10 as its two most-significant bits. As a result, it is immediately obvious whether any given byte anywhere in a (valid) UTF‑8 stream represents the first byte of a byte sequence corresponding to a single character, or a continuation byte of such a byte sequence.

So in answer to the question of why does 128 expand into multiple bytes when round tripped, I don’t really know other than a single byte of 128 isn’t a valid UTF-8 character. So in all likelihood, the behavior shouldn’t be defined. it’s the Unicode Replacement Character used for invalid data (Thanks to RichB for the answer in the comments!).

I’ve noticed a lot of invalid ITF-8 values expand into these three bytes. But that’s beside the point. The point is that using UTF-8 encoding to store binary data is a recipe for data loss and heartache.

What about Windows-1252?

Going back to the original question, you’ll note that the code didn’t use UTF-8 encoding. I took some liberties in describing his approach. What he did was use  System.Text.Encoding.Default. This could be different things on different machines, but on my machine it’s the Windows-1252 character encoding also known as “Western European Latin”.

This is a single byte encoding and when I ran the same round trip code against this encoding, I could not find a data-loss scenario. Wait, could Jon be wrong?

To prove this to myself, I wrote a little program that cycles through every possible byte and round trips it.

using System;
using System.Linq;
using System.Text;

class Program
{
    static void Main(string[] args)
    {
        var encoding = Encoding.GetEncoding(1252);
        for (int b = Byte.MinValue; b <= Byte.MaxValue; b++)
        {
            var data = new[] { (byte)b };
            string text = encoding.GetString(data);
            var roundTripped = encoding.GetBytes(text);

            if (!roundTripped.SequenceEqual(data))
            {
                Console.WriteLine("Rount Trip Failed At: " + b);
                return;
            }
        }

        Console.WriteLine("Round trip successful!");
        Console.ReadKey();
    }
}

The output of this program shows that you can encode every byte, then decode it, and get the same result every time.

So in theory, it could be safe to use Windows-1252 encoding of binary data, despite what Jon said.

But I still wouldn’t do it. Not just because I believe Jon more than my own eyes and code. If it were me, I’d still use Base64 encoding because it’s known to be safe.

There are five unmapped code points in Windows-1252. You never know if those might change in the future. Also, there’s just too much risk of corruption. If you were to store this string in a file that converted its encoding to Unicode or some other encoding, you’d lose data (as we saw earlier).

Or if you were to pass this string to some unmanaged API (perhaps inadverdently) that expected a null terminated string, it’s possible this string would include an embedded null character and be truncated.

In other words, the safest bet is to listen to Jon Skeet as I’ve said all along. The next time I see Jon, I’ll have to ask him if there are other reasons not to use Windows-1252 to store binary data other than the ones I mentioned.

Tags: , , , ,


Categories: Blogs

Getting Older

Mon, 01/23/2012 - 19:22

Birthdays are a funny thing, aren’t they? Let’s look at this tweet for example,

It's @haacked's birthday. Give him crap about getting old.

No gifts, please. Especially not what Charlie suggests.

Of course I’m getting older. We’re all getting older. Every second of every day and twice on Monday. Every femtosecond even. Perhaps the only time we’re not getting older is the moment within a Planck time interval. But once that interval is up, yep, you’re older.

Yet people apparently live their lives completely oblivious to this fact until they’re next birthday comes along. As the chronometer slides the next number into place, the realization dawns, “Damn! I’m older!” What? You didn’t know this?!

Feeling Older

The odd thing to me is that I don’t really feel older, mentally. I mean, I consciously know I’m older, but I feel like there’s this smooth continuum from my first memory to now. While the things I spend time thinking about have changed, the way I think about others and about myself feels like it hasn’t changed. I’m the same person then as I am now, and that kind of blows my mind.

For example, I still think fart jokes are funny.

In my mind, old people tell you how they used to walk miles uphill both ways to get to school. But I realize that these days, old people tell you about how they used to have to use their phone to connect online at 1200 baud. And there was no internet!!! OMG! What the hell were we connecting to?

Rather than feeling older, I am observing the evidence that I’m older. For example, I used the word “baud” in this blog post. Another example is how injuries now take much longer to heal. I have two kids, a four year old and a two year old and I’m pretty sure that if I were to slice them clean in half, that’d only put them out of commission for a week. They’d heal up and have no scars! Meanwhile, if I get a paper cut on a finger I can pretty much kiss that finger goodbye. Write it off as a loss and start practicing typing with two bloody stumps for hands.

Getting Experienced

But it’s not just physical. I do notice that while I don’t feel older, I do have the benefit of many more years of experience to draw upon. But more importantly, I’m finally actually paying attention to that. Go figure.

Last week, we had our GitHub summit and Friday was our field trip day to a distillery then a bar. This was the night set aside to party hard. Which is amazing to me because the night before I’m pretty sure we as a company consumed enough alcohol to bring elephants to extinction.

But I drew upon my experience and took it easy because I had a flight early the next morning and I did not want to be sick on an airplane. Contrast this to a few years before at Tech-Ed Hong Kong when I was out with some local friends and at 5:00 AM I had to leave the bar early to catch a flight. For the first time in my life, I contemplated suicide.

Some might call that getting wiser. I call it pain avoidance.

Knowing Less

The other evidence of my getting older is that I know a lot less now than I did when I was younger. Certainly that can’t be true in the absolute sense since I don’t have alzheimer’s (that I’m aware of anyways). But I remember as a young programmer I knew everything!

I knew the right way to do all things in all situations with absolute conviction. But these days, I’m not so sure. About anything. All I have is the breadth of my experience and pattern matching at my disposal. Each new situation is simply a pattern matching exercise against my database of experience followed by an experiment to see if what I thought I knew produces good results.

The great thing about this approach is when you know everything, you have nothing to learn. But now, I’m constantly learning. Many of my experiments fail because many of my experiences are no longer relevant today. The world changes. Quickly. But each experiment is an opportunity to learn.

Staying Young

So yeah, I’m getting older, but I’ve found a loophole. Remember the kids I mentioned slicing in half? I’m not going to do that because I’m worried I’d end up with four of them then and two are already a handful.

These two do a great job of making me feel young because they will laugh at every fart joke I can come up with.

So thanks for all the birthday wishes on Twitter, Facebook, and elsewhere. Here’s to getting older!

Tags: , , ,


Categories: Blogs

Comparing Strings in Unit Tests

Sat, 01/14/2012 - 08:29

Suppose you have a test that needs to compare strings. Most test frameworks do a fine job with their default equality assertion. But once in a while, you get a case like this:

[Fact]
public void SomeTest()
{
    Assert.Equal("Hard \tto\ncompare\r\n", "Hard  to\r\ncompare\n");
}

Let’s pretend the first value in the above test is the expected value and the second value is the value you obtained by calling some method.

Clearly, this test fails. So you look at the output and this is what you see:

test-output

It’s pretty hard to compare those strings by looking at them. Especially if they are two huge strings.

This is why I typically write an extension method against string used to better output a string comparison. Here’s an example of a test using my helper.

[Fact]
public void Fact()
{
    "Hard  to\rcompare\n".ShouldEqualWithDiff("Hard \tto\ncompare\r\n");
}

And here’s an example of the output.

test-compare-output

At the very top, the assert message is the same as before. I deferred to the existing Assert.Equal method in xUnit (typically Assert.AreEqual in other test frameworks) to output the error message.

Underneath the existing message are headings for three columns: the character index, the expected character, and the actual character. For each character I print out the int value and the actual character.

Of course in some cases, I don’t print out the actual value. If I were to do that for new line characters and tab characters, it’d screw up the formatting. So instead, I special case those characters and print out the escape sequence in C# for those characters.

This makes it easy to compare two strings and see every difference when a test fails. Even the hidden ones.

This is a simple quick and dirty implementation available in a Gist. For example, it doesn’t do any real DIFF comparisons and try to line up similarities. That’d be a nice improvement to make at some point. If you can improve this, feel free to fork the gist and send me a pull request.

Tags: , , ,


Categories: Blogs

A Really Empty ASP.NET MVC 3 Project Template

Wed, 01/11/2012 - 01:33

In the ASP.NET MVC 3 Uservoice site, one of the most voted up items is a suggestion to include an empty project template. No, a really empty project template.

You see, ASP.NET MVC 3 includes an “empty” project template, but it’s not empty enough for many people. So in this post, I’ll give you a much emptier one. It’s not completely empty. If you really wanted it completely empty, just choose the ASP.NET Empty Web Application template.

The Results

I’ll show you the results first, and then talk about how I made it. After installing my project template, every time you create a new ASP.NET MVC 3 project, you’ll see a new entry named “Really Empty”

mvc3-empty-proj-template

Select that and you end up with the following directory structure.

mvc3-proj-template-expanded

I removed just about everything. I kept the Views directory because the Web.config file that’s required is not obvious and there’s special logic related to the Views directory. I also kept the Controllers directory, since that’s where the tooling is going to put controllers anyways. I also kept the Global.asax and Web.config files which are typically necessary for an ASP.NET MVC project.

I debated removing the AssemblyInfo.cs file, but decided to trim it down and keep it.

Building Custom Project Templates

I wrote about building a custom ASP.NET MVC 3 project template a long time ago. However, I’ve improved on what I did quite a bit. Now, I have a single install.cmd file you can run and it’ll determine whether you’re on x64 or x86 and run the correct registry script. The install.cmd and uninstall.cmd batch files are there for convenience and call into a PowerShell script that does the real work.

UPDATE 1/12/2012: Thanks to Tim Heuer, we have an even better installation experience. He refactored the project to output a VSIX file. All you need to do is double click the extension file to install the project template. I’ve uploaded the extension file to GitHub here.

I tried uploading it to the gallery, but it wouldn’t let me. I’ll follow up on that.

History

If you’re wondering why the product team hasn’t included this all along, it’s for a lot of reasons. There was (at least when I was there) internal debate about how empty to make it. For example, when you create a new project with my empty template, and hit F5, you get an error. Not a great experience for most people.

Honestly, I’m all for it, but there are many other higher priority items for the team to work on. So I figured I’d do it myself and put it up on GitHub.

Installation

Installation is really simple. If you like to build things from source, grab the source from my GitHub repository and run the build.cmd batch file. Then double click the resulting VSIX file. Be sure to read the README for more details.

If you don’t yet know how to use Git to grab a repository, don’t worry, just navigate to the downloads page and download the VSIX file I’ve conveniently uploaded.

Contribute!

Hey, if you think you can help me make this better, please go fork it and send me a pull request. Let me know if I include too little or too much.

I’ve already posted a few things that could use improvement in the README. If you'd like to help make this better, consider one of the following. :)

  • Make script auto-detect whether VS is running or not and do the right thing
  • Test this on an x86 machine
  • Write an installer for this

Let me know if you find this useful.

Tags: , , ,


Categories: Blogs

Recognition Compensation

Thu, 01/05/2012 - 23:26

Mary Poppendieck writes the following in Unjust Deserts (pdf), a paper on compensation systems (emphasis mine),

There is no greater de-motivator than a reward system that is perceived to be unfair. It doesn’t matter if the system is fair or not. If there is a perception of unfairness, then those who think that they have been treated unfairly will rapidly lose
their motivation.

Written over seven years ago, the paper is just as insightful and applicable today. For example, let’s apply it to the recent dust-up about the legitimacy and fairness of the Microsoft MVP Program.

I think the MVP program means well. It’s not trying to be a conspiracy or filch you of your just desserts. But if you think about the MVP program as a compensation system, it becomes very clear why people feel disillisioned.

What compensation am I talking about?

  1. An MSDN Subscription
  2. Privileged access to product teams and not yet public information (under NDA)
  3. A yearly summit which provides hotel rooms and access to product team members as well as a nice party.

Not only is it a compensation system, but the means by which compensation is doled out is perceived to be arbitrary and hidden. It’s a recipe for mistrust.

Intrinsic Motivations

Mary goes on to point out,

In the same way, once employees get used to receiving financial rewards for meeting goals, they begin to work for the rewards, not the intrinsic motivation that comes from doing a good job and helping their company be successful.

Someone asked me what I thought about the MVP program recently and I said I think Microsoft’s actually a great company, but I don’t think you should seek out recognition from Microsoft or any other corporation for your community contributions. I think that provides the wrong incentives to build community.

If you run an open source project, don’t do it to receive recognition from Microsoft. Or any other corporation for that matter (except maybe you’re own). Do it to scratch an itch! Do it because it’s fun. Do it to show cool stuff to your peers. Worry about their recognition more than some corporation.

If you answer questions about a technology on StackOverflow, do it because you enjoy sharing your knowledge with others (and you want the SO points!), not because it’s on a checklist to receive an MVP award.

Just as Mary points out, when you start to frame these activities as means to receive an extrinsic reward, you become disillusioned. So whether the program exists or not, we should strive on our part to not feel a sense of entitlement to the program and focus on our intrinsic motivations.

Fixing It

I covered what I think we should strive for. But what do I think Microsoft should do? Several things.

So far, I glossed over the the fact that recognition from Microsoft isn’t the only reason people want the award. There are material benefits. MVPs are part of a privileged group that gets early access to what Microsoft is doing, which might provide a real competitive advantage. Why wouldn’t you seek that out?

Open Development

Let’s tackle the first thing first, privileged “early access”. Well there’s one easy solution to that. Do you know why NuGet doesn’t have an “early access” program? Drew Miller nails it on Twitter:

Know how you avoid the need for a privileged group of folks under NDA that inevitably is seen as special and superior? Develop in the open.

NuGet sidesteps the whole question of a recognition program by developing in the open. The same is true for the Azure SDK. When active development occurs in a public repository, the whole concept of “early access program” makes no sense.

Not only that, but recognition in an open source project doesn’t come from some corporation. It comes from the maintainers of a project and from the folks in the project’s community that you’ve helped. You can point to the reason people are recognizing you.

Better Free Tools

The other reason folks want an MVP is to have access to the professional tools. Most companies will easily shell out the money for this, but if you’re a hobbyist or open source developer, it’s a lot of money to shell out.

In this regard, I think Microsoft should either make its free Express tools have more pro features such as allowing Visual Studio Extensions and multi-project support, or simply make Visual Studio Professional free, and focus on developing the ecosystem that gets a boost when everyone has better tools to build on your platform. Everyone wins.

Focused recognition

I don’t think it’s inherently wrong for a company to recognize people’s contributions. But it has to be done in a way so that it’s seen as icing and not an entitlement or cronyism.

It’s darn near impossible to conceive of a recognition program that would be seen as universally fair and recognizes something so broad as “community contributions”. A better approach might be to have multiple smaller recognition programs. Focus on removing obstacles that get in the way of people inherently doing the things that’s good for all of us. For example, it benefits Microsoft’s when:

  1. People are helping solve each other problems on the forums.
  2. People are giving talks about their products.
  3. People are building software (open and not) on their platforms.
  4. Probably some others I’m forgetting…

For what it’s worth, I think #1 is already solved by StackOverflow. Just move your forums there and be done with it. After all, nobody gets upset when they answer a question on Twitter and don’t get StackOverflow points.

Recognize!

Will Microsoft change the program? I have no idea. I’m not really all that concerned about it really. In the meanwhile, we can recognize folks who make our lives better. We don’t need to wait for Microsoft to do so. I’ve used a huge swath of open source projects that have made my development smoother. I’ve found many great answers in forums, blog posts, StackOverflow that unblocked me.

Moving forward, I’ll make an extra effort to thank the people responsible for those things. Maybe there’s some projects and folks you should recognize. Go for it! It’ll feel good.

Disclaimer: I was a former Microsoft MVP for about three months before joining Microsoft as an employee. I’m now an employee of GitHub. My opinion here is simply my own opinion and does not necessarily represent the opinion of any employers past, present, and future. Nor does it represent the opinion of my dog, because I don’t have one, nor anyone in my neighborhood.

Tags: , ,


Categories: Blogs

Structuring Unit Tests

Mon, 01/02/2012 - 06:08

In the past, I’ve tried various schemes to structure my unit tests but never fell into a consistent approach. Pretty much the only rule I had (which I broke all the time) was to write a test class for each class I tested. I would then fill that class with a ton of haphazard test methods.

That was until I saw the approach that Drew Miller took with NuGet.org. The way he structured the unit tests struck me as odd at first, but quickly won me over. Drew tells me he can’t take all the credit for this approach. This approach came from when he worked at CodePlex, and builds upon practices he learned from Brad Wilson and Jim Newkirk. That’s the thing I like about Drew, he won’t take credit for other people’s work. Unlike me, of course.

The structure has a test class per class being tested. That’s not so unusual. But what was unusual to me was that he had a nested class for each method being tested.

I’ll provide a simple code example to illustrate this approach and then highlight some of the benefits. The following has two methods for embellishing names with more interesting titles. What it does isn’t really that important for this discussion.

using System;

public class Titleizer
{
    public string Titleize(string name)
    {
        if (String.IsNullOrEmpty(name))
            return "Your name is now Phil the Foolish";
        return name + " the awesome hearted";
    }

    public string Knightify(string name, bool male)
    {
        if (String.IsNullOrEmpty(name))
            return "Your name is now Sir Jester";
        return (male ? "Sir" : "Dame") + " " + name;
    }
}

Under Drew’s system, I’ll have a corresponding top level class, with two embedded classes, one for each method. In each class, I’ll have a series of tests for that method.

Let’s look at a set of potential tests for this class. I wrote xUnit.NET tests for this, but you could apply the same approach with NUnit, mbUnit, or whatever you use.

using Xunit;

public class TitleizerFacts
{
    public class TheTitleizerMethod
    {
        [Fact]
        public void ReturnsDefaultTitleForNullName()
        {
            // Test code
        }

        [Fact]
        public void AppendsTitleToName()
        {
            // Test code
        }
    }

    public class TheKnightifyMethod
    {
        [Fact]
        public void ReturnsDefaultTitleForNullName()
        {
            // Test code
        }

        [Fact]
        public void AppendsSirToMaleNames()
        {
            // Test code
        }

        [Fact]
        public void AppendsDameToFemaleNames()
        {
            // Test code
        }
    }
}

Pretty simple, right? If you want to see a real-world example, look at these tests of the user service within NuGet.org.

So why do this at all? Why not stick with the old way I’ve done in the past?

Well for one thing, it’s a nice way to keep tests organized. All the tests (or facts) for a method are grouped together. For example, if you use the CTRL+M, CTRL+O shortcut to collapse method bodies, you can easily scan your tests and read them like a spec for your code.

unittests-spec

You also get the same effect if you run your tests in a test runner such as the xUnit test runner:

unittests-testrunner

When the test class file is open in Visual Studio, the class drop down provides a quick way to see a list of the methods you have tests for.

unittests-method-list

This makes it easy to then see all the tests for a given method by using the drop down on the right.

unittests-test-list

It’s a minor change to my existing practices, but one that I’ve grown to like a lot and hope to apply in all my projects in the future.

Update: Several folks asked about how to have common setup code for all tests. ZenDeveloper has a simple solution in which the nested child classes simply inherit the outer parent class. Thus they’ll all share the same setup code.

Tags: , ,


Categories: Blogs

Why I Love New Year&rsquo;s Eve

Sun, 01/01/2012 - 00:19

Happy New Year’s Eve everyone! And by the time you read this, it’ll probably already be the new year. To my friends across the international date line, what is 2012 like? The rest of us will be there soon.

New Year’s Eve has always been one of my favorite holidays. It brings a collective time for reflection on the past year and anticipation and hope for the year to come.

And for me, New Year’s Eve has an extra special meaning because exactly ten years ago on New Year’s Eve, I met this woman at Giant Village. A mutual friend suggested that we should meet since we were both attending this event. This woman was there with her brother, and I was there with a buddy.

photo

I wonder what she’s been up to after all these years?

Just kidding!

I know what she’s up to. We met in 2001 and were smitten by the time 2002 arrived and have been together ever since. Ten years later, we’ve added to our funky bunch. We work hard hoping to keep these little munchkins alive. What a difference a decade makes, no?

family-2011

So yeah, New Year’s Eve totally rocks in my book.

Tags:


Categories: Blogs

OSS and .NET Year In Review 2011

Mon, 12/26/2011 - 05:40

T’is the season for “Year in Review” and “Best of” blog posts. It’s a vain practice, to be sure. This is exactly why I’ve done it almost every year! After all, isn’t all blogging pure vanity? Sadly, I did miss a few years when my vanity could not overcome my laziness.

This year I am changing it up a bit to look at some of the highlights, in my opinion, that occurred in 2011 with open source software and the .NET community. I think it’s been a banner year for OSS and .NET/Microsoft, and I think it’s only going to get better in 2012.

NuGet Logo

We released NuGet 1.0 in the beginning of this year and it had a big impact on the amount of sleep I got last year. Insomnia aside, it’s also had a significant impact on the .NET community and been very well received.

One key benefit of NuGet is it provides a central location for people to discover and easily acquire open source libraries. This alone helps many open source libraries gain visibility. According to http://stats.nuget.org/, the NuGet gallery now has over 4,000 unique packages and 3.4 Million package downloads.

Scott Hanselman noted another impact I hadn’t considered in his DevReach 2011 keynote. To understand his observation, I need to provide a bit of background.

Back in April, Microsoft released the ASP.NET MVC 3 Tools update. This added support for pre-installed NuGet packages in the ASP.NET MVC 3 project templates so that projects created from these templates already include dependent libraries installed as NuGet packages rather than as flat files in the project. This allows developers who create a project from a template to upgrade these libraries after the project has been created via NuGet.

NuGet 1.5 adds this support for pre-installed packages to any project template that wants it. In the preview for ASP.NET MVC 4, we included libraries such as ModernizR, jQuery, jQuery UI, jQuery Validation, and Knockout in this manner. We expect other project templates in the future to take advantage of this as well.

The interesting observation Hanselman had in his keynote is that this is an example of Microsoft giving equal billing to these open source libraries as it does to its own. When you create an ASP.NET MVC 4 project, your project includes Microsoft packages alongside 3rd party OSS packages all installed in the same manner.

Additionally, the way NuGet itself was developed is also important. NuGet is an Apache v2 open source project that accepts contributions from the community. Microsoft gave it to the Outercurve Foundation and continues to supply the project with employee contributors.

Orchard Project

Before there was NuGet, there was Orchard. Orchard is an open source CMS system that was started at Microsoft, but also contributed to the OuterCurve Foundation.

What’s really impressive about Orchard is the amount of community involvement they’ve fostered. They’ve set up a governance structure consisting of an elected steering committee so that it’s truly a community run project.

They recently surpassed 1 million module downloads from their online gallery. Modules are extensions to Orchard that are installable directly from within the Orchard admin.

Umbraco

Umbraco is an independent open source CMS that has a huge following and a strong community. They’ve been around for a while, long before 2011. But in 2011, Microsoft hosted the redesigned http://asp.net/ site using Umbraco.

Micro ORMS

For a lack of better term, I think 2011 was the year of the mini-ORMS. While many refere to these libraries as micro-ORMS, they’re not technically ORMs. They’re more simple data access libraries. A non-comprehensive list of the ones that made a big splash are:

If you’re interested in seeing a more comprehensive list of Micro ORMS with source code examples of usage (Nice!), check out this blog post by James Hughes.

Micro Web Frameworks and OWIN

Like pairing a good beer with the right steak, lightweight micro-web frameworks provide a good pairing with Micro-ORMS. It’s interesting that  both of these picked up quite a bit this past year.

Some that caught my attention this year are:

  • Named after Sinatra’s daughter, there’s the Nancy micro web framework.
  • FubuMVC is billed as the project that gets out of your way.
  • OpenRasta is a resource oriented web framework for building REST services.

Again, James Hughes provides a comparative list of micro-web frameworks complete with source code examples.

With the proliferation of web frameworks as well as lightweight web servers such as Kayak and Manos de Mono, the need to decouple the one from the other arose. This is where OWIN stepped into the gap.

OWIN stands for Open Web Interface for .NET. It is a project inspired by Rack, a Ruby Webserver interface, meant to decouple web servers from the web application frameworks that run on them.

This project was started as a completely grass roots project in 2011 but has seen amazing pick-up from the community and I believe will have a big impact in 2012.

mp-mono-logo

Miguel de Icaza wrote a monster blog post about the year that he and the Mono (and Xamarin) folks have had in 2011. His post inspired me to write this less monstrous one. It’s a great post and really inspiring to see how they’ve emerged from the ashes of the great Novell layoff of 2011 to have a great year.

In the following image, you can see me teaching Miguel everything he knows about software development and open source while Scott acts surprised?

What really caught my interest in his post was the note about Microsoft using Mono and Unity3D to build Kinectimals for iOS systems such as the iPad. 2011 seems to be the year of pigs flying for Microsoft.

Xamarin is doing a great job of bringing Mono, and consequently C# and open source to just about every device imaginable!

Open Source Fest at Mix 11

Mix is one of my favorite conferences and I’ve attended every single one. And it has nothing to do with it being in Las Vegas, though that doesn’t hurt one bit.

This year was special due to the efforts of John Papa (who’s name makes me wonder if he ever goes all Biggie Smalls on people and sings “I love it when you call me John Papa”). This year, John put together the Open Source Fest at Mix.

This was an event where around 50 projects had stations in a large open room where they could represent their project and talk to attendees. The atmosphere was electric as folks went from table to table learning about useful software directly from the folks who built it.

This is where projects such as Glimpse got noticed and really took off. Would love to see more of this sort of thing at conferences.

Azure SDKs and GitHub

As I recently wrote on the GitHub blog, Microsoft is actively developing a set of Azure SDKs for multiple platforms (not just .NET) in GitHub. All of these libraries are Apache v2 licensed and actively being developed in GitHub.

screenshot of the azure sdk homepage

It’s great to see Microsoft not only releasing source code under an open source license, but actively developing it in the open and ostensibly accepting contributions from the public. I look forward to seeing more of this in the future.

GitHub_PixelOptimizedLogo

Last but not least, there’s GitHub. Full disclaimer: I’m an employee of GitHub so naturally my opinion is totally biased. But a bias doesn’t necessarily mean an opinion is wrong.

What I love about GitHub is that just about everybody is there. GitHub hosts a huge number of open source projects, including a large number of the important ones you’ve heard of. Quantity alone isn’t a sign of quality, but it can create network effects. When a site has such a large community, hosting a project there makes it easier to attract contributors because there’s such a large pool to draw from.

I’ve seen this benefit .NET open source projects first hand. Since moving some of my projects there, I’ve received more pull requests. Small independent projects such as JabbR have really attracted a passionate community at GitHub with large numbers of external contributions. Most of the credit must go to the efforts of the great project leads who’ve worked hard to foster a great community, but I think they’d agree that hosting on GitHub certainly makes it easier and more enjoyable.

What did I miss?

Did I miss anything significant in your opinion? Let me know in the comments. What do you think will happen in 2012? Does the number 2012 look like a science fiction year to you? Because it does to me. I can’t believe it’s just about here already. Have a great holidays!

UPDATE: Egg on my face. This post was meant to list a few highlights and not be a comprehensive list of all that happened in open source in the .NET space. Even so, in my holiday infused malaise, I was negligent in omitting several highlights. I apologize and updated the post to reflect a few more significant events. Let me know if I missed some obvious ones.

Tags: , , ,


Categories: Blogs

Configure Git in PowerShell So You Don&rsquo;t Have to Enter Your Password All the Damn Time

Mon, 12/19/2011 - 09:09

My last post covered how to improve your Git experience on Windows using PowerShell, Posh-Git, and PsGet. However, a commenter reminded me that a lot of folks don’t know how to get Git for Windows in the first place.

And once you do get Git set up, how do you avoid getting prompted all the time for your credentials when you push changes back to your repository (or pull from a private repository)?

I’ll answer both of those questions in this post.

Install msysgit

The first step is to install Git for Windows (aka msysgit). The full installer for msysgit 1.7.8 is here. For a detailed walkthrough of the setup steps, check out GItHub’s Windows Setup walkthrough. It’s pretty straightforward. That’ll put Git.exe in your path so that Posh-Git will work.

Bam! Done! On to the second question. Make sure you set up your SSH keys before moving to the second section.

Using SSH with Posh-Git

One annoyance with Git on Windows is when pushing changes to a repository (or pulling from a private repository), you have to constantly enter your password if you cloned the repository using HTTPS.

Likewise, if you clone with SSH, you also need to enter your passphrase each time. Fortunately, a little program called ssh-agent can securely save your pass phrase (and consequently your sanity) for the session and supply it when needed.

Update: Mike Chaliy just fixed PsGet so it always grabs the latest version of Posh-Git. If you installed Posh-Git before today using PsGet, you’ll need to update Posh-Git by running the following command:

Install-Module Posh-Git –force

Unfortunately, at the time that I write this, the version of Posh-Git in PsGet does not support starting an SSH Agent. The good news is, the latest version of Posh-Git direct from their GitHub repository does support SSH Agent.

Since the previous step installed git.exe on my machine, all I needed to do to get the latest version of Posh-Git is to clone the repository.

git clone https://github.com/dahlbyk/posh-git.git

This creates a folder named “posh-git” in the directory where you ran the command. I then copied all the files in that folder into the place where PsGet installed posh-git. On my machine, that was:

C:\Users\Haacked\Documents\WindowsPowerShell\Modules\posh-git

When I restarted my PowerShell prompt, it told me it could not start SSH Agent.

powershell-ssh-agent-not-found

It turns out that it was not able to find the “ssh-agent.exe” executable. That file is located in C:\Program Files (x86)\Git\bin. but that folder isn’t automatically added to your PATH by msysgit.

If you don’t want to add this path to your system PATH, you can update your PowerShell profile script so it only applies to your PowerShell session. Here’s the change I made.

$env:path += ";" + (Get-Item "Env:ProgramFiles(x86)").Value + "\Git\bin"

On my machine that script is at:

C:\Users\Haacked\Documents\WindowsPowerShell\Microsoft.Powershell_profile.ps1

The next time I opened my PowerShell prompt, I was greeted with a request for my pass phrase.

powershell-ssh-agent

After typing in my super secret pass phrase, once at the beginning of the session, I was set. I could clone some private repositories and push some changes without having to specify my pass phrase each time. Nice. Secure. Convenient.

The Start-SshAgent Command

The reason that I get the ssh-agent prompt when starting up PowerShell is because when I installed Posh-Git, it updated my profile to load in their example profile:

C:\Users\Haacked\Documents\WindowsPowerShell\Modules\posh-git\profile.example.ps1

That profile script is calling the Start-SshAgent command which is included with Posh-Git. If you don’t like their profile example, you can manually start ssh-agent by calling the Start-SshAgent command.

Tags: , , ,


Categories: Blogs

Better Git with PowerShell

Tue, 12/13/2011 - 04:29

I’m usually not one to resort to puns in my blog titles, but I couldn’t resist. Git it? Git it? Sorry.

Ever since we introduced PowerShell into NuGet, I’ve become a big fan. I think it’s great, yet I’ve heard from so many other developers that they have no time to try it out. That it’s “on their list” and they really want to learn it, but they just don’t have the time.

But here’s the dirty little secret about PowerShell. This might get me banned from the PowerShell junkie secret meet-ups (complete with secret handshake) for leaking it, but here it is anyways. You don’t have to learn PowerShell to get started with it and benefit from it!

Seriously. If you use a command line today, and switch to PowerShell instead, pretty much everything you do day to day still works without changing much of your workflow. There might be the occasional hiccup here and there, but not a whole lot. And over time, as you use it more, you can slowly start accreting PowerShell knowledge and start to really enjoy its power. But on your time schedule.

UPDATE: Before you do any of this, make sure you have Git for Windows (msysgit) installed. Read my post about how to get this set up and configured.

There’s a tiny bit of one time setup you do need to remember to do:

Set-ExecutionPolicy RemoteSigned

Note: Some folks simply use Unrestricted for that instead of RemoteSigned. I tend to play it safe until shit breaks. So with that bit out of the way, let’s talk about the benefits.

Posh-Git

If you do any work with Git on Windows, you owe it to yourself to check out Posh-Git. In fact, there’s also Posh-HG for mercurial users and even Posh-Svn for those so inclined.

Once you have Posh-Git loaded up, your PowerShell window lights up with extra information and features when you are in a directory with a git repository.

posh-git-info

Notice that my PowerShell prompt includes the current branch name as well as information about the current status of my index. I have 2 files added to my index ready to be committed.

More importantly though, Posh-Git adds tab expansions for Git commands as well as your branches! The following animated GIF shows what happens when I hit the tab key multiple times to cycle through my available branches. That alone is just sublime.

ps-tab-expansion

Install Posh-Git using PsGet

You’re ready to dive into Posh-Git now, right? So how do you get it? Well, you could follow all those pesky directions on the GitHub site. But we’re software developers. We don’t follow no stinkin’ list of instructions. It’s time to AWW TOE  MATE!

And this is where a cool utility named PsGet comes along. There are several implementations of “PsGet” around, but the one I cover here is so dirt simple to use I cried the first time I used it.

To use posh-git, I only needed to run the following two commands:

(new-object Net.WebClient).DownloadString("http://psget.net/GetPsGet.ps1") | iex
install-module posh-git

Here’s a screenshot of my PowerShell window running the command. Once you run the commands, you’ll need to close and re-open the PowerShell console for the changes to take effect.ps-installing-posh-gitThat’s

Both of these commands are pulled right from the PsGet homepage. That’s it! Took me no effort to do this, but suddenly using Git just got that much smoother for me.

Many thanks to Keith Dahlby for Posh-Git and Mike Chaliy for PsGet. Now go git it!

Tags: , , ,


Categories: Blogs

Using QUnit with Razor Layouts

Sat, 12/10/2011 - 14:47

Given how central JavaScript is to many modern web applications,  it is important to use unit tests to drive the design and quality of that JavaScript. But I’ve noticed that there are a lot of developers that don’t know where to start.

There are many test frameworks out there, but the one I love is QUnit, the jQuery unit test framework.

qunit-tests-running

Most of my experience with QUnit is writing tests for a client script library such as a jQuery plugin. Here’s an example of one QUnit test file I wrote a while ago (so you know it’s nasty).

You’ll notice that the entire set of tests is in a single static HTML file.

I saw a recent blog post by Jonathan Creamer that uses ASP.NET MVC 3 layouts for QUnit tests. It’s a neat approach that consolidates all the QUnit boilerplate into a single layout page. This allows you to have multiple test files and duplicate that boilerplate.

But there was one thing that nagged me about it. For each new set of tests, you need to add an action method and a corresponding view. ASP.NET MVC does not allow rendering a view without a controller action.

Controller-Less Views

The idea of controller-less views has been one tossed around by folks, but there are all sorts of design issues that come up when you consider it. For example, how do you request such a view directly? If you allow that, what if the view is intended to be rendered by a controller action. Now you have two ways to access that view, one of which is probably incorrect. And so on.

However, there is another lesser known framework (at least, lesser known to ASP.NET MVC developers) from the ASP.NET team that pretty much provides this ability!

ASP.NET Web Pages with Razor Syntax

It’s a product called ASP.NET Web Pages that is designed to appeal to developers who prefer an approach to web development that’s more like PHP or classic ASP.

Aside: I’d like to go on record and say I hated that name from the beginning because it causes so much confusion. Isn’t everything I do in ASP.NET a web page?

A Web Page in ASP.NET Web Pages (see, confusing!) uses Razor syntax inline to render out the response to a request. ASP.NET Web Pages also support layouts. This means we can create an approach very similar to Jonathan’s, but we only need to add one file for each new set of tests. Even better, this approach works for both ASP.NET MVC 3 and ASP.NET Web Pages.

The Code

The code to do this is straightforward. I just created a folder named test which will contain all my unit tests. I added an _PageStart.cshtml file to this directory that sets the layout for each page. Note that this is equivalent to the _ViewStart.cshtml page in ASP.NET MVCs.

@{
    Layout = "_Layout.cshtml";
}

The next step is to write the layout file, _Layout.cshtml. This contains the QUnit boilerplate along with a place holder (the RenderBody call) for the actual tests.

<!DOCTYPE html>

<html>
    <head>
        <title>@Page.Title</title>
        <link rel="stylesheet" href="/content/qunit.css " />
        <script src="/Scripts/jquery-1.7.1.min.js"></script>
        <script src="/scripts/qunit.js"></script>

        @RenderSection("Javascript", false)
        @* Tests are written in the body. *@
        @RenderBody()
    </head>
    <body>
        <h1 id="qunit-header">
          @(Page.Title ?? "QUnit tests")
        </h1>
        <h2 id="qunit-banner">
        </h2>
        <h2 id="qunit-userAgent"></h2>
        <ol id="qunit-tests">
        </ol>
        <p>
            <a href="/tests">Back to tests</a>
        </p>
    </body>
</html>

And now, one or more files that contain the actual test. Here’s an example called footest.cshtml.

@{
  Page.Title = "FooTests";
}
@if (false) {
  // OPTIONAL! QUnit script (here for intellisense)
  <script src="/scripts/qunit.js"> </script>
}
<!-- Script we're testing -->
<script src="/scripts/calculator.js"></script>

<!-- The tests -->
<script>
  $(function () {
    // calculator_tests.js
    module("A group of tests get's a module");
    test("First set of tests", function () {
      var calc = new Calculator();
      ok(calc, "My caluculator is a O.K.");
      equals(calc.add(2, 2), 4, "shit broken");
    });
  });
</script>

You’ll note that I have this funky if (false) block in the code. That’s to workaround a current limitation in Razor so that JavaScript Intellisense for QUnit works in this file. If you don’t care for Intellisense, you don’t need it. I hope that in the future, Razor will pick up the script in the layout and you won’t need this either way.

With this in place, to add a new test with the proper QUnit boilerplate is very easy. Just add a .cshtml file, set the title for the tests, and then add the script you’re testing and the test script into the same file.

The last step is to create an index into all the tests. I wrote the following index.cshtml file that creates a list of links for each set of tests. It simply iterates through every test file and generates a link. One nifty little perk of using ASP.NET Web Pages is you can leave off the extension when you request the file.

@using System.IO;
@{
  Layout = null;

  var files = from path in
  Directory.GetFiles(Server.MapPath("./"), "*.cshtml")
  let fileName = Path.GetFileNameWithoutExtension(path)
  where !fileName.StartsWith("_")
  && !fileName.Equals("index", StringComparison.OrdinalIgnoreCase)
  select fileName;
}

<!DOCTYPE html>
<html>
<head>
    <title></title>
</head>
<body>
    <div>
        <h1>QUnit tests</h1>
        <ul>
        @foreach (var file in files) {
            <li><a href="@file">@file</a></li>
        }
        </ul>
    </div>
</body>
</html>

The output of this page isn’t pretty, but it works. When I navigate to /test I see a list of my test files:

qunit-tests

Here’s the contents of my test folder when I’m done with all this.

solution

Summary

I personally haven’t used this approach yet, but I think it could be a nice approach if you tend to have more than one QUnit test file in your projects and you tend to customize the boilerplate for those tests.

I tend to just use a static HTML file, but so far, most of my QUnit tests are for a single JavaScript library. But this approach might come in handy when I get around to testing the JavaScript in the NuGet gallery.

Tags: , , ,


Categories: Blogs

Hello GitHub!

Wed, 12/07/2011 - 02:32

Hubot stache me.

Well the poll results are in and you guys were very close! I was taken aback at the intensity of the interest in where I would end up. Seriously, I’m honored. But then I thought about it for a moment and figured, there must be a betting pool on this. These folks don’t care that much.

Today is my first day as a GitHub employee! In other words, I am now a GitHubber, a Hubbernaut, a GitHubberati. Ok, I made that last one up.

If you haven’t heard of GitHub, it’s a site that makes it frictionless to collaborate on code. Some would call it a source code hosting site, or a forge, but it goes way beyond that. Their motto is “Social Coding”, and they mean it. They’ve turned shipping software into a fun social activity. It’s great!

Beyond a great product, they’ve built a great company culture. From everything I’ve seen and read, GitHub has figured out how to make a great work environment. They optimize for happiness and I believe that’s resulted in a great product and a lot of success. I’ll talk about that some more another time. For now, let’s talk about…

What will I be doing at GitHub?

According to my offer letter, my title is “Windows Badass”, but the way I see it, I will do whatever I can to help GitHub be even more awesome. It’s going to take some creative thinking because it’s already pretty damn cool, but I’ll figure something out.  My first idea for adding more cowbell was rejected, but I’m just finding my footing. I’ll get the hang of it.

More specifically, I plan to help GitHub appeal to more developers who code for Windows and the .NET platform. For example, take a look at the TIOBE language index.

tiobe-index

Now take a look at this chart from the GitHub languages page.github languages

See something missing? Yes, oh mah gawd! LOLCODE is not there!!!

Ok, besides that. See something else missing? Despite the fact that TIOBE ranks it as the fourth most popular language, C# doesn’t make it into the top ten at GitHub. I’d like to change that!

I’ve always been a big proponent of open source on .NET. Pretty much everything I worked on at Microsoft was or became open source (I did work on a Web Form control that wasn’t open sourced, but we don’t talk about that much).

I will continue to work to grow a healthy open source ecosystem on .NET and Windows. I hope to see more .NET developers contributing to open source and doing it on GitHub.

This might include making the website more friendly to Windows developers, working on a Windows client for GitHub, and continuing to work on NuGet, among other things. One of the appealing aspects of GitHub to me was how much they got NuGet. Perhaps more so than many at Microsoft.

Why Bother?

You might wonder, why bother?

Well, there’s the simple business answer. The more open source developers there are, the more potential customers GitHub has. But we have larger aspirations than that as well.

When trying to build a case for releasing more software as open source at Microsoft, I once asked Miguel de Icaza, what’s in it for Microsoft? Why do it?

His response was something along the lines of bla bla bla bla. But there was one thing that he said that struck me.

A rising tide lifts all boats.

When I first read that, I thought he wrote “tilde” and I was really confused what a rising tilde had to do with anything.

But it makes sense to me now. As I wrote in a recent post talking about software communities,

The interchange of ideas between these disparate technology communities can only result in good things for everyone.

There are millions of .NET developers, but a disproportionately small number of them are involved in open source projects. If we increase that just a tiny bit, that increases the pool of ideas floating around in the larger software community. Ideas backed by code that anybody can look at, incorporate, tweak.

The nice thing here is I think a healthy .NET OSS ecosystem is a good thing for everyone. Good for GitHub. Good for Microsoft. Good for the software industry.

Am I moving?

GitHub is located in an amazing space in San Francisco. When I visited, Hubot pumped in Daft Punk via overhead speakers as people coded away. That alone nearly sealed the deal for me. The fine scotch we sipped as we talked about software didn’t hurt either.

But alas, as much as San Francisco is a great city, my family and I love it here in the Washington, so I will work as a remote employee. Fortunately, GitHub is well suited for remote employees. And this gives me a great excuse to visit SF often!

My little octocats agree, this is a good thing.

octocats

If you’ve been a fan of my blog or Twitter account, I hope you stick around. I’ll still be blogging about ASP.NET MVC, NuGet, etc. But you can expect my blog will also expand to cover new topics.

It’ll be an adventure.

Technorati Tags: ,,


Categories: Blogs