One thing I learned today: May 2011

Tuesday, May 17, 2011

Getting picky with arrays

Today I learned how to choose multiple items from a list. For example, imagine you have a C# program with an array of addresses. If you want a list of only the addresses in a specific zip code, you could iterate over all the addresses and add them to a list one by one. It would look something like this:

List<Address> one_zip = new List<Address>();
foreach (addr in addresses)
{
    if (addr.zip == "12345")
    {
        one_zip.Add(addr);
    }
}

You could also replace that with a single line of more readable code:

IEnumerable<Address> = addresses.Where(addr => addr.zip == "12345");

If the Where syntax above looks familiar, you’ve probably learned SQL or some other database query language. C# 3.0 introduced a framework called LINQ which allows you to do the same kinds of queries in C# as you would do in a database.

Where isn’t truly a method of arrays, but it behaves like one for our purposes. It receives one parameter, a function that maps a single item from the array to a boolean value (true or false). This function (think of it as a condition), tells Where which items to keep and which to ignore.

The only daunting part of this syntax is the function, in this case a lambda expression. A lambda expression uses the => to say that it maps the parameter, in this case addr to some value, in this case true if the zip code is 12345 and false otherwise. Naturally, Where calls this function for each item in the array and keeps only the items where the function returns true.

In order to use the Where method, you need to include the LINQ namespace by declaring using System.Linq; and ensuring that you have the necessary resources. I don’t know if it’s bundled in the .NET SDK, but I do know it’s included with Visual Studio.

In Python, on the other hand, you don’t need to import any modules to select items from a list. It’s part of the basic functionality. (A structure similar to Address above isn’t part of the basic functionality, but you can define one with a namedtuple from the collections module.) The code looks like this:

[addr for addr in addresses if addr.zip == "12345"]

The square parentheses define a list and the first addr defines the items in the list. This is even more flexible than Where in C# in several ways. For example, if you only want a list of the street names:

[addr.street for addr in addresses if addr.zip == "12345"]

This is why addr appears twice in the declaration. The second time it appears, it defines the variable you are iterating with. But you don’t have to save those items, you can save any value you want. For example, a truly odd way of counting the addresses in the zip code would be:

sum([ 1 for addr in addresses if addr.zip == "12345"])

With just a little extra work you can also make your selection based on the indices. If you want every other address in your list:

[addresses[i] for i in range(len(addresses)) if i % 2 == 1]

I don’t know how you could do this in C# without a loop.

Thanks to Imri for the lessons in Python.

Monday, May 16, 2011

Literal strings in C# and Python

Like any good programmer, I’m lazy. I can’t stand doing anything repetitive and mindless. That’s the computer’s job.

Copying file locations is one of those annoyances. Or it used to be, before I learned about “truly literal” string literals. I first encountered this in C#, but today learned how to do it in Python as well. All they are are strings that are saved exactly as they are written, meaning that all escape characters are ignored.

Console.WriteLine( "Hello \n World." );
// prints as:
// Hello 
//  World

Console.WriteLine( @"Hello \n World.");
// prints as:
// Hello \n World.

Let’s take a step back for those that don’t know what string literals or escape sequences are. String literals are strings that you write in your code, like in the example above. Text that is surrounded by double quotes is saved as a string object that contains that text. If you print it, as with the WriteLine function above it prints that text. That’s why it’s called a string literal. The program stores it literally and doesn’t use it to refer to something else, as it would with a variable.

int a = 10;

Console.WriteLine(a);
// prints:
// 10

Console.WriteLine("a");
// prints:
// a

But string literals aren’t truly literal, because of escape sequences. Consider the case of a string that includes five empty lines. That string literal would take up five whole lines of your screen real estate. In order to get around this problem, not to mention characters like BEEP that can’t be typed, string literals look for an escape character, in most languages \ (backslash). This tells the program that there is a special character at this place in the string and the following letter says which special character. The two most common escape sequences are \n (new line) and \t (tab). The first example demonstrated \n. This is where “truly literal” string literals come into play. If the string literal is preceded by the @ symbol (in C# and probably other .NET languages) the \ stays a \ and \n stays \n instead of becoming a new line. That can be useful because file paths and URLs are full of backslashes.

What I learned today is that there is an equivalent in Python, which is even more useful than in C#. C# is a compiled language, so when you want to run a program with a file path as an argument, that path is a command line arguments. Command line arguments don’t expand escape sequences, so they are already truly literal. Python, on the other hand, is an interpreted language. Most of the time you run programs from within the interpreter, where the strings you write are not command line arguments, but string literals just like we described above. So if you want to copy a file path from Windows Explorer to use as an argument for your program you have two options:

Go through the file path string and change every \ to \\ which is the escape sequence for a regular backslash. This is very annoying and is also what I did until today.
Prefix the file path string with the letter r. This has the same effect as @ in C#. All of the backslashes are left alone and everything works perfectly.

In the example below, both of the strings are the same, but you can imagine how much easier one is to compose than the other.

// double backslashes for every backslash in the file path
// A major nuisance relative to copy-paste
print("C:\\Program Files (x86)\\Windows Live Writer"
// file path string is preceded by r
// Very easy
print(r"C:\Program Files (x86)\Windows Live Writer")

I also learned today, or will have learned once this is posted, whether the Windows Live plugin called Source Code Formatter does a better job than the one called Insert Code.

Sunday, May 15, 2011

Picasa does not allow embedding videos

There are two videos embedded in yesterday’s blog post. You may not know that even if you read the post, because they look like regular images. That’s what you get if you use the embed code from Picasa Web Albums. If you click on the image, it takes you to the album and plays the video. YouTube is being annoying about linking my Google account, so at least I’ll try embedding a Facebook video tonight. It is the Olympic torch in Vancouver, which I think they lit especially for the CHI conference.

As promised, I’ll tell you one other thing I learned. I write blog posts faster after a beer. I might have to try more to find the optimal BAC for blogging. As always, XKCD has something clever to say.

Saturday, May 14, 2011

Two-year olds change quickly and gadgets change slowly

I didn’t keep up with my blogging while at the CHI conference this week. To make up for it, each day for the coming week I’ll write about two things I learned.

1. Today I learned just how much a two year old can grow up in a week. She looks different, walks differently, talks differently, sings differently. Were my flights going at close to the speed of light? For example, before I left, I had never heard her consistently and correctly use the pronoun I in any language. She typically said mine in English regardless of the language or syntax of the rest of the sentence. Today she seemed totally comfortable with the concept in any language. For example, she announced “I go on high chair.” Or consider this dialogue with her mother, which demonstrates her newfound confidence in how she communicates:

Daughter: “Ya ustala.” (I’m tired)

Mother: “Ya ustala.” (Thinking she was referring to the mother)

Daughter: “No! YA ustala.”

The same can be seen in how she carries herself. Unfortunately, I don’t have a video from a week ago to compare to. I guess not everyone can, or should, record everything.

2. I learned a lot about the history (and present) of gadgets from Bill Buxton at CHI. He is a gifted storyteller and is among other things an anthropologist of computer interaction. He is assembling the Buxton Collection as a glance into the evolution of gadgets, with all of its varied, often surprising and sometimes whimsical branches (see the Phantom Chess set or the dozen different kinds of multifunctional mice).

Buxton prefaces his collection by saying: “Look at the collection and then try and convince me that our slow rate of progress is due to a lack of technology rather than a lack of imagination.” I see his point. Many of the ideas and principles that are thought of as new and innovative in the latest generation of gadgets have actually been around for a long time. As someone who bought a tablet computer years before iPads were a twinkle in Steve Jobs’ eye, I don’t need much convincing.

Although I see Buxton’s point, my take-away lesson is complementary. There’s a big difference between what kinds of interactions we can create with current technology and what kinds of interactions we should create. For the technology to catch on, it’s not enough for it to allow a novel, innovative and useful mode of interaction. It has to be smooth, efficient, reliable, attractive and appropriate for the ecosystem of software and other gadgets around it. So, if Buxton argues that there’s plenty of good technology out their just waiting for ideas of how to utilize it, I argue that there are plenty of good ideas that are already out there in the gadgets of the past just waiting for the technology that will help them shine.

Friday, May 6, 2011

Cleaning up speech recordings with Audacity

In the research project that’s currently taking 95% of my time, we had participants think aloud while wearing headsets. It was a nightmare, simply because we were using someone else’s computers. After dealing with their $#%^! security software, I didn’t have time to make sure each computer is configured well for recording. So after the experiment, I was left with a hard disk full of crappy, barely audible recordings.

Audacity to the rescue. “Audacity® is free, open source software for recording and editing sounds.” It’s an extremely powerful tool, but I’m using it in the dumbest way possible just to clean up my speech recordings. Luckily for me, I stumbled upon a preconfigured option for this.

Audacity uses a mechanism for batch processing called a chain. A chain is a series of predetermined commands that it applies to an audio file one after another. From the File menu you can edit chains or create new ones, and you can apply a chain to files. When I opened File > Edit Chains imagine my wonder at finding a chain called CleanSpeech. It does some normalization and noise reduction and then saves a copy as an mp3 (All the better to upload you with, my dear). Thanks to CleanSpeech, the process of cleaning up my speech recordings boils down to:

Choose File > Apply Chain…
Choose CleanSpeech chain and click Apply to Files…
Choose my files, click Open and leave the computer overnight to do its thing

After some testing, I did modify the CleanSpeech chain a little. I got rid of two steps, SaveHqMaster and TruncateSilence. SaveHqMaster saves an mp3 version before the cleaning process. I don’t need that because I’m keeping the original wav files as backup. TruncateSilence supposedly cuts out silent portions of the recording, but I found that in some cases it made the speech choppy or cut some out altogether. I’d rather leave in the silence. Also, running CleanSpeech will require that you install the LAME mp3 encoder that isn’t bundled with Audacity for licensing reasons (you can find it from their instructions or here). I recommend that you try opening a file in Audacity and running CleanSpeech on it (choose Apply to Current Project instead of Apply to Files…). That should pop up the request to install LAME and once you do that you’ll be all set.

I’m sure there are easier, more efficient, more elegant and all around better ways to clean up a ton of speech recordings and if you know them, please share in the comments. For today, this is what I learned.

Thursday, May 5, 2011

Timing activities

Yesterday I learned, the hard way as usual, that I need to decide beforehand how much time I’m willing to devote to an activity. I have the same difficulty managing my time as some people have managing their spending with a credit card. When I have a million things to do, but deadlines that are weeks away if they exist at all, the cost of wasting time isn’t immediately obvious. It accumulates slowly without drawing your attention until you get close to one of those rare deadlines or even rarer moments of clarity where you stop, look back and can’t understand what you’ve been doing for the last month. Much like a credit card bill.

One of my main defenses against wasting time is the Pomodoro technique. The central concept is that you promise a set amount of uninterrupted work, usually 20-30 minutes, and in return get a guaranteed break at the end. No matter how lazy you are feeling, you can always work for just 20 minutes, right? For Android owners, I recommend Pomodroido, which comes in both Free and Pro versions. I like the aesthetic of the app so much, that I’m more motivated to stick to the technique just so I can use it. Unlike a lot of productivity pr0n apps it is also so simple that it doesn’t become a distraction itself.

So if the Pomodoro technique is for motivation, what does it have to do with timing my activities? Very simply, pomodoros offer a great time unit. For example, I’m only devoting one pomodoro to this blog post. Yesterday I didn’t measure pomodoros, but if I had it would have been enough to make a lasagna.

My pomodoro is calling…

Wednesday, May 4, 2011

Windows Live Writer supports Blogger

Today I learned that I can use Windows Live Writer to post to Blogger. The reason I want to is because it may be the easiest way to post code in the blog. Yesterday I tried using the hosted version of SyntaxHighlighter, but it didn’t go as smoothly as I expected. When I tried to add the following code within the post, Blogger rejected it.

<script class="brush: html" type="syntaxhighlighter">

<![CDATA[

<script class="brush: html" type="syntaxhighlighter">

<![CDATA[

<b>Hello world.</b>

]]>

</script>

]]>

</script>

It obviously wasn’t ignoring the script tags that were in the code to be posted. So now I’m trying WLW with the Insert Code plugin. WLW took way too long to install (~30 minutes including restart), but installing plugins is pretty quick. I installed a Picasa plugin while I was at it, so I can share this:

It’s an old picture, but my daughter still loves those baby swings.

Tuesday, May 3, 2011

Learning can take a while

Today I learned not to start learning the thing I want to post about at 9pm. And while I'm not a big believer in Murphy's Law, starting the post off by saying something is easy, before you've actually done it... That's just asking for trouble.

Monday, May 2, 2011

Lesson learned the hard way

Today's lesson:

For any program you are writing, if you don't have a test for it, you may as well not have written it. By test, I mean an easy way to validate that the program is doing what you think it is doing, and especially that it is saving the information that you think it is saving. The test needs to be easy,so you can repeat it as often as possible after making changes.

If you're programming in Java, I recommend writing JUnit tests. I'll soon be looking into whether C# has an equivalent testing framework.

I also learned today that InputEventArgs.timestamp and Environment.TickCount are measured in milliseconds, while DateTime.Ticks and TimeSpan.Ticks are measured in hundreds of nanoseconds. That's a 10K difference in magnitude, approximately the same as the difference in magnitude between these two lessons. Therefore, this still counts as one (point oh-oh-oh-one) thing I learned today ;)

Sunday, May 1, 2011

Each day, I will post one thing I've learned.

Today's lesson - it takes nine minutes and thirty nine seconds to create a new blog on Blogger. Twice, because I created it on the wrong account at first. Customization can wait until later.

BTW - the post took another four minutes. See you tomorrow.