Monday, May 16, 2011

Literal strings in C# and Python

 

Like any good programmer, I’m lazy. I can’t stand doing anything repetitive and mindless. That’s the computer’s job.

Copying file locations is one of those annoyances. Or it used to be, before I learned about “truly literal” string literals. I first encountered this in C#, but today learned how to do it in Python as well. All they are are strings that are saved exactly as they are written, meaning that all escape characters are ignored.

Console.WriteLine( "Hello \n World." );
// prints as:
// Hello
// World

Console.WriteLine( @"Hello \n World.");
// prints as:
// Hello \n World.




Let’s take a step back for those that don’t know what string literals or escape sequences are. String literals are strings that you write in your code, like in the example above. Text that is surrounded by double quotes is saved as a string object that contains that text. If you print it, as with the WriteLine function above it prints that text. That’s why it’s called a string literal. The program stores it literally and doesn’t use it to refer to something else, as it would with a variable.

int a = 10;

Console.WriteLine(a);
// prints:
// 10

Console.WriteLine("a");
// prints:
// a

But string literals aren’t truly literal, because of escape sequences. Consider the case of a string that includes five empty lines. That string literal would take up five whole lines of your screen real estate. In order to get around this problem, not to mention characters like BEEP that can’t be typed, string literals look for an escape character, in most languages \ (backslash). This tells the program that there is a special character at this place in the string and the following letter says which special character. The two most common escape sequences are \n (new line) and \t (tab). The first example demonstrated \n. This is where “truly literal” string literals come into play. If the string literal is preceded by the @ symbol (in C# and probably other .NET languages) the \ stays a \ and \n stays \n instead of becoming a new line. That can be useful because file paths and URLs are full of backslashes.


What I learned today is that there is an equivalent in Python, which is even more useful than in C#. C# is a compiled language, so when you want to run a program with a file path as an argument, that path is a command line arguments. Command line arguments don’t expand escape sequences, so they are already truly literal. Python, on the other hand, is an interpreted language. Most of the time you run programs from within the interpreter, where the strings you write are not command line arguments, but string literals just like we described above. So if you want to copy a file path from Windows Explorer to use as an argument for your program you have two options:



  1. Go through the file path string and change every \ to \\ which is the escape sequence for a regular backslash. This is very annoying and is also what I did until today.
  2. Prefix the file path string with the letter r. This has the same effect as @ in C#. All of the backslashes are left alone and everything works perfectly.

In the example below, both of the strings are the same, but you can imagine how much easier one is to compose than the other.

// double backslashes for every backslash in the file path
// A major nuisance relative to copy-paste
print("C:\\Program Files (x86)\\Windows Live Writer"
// file path string is preceded by r
// Very easy
print(r"C:\Program Files (x86)\Windows Live Writer")
I also learned today, or will have learned once this is posted, whether the Windows Live plugin called Source Code Formatter does a better job than the one called Insert Code.

No comments:

Post a Comment