Archive for the ‘Uncategorized’ Category

Authenticating a login

Monday, June 21st, 2010

There is an interesting discussion on the use of the term “login” as a verb, part of a much wider discussion on colloquial terms often misunderstood to be verbs, that got me thinking about the proper terms for such actions.

“Login” is just one of many offenders (many of which originated on the Internet) that could easily be replaced with verbs that make much more sense.

Readability

Of course, there’s a strong argument that words such as “login” and “checkout” have well-defined, verb-like connotations that readers’s are familiar with. So it’s often wise to sacrifice absolute grammatical correctness in order to communicate better with your readers. After all, the purpose of a language is as a well defined, but constantly evolving communication protocol.

However, there are ways in which you can ensure grammatical correctness without misleading your readers.

Grammatically Correct Software

In the context of software development, “readers” are the users of your software. Understanding how users perceive actions within your software (and optimising it) is a crucial aspect of development that is often overlooked. Using correct grammar can help a lot by providing users with a language that is familiar and un-obtuse. Users should never find themselves thinking “what does that mean?”.

So we need to be both grammatically correct, but colloquial enough that users don’t have to think too hard about what they’re doing. Lets see if we can apply this to a couple of common “anti-verbs”.

“login”

As the notaverb.com reference for “login” points out; “login”, while not a verb, can be a noun. This is useful as it allows us to use terminology that user’s are already familiar with in order to prevent them from getting lost.

So if “login” is not a verb, what is? It’s suggested that “log in” is appropriate, but since this is a verb/adverb combination this doesn’t really fit the bill as it implies that you are “logging” in an “in” way. Even if this is grammatically correct, it’s quite vague: what or where is “in”?

I’d propose that, in the context of software, an appropriate verb would be to authenticate. Authentication implies that you are being recognised (or rejected) by an authority – in the case of software, the application itself.

Note: There are often two software processes involved in “logging in” – “authentication”: checking the authenticity of a user and; “authorization”: ensuring a user has authority to do something.

It’s good practice to disambiguate these terms internally (in the software code and audit logs etc.) but wise to make the latter transparent to users; a user should only be concerned with authentication.

The problem with “authentication” as a verb is simply one of complexity. Big words scare people.

We can use the noun form of “login” to provide user’s with a grammatical cue that they’re familiar with, but what is a “login” if it’s a noun? Simple, the “username”.

“username” is a word I’m strongly opposed to as it’s (in my opinion) not really a noun; it is two nouns squashed together: a “compound noun” if you prefer. More specifically, it is the “name” of the “user”, although the actual name of the user is often quite different – so how do we disambiguate the two?

The solution is simple: ditch the term “username” in favour of the proper noun “login”.

Using these ideas, here’s how an authentication process may appear to a user in a way that’s both accessible and grammatically more accurate:

Authentication
Please enter your login and password:
Login: __________
Password: __________

It seems to have become part of modern internet culture to define new words, often that defy traditional language conventions, in order to describe something new and exciting. This is not only unnecessary, it makes technology obtuse and less accessible – particularly to the older generations.

At the end of the day, technology is meant to innovate, liberate and empower users – not confuse them with unfamiliar words.

Handling Segmentation Faults in userland PHP

Friday, June 18th, 2010

I’ve been doing a lot of multi-process and signal programming PHP lately, and I found myself wondering which signals it’s possible to intercept and handle in userland PHP. Of my findings, the most interesting was that you can trap SIGSEGV (“segmentation fault”) and handle it yourself.

By default, when the kernel raises a SIGSEGV on a PHP process, PHP intercepts the signal and handles it by simply printing “Segmentation Fault” to stderr (standard error) and the process immediately exits. This is understandable behaviour, but in most applications, especially large ones, it’s useful to know what it was that caused the segfault to be raised so you can address the problem.

So I found myself wondering if it was possible to trap SIGSEGV and gather some debug information about it before exiting, and discovered that it’s much simpler than it sounds:

<?php
/**
* Demonstrates signal handling to trap Segmentation Fault using userland PHP.
*
* Useful for debugging larger applications that occasionally generate
* segfaults.
*
* You are free to modify and redistribute this code without attribution.
*
* @author Nick Telford <nick.telford@gmail.com>
* @copyright Nick Telford (c) 2010
*/

// you need this towards the start of your application in order for signal
// handlers to be executed
declare(ticks = 1);

/**
* Signal handler for segmentation faults.
*
* Since a segfault indicates a memory problem, it's wise to keep this separate
* from other signal handlers and to have it do as little as possible.
*
* Also, since there's a memory problem, this should call exit(1), otherwise
* you could end up with unpredictable behaviour. Only avoid exit(1) if you
* *really* know what you're doing.
*
* @param integer $sig The UNIX signal, will be SIGSEGV for a segfault.
*/
function handleSegfault($sig)
{
    // note: unless you're running the CLI SAPI, this will be sent to the
    // browser, fwrite() it to STDERR instead or better yet, log the info
    // somewhere (a file or syslog) for later analysis
    echo "Segmentation fault, printing backtrace:\n";
    debug_print_backtrace();
    exit(139);
}

// attach the segfault handler
pcntl_signal(SIGSEGV, 'handleSegfault');

// tests the segfault handling by generating one
posix_kill(getmypid(), SIGSEGV);

Update: I’ve changed the exit code to 139, as processes terminated by a UNIX signal should use an exit code that is the signal number + 128. The exit code for processes that terminate due to SIGSEGV is 139.

As my notes in the code suggest, you shouldn’t merely echo the information out, as by default it is sent to stdout (standard output), which any SAPI handling web requests will send to the browser. Something strongly inadvisable.

Instead, you’ll most likely either want to output the information to stderr (using fwrite()) or log the data using your application’s logging layer. It’s important that you execute as little code as possible, especially memory operations, as a segmentation fault indicates memory corruption that could have unpredictable effects on your application.

In PHP 5.3.0 PHP’s handling of signals was changed from implicit, using declare(ticks = n) to instruct the interpreter to automatically dispatch pending signals after n lines of code are executed; to explicit, by calling pcntl_signal_dispatch() yourself. Whilst the new explicit process is more flexible and provides for better performance, it does have the potential to decrease the durability of your application, especially if you’re trapping SIGSEGV, because the application will continue to run until you call pcntl_signal_dispatch(), even though memory corruption has occurred.

Finally a note about performance in PHP < 5.3.0: using declare(ticks = 1) tells PHP to check for pending signals after it executes every line of code. Obviously this adds an overhead (albeit a small one) to your application. For that reason, it's probably wise to have this disabled in production environments.

Documenting Code Correctly

Wednesday, February 18th, 2009

One of the keys to writing good and portable code is clear and concise documentation. Most programmers have a tendency to dislike documentation – thinking of it as an ancillary task, to be written once the bulk of the programming has been done. This attitude yields low quality documentation which can have some serious problems.

All but the smallest of projects have the problem of new programmers having to understand code that others before them have written, a task made harder when you encounter documentation like this:

/**
 * doSomething
 *
 * @param mixed $id
 * @param mixed $name
 * @param mixed $thing
 */
public function doSomething($id, $name, $thing)

If you’re wondering what’s wrong with this documentation then you definitely need to keep reading.

This is PHP code documented with PHPdoc. Of course, other languages/tools will differ, but anything based around JavaDoc will be similar to this.

How to write good documentation

First off, the format for a PHPdoc method doc-block is as follows:

/**
 * <short description>
 *
 * [<long description>]
 *
 * @param <type> <variable> <description>
 * @return <type> <description>
 */

Now, looking at our example above, we can see several glaring problems:

  • “doSomething” is the name of the method, not a short description. You don’t need to write the name of the method anywhere in the doc-block as PHPdoc can read it from the method signature.
  • There is no long description – while this is optional, non-trivial methods should always have a long description so that anyone can understand the method’s use without looking at the implementation code.
  • None of the parameters have any type information – while it’s possible that this method does accept mixed types for all the parameters, it’s unlikely. PHPdoc can’t infer the type from the method signature (unlike JavaDoc) so you need to specify the type here.
  • None of the parameters have a description – Parameter descriptions aren’t optional. If you include an @param tag, you may as well write a sentence that describes the parameter.
  • There is no @return tag – while this method may not return anything, many do. You should always document the return type and conditions of a method. For methods that have a more complex set of conditions on their return value you may want to elaborate in the method’s long description.

As it stands, the example I gave above provides no information about the method that couldn’t be inferred from the method signature. The doc-block might as well have been omitted entirely.

Good documentation verbosely describes a method’s use, it’s inputs, outputs and their conditions. Reading the documentation for a method should be all a programmer needs in order to understand what it does and how to use it. If you want to check your documentation, ask a friend or put yourself in the shoes of another programmer and read the documentation through without looking at the code.

When to write documentation

One of the main causes of bad documentation is that many programmers either fill in some documentation after they have written their code or omit it entirely.

It’s safe to assume that usually, when writing a method/function/procedure, programmers already know what they want it to do and how they want it to work. The best process I’ve found for writing a method is:

  • Think about what it will do and how it will work (or look at the design document)
  • Write the method signature (the declaration) – e.g. public function doSomething($id, $name, $thing)
  • Write the documentation, in full.
  • Write the method body – e.g. in PHP/Java/C# everything between { and }

Using this process you will have full documentation for the new method before you’ve even written it! What’s more, you can refer to your documentation while writing the body code, ensuring you get things right. This is especially effective in a test-driven development scenario where you are designing all your methods to pass a set of unit tests.

Documenting body code

Once you get used to the process, documenting methods/functions/classes becomes almost second nature. But there’s a separate art entirely to documenting body code – the code inside the method/function/procedure.

The trick to it is not to over-comment your code. Code that has every single line preceded by a comment is useless, especially when those comments look like this:

// check the result is not null
if (!is_null($result)) {
    // add result to the return array
    $return[] = $result;
    // increment the count
    $count++;
}

The problem with this is that the comments add no useful information from the code. In fact, every line of that code is “self-commenting”. It would instead be more useful to comment this block of code as a whole, to describe what its doing from a higher level:

// add any results
if (!is_null($result)) {
    $return[] = $result;
    $count++;
}

Of course, without the greater context of the rest of the code, this still doesn’t make much sense, but you can see that it is immediately more legible – there are no unneccessary comments cluttering up the code.

The trick to commenting body code is to divide the it up in to relevant chunks and comment blocks that are not immediately obvious. Good candidates are loops and branches (if/while/for etc.) as they naturally group code in to a block and usually perform one or two related operations.

This concludes my crash course on code documentation. Please leave any questions, suggestions and criticism in the comments.

Trust, Twitter and Passwords

Wednesday, February 11th, 2009

After we launched TwitOrFit we noticed a rather surprising backlash within the Twitter community. Many users were unhappy with trusting a web-based service, such as TwitOrFit, with their Twitter credentials – and rightly so – giving out your password is one of the cardinal sins of the online society we live in.

However, many people don’t seem to realise that without both their Twitter username and password, there’s very little external services can do with Twitter. And this is the crux of the problem – Twitter’s API requires you to send the username and password of the user with every API call.

On the surface, this doesn’t seem so terrible, after all service providers may request the user’s password then throw it away after the API call has been made – but one of the important points to remember is that Twitter’s API is not located on an SSL secured server. So any calls made to Twitter’s API is made with the username and password in the HTTP header sent in plain text.

fav.or.it was recently lambasted by a blogger for doing exactly this (I’m unable to locate the URL), and we are soon going to fix the problem by launching an update that will include an SSL secured user area. Still, it amazes me that such a high profile website has been totally overlooked when it comes to this gaping security hole.

The solution to the problem is simple: Twitter, and many other API providers, need to stop requiring the password to be sent with each API call. Instead they should adopt one of the many approaches designed to circumvent this problem, be it OAuth, a challenge/response system or even a full blown token-based session system.

My favourite approach is currently LiveJournal’s challenge/response system. From a developers point of view it’s very straightforward to use and very secure. In theory, it can also be used to allow users to revoke the permission of an application.

Twitter are supposedly working on an OAuth implementation with plans to drop support for the Basic Auth 6 months after it goes live. Unfortunately, sources tell me that they’re making very slow progress, so we can’t hope for anything soon.

Update: Last night, approximately 30 minutes after I wrote this post, Twitter announced the opening of their private beta for their OAuth API. Apparently, my source got his information about 6 months ago… Is anyone aware of a sauce or condiment that goes well with my hat?

Moving to Wordpress

Monday, June 18th, 2007

I’m in the process of moving my blog over to Wordpress now that we’re with a new VPS provider.

I’ve mostly ported the theme and all the old posts, but unfortunately I don’t think I’m going to be able to post the comments.

The theme is pretty slap-dash at the moment. It hasn’t been tested in anything other than Firefox 2 on Linux so it’ll likely be a little broken, especially on ancient installations of IE.

Lies, deceit and Christians

Thursday, November 16th, 2006

The Evangelical Christian Union have this week filed a “Letter before Action” with the Students’ Guild of the University of Exeter.

My position within X-Net (for those who don’t know, the online student media for Exeter University) has forced upon me the pure facts surrounding the issue. This is essentially a good thing as it has meant that our News team has been able to cover the story without bias, however, other news websites have not been so fortunate.

Interestingly, it seems that almost all of the websites covering the story are Christian news sites. Not particularly unexpected, however, it seems that almost all of the reporting of this story by Christian news sources either manipulate the facts or get the story almost entirely wrong.

In fact, the most balanced and in fact correct coverage I could find on this issue, beyond X-Net of course, was the BBC News coverage.

Not only are news sources getting the facts wrong but the Exeter ECU themselves don’t seem to know what’s going on. During the recent referendum over the final name, the president of the ECU, James Harding, was interviewed by our News Editor, Kathryn Nott. Some of his initial statements were flat out lies; at one point stating “It is a myth that the Christian Union are affiliated to the UCCF”. After phoning the UCCF, we confirmed that they most definitely are affiliated. To which we got an apology and an excuse: “I forgot that we were affiliated to the UCCF”.

Yeah, we believe you.

Couple this with the inaccurate reporting of various Christian news sites and I’m slowly coming to the conclusion that large Christian organisations will only promote virtues such as equality and morality when it benefits them. So, just like any other organisation really.

I for one sincerely hope that the Students’ Guild don’t buckle under the legal pressure being mounted on them by the ECU, after all, it should be fairly obvious to them that they don’t have a legal leg to stand on.

Moving to Mephisto

Wednesday, November 15th, 2006

I’ve now moved by blog over to Mephisto, a powerful new Rails blogging platform. Generally, it seems to offer a fair amount more flexibility over Typo and is a lot faster, although I need to use it for a while longer before drawing any real conclusions.

I very much doubt that I’ll get much on here in the coming months, University and in particular, X-Net, are keeping me very busy. Hopefully I’ll be able to write the odd reasonably insightful entry here and there, but frequent updates are a few months off at least.

I’m hoping to come up with a half-decent non-default design at some point in the next few weeks, Mephisto seems to make theming fairly painless so it’s not something that’s likely to demand too much of my time.

In other news, I’ve discovered that our VPS host Adiungo, has finally upgraded their virtualization software to allow the 2.6 series of Linux kernels to run. It’s about time.

Unfortunately, I don’t have the time to backup and reinstall everything on the server right now, a task made more complicated by the two people I share it with. Still, we may get around to it eventually, hopefully then we’ll be able to see the back of Fedora Core 2…