Monthly Archives: June 2007

Market research versus usability testing

Yesterday’s post about the ORG report earned me two interesting comments from James Gilmour. They spurred me into more research, in particular the Cragg Ross Dawson report he referred to in the first comment. I decided another post on the subject was in order; normal nonsense will resume in due course.

I’ve found two reports by Cragg Ross Dawson; I believe James referred to the ballot paper design research report, but the later STV ballot paper report also makes interesting reading.

As he said, these might not have been “focus groups” in the sense of a bunch of people sitting round a table munching on biscuits, but neither were they proper usability tests.

Usability testing is about asking people to perform tasks in as close as possible to a realistic scenario (no prompting, no helping, no detailed instructions in advance) and observing what they do, and their success (or failure). It gives you objective results rather than the subjective feelings of the Cragg Ross Dawson reports.

Cragg Ross Dawson aren’t usability professionals; they’re a market research company. There’s a huge difference.

Some of my problems with their approach:

  • The ‘Topic Guide’ in the first report suggests that test users, after trying out a ballot, were asked questions such as “is it clear to them who and what they were voting for?” and “how clearly does it explain how to use the ballot paper?”. A true usability test observes the test users to answer those questions – watch, don’t ask. People are very bad at explaining this kind of thing, often to the point of self-delusion. They’ll say things were easy when observation showed they had significant problems. When asked why they did something, they’ll invent entirely spurious explanations (not maliciously, but because they were asked and a plausible answer just pops into their head).
  • It appears in this case that every test user tried every design of ballot, and then explained which one they preferred and why. This was a bad idea: from the second ballot, they were more familiar with the process and thus biased. To get a fair view of which ballot design was easiest to use, each user should have tried only one design; the success rates of each design could be compared after the test was complete. (And then the best design could be modified and the test performed again with new test users to verify that the new design was better and not worse.)
  • Look at section C, ‘Outcome’. In a true usability test this section would summarise the success rates for each design of ballot. It doesn’t; it just reports ‘preferences’ for one design over another. It’s full of phrases like ‘regarded as’, ‘felt that’, ‘thought that’. Which design was most successful – helped most people vote for the candidate(s) they wanted to vote for? It doesn’t say!

I did dig out some actual usability data from the reports:

  • First report, section 2.1, “Initial impressions”: “on first sight of the ballot papers most voters looked initially at the list of parties and candidates; on the basis of observation by the moderators, few seemed to start at the top and read the instructions”. And that’s exactly what I would expect to see. It’s been proven time and time again: people don’t read instructions (there are always exceptions, but they are exceptions).
  • Second report, Chapter 3: “despite the view that the designs were straightforward, some respondents made mistakes; 13 out of 100 ballot papers were unintentionally spoiled”. Followed by “it is worth noting that of the 13 respondents who spoilt their initial ballot papers, 9 realised their mistakes and corrected subsequent papers – many admitted they had voted before reading instructions carefully”.

That second point is damning. People said that the designs were straightforward, but the reality was different. That’s why true usability tests are so important. The fact that people corrected subsequent papers just confirms my point above: from the second ballot design, they’re biased. Not to mention that in a real election they don’t get a second chance to vote.

The goal of a ballot paper design is to allow voters to vote for the candidate(s) of their choice, and for that vote to be counted, as efficiently as possible. This is easy to test objectively, and to retest with improved designs, until there is sufficient confidence in the results. This wasn’t done. Market research isn’t usability testing.

In the actual election, we know that voters made marks on the ballot paper that were mostly, but not always, valid. How many people successfully voted for the candidate(s) of their choice? We have no idea.

Leave a comment

Filed under Random

How monitor resolution nearly cost the SNP victory in Scotland, and other stories

The recent elections across the UK included a number of e-voting
and e-counting pilots. And for the first time, official observers were
allowed to attend.

The Open Rights Group
called for volunteer observers in February and has now released a report of their
observations
. You can guess the overall summary: no confidence in
the results.

I’ve skimmed the report; it makes scary reading.

It seems that few places were geared up for observers; in at least one
case an official observer was granted less access than the media. The
Electoral Commission stepped in more than once to guide the election
administrators.

In many places the software vendors appeared more in control than the
returning officer. There were unguarded PCs lying around with open
ports. There was no certification of voting equipment. A hodge-podge
of software was used, including programs with known unpatched
vulnerabilities.

In one e-voting pilot voters received a two-part receipt containing a
‘voting receipt’ – which seems to be a sixteen-character hex number –
and a ‘ballot signature’, which looks like a cryptographic hash. The
purpose of the receipt is to allow the voter to verify that their vote
was counted. But one pilot gave no instructions on how to do that.
Another pilot allowed people to check their receipt by downloading a
69-page PDF file which – I kid you not – appears to have been produced
by opening an XML file (with no stylesheet) in Firefox and printing to
PDF. The voter must search this PDF file for a line containing their
sixteen-character ‘voting receipt’ – something like this:

<ballot_id value=”123456789abcdef0″ index=”123″ />

This is, of course, mad.

There appears to be no way to check the ‘ballot signature’ hash, and
no clue as to why that even exists. And the file does not tell you
anything else: the location of the election, for example. It certainly
gives you no confidence that your vote was counted correctly.

Most publicity at the time focused on the problems with the Scottish
Parliamentary elections, in particular the large number of spoiled
ballots (which in 16 of the 73 constituencies was greater than the
majority of the winning candidate). The report is unsurprisingly
harsh here. Voters were given misleading and contradictory
instructions. The layout of the ballot papers didn’t match user
expectations (the regions appeared on the left, the constituencies on
the right – most people thought the constituencies more important, and
assumed they were on the left).

And despite advice by usability professionals, they didn’t perform any
valid usability tests on the ballot paper. Instead they presented a
set of sample ballots to a number of focus groups and asked for
opinions. This isn’t a valid usability test. And in any case, none of
the sample ballots had the constituencies on the left where people
expected them.

This was doomed to failure. As anyone with any usability experience
could tell you from a glance at the ballot, many people saw the large
text saying ‘You have two votes’, ignored the tiny text saying ‘vote
once in this column’ for each of the two columns – constituency and
region – and believed they could vote twice in the same column. And
that’s what many of them did.

A simple fix – two pieces of paper instead of one, with each one
saying ‘vote once’ – would have solved that problem. Still, it’s only
an election, usability doesn’t matter…

The election result in Scotland was close: the SNP emerged with 47
seats, Labour 46. But without a last-minute objection by an SNP
candidate at one count, Labour would have won. The reason? The
resolution of someone’s monitor.

It was the final set of results to declare: the regional seats for the
Highlands and Islands. The SNP were then two seats ahead, with seven
undeclared. One of the SNP candidates had been keeping an eye on the
count, and reckoned the SNP had about 35% of the vote. But when the
returning officer showed the calculated results to the candidates
before the official declaration, it showed Labour with four seats and
the SNP with zero – unlikely if the SNP had anywhere near 35% of the vote.
This would give Labour overall victory in the
national election.

As the returning officer headed to the podium, the candidate
officially challenged the result. After some resistance the returning
officer agreed to show the workings (in the Scottish regional
elections it’s not a one-member-one-seat winner-takes-all system).

It emerged that the SNP’s votes hadn’t been included: the large number
of parties contesting the election meant that the SNP had scrolled off
the right of the Excel spreadsheet window (yes, that’s right). The
true result gave Labour three seats and the SNP two, and the SNP
gained control of the Scottish Parliament.

The returning officer was deeply apologetic. I bet.

The Open Rights Group report makes the point that many computer
scientists and related geeks and nerds, despite traditionally being
early adopters, are concerned about voting technologies. It recommends
that further e-voting and e-counting trials are suspended until more
research has been performed (and, unsaid, until politicians get a
clue).

Sadly I suspect that the only way to prevent a headlong rush into
e-voting hell is to engineer a major hack: an election apparently won
by someone who wasn’t even standing, with 110% of the vote.

But would even that work? The politicians would probably prosecute the
messenger and carry on regardless. As usual.

3 Comments

Filed under Random

Rimini etc

As you may have seen from Lynda’s blog, she, I, Andy, Chef and Louise spent last week swanning around various Italian towns in the vicinity of Rimini.

The week in quotes:

  • “How many courses shall I have?” – by Chef, at most meals. We ate out every lunchtime and most evenings. The pasta was mostly nice, the pizza was relatively disappointing. As Chef was designated driver the meals were cheaper than they might otherwise have been. I once again disproved the sweeping assertion that I never have dessert by opting for Tiramisu rather than niente dolci on a number of occasions.
  • “You’re the navigator” – which actually meant, “you’re in the passenger seat”. Three in the back of our 4×4 was cosy so we rotated shotgun; Louise performed most of the actual navigational duties. By the end of the week we also had a roadmap, which was nice. Chef drove come un italiano once or twice. We were never lost, though occasionally we didn’t know where we were going. Italian road signs are, mostly, for entertainment purposes only.
  • “Just get in as quickly as you can”. Our pool was unheated. The wrong approach is to take it a millimetre at a time, moaning all the way. I just got wet as quickly as possible, often noisily, but effectively. Once submerged it was fine. More than a few seconds out of the water, though, and re-entry was most invigorating.
  • “More tat shops”. Sometimes I wish the tourist industry could be uninvented. Please exit church via bookshop. Mug with your name on it? These postcards already stamped for Europe! Would you like to buy a pistol or a samurai sword with your baby’s new bib? San Marino’s fancy bits are especially blameworthy.
  • “Cock-a-doodle-doo”. All day long, all across the valley in which our villa lay. Cheerfully, croakily, quickly, slowly, interminably.
  • “Do we need more wine?”. Yes, apparently.

I’ve created a Flickr group for all our photos (mine only at time of writing).

3 Comments

Filed under Random

Wherein Avaragado pontificates about that logo

It’s received wisdom for UI design and many other disciplines: listen to your users but ignore what they say. People sometimes misunderstand this and come over all uppity, thinking it means “pretend to listen, but take no notice”. Ironically, by doing so they are themselves listening but ignoring, which I’ve just suggested is a good thing, but here it’s wrong, I tell you, wrong.

What’s the correct interpretation? That people are experts at telling you what they like or (more often) dislike, or wish they could or could not do, but useless at deciding what to do about it. They’re great at problems, but meh with the answers.

Oh, they’ll give you answers, answers by the bucketload. But following Sturgeon’s law, you can chuck at least 90% of them away without thinking. In fact, save the thinking and bin the lot. For a discipline like UI design, the suggested solutions tend to be unworkable, unusable, inappropriate, inadvisable, and other negative nancy words. To quote a certain forthright African, we’re outnumbered by morons. (By “morons” I do of course mean “well-intentioned individuals without sufficient domain expertise to judge the most appropriate solution”.)

But this post isn’t about UI design. It’s about the new London 2012 logo.

Who knew so many people cared about design? Or, at least, cared about £400,000-worth of design. On the one hand, I’m glad. On the other hand, it’s a shame that the logo is disliked so much, since it’s going to be in all our faces for the next five years (unless the organisers cave and change it). I’m not as vehemently negative about it as many people, but for the avoidance of doubt, I don’t like it either.

However, the armies of amateurs now rushing to produce their ten-second replacement logos are proving the “listen but ignore” mantra beyond all doubt. Because their “better” ideas aren’t.

It starts and ends with the client’s brief.

If I might go all horn-rimmed on yo’ asses for a moment, the client decides what message it wants the brand to convey. It sets the tone. It rules in and rules out. The agency sets its finest minds/interns on the project, presents some ideas (often two reasonable proposals plus a joke one that the client can reject out of hand to feel in control), and then they iterate until all parties are so fed up with the entire process that the client decides to go with whatever’s on the table when the deadline bangs on the door.

For London 2012, what was LOCOG’s brief to the agency, Wolff Olins? That’s key to this whole saga. I don’t know, but we can reverse-engineer it from the web site, the launch event, and so on. LOCOG wanted something bold not bland; representing more than just the few short weeks of the Olympic (and Paralympic) games themselves; representing more than just London; to be dynamic, modern, flexible and inspirational. Phrases like “everyone’s 2012”, “a Games for the next generation”, “reaching out and engaging young people”, and so on, abound in the press materials. And here’s something: “It’s not a logo, it’s a brand that will take us forward for the next five years” (Seb Coe). And this, from the BBC News web site: “It is a deliberate change from previous Olympic logos, which often feature an image from the city”.

And here’s something I don’t believe anyone has picked up on yet, from the London 2012 blog:

We have built a brand identity which has over 40,000 elements, which will evolve over the coming months and years in many smart ways … It’s not about the shape. It’s not about the colours. It’s about what we can do with it – there is a lot more to see, and you’ll see it soon.

When you look at the chosen logo, and compare it with the suggested replacements, you have to measure them all against the client’s brief. How well do they rate? It’s not just the aesthetic qualities, or lack thereof. Because if the client asks for something bold and you give them something bland, you haven’t done your job.

Let’s measure the chosen logo against the (apparent) brief.

  • Bold not bland: It’s definitely that. It’s amazingly daring for an Olympic logo; the only ones that come close are the psychedelic Mexico 1968 and Munich 1972 logos. In contrast, all logos since 1972 have been hewn from very similar rock. The London 2012 logo immediately stands out from these.
  • Not just the games: Absolutely. Notice how the logo does not say “sport” at all. The Olympic rings can be substituted with all manner of other logos (or no logo at all). The Paralympics to be held in London just a few weeks after the Olympics uses the same brand – here’s the 2012 Paralympics logo, with the Paralympics symbol in place of the Olympic rings. I can see events across the country piggybacking on this brand – I imagine LOCOG saying, “give us a tenner and a packet of crisps and you can shove your logo in”, that sort of thing.
  • More than just London: Again, absolutely. There’s no cliched London skyline, which makes the logo usable – possibly in altered form – across the country. Don’t forget that many Olympic events will take place outside of London (sailing in Weymouth, football all over the place).
  • Dynamic: Yes, though this is always hand-wavy. You can make anything dynamic by making it jiggle. Here the idea seems to be that the logo can break apart and reform, change colour, and so on. It always seems a good idea originally, but I’m sceptical that this is something we’ll see a lot of in practice; time will tell.
  • Modern: Hmm. Everything is a product of its time; nothing dates quite so fast as the future. To me and people of my advanced years, the logo is retro: 80s style, 80s colours. It has been compared to the Tiswas logo. Here I think they’re gambling, and I have a vision of a grown-up desperately trying to be hip wiv da kidz innit.
  • Flexible: Yes, it’s certainly that. Perhaps too flexible: all you need is the right colour combinations and a couple of jagged edges and you’ve got a cast-iron knock-off. See also ‘dynamic’: there’s so much you could do with this brand that you might end up doing very little to avoid diluting it too much.
  • Inspirational: Nope, sorry. It doesn’t inspire me. Well, it’s inspired me to write this long piece, but I’m doing so on my backside in front of Big Brother. I’m not sure that’s the inspirational effect they’re aiming for.

Measured against what I believe to be the brief, the logo holds up pretty well. Aesthetically, of course, I don’t think it works. But that’s a different matter (although valid).

How about the alternative logos that people have been creating? How do they measure up to the client’s brief? The BBC News web site lets you vote for your favourite 2012 logo from a small selection of reader submissions, plus the LOCOG-approved logo. Here’s what I think about them (you’ll have to look at the page, I’m not reproducing them all here):

  • Reader logo 1: Nice trick: “2012don” reading as “London”. If LOCOG wanted bland, they’d have chosen something like this. It’s not dynamic or flexible, nor is it inspirational. It’s all about London. Apart from the trick it’s just meh.
  • Reader logo 2: It’s the Union Flag in flames, by the look of it. Cheesy, nationalistic, boring. They’d never have picked this.
  • Reader logo 3: Breaks the Olympic rings (I don’t believe the IOC would allow this today, though they have done so in the past) and reminds me of the current Sky One/Two/Three logos. The lack of a nice arc for the ‘l’ of ‘london’ ruins the effect. But again, pretty bland and undynamic, just a bit of a trick (finding letters in the rings). Again, it’s all about London.
  • Reader logo 4: This kind of idea is the safe choice. But LOCOG didn’t want anything sporty. This logo fails miserably on that score. Again, another trick (finding a runner from the digits in 2012).
  • Reader logo 5: It’s a logo for a radio station. Next!
  • Reader logo 6: Clearly infringes Transport for London’s roundel logo, and thus is lawyer fodder. Screams London, but that’s not what LOCOG wants the logo to do (can you imagine that logo adorning Anfield for an Olympic football match?). Apart from that, wow, how bland can you get?

None of those logos fulfils the brief as well as the chosen logo does. (Predictably, Reader logo 1 is the most popular with voters, with Reader logo 3 in second place. People love a nice trick in a logo.)

What happens next? For what it’s worth, I think the (ugh) “brand attributes” are all desirable ones; but I don’t much like the end result, along with the vast majority of the public it seems.

LOCOG can either stand its ground or cave in. They’ll be nervous since they’re already under fire for budget escalations and everyone’s worried the Olympics are going to turn into Wembley times ten. They’ll hope that the fuss will die down, the brand and the logo will begin to seep into people’s brains, and the much-promised dynamic, flexible etc nature will start to win people over.

But I find it hard to believe that they’ll be able to ignore such a hugely negative reaction – if they listen to the people, they have to act.

Just as long as they ignore what the people are saying.

Leave a comment

Filed under Random

Strawberry Fair

Summer’s here again, for a couple of days at least, and Strawberry Fair had glorious weather. A scandalous amount of flesh was on show, colouring from eggshell white to sun-dried tomato red over the course of the day.

I popped along for a couple of hours in the company of Lynda, Andy, Chris and Melanie. Beer was drunk and photos were taken. Chris took some too, but they haven’t appeared yet; I believe they include me wearing a dodgy hat.

Highlights include Lynda queue-jumping at the toilets by claiming, falsely, that she was pregnant, and Chris’s DMs being used as a toilet by a passing dog.

Leave a comment

Filed under Random