Don’t let humans near the data

One small problem with people is that they use computers. Particularly web appilcations. They’d never break if people didn’t use them. But there’s one thing worse then people using your web site, and that is people supplying data to put into the web site. The last 3 weeks have been spent entering over 2000 users into a database from numerous sources. Lots of (mostly) excel files with user data that had to be first transformed to an Access table, uploaded and the data moved into a number of tables on SQL Server. The job could be relatively straight forward – just write a few scripts to move the data about and just start feeding the data in. One small hitch though. Humans.

The problem is that although the humans were told “Here is an excel file. It is a template. Fill in the details into this template”. So they did. And a column or 2. Move another column. Prehaps enter names in different formats:Bloggs, Jo or maybe Mr. J. Bloggs, or even Joe [new column] BLOGGS [new column]. And don’t get me started on the addresses.

Then, once you send 2 hours modifying your scripts to fit each and every file you just received, they send you another, and this time it partly the same data, but with a few modifications.

So you work at it every day, evenings, weekends, and still they complain why did it take so long (I should say that was them that sent the data, not my colleagues/superiors who were having their own problems with the humans).

So what’s the point of this diabribe against our flawed species? Well, it’s a lesson that you get told in programmer boot camp – you program’s only ever as good as the data that gets put into it. [still not with you – Ed]. My point is that you really have to limit in everyway possible the data that gets entered. If you ask in a form “Which day of the week?” you’d be mad to give them a text box – the code to validate it, though not huge, would none-the-less be somewhat superfluous since a drop-down select box offers the only seven choices that there are. Coming back to the data issue I started with, a 5 day task became a 21 day task because there was no data validation at all when these people filled in the excel spreadsheets. Next time (because this will happen again in the future and could involve 3 times as much data) the excel template will be sent will built-in data validation: some kind of VBA app maybe. It makes you go it nutty otherwise (see previous entry, and this rambling nonsence to boot!)



About this entry