Should everyone learn programming?

2/16/2015 01:33:00 PM
Tweetable
I'm generally a big fan of the idea of introducing programming into schools as a core subject area, along side reading, writing, and math. But my main worry is that if everyone knows programming, everyone will try to do it on their own, and do it poorly.

Consider SQL for example. Currently none of the doctors in my division know SQL, and as a programmer I'm often one of the few people who interacts with the database to get data to the doctors. Suppose we want to get a list of all the patients who aren't emergency patients or inpatients. An inexperienced programmer might write:
SELECT DISTINCT name FROM patients WHERE class NOT IN ('emergency','inpatient');
But that's wrong. This says get a list of unique names from the patients table where the class column has a value that is neither "emergency" nor "inpatient." What we really want is this:
SELECT DISTINCT name FROM patients WHERE class NOT IN ('emergency','inpatient') OR class IS NULL;
Although the predicate "OR class IS NULL" might seem redundant because NULL is certainly not in ('emergency','inpatient'), it is still necessary because otherwise SQL will skip over rows with null class values.

This arcane bit of programming trivia is not at all intuitive, nor really covered in any of the SQL tutorials or courses. It may not even be consistent across implementations of SQL--this is the behavior of SQL Server, and I really don't know if, say, Oracle or MySQL do the same thing. Yet if you don't know this, you will end up giving factually incorrect information to people.

Every programming language is full of landmines likes these--arcane bits of programming trivia that are unintuitive, poorly documented, and hugely important. And the fact is that much of the time, these errors don't actually throw errors for us--that would be too kind--but instead return deceptive correct-looking but wrong data. (On the other side of my work, the same thing happens in statistics. The fact that your regression returned valid-seeming coefficients does not make it so.)

I find these errors because a big part of my job is ensuring data integrity. I don't just find these issues, I hunt for them. I wouldn't expect, or even want, doctors and other researchers who use this data to run these queries themselves.

So, when it comes to teaching everyone some programming, we should be clearer about the objectives. We do want doctors and other important front line workers to be able to think critically about data needs, database design, and inputing data in ways to minimize the potential for human error, and a general education that includes programming will help with that. But we don't literally want doctors writing SQL statements in between patient visits.
Nick Rowe 2/17/2015 08:12:00 AM
Matthew: totally off-topic, sorry, but I need your brain, because mine isn't up to it:

http://worthwhile.typepad.com/worthwhile_canadian_initi/2015/02/do-economically-illiterate-slobs-use-duality-theory.html