SQL::Functional Cookbook: ANDs and ORs

It seems like it should be easy, but I was always disappointed with how other solutions handle arbitrarily nested ANDs and ORs. Most SQL creation libraries seem to start by adding support for a list of AND statements. At some point, the authors realize they need ORs, so they slap that in. Much later, they realize they need to mix ANDs and ORs, and then all sorts of convolutions get written.

With SQL::Functional‘s approach, nesting ANDs and ORs together is as natural as straight SQL. First, we’ll back up a few steps and demonstrate the ANDs:

Notice that unlike SQL, the AND is a prefix rather than infix. This might take some getting used to, but it does mean you can pass it an arbitrary number of statements:

In the final SQL, all of these will be joined together with AND. The OR statement works the same way:

If we need to mix the two together, we just do that:

Thus, the nesting falls naturally out of the system, just like it should be.

SQL::Functional Cookbook: Simple Select

SQL::Functional now has reasonably complete converge of common SQL statements. There’s always going to be something else to add, but the main thing it needs now is a cookbook of standard cases. Along those lines, I thought I’d start up a series of blog posts that can be compiled into a complete cookbook.

We’ll start with a basic SELECT statement.

One of SQL::Functional‘s strong points is in making easy jobs easy. Its other strong point, making hard jobs possible, will be covered later.

Let's break down what’s actually going on. The SELECT function takes a scalar for all the fields, so we have to pass it as an arrayref. FROM takes a list of tables, which we might do for joins, but we'll just pass it one for now. WHERE takes a list of clauses.

Ideally, we could say something like:

But that would require deep Perl voodoo (e.g. source code filters), so we content ourselves with the match function to build the same thing. It also helps us here by having the matched data passed as a placeholder.

In the end, $sql will contain:

With @sql_params containing the data for all the placeholders. We can run this through DBI like any other statement:

Easy.

New module: SQL::Functional

If you have an application with simple database needs, writing the SQL strings directly in the program works fine. If it’s bigger, an Object-Relational Mapper like DBIx::Class starts to make sense. I find the additional boilerplate of an ORM isn’t worth the effort for simple needs, but it doesn’t take much more for the overhead to be worthwhile.

Then there’s another point beyond that where the ORM doesn’t know how to effectively and efficiently run a request. When you get there, it’s time to write raw SQL again, perhaps giving the result set back to your ORM so you can continue on as if nothing happened.

How do you write that SQL? Probably with strings again, or a helper library like SQL::Abstract.

The problem with raw strings is brought up in Epigrams in Programming:

  • It’s difficult to extract sense from strings, but they’re the only communication coin we can count on.
  • The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information.

The nature of our database problem is that we’ve hit on a very complicated case, and now we’re stuck using the most ad-hoc way of structuring data. Not a good combination!

SQL::Abstract tries to address this by providing a structured way of building a string. We still have to output a string (” . . . they’re the only communication coin we can count on”), but at least we don’t have to deal with it directly. The problem is that, again, we’re getting to the most complicated cases of SQL, and SQL::Abstract abuses syntax tricks for these cases. Consider this case I pulled out of its docs:

Why are we taking references to scalars? And references to arrayrefs, for that matter? Could you copy-and-paste this to a DBA and explain what’s going on?

When you drill down a bit, the reason for these issues is that SQL::Abstract uses an Object Oriented interface to build this string, and OO is just not a good fit for this problem.

On the other hand, the problem can be solved more cleanly by combining functions that look vaguely like SQL:

This is exactly what SQL::Functional does. Version 0.1 is now up on CPAN.

Language Community Litmus Test: Database Placeholders

Search for “[language] database tutorial”. Out of the top 5 results, how many of them show how to use safe practices (preferably placeholders, but safe quoting functions are OK, too) at the earliest available opportunity?

I award a score of 5 points for the first result, 4 for the second, and so on. A perfect score would be 15. If the first INSERT or SELECT statement has simple static data, I give it a pass and find the first statement that’s filling in data from variables.

  • Haskell: 15
  • Python: 12
  • Perl: 11
  • Ruby: 11
  • Node.js: 4
  • PHP: 4

There are some caveats with the judging of this data:

  • Haskell’s interfaces seem to use declarative functions to build the SQL statement rather than concatenating strings. I’m presuming these libraries quote things safely.
  • The third result for Python was to a wiki page that linked to various database-related info. Many of the links had placeholders and quoting mentioned early on, so I gave it points for this one.
  • The last two results on Python were to a Stack Overflow question without many direct examples, and a YouTube vid that has no dynamic statement usage. No points for these.
  • Perl’s second link was to an About.com (??!) tutorial with no placeholders/quote functions shown. I think we in the Perl community might need to fix our SEO on this one.
  • Ruby’s second result was to a general Rails tutorial without much direct SQL access. No points here, but like Perl’s About.com result, this may be more of a case of bad SEO.
  • Tempted not to give points to Ruby’s third result, which took a while to get to a statement with any dynamic data. But when you got there, it did do it safely, so I gave it a pass.
  • Edit: I initially didn’t give Ruby points for the fourth link. It was a short page that had no complex SQL statements. I see now that it’s one part of a larger document, and the subsequent link does use Ruby ActiveRecords. Ruby now gets points for this.
  • Fifth Ruby result was a link to the Ruby/DBI homepage. Clicked on “tutorial” sidebar link, which did show best practices, so I gave it points.
  • This test is perhaps unfair towards Node.js, because most of its top results covered MongoDB rather than an SQL database. While I have mean things to say about NoSQL, critiquing it wasn’t the intent of this test. The second link does use SQL, and does things correctly, so points awarded there.
  • All but the second PHP result concatenates $_POST[‘…’] or $_GET[‘…’] directly into statements in its earliest examples. The second result is trying desperately to show how things should be done, and is the only one to get points. One bad result can be passed off as an SEO issue. All but one is a real problem.