prog: (Default)
prog ([personal profile] prog) wrote2007-11-15 03:52 pm
Entry tags:

Editorial on SQL abstraction

They say that every Perl hacker's journeyman project is an HTML templating system. I'm learning about a variant, running into clients who each developed their own SQL abstraction modules before hiring me. Like every other SQL abstraction module, it really has just one purpose: an attempt to avoid having to embed one programming language (SQL) inside another one (Perl).

To this I say: Bleah. I used to agree with this sentiment, and for years used modules like Class::DBI, which treat tables as classes and rows as instances. Clever, and easy to work with! But now try doing a right join. Ha ha, no, I know: show me how do to a subselect with that. Yeah. Either you punt and shove raw SQL into your code anyway, or you insist on doing it "Perlishly" with loops and checks, in which case a query that should have taken a tenth of a second takes several seconds or more.

In fact, I'll wager that your code is already running at horrible efficiency because you're pounding the crap out of your DB with unneccessary loop-based SELECTs instead of carefully doing case-by-case SQL queries that get exactly the rows you need, each holding exactly the columns you care about, every time you call an information-seeking object method. I'll also bet your INSERTs are ass-slow because you're not using bind values in them. Actually, you're not using bind values anywhere, right? Yeah, see.

My friend, if you're going to work with a little raw SQL, I argue you're already blown it, and may as well just let it all hang out, throwing out your half-useful abstraction layer. There is no sin in openly acknowledging that you're using an SQL database by actually writing SQL. If the fastest way to get some particular information out of the DB is to write a crazy-long and baroque query, then you should do so. Let the database do the work it's optimizied for and stop treating it like a set of config files that you'll need to write your own logic around.

Another way to put it: If you're writing nested loops in your SQL-driving Perl, you're probably doing it wrong.

[identity profile] ahkond.livejournal.com 2007-11-15 09:00 pm (UTC)(link)
Absolutely. I've also found that people who do this kind of thing are afraid to write stored procedures and views that could encapsulate a lot of the logic. For instance, in many cases one can write a stored procedure that will accept a handful of parameters and then return the results of the implied query. Instead, they want to do everything "in one place" even if that means bad code.

[identity profile] prog.livejournal.com 2007-11-15 09:53 pm (UTC)(link)
This is good. I also agree (and it may not be clear in my post) that I do support using some kind of abstraction. But the way I'm currently doing things, that means little more than saying:

my $sth_for_some_specific_thing = get_statement_handle_for_that_specific_thing();
$sth_for_some_specific_thing->execute('bind_value1', 'bind_value2');


And then just packing all the raw-SQL-returning subroutines down in the basement of the source code, in clearly labeled boxes.
Edited 2007-11-15 21:54 (UTC)

[identity profile] dougo.livejournal.com 2007-11-15 10:33 pm (UTC)(link)
There's a Scheme library I've used a little bit that puts the abstraction around the SQL language, not the query. So instead of "SELECT foo FROM bar WHERE etc" you have something like "(select bar 'foo etc)" (or whatever), but the result is not the answer to the query, it's just the query string that you then send to the database (using a separate API). Maybe this is too much work simply to avoid SQL syntax, though.

[identity profile] prog.livejournal.com 2007-11-15 10:40 pm (UTC)(link)
The first abstraction modules I used were like that, too. And again, I'm sure it works fine until you want to do something hairier than a straightforward select/insert/update against a single table (or even multiple tables but with simple straight joins).

[identity profile] dougo.livejournal.com 2007-11-15 11:21 pm (UTC)(link)
Nope, join is trivial: (join <query1&rt; <query2&rt;). It's just not clear what you get out of it other than more parens and fewer UPPERCASE KEYWORDS.

[identity profile] dougo.livejournal.com 2007-11-15 11:22 pm (UTC)(link)
Oh, fooey, I'm a freeloader so I can't edit those rts into gts. Sorry about that.

[identity profile] daerr.livejournal.com 2007-11-22 03:16 am (UTC)(link)
That looks like a UNION not a JOIN. (Where you merge the result sets from two independent queries.) JOINs require logic telling your DBMS how the tables are related.

[identity profile] dougo.livejournal.com 2007-11-22 04:52 am (UTC)(link)
You're right, it would be more like (join [inner|outer|etc] <table1> <table>). The point is it's purely syntactic. It's basically the AST for the SQL grammar. (Is SQL even CF?)

[identity profile] chocorisu.livejournal.com 2007-11-15 11:44 pm (UTC)(link)
I thought the whole point of SQL is that it's an abstraction in the first place. I'm kind of reassured to hear you say that all that fancy-pants SQL abstraction crap is just confusing rather than useful... it's not like SQL is exactly *hard*.
ext_2472: (Default)

[identity profile] radiotelescope.livejournal.com 2007-11-16 10:52 pm (UTC)(link)
SQL isn't hard until someone takes your pile of I-developed-this-against-MySQL code and tries to run it with PostGres.

Then, you want an abstraction layer.

Also, you want an abstraction layer which will prevent you from ever having to think about quote-quoting again.