June 29, 2009
july 6 at vaudeville mews

Filed under: music
No Comments

I've mentioned before how no ORM is perfect, and how the best frameworks give you the option of going around the ORM when you need to. As great as OO and MVC are, neither is a panacea. I don't care how frowned-upon it is to use custom SQL in a CakePHP project, practical stuff-that-works beats design dogma every time. Today (and yesterday) I ran into just such a situation.
Working on a CakePHP-based web app, the client wanted a particular report added to the site. The report needed to break down counts and sub-totals of revenue according to attributes that belonged to the line items, the objects being paid for, and the users themselves, given a date range. The report definitely needed to be based on the line-items. The line items "belong_to" the payments and the things being paid for (licensing applications, as it turns out), and the payments and license applications both "belong_to" the users.
It was bad enough that I had to make CakePHP's model layer do an aggregate/GROUP BY query to get the counts and totals I need (preferable to looping through payment items and totaling them in PHP). This is clearly a case where the functionality I needed was more relational than OO, so I was already trying to hammer a square peg into a round hole (something I seem to do a lot of). Where the trouble really started was getting CakePHP to pull in the users along with the line items. CakePHP has no relationship along the lines of Rails's "has_many :through", but it does have "recursive" and "Containable." Unfortunately, these aren't smart enough to pull in the two-relationships-away data by joining the table into the query; instead, CakePHP's crummy not-quite-ORM actually recurses through the models and pulls those in in another query. So the query for line items (joined with their payments and their applications) brings in, let's say, 1200 rows (typical for a month), then once CakePHP has all the user IDs from all the payments that all those line items belongs to, it sticks all their ids in a list and fires off a big "where id in ([900 or so numbers here])" query. Which, as you can imagine, takes way too long, causing PHP to puke and die because the process doesn't complete within 30 seconds. So the "proper" CakePHP method of effecting this relationship turns out to be too inefficient for this case.
So, I had to do a little custom SQL. An upside to this is that I at least have tighter control over the query (I can narrow down the fields returned with Containable, but like any time you try to do something a little bit advanced with a query in CakePHP, it gets a bit verbose and ends up actually less readable than straight SQL). After a few minutes of fiddling at the MySQL command line, I came up with this:
SELECT payment_items.product_type, license_application_types.trade, license_application_types.license_type, (licensees.personal_state = 'IA') AS in_state, sum(payment_items.amount) AS fees, count(*) AS count FROM payment_items JOIN payments ON payment_items.payment_id = payments.id LEFT JOIN license_applications ON payment_items.thing_id = license_applications.id LEFT JOIN license_application_types ON license_application_types.license_application_id = license_applications.id LEFT JOIN licensees ON license_applications.licensee_id = licensees.id WHERE payments.approved = 1 AND payments.created >= ? AND payments.created <= ? GROUP BY product_type, trade, license_type, in_state ORDER BY product_type, trade, license_type, in_state
This gave me pretty much what I wanted, minus all the sub-totals for different combinations of subsets of the attributes of trade, license type, and in-state/out-of-state. I got back one row for each possible combination of all those attributes though, as well as for every kind of additional fee that showed up on payments but isn't a license application fee itself; and for each row, I got a count of line items and a total of dollars paid for that kind of item.
Now I had to figure out how to come up with all the sub-grand-totals that the report needed to have — for all out-of-state journeyman applications, all master plumber applications, etc. I could have re-run queries like this one but with each one narrowed down by adding some extra ands to the WHERE clause, but I thought I could probably get by with looping through this data and totaling up the columns that meet each set of criteria.
In Ruby, this would be a cinch — I'd just call select() on this data a few times, each time with a different block to pick out the subset I wanted to total up, then use sum() or inject() on that. PHP, however, being the COBOL of the Web, doesn't really have the first-class functions that enable things like select() and inject(). It has some things like array_map and array_filter that take "callback" arguments, but you pass those by giving the name of the function, and if needed, the class or object it's defined on. So I'd have to actually clutter up my class with a bunch of little methods, one describing each subset I need to total. Ugh.
But I figured that, given the usual array-of-associative-arrays "recordset" of data you get back from a SQL database, it would be pretty trivial to write a function that filtered it according to an associative array of conditions — pass it the data set and an associative-array "condition" such as "array('trade' => hvac')," and it would return the rows for which every element in the condition matched. This would be great, if only CakePHP returned data in the array-of-associative-arrays recordset format we usually get from SQL databases. But alas, CakePHP feels the need to map your data into a pseudo-ORM tree and return you junk like this:
Array(
[0] => Array
(
[payment_items] => Array
(
[product_type] => application
)
[license_application_types] => Array
(
[trade] =>
[license_type] => apprentice
)
[0] => Array
(
[in_state] => 1
[fees] => 400
[count] => 8
)
)
[1] => Array
(
[payment_items] => Array
(
[product_type] => application
)
[license_application_types] => Array
(
[trade] => rapping
[license_type] => master
)
[0] => Array
(
[in_state] => 0
[fees] => 500
[count] => 2
)
)
[.... and so on ...]
Which rather complicates matching against a simple associative array of column_name => value. I'd have to either nest another loop or figure some way to recurse it — eeeew. I tried a more "procedural" solution — coming up with all the totals by looping through the dataset and adding the values from each row to every running total they applied to — but that ended up being the worst pile of nested-loop confusion I've ever written — The code to initialize all the totals to 0 alone was longer than the data being processed — and I still couldn't get it to work right, owing to the complexity.
There's got to be a way to get just the rows-and-columns data, right? CakePHP is bound to have a "just act like a vanilla SQL data store just this once" escape hatch somewhere, especially given how thin and anemic its ORM (more of a Table Data Gateway combined with an OAAM (object-associative-array-mapper)) is to begin with — right?
Well, I just spent my morning digging though CakePHP's database-centric naughty bits looking for the method that gets the row-and-column data that then gets passed to a rearrange-into-a-tree method. No such luck — no such separation of concerns between fetching the records and re-structuring them was to be found, the two tasks, with rather different purposes, were being done in the same method. Even a method called fetchRow() returns stuff that is no longer recognizable as a "row." Each Dbo* class for each type of database CakePHP supports re-implements the wannabe-object-graph re-arranging right there in the fetchResult() method, immediately after calling mysql_fetch_row() or whatever lower-level database call. There is no option to be found anywhere, no parameter that can be passed nor class variable that can be set, that turns this re-arranging off.
So instead, I had to write a method to undo it. I packaged it up with my previously-described filtering method into a Component called RawReportComponent. The better part of a workday wasted trying to find a work-around for some "magic" that's not all that great to begin with (ooh look, we do you the favor of massaging array data according to what table it's from!). I'd love to share this undoWannabeORM() method with the "community," but since it's code I wrote for work I probably shouldn't. You could probably write it yourself, with three nested loops like I did. The real pity is that anyone should have to.
Filed under: computing, php
2 Comments
I'm seriously rethinking my decision to use Geeklog for the Why Make Clocks site. I liked how most the admin interface made sense right away: you have static pages and stories, not vague abstractions like "items." But I've had a lot of little annoyances and difficulties with it since. Most of them I've managed to overcome, even if it meant hacking a little bit of code. But last night it came to a head when I decided to test out comments and came across this bit of text above a user-registration form:
Creating a user account will give you all the benefits of Why Make Clocks membership and it will allow you to post comments and submit items as yourself. If you don't have an account, you will only be able to post anonymously. Please note that your email address will never be publicly displayed on this site.
"All the benefits of a Why Make Clocks membership"? Uh no, registering via this web form won't get you half-price beer at the gig and you certainly don't get to stand up on stage with us and sing.
"Allow you to … submit items"? Um no, this is our site, we submit the items. I thought this was the registration to add comments, not to submit regular site content. Of course, you can configure user accounts to only be able to submit comments, not stories, but then that just means that this verbiage would become inaccurate and misleading.
This even echoes an earlier problem I had in which, despite this supposed requirement to create a login account to post a comment as anything other than "Anonymous," the Geeklog calendar always displays a link for visitors to add events. I was using the calendar to put up our shows, radio appearances, album release dates, that kind of thing. It doesn't make the least bit of sense to have just any visitor who shows up be able to, or even think they're supposed to be able to, put stuff on our calendar.
So I decided to try to figure out where this bit of text was at and change it to something that makes more sense. Turns out it's tucked away in a file in Geeklog's core code along with the same statement translated into a couple dozen other languages. I don't know all these languages, so I can hardly be expected to take the text I want and translate it into them. Anyway, I don't need all these languages. We're a regional band from the Midwest, English will do fine. Once we start touring Europe or Japan, maybe then we'll worry about internationalization. Of course, I can turn ths feature off in the configuration…
I mean, I understand some of what this "let Joe Random submit content" feature might be for, but it's not a feature we want and there's no good way to completely turn it off. Even if you configure it that way, there's still links and verbiage on the site that tell people they can submit items, and you have to go grubbing around the theme code and the Geeklog code itself to get rid of these. I do not need to allow, nor want to deal with, publicly-submitted content other than comments, and I'm not running a company where I have a bunch of subordinates submitting content either, so I don't care about being able to review their submissions and approve them. But it's a ball of features I end up having to configure and generally dick-around-with anyway.
There's also things like the swear filter and HTML filter — I wanted to embed a YouTube video in a story and ended up having to dig through the configuration to figure out how to get it to stop stripping out the HTML code. And this is rock and roll, not a buttoned-down corporate site, so censoring cuss words would, if anything, hurt our image, making us look like uptight prigs. Sure, I can turn off or configure these features too…
But that's just it — the dizzyingly large and poorly organized collection of configuration knobs in this thing. It's a tall order to expect me to pick through all of them (especially when I'm not even sure what some of them do) to turn off stuff I don't need.
See, Geeklog suffers from the problem a lot of open-source CMS solutions have which is that it tries to have all these features that are oriented to the Enterprise. This ignores the obvious fact that most web sites, indeed most organizations, start small and then grow. So all these options oriented to larger organizations right from the start, they get in your way. It's more cruft that you have neither the time nor the need to think about and configure. I don't want to have to think about i18n until I actually need it. I don't want to have to think about assigning my users to groups and different fine-grained permissions to those groups when my users are me and the two other guys in the band, or worse yet, when it's just myself. It would make much more sense to be able to turn on, or add through plugins, these kinds of advanced features as they become needed. Let the corporations who can afford to pay people to write up process documents about what configurations to make deal with all this shit. Let the little guy keep it simple. They call it "Geeklog" but I think a better name might have been "Suitlog."
What makes matters worse is that solutions like this seem not to be able to completely decide or commit to whether they truly want to be enterprisey or not — they include just enough enterprisey features to annoy you, get in your way, and give their developers something resume-building to work on, but either totally leave out, or provide rather bad implementations of, features that companies might actually consider crucial — like being able to easily migrate your site to a new server, so that you can set up a site in a testing environment and get the theme and configuration nailed down before moving it to a live public-facing server.
That's right, I tried to be all professional and, rather than kill off the existing Why Make Clocks site and potentially have people run into an in-progress version of the new site, I chose to first try to build the site on my local machine, then put it up on another hosting account where Dan could look it over and try it out, then put it up on the real site once we decided it was ready. Development-Testing-Production. Man, what a nightmare that was. All the internal configuration, including a dozen or so file paths, is kept in the database, so if you back up the database and restore it on some other server, then edit the db-config.php file, you're still only about 5% of the way to having the thing actually work. There's a migrate.php script you can put in, but it only changes a few of the paths and misses a bunch more. After much fumbling over a period of some weeks, I had finally arrived at this process:
So anyway yeah, I'm back to looking at shoehorning a static-page-nav menu into the Old Standby, Wordpress.
There's a larger point about software design to be taken from this, though.
Filed under: computing, music
No Comments
Back in the old hometown yesterday to tend to my house and visit some folks, and happened to spend a little time at the annual My Waterloo Days festival, featuring a concert by Great White. Putting aside the "Waterloo: Where It's Still 1988" aspect of that particular choice, I have to share an interesting observation about the beer tent, located a short distance from the stage. A single $3 beer ticket gets you one of those 16-oz plastic cups of beer they always sell at these things, but get this: the selection of beers included Blue Moon, Boulevard, Fat Tire and Sam Adams. WTF? When did this start happening? Up through just as recently as last year about all you could get at these things was Bud Light and Miller Lite, and now all of a sudden, for three lousy bucks, you can get a cup of Fat Tire, which only a few months ago bars were selling for like $6.50 a bottle. Damn. How the hell did that happen? Recession my ass!
Filed under: Uncategorized
1 Comment
Made a new web site for Why Make Clocks. The old site was hand-rolled HTML painstakingly maintained by Dan using a basic "file manager" web form in the hosting company's admin panel (so his development environment was basically a <textarea> — ouch). He asked me to come up with something better, and while I don't have the design chops to come up with something hugely better from a visual standpoint, I knew I could improve on it from a functional standpoint, and could probably do it without having to write too very much custom code.
One thing I've noticed over the years with respect to band/musician web sites is that they have no trouble getting a really visually striking design that really jives with their image and sound — musicians and artists tend to hang around the same circles after all — but all too often what they put up is a beautiful, but totally static brochureware thing that they end up not being able to keep up to date, and thus they miss out on a chance to engage with their fans, or potential fans, through it.
Obviously bands have got the idea that they can do more on the web than some static info pages that never update. Far too many bands are basically using their MySpace profile as their website, and it seems like you see more and more of them starting up blogs on things like Blogger or Wordpress these days, because these are the best options available to them if they don't have a techie on board. But there are some pretty serious downfalls to them. You've got pretty limited control over the design, and on MySpace, your music and message has to fight with pop-up ads for weight loss products and various web-based scams for people's attention. Your MySpace page isn't really yours in any real sense. This is a state of affairs that I've often thought I could potentially carve out a little career for myself rectifying. In what form though, I'm not sure. Creating Yet Another Social Network Site For Bands seems like it's already been done dozens of times and the results have been consistently underwhelming.
But anyway, back to the project at hand. I built the site on Geeklog. I've had rough luck with content management systems in the past; the ones with the really sweet features are usually so abstract that it's hard for normal people to just jump in and start writing stuff, unless you pay some guy $10 to download a book he wrote that you won't have time to read. Unless your enterprisey employer is forcing you to. Geeklog's admin menu made sense right away, and what's more, it's got Iowa roots. My "design" leaves a bit to be desired yet; in fact, I could use suggestions. If you have any, or would like to offer your services as a web designer who's good with hacking Geeklog themes, feel free to comment here.
Filed under: computing, music
2 Comments
Stop glomming on more and more recipients and CC-recipients in email discussions. Just don't do it. You think you're doing people a favor, "getting everyone involved in the discussion," by throwing every conceivably remotely interested party into the To: or CC: line. You're not. What you're doing, rather, is making the naive assumption that you can just rope six or so otherwise-unacquainted people into an interaction together and they're all going to just hold hands and sing kum-ba-ya. You're also tempting people to hit "Reply All" in the interest of honoring your decision about who needs to be "in the loop." But the likelihood of overlooking specific recipients grows exponentially with the number of recipients, and pretty soon somebody is going to say something, or say it in a particular way, that's meant only for certain people, and somebody's going to get all bent out of shape.
It never ceases to amaze me how otherwise mature, experienced people, myself included, continue to fall into this well-worn trap.
Filed under: business, computing, goings-on, news & politics
2 Comments
When I moved this blog to Dreamhost, it was a decision more about features than about uptime/reliability — after all, Dreamhost has had their share of complaints in that area — and so far it's been Good Enough, and indeed gotten better, in the time I've been here. But the main reasons I moved were support for Rails (which I still have yet to take advantage of), support for PHP5 without .htaccess hacks, SSH shell access (my old host had this too, but point is, I didn't want to give it up), and a better-designed web control panel.
Anyway, as I recently joined the band Why Make Clocks and am a web applications developer by day, the job fell to me to come up with a new web site for the band. Until now, the site has been mainly static html that Dan updates by editing the files in the "File Manager" area of the control panel for the hosting account, which I think he's has since like '99, and gave me the password to in order to work on the site.
It turns out the host doesn't support SSH access. I personally consider this a crucial feature, but admittedly I haven't done a ton of shopping around for hosts… a little asking around at work and I'm told that apparently this is really quite common. The majority of shared-hosting joints don't provide SSH, preferring their customers to work on their site through a combination of FTP and a usually-poorly-designed web interface. Not only does this suck, it strikes me as less secure. At least, for me and my data…
Does anyone know why most hosting companies do this? I wasn't aware it was so common.
Update/clarification: by shells, i mean, including jailed shells. Just something that lets me manage the files and databases on my little corner of the server, not lets me go tromping all over the server willy-nilly, obviously; I understand the potential problems with that.
Filed under: computing, news & politics
No Comments
I've been threatening to kill off this blog at various times for possibly as much as a year now. You all probably think I'm a total tool because of it, or perhaps more likely, for not following through.
I think I'm in need of a re-branding, though. On the one hand, I don't want to have to maintain too many different blogs/sites, on the other, it gets really incoherent if I mix up too many topics or kinds of content. Somebody always ends up complaining about what I choose to write, how I choose to organize it, how much I choose to say about one thing or another. Usually my wife.
Maybe because she's the only one who actually gives a shit.
I'm already pretty determined to start a different site for music projects, since I would desire such a site to have a lot of features that this format doesn't lend itself well to. The kinds of things that need more of a CMS than a blog engine, and making Wordpress pretend to be a CMS strikes me as a square-peg-round-hole problem that's more trouble than I want to take on.
What do you guys think? What do you like to read from me? How do you think it's best disseminated or organized? How do I resolve my multiple facets? How do you, the readers, perceive your relationship to Chuck Hoffman and his Nothing Happens? Do you read because consider me and/or my words to be provocative? insightful? an artist? a personal acquaintance whose comings and goings you want to keep up on? intelligent? a trainwreck? a freakshow? I want to know.
Filed under: meta
7 Comments
Sure, Adobe hasn't updated the thing in like a year, but I'm sick enough of spyware infestations to have already eradicated the Windows scourge from all my machines, and can't afford to buy yet another computer right now, Mac or otherwise. So it's worth my while to try the "alpha" Linux version of Flex Builder. Even though I already paid full price for a not-so-alpha Flex Builder license. Maybe I'm crazy. Yeah, probably.
First off, install a 3.3 version of Eclipse directly from a .tar file in your home directory. Flex Builder Linux Alpha works only with 3.3, will not work with 3.4 (some assertion will fail when you open an editor). Go to the "old versions" link on eclipse.org to download it.
Furthermore, the Flex Builder Linux Alpha installer needs to be able to write to Eclipse's directory, but for some reason if you install Eclipse somewhere like /opt where you need to be root to do it, and then try to run the Flex Builder Linux Alpha installer as root to install it there, that also doesn't work. What will happen is you'll start up Eclipse and it will appear as though you'd never tried to install Flex Builder at all. I have no idea why this is.
So it kind of sucks a little that you can't do it in a "site-wide" install, and this would probably not be a problem if Adobe would just package the thing up as a normal Eclipse plugin instead of an installation executable with .bin at the end of the filename. I don't get that. Adobe obviously still doesn't quite understand the kind of modularity philosophy that pervades Linux-land.
So untar Eclipse 3.3 in your home directory, then run the installer as yourself rather than as root, the default install directory for Flex Builder will do fine, point it to the directory where Eclipse is at, and that's pretty much that.
Point is, don't try to get all fancy about it, or try to install it over however your distro's package system installs Eclipse, or insist on the latest Eclipse. What I've just described seems to be the only way that actually works.
Filed under: computing, flash/flex/actionscript
2 Comments