Day before yesterday, I finally got all of PGXN moved to a new server. I had been using a small server owned by my company, Kineticode, and hosted by Command Prompt. That was fine for a while, but CMD was needing its rack space back, and what with my new job, I was shutting down Kineticode, too. It was time to move PGXN elsewhere.
For a while, I got a lot of support and assistance towards moving PGXN to a PostgreSQL community server. Dave, Magnus, and Stefan kindly spun up a VM for me, and gave me permission to install Perl modules from CPAN, provided I supply them with a script to report to Nagios when Perl modules were out of date, which of course I did. This was necessary because I built PGXN with some pretty recent versions of CPAN modules that are not yet available in Debian stable. I was looking forward to getting things running and integrating with the community authentication service.
I got the server built, and everything was working reasonably well. Magnus and I were just working out some issues with the proxy server configuration, and I was starting to think about how to migrate the data over. But first, I decided to refactor the Perl module script to use a more efficient implementation. I fired it off and piped its output to the cpan utility to just get everything updated. Unfortunately, unlike my first implementation, which reported only on CPAN-installed modules, this version of the script also reported when Debian-installed modules were out-of-date. And since I have my CPAN build configuration set up to remove previous installations, I upgraded all those modules, replacing them with new versions.
Well, this was a major fuckup on my part. Turns out there’s no simple way to restore Debian-distributed versions of the modules without rebuilding the entire system. Worse, this was exactly the sort of thing the community sysadmins feared. They have to maintain a lot of servers. So they naturally prefer that they all be as similar as possible. The new PGXN server had been mostly similar to what they had before, and Dave and company had been willing to compromise quite a bit to get PGXN going, but I, unfortunately, demonstrated how easy it is to ruin the whole thing.
So we decided that a community server isn’t the right place for PGXN. At least not yet. Perhaps in a year or two the Debian distribution will be updated to have all the prerequisites I need. Better yet, maybe someone create a PGXN debian distribution! (Volunteers welcomed.) Then I won’t have to do anything special and we can try again (without any sudo privileges for me!). But in the meantime, I still needed to move things.
Fortunately, depesz came to the rescue. He has a very nice box hosting his blog, explain.depesz.com, and a few other things, and would I like to set things up there? Depesz used perlbrew to set up a Perl install just for the PGXN system accounts, meaning I could install any Perl modules I needed without interfering with the system Perl. And each account has its own privileges to run the services it needs (Manager, API, Site) without the risk of breaking anything else. A few days after getting access, we had everything set up and ready to go. I pulled the trigger on Monday, and it went of without a hitch.
My thanks to depesz for the server and all the assistance, not to mention his donation! PGXN now has a very nice home where it can mature.
And as for the future, I have some thoughts about that, too.
And yes, now that this migration is finally done, I expect I’ll have more time to blog and work on PGXN going foward. Please leave your thoughts and ideas in the comments. This thing is wide open to any kind of idea, and I would greatly appreciate your feedback.
I’m thinking about how to add support for reserved extensions. These are extensions that one needs to depend on, but aren’t distributed via PGXN. Primarily, this means stuff distributed with the PostgreSQL core, including:
Basically, anything that an extension might want to declare as a dependency, but that isn’t on PGXN itself.
There are a number of ways to do this. Which do you think would be the best approach?
META.json fileI’m leaning towards #2, perhaps having it automatically maintain a list in the database and a metdata file on the mirrors.
But what do you think? Opinions wanted!
I’ve been hard at work on PGXN Manager, the app for users to upload distributions to PGXN. I am of course following my own dictum: “the database is the model.” As a result, I’ve been creating an API for creating, updating, and deleting entities, as well as generating JSON (more on that later).
I’m also using PostgreSQL 9. The motivation to make the jump to 9.0 was to to try to use the JSON data type patch, but I abandoned it when I couldn’t get it to compile. (I might come back to it later, but right now I’m trying to practice YAGNI and JFDI so that I don’t end up owning a yak farm). But there are other reasons to stick with 9.0, like improved hstore support, the DO statement, and named parameters.
Actually, this last one is very nice, as it allows me to use SQL syntax to specify function parameters instead of using an hstore value to hack it. For example, I have this function for update a user record:
CREATE OR REPLACE FUNCTION update_user(
nick LABEL,
name TEXT DEFAULT NULL,
email EMAIL DEFAULT NULL,
uri URI DEFAULT NULL
) RETURNS BOOLEAN LANGUAGE plpgsql SECURITY DEFINER AS $$
DECLARE
_email ALIAS FOR email;
_uri ALIAS FOR uri;
BEGIN
UPDATE users
SET full_name = COALESCE(name, full_name),
email = COALESCE(_email, users.email),
uri = COALESCE(_uri, users.uri),
updated_at = NOW()
WHERE nickname = nick
AND status = 'active';
RETURN FOUND;
END;
$$;
The nice thing about named parameters is that I can call this function like so:
SELECT update_user(
nick := 'theory',
name := 'David E. Wheeler',
email := 'justatheory@pgxn.org',
uri := 'http://www.justatheory.com/'
);
Hell, since the parameters are named, I can specify them in any order. And because there are defaults, I can omit one or more of them:
SELECT update_user(
name := 'David E. Wheeler',
email := 'justatheory@pgxn.org',
nick := 'theory'
);
And it will just work. Nice!
One thing I did discover, though: Now that I’m using named parameters, I don’t want to use Hungarian notation for the parameter names. Back before 9.0, when callers never saw argument names when calling a function, I could call them whatever I wanted. This was especially important when executing queries in PL/pgSQL, because one needs to be aware of parameter (and variable!) names that conflict with SQL identifiers, especially columns. (Another nice feature in 9.0 is that it will throw an exception if you create a PL/pgSQL function with conflicting parameter and identifier names.) But now that the parameter names are more likely to be exposed and used to the function caller, I want them to be meaningful.
I messed with this for a while. I was okay with using “nick” for “nickname” and “pass” for “password”, but was annoyed with the alternatives for “email” and “uri,” especially since they’re used with those names elsewhere (queries against the table, generated JSON). I really wanted to use parameter names that were the same as the columns. I tried just using the dollar variables ($1, $2, etc.), but got errors for them, too: they seem to be compiled into the parameter names.
Then, on a guess, I tried creating aliases for the variables. From the example above, its:
_email ALIAS FOR email;
_uri ALIAS FOR uri;
Then I used _email and _uri in my UPDATE statement. And what do you know, it worked! I found this somewhat humorous, given the history of ALIAS.
It used to be, long ago, that you couldn’t use the parameter names in the body of PL/pgSQL functions, so ALIAS was there to let you alias the dollar variable names to other names. But somewhere around, oh, 8.0 or so, we were blessed with the ability to use the parameter names directly. Suddenly ALIAS seemed superfluous. I’ve hardly ever used it myself. It’s just been sitting there, like my appendix, waiting for another use.
And now there is one. A really good one! I can safely use parameter names that are the same as column names in my PL/pgSQL functions as long as I alias them. And it just works!
Well, almost. It seems that ALIAS means what it says: the parameter names are still around can can be used. So sometimes you might run into an error like
ERROR: column reference "email" is ambiguous
Even though you’re not using the variable. I ran into this in the update function where I was using the column names in the left-hand side of the SET expressions. The solution, fortunately, is simple: table-qualify the column names as appropriate:
UPDATE users
SET full_name = COALESCE(name, full_name),
email = COALESCE(_email, users.email),
uri = COALESCE(_uri, users.uri),
Note the use of users.email instead of just email in the COALESCE() function. Seems like a reasonable workaround in exchange for the ability to have parameter names that match column names. I’m sold!
Now just to consider whether to change the nick and pass parameter names to nickname and password in order to be completely consistent. I guess it’s a good idea.
More next week. I’ve been doing lots of hacking and have much to share, but have another project that will take up my time between now and Monday, so I’ll have to come back to it.
Update 2010-08-07: I turns out that there is a much better way to do this: Just function-name-qualify parameter names where they might conflict with database object names:
UPDATE users
SET full_name = COALESCE(name, full_name),
email = COALESCE(update_users.email, users.email),
uri = COALESCE(update_users.uri, users.uri),
No need for the aliases at all! I had no idea about this feature. Many thanks to Colin ‘t Hart for the comment below about how this is available in Oracle and to Tom Lane for smacking me upside the head with the fine manual (look for the “note” at the bottom) when I asked about it.