The second piece of the PGXN infrastructure, after PGXN Maanager, is the PGXN API Server. I’ve just finished the API documentation, which covers both the lightweight static file API provided by mirrors and the superset provided by the API server. So now seems like a good time to talk about the design of the API server and how it works.
At its core, the PGXN API server is just another mirror. It has an hourly
cron job that rsyncs to the master mirror, updating the mirror. But then it
iterates over the rsync log and transforms some things. Here’s what it does:
README file and any files recognized by Text::Markup and converts them to sanitized HTML with a table of contents. Such files can then be used to display the README on the distribution page and to display individual documentation files.META.json generated by PGXN Manager. For example, as of this writing, the API server’s semver 0.2.1 META.json and the unversioned semver.json are identical. Effectively, this format has all the metadata from the META.json as well as a list of all releases of the distribution from the semver.json. This is useful for displaying all the data on the distribution page by fetching the data in a single API request.META.json file. For example, if you look at the semver 0.0.0 META.json, you’ll see that it includes 0.2.1 in its list of releases, even though 0.2.1 was released after 0.2.0. This allows semver 0.2.0 page on the main site to have a select list of version to choose from, including versions released later, with a single API request.semver.json to the API semver.json.theory.json to the API theory.json and the mirror data types.json to the API data types.json. This allows the user page and tag pages to include the abstract in the list of distributions released by the user or associated with a tag.All of this merging stuff came out of my thinking following the discussion of the PGXN API RFC. The decision to use Lucy instead of PostgreSQL’s full-text search followed rather naturally from this, as I quickly realized that there was no other driving need for a relational database behind the API at all. The only dynamic API is the search API. Everything else is just static files. And given the performance issues of in-database search, as well as the desire to have fewer outside dependencies, made the decision a natural one.
Beyond the syncing, there is a very simple web server providing the HTTP REST interface to the static JSON files and the full-text search. That’s it, really. The API server is really just another mirror on steroids. The nice thing is that it allows an interface, such as WWW::PGXN or the new PGXN client to work with either interface, just failing gracefully when API server APIs are unavailable.
If you want to learn more about the specifics of the REST API, the API documentation has all the details. Really, it’s quite comprehensive!
I actually consider the API to be 1.0-complete at this point, unlike PGXN Manager. The only thing I want to add is JSONP support for static JSON files (right now it’s only for search results) and might tweak a few things here and there, but otherwise I think it’s in pretty good shape.
Longer term, though, it might be worthwhile to add some other features to enhance the value of PGXN overall. Some ideas:
But I think we need to build up some momentum on the foundation that’s in place. Have you submitted your extensions, yet?