The second piece of the PGXN infrastructure, after PGXN Maanager, is the PGXN API Server. I’ve just finished the API documentation, which covers both the lightweight static file API provided by mirrors and the superset provided by the API server. So now seems like a good time to talk about the design of the API server and how it works.
At its core, the PGXN API server is just another mirror. It has an hourly
cron job that
rsyncs to the master mirror, updating the mirror. But then it
iterates over the
rsync log and transforms some things. Here’s what it does:
- Unpacks each distribution in a directory for browsing. Here, for example, is where one can browse the semver 0.2.1 sources.
- Searches for a
README file and any files recognized by Text::Markup and converts them to sanitized HTML with a table of contents. Such files can then be used to display the
README on the distribution page and to display individual documentation files.
- Merges the distribution metadata file with the latest stable release
META.json generated by PGXN Manager. For example, as of this writing, the API server’s semver 0.2.1
META.json and the unversioned semver.json are identical. Effectively, this format has all the metadata from the
META.json as well as a list of all releases of the distribution from the
semver.json. This is useful for displaying all the data on the distribution page by fetching the data in a single API request.
- Updates all other versions of the
META.json file. For example, if you look at the semver 0.0.0
META.json, you’ll see that it includes 0.2.1 in its list of releases, even though 0.2.1 was released after 0.2.0. This allows semver 0.2.0 page on the main site to have a select list of version to choose from, including versions released later, with a single API request.
- Adds additional metadata to the extension JSON file for all extensions in the distribution. The added data includes release dates for the list all distributions providing the extension, as well as an abstract and doc path for the latest stable release. To see the differences, compare the mirror
semver.json to the API
- Adds an abstract for each distribution listed in the user’s JSON file and all tag JSON files. Compare, for example, the mirror
theory.json to the API
theory.json and the mirror
data types.json to the API
data types.json. This allows the user page and tag pages to include the abstract in the list of distributions released by the user or associated with a tag.
- Adds records to a Lucy-powered full text search index.
All of this merging stuff came out of my thinking following the discussion of the PGXN API RFC. The decision to use Lucy instead of PostgreSQL’s full-text search followed rather naturally from this, as I quickly realized that there was no other driving need for a relational database behind the API at all. The only dynamic API is the search API. Everything else is just static files. And given the performance issues of in-database search, as well as the desire to have fewer outside dependencies, made the decision a natural one.
Beyond the syncing, there is a very simple web server providing the HTTP REST interface to the static JSON files and the full-text search. That’s it, really. The API server is really just another mirror on steroids. The nice thing is that it allows an interface, such as WWW::PGXN or the new PGXN client to work with either interface, just failing gracefully when API server APIs are unavailable.
If you want to learn more about the specifics of the REST API, the API documentation has all the details. Really, it’s quite comprehensive!
I actually consider the API to be 1.0-complete at this point, unlike PGXN Manager. The only thing I want to add is JSONP support for static JSON files (right now it’s only for search results) and might tweak a few things here and there, but otherwise I think it’s in pretty good shape.
Longer term, though, it might be worthwhile to add some other features to enhance the value of PGXN overall. Some ideas:
- Distribution and/or extension ratings (reviews, Like/Dislike, stars, or something).
- Diffs to compare changes between versions.
- A test reporting infrastructure with result matrices (á la CPAN Testers.
But I think we need to build up some momentum on the foundation that’s in place. Have you submitted your extensions, yet?