If a phrase could sum up what we've seen so far in WordPress in 2017 it would be something like: > Bureaucratic nonsense, shitty design, tone deaf development. The Core crew and their work on Plugins has recently turned from tragedy to farce. While these hardworking plebes have put in the hours, their result is, frankly, pathetic. They couldn't do much worse if they tried. They've broken search (which was crap to begin with, worse now), and their design is optimized for an ipad, nothing bigger or smaller. At every turn there has been scorn placed on requirements and suggestions proposed by developers, because, well, they are developers! And damned be the developers. (It has a kind of anti-Microsoft ring to it.)
Plugin Repository Redesign Fiasco
Before we start in on WordPress, and search, it is useful to understand my own background regarding WordPress, and search. I've used WordPress regularly since Kubrik (v1.2, 2005). My experience with the web, and search, was earlier. From 1999-2001 I took courses in old school library science, information science, newly spawned information architecture (a combination of library science and human factors, rebranded user experience), and the like. I graduated in 2001 with an MS in Information Management and Systems (a unique degree name, our gown color was from the more established MS in Information Science degree). What brought me to the attention of the program, then named Information Management and Systems, hence rebranded the School of Information, was the book Information Rules by Carl Shapiro and Hal Varian. Hal had become the Dean of the School. I applied and two years later was a part of the third graduating class. (Hal later left and became the Chief Economist at Google.) The school is the newest (and smallest) school at Berkeley. However, its roots at the school are ancient, and it is situated in South Hall, the oldest building on campus, constructed in 1873, and original home of the first Physics laboratory in the United States. Essentially it is a re-organized library school, with new faculty hired with dual appointments at several different schools across campus, including the Law School, Computer Science, Engineering, Economics, Public Policy, etc. This provides a necessary interdisciplinary and multidisciplinary orientation. When I attended, it was as a former Berkeley grad in Interdisciplinary Studies Field, with a focus on literature and philosophy, who had become a network engineer in industry, over the previous five years since graduation. My interests were not so much with the data networking I had been doing, but the burgeoning startup scene (and incipient collapse). I was interested in programming, and developed some skills in that, but it turns out that the various courses available were intrinsically interesting and they provided a basis of modern education I use today, including: - Product Design (Mechanical Engineering course) - Intellectual Property (Law School course) - Internet Law - Information Classification - Usability and Interface Design - Information Retrieval* - Library Services* *This last one was a surprise to me, and actually I don't recall why I took these old school courses from Michael Buckland, which turned out to be the most relevant, not least of all because these drive search on the Internet.
Information Retrieval and the Found Set
The two basic concepts, exploited to great effect by Google, is based on the human concept of relevance. Relevance is always what is relevant to a given searcher with a particular information need. Again, this is a human concept, and therefore can only be approximated by a machine, which ultimately needs human judges to evaluate its effectiveness at relevance approximation. The human judges in terms of relevance, it turns out, are retired CIA analysts. The basic (human) search is as follows: given an information need, and a set of results (documents), which of those documents are relevant to the information need. Before the use of computers, there was a mechanical use of cards (which held metadata about certain documents). These cards would have holes punched in them at various places in two dimensions. Those holes corresponded to certain categories. Rods could be inserted in these holes through a set of cards. The cards that stayed connected to the rods were the relevant ones, and those that were not were called the dropped set. The initial categorization via hole-punching was replaced by a vector-space model determining the content (category, keyword) relevance of a given document to a given query. Conceptually this is still the same, though the algorithms are much more complex these days. And so, what is important is what can be known through metadata (title, description, age, etc.), document structure, and the content of documents (words, phrases, word count, other patterns). There is then a matching of a search term with related documents. This again is the found set, as above with cards.
Relevance Ranking and Signals of Eminence
Once a found set is known, the question comes (in an age of an abundance of information, but a deficit of attention) to ranking. For scientific journals, ranking and impact analysis was driving by work in citation analysis by Eugene Garfield in the 1960s and 1970s (but first posed in the 1950s), and enabled by increasing statistical analysis done by computers. Citation analysis across articles could attribute the impact of a given journal to what was published there. This meant that future articles in a given journal would have a higher or lower probability of citation, but also that it could clearly indicate which articles themselves where more more relevant. Google's Larry Page Rank derives directly from this, in terms of a link being a citation. There is obviously more noise (and opportunity and incentive for link-fraud than in scientific publishing), but the basic correlation remains. A large amount of variance in search results ranking is (still) explained by the number of domains linking to a given URL on the Internet, with attenuation based on the quality or authority of the linking domains.1
Google's Giant Feedback Machine
The thing about Google is that it not only has inputs, but can determine based on human behavior in clicking, what kind of modification of the initial results should take place. This is an extremely dynamic situation, where clicking on results, the use of the back button, and subsequent repeated searching can provide evidence of less relevance. Of course, and in addition, personalization is important and useful.
Simplicity of Search vs. Wall of Browse
Google's rapid popularity, when faced with incumbents such as Yahoo! with hundreds of humans (doing effectively the same CIA Analyst task previous to mechanization, computerization, and digitization), was clearly due to in some sense mastering relevance and deploying superior search algorithms. In particular, long-tail searches were famously rewarding, while short-head searches still occupied, well, the short-head (large search volume). Google could do well for both kinds of searchers, and those in-between. The I'm feeling lucky button meant to show that the number one result was within easy reach of most, yet slowly expanding searches were well supported. Additional parameters for constraining searches to particular file types, searching within a given domain, and date ranges helped increase the tools available.
Wither WordPress 2017
Faced with two decades of Google's search effectiveness and public facing search tools, WordPress began a project to revamp the search interface and search algorithms for their WordPress plugins. The first Plugin page revamp was rolled out in 2015. From the comments it becomes clear that things are being broken that people use. A 2016 Plugin Search prototype released to the public garnered the same kind of response, namely lots of things wrong with the new design, breaking things that worked before, minimal improvements, and generally poor reception.
WordPress Plugin Search User Interface
WordPress Plugin Search Algorithm
Besides the user interface, adding, removing and rearranging various bits, there is the basic algorithm. Obviously, as per the previous history, it is important (or rather, the goal) to approximate human relevance. Relevance has to do with searcher intent, which is itself approximated by search terms, and searching behavior (clicking, back button, searching again). So what does the WordPress team do in terms of ranking? Well, it makes the Last Updated date (more specifically, the tested with WordPress version) as a huge ranking factor. But this feature is one the community has no say over (meaning, it is not a feedback feature from actual behavior), and it is the easiest one to game (change a bit of text, resubmit plugin, repeat after each WordPress version release).
Ageism at WordPress
For Google (though indeed, it gets it wrong sometimes) age is a positive ranking factor. WordPress ignores this completely, and puts last updated as the only age-related factor, essentially the opposite of Google.
Exact Match at WordPress
> We no longer have the exact match search in place. The new search is more relevant to current events. If you don't maintain it, it will fall out of rank. -Samuel Wood (Otto) No exact match. Really. Actually. Honestly. > This is a huge problem for people looking for an exact match, that is, they know the name of the plugin. I searched for Post Tags and Categories for Pages and it came up on the 5th page of results. I guess I should count my blessings as there are 163 pages of results for that query. 163 pages! If someone knows the name of the plugin (who cares if it is 2 years out of date, the plugin still works and I use it on multiple sites), but can’t get an exact match, just exactly how are they supposed to find what they are looking for? > > More relevant to current events shouldn’t destroy the relevance of historical events.2 ↩
New Plugin Directory Mostly Live (Make.WordPress.Org 03-2017) The main response to this search problem that has been introduced, was simply repeating the demand that a new version and new tested to WordPress version be added by the developer. But this is a bureaucrat's argument (the plugin developer has not updated the form correctly, therefore your request for their plugin is not legitimate). > If the last update date is all that matters (stellar reviews, large numbers of active installs, and exact match text matters little), then the preferences and activity of the community are shown no respect. The community does not have control over when a plugin developer can/will update a plugin, but they have control over the other factors (that is, installing, using, rating positively, and searching for a plugin by name). This is not an edge case, as there are many plugins in this situation. They should not be penalized. They should have exact match text respected, along with the other community factors.[^New-Plugin-Directory-Mostly-Live]
Tested Up To vs. Works With
There was a useful feature which allowed users to vote as to whether the most recent plugin version was working with a specific release of WordPress, or not. It was a simple does/does not work and a WordPress release version drop-dodwn. It was community-driven information. And for some plugins (either newly updated, or not newly updated) it provided some information (though of course it could be innaccurate). Still, it was something, and there are many reports that the information helped. That feature was removed. Now, when users complain about that feature being removed, they are told that people didn't use it (of course they did, you are hearing from them). And they are then told that the plugin developer should update the metadata of their plugin showing Tested up to. This is purely bureaucratic thinking. But it is a valuable question as to who benefits from this situation. It can only be developers who want to promote their plugins that are otherwise being eclipsed by not or not-often updated plugins (usually the older ones).
Support Forum Use Requirements
If a support forum is ignored by a developer, woe unto them. Regardless of the relevance of support topics, if they are not managed quickly, and marked resolved, then that will impact the search rank. Again, more bureaucratic thinking based on rule-enforcement. These kinds of ranking signals can only be considered legitimate by other developers who wish to penalize those who do not follow their (the WordPress Core developers') rules. First, not everyone knows how the support forum works. In many cases you see developers for whom English is a second language, forced to deal with questions in English (and with a Support Forum in English). Secondly, there are plugins like Contact Form 7 which as of 30 March 2017, has 141 out of 703 issues resolved in the last two months. CF7 is very well known, but the usefulness and use of the support forum for this, and other plugins is minimal at best. The Support Forum for WordPress Plugins has always been a mixed bag. Some plugins simply have no one watching the support forum so there is no information (except perhaps that someone else might have the same problem). Sometimes other users help answer, but they have no ability to mark the issue resolved, as only the issue creator and the developer have that ability. In many cases, developers state directly that they have a support forum at another location and to address questions there (which of course some people ignore and post questions anyway). Support Forum use should not be mandated, or used as a signal for relevance. Rather, it should (if kept) be optional to use, and either toggled on or off, with an optional URL pointing to an off-site support forum or simply an email address for support.
A Plugin Repository is not General Search
Certainly, search in a plugin repository is not general search, but rather should be a more organized, faceted classification (tags and categories). But this is not the case. Rather, this kind of metadata is not organized, but ad hoc, and determined by developers alone. In the latest version, there is now a limit on tags (up to five), but where did that come from? From the algorithm folks, who prefer to have fewer signals to deal with (and no actual user behavior involved).
A Lesson from An Electronic Cultural Atlas
To round out the UC Berkeley Ischool reminiscences, I recall vividly a talk by Lewis Lancaster, possibly the Platonic ideal of a gentleman scholar. He was someone people would do anything for, a truly magnanimous and gifted scholar. He saw clearly, decades ago, the need for interdisciplinary research in the humanities, and that technical skills are needed alongside that of the historian and cultural researcher, in order to present information in a way that provides original insight. Digitization is but the first step, information retrieval and visualization meant to show things that are otherwise hidden, that is the real feat. To which end Lancaster founded the Electronic Cultural Atlas Initiative. The problem with WordPress Plugin development, that is clear to even the most cursory evaluation: > It seems like backend developers create such a design. Boring minimalism. ?3 ↩
Plugin Repository Redesign Beta Available (TorqueMag - June, 2016) This is the heart of the failure (or the failure of the heart of it), is that there is no cross-disciplinary team involved. There are essentially programmers doing the work that should be done by experts from other fields. The provisioning of human relevance needs more than PHP and CSS code wranglers. It will continue to fail without the fresh air of authentic collaboration with, and leadership from, those who are not developers first and foremost. Facebook and WordPress interfaces are similarly bad, based on the same kind of DNA: Run by founders who are at base PHP coders, driving an engineer-centric culture, whose insular thinking continues to produce an unremarkable product (whose minimal progress occasionally goes in reverse).
Where to Begin Again, The Appstore
To begin again would be to first take people's lived experiences as a place to explore and respect, and their suggestions, to follow. Doing a redesign is all very lofty, but how about fixing what can be fixed easily first. For example, plugin screenshots have no lightbox, something simple to fix. Sorting and filtering search results by certain criteria, something asked for, for years, and ignored yet again. Second, orient toward the kind of search that makes sense for a plugin repository, which is an appstore. Appstore search is difficult, as the atrocities that are Apple and Google demonstrate. However, there are some basic features that make sense: collections, categories, and the app (plugin) display page. These are all huge, low-hanging fruit for WordPress plugins. But of course, not sexy like search algorithms, to a back-end developer, so they play with search algorithms, while the user experience languishes and ticks down further. For people to take the plugin store seriously, WordPress needs to take it seriously. In the face of requirements for plugin developers to regularly update their plugin (even when it is not needed) and babysit the required support ticket system, the ongoing force-feeding of a badly out-of-date plugin -- Hello Dolly -- is simply a joke. ↩