WordPress is the most popular content management system (CMS), powering up to 29% of all Internet websites (source). A CMS is a generic term for any system that facilitates the creation and publication of any type of digital content. This category includes platforms for creating static sites, blogs, forums, online stores and everything in-between. Other well-known CMSs include Joomla, Drupal, Shopify and SquareSpace, all of them with a much lower market share: 60% of all sites using a CMS go with WordPress, the only one with double digits (Joomla takes second place with a 6.4% market share).

Given its importance, WordPress deserved to have a dedicated impact column in the IEEE Software magazine. IEEE Software‘s mission is to be the best source of reliable, useful, peer-reviewed information for leading software practitioners. The goal of these impact columns is to develop a better quantitative understanding of software’s impact on different industries. As such, the goal of the column is not just to describe a specific software product but to provide some insights on how the software is being developed and some metrics that help to assess its impact.

With this context in mind, I hope you enjoy my impact column on WordPress – A Content-Management System to Democratize publishing. Keep reading for the unedited (but free) version of the column.

WordPress started in 2003 when Matt Mullenweg and Mike Little forked the b2 blogging platform and created the first version of WordPress. WordPress is an open source project released under the GPLv2 (or later) license. The WordPress Foundation was started in 2010 (inspired by the Free Software Foundation and the Mozilla Foundation) to further support its sustainability and promote the project. The foundation owns the WordPress Trademark.

From the beginning, the WordPress mission has been to democratize publishing, ensuring that any non-technical person was able to create her own website while at the same time building a product that can scale all the way up to enterprise clients with complex needs (e.g. eCommerce, multilingual, mobile). The recent addition of the WordPress REST API is a step forward in this direction. Thanks to the API, you can now use WordPress as a headless CMS to build your own web applications on top of WordPress while benefitting from all its core backend functionalities (e.g. for collaboration and content and user management).

The WordPress Codebase

WordPress comprises over 500K lines of code mainly consisting of PHP code serving HTTP requests by querying a MySQL database. However, there is a quickly growing presence of JavaScript, especially in all components implied in the frontend aspects of WordPress. React has been chosen as the core JS framework for its new JavaScript-based developments.

Figure 1 displays some stats on the evolution and growth of the WordPress codebase. Nowadays, PHP still represents over half of the total number of LOCs (Lines of Code) with an additional 30% of JavaScript. CSS, HTML and XML make up for the rest. Its compound annual growth rate over a 14 year period (from the 16314 LOCs it had in Dec, 2003 to the 565917 LOCs in Dec, 2017)  is 28.8%, which puts WordPress in the upper end (but still within the typical range) of the calculated CAGRs for software projects[1].

 

WordPress lines of code - grouped by language

Figure 1 – Evolution of LOCs in the WordPress codebase grouped by language – Source openHub.net

The database schema is rather simple with only 12 core tables (see the Entity-Relationship schema of the tables, showing the relationships between them though the corresponding foreign keys are not part of the WordPress database SQL DDL script).

WordPress code itself is organized into a few dozens of core components based on the functionality they provide (rather than size or language). Each component is coded using a mix of procedural and object-oriented programming techniques. Typically, new functionalities are developed with a more OO style and a few class wrappers have been added to better encapsulate related sets of functions.

WordPress development

WordPress uses Subversion as version control system with the current development version available at this location. Trac is the bug tracking system in place where all important discussions take place with Slack as a complement for real-time communication.  There is also a GitHub mirror but only as a read-only version of the SVN repository.

Since its inception, over 30 versions of WordPress have been released, the current one (at the time of writing) WordPress 4.9.4. WordPress strives never to break backwards compatibility. This is contrary to other CMSs like Drupal, which is prepared to break it at every major release if this simplifies radical code improvements. Note that, for historical reasons, WordPress does not use semantic versioning and therefore the first two digits are the actual identifiers of a major release with the third one identifying minor releases (mainly for security patches and bug fixes). Until recently, releases were fairly regular, launching a new major release every four months, give or take. In 2016, it was announced that WordPress would move towards a more feature-driven approach with each new version focusing on the release of a major functionality.  Editing, customization and the REST API were the first three planned. In the end, we have witnessed a mixed model with different release models happening in parallel.

Each release has a lead developer with the recent novelty that we may now also have a nominated designer lead paired with the lead developer. This trend of recognizing the key role of designers in the development process is something we are seeing as well in a number of tech companies that are drastically improving their developers/designers ratio. Leaders rotate in each release, which helps to involve more people in key positions of the project and therefore helps to increases its bus factor, a measure to calculate the risk of concentrating too much information in a small number of developers.

The release leaders decide on all technical aspects of the release but they depend on the WordPress community to move the code forward. This also includes adding unit tests (WordPress uses PHPUnit and QUnit for automatic testing of PHP and JavaScript respectively). For instance, 443 contributors participated in the latest major release (4.9), of whom 185 were new contributors.  Beyond the lead developers, a number of core committers have write access to the SVN and can, therefore, commit the patches submitted by those contributors. Sometimes commit access is granted on a temporal basis to work on specific components but a number of people have a permanent commit access and constitute what is known as the WordPress core team. Moving up from external contributor to core team is mainly based on meritocracy. This meritocracy takes into account not only the technical skills but also the attitude, professionalism, and respect to the project’s core philosophies. And at the top of the chain, Matt Mullenweg, the WordPress co-founder, supervises everything under his (unofficial) role of BDFL (Benevolent Dictator For Life).

A weekly open bug triage meeting is held every week but security vulnerabilities are immediately addressed (you can see the list of all declared WordPress security vulnerabilities)  and, if necessary, a new minor security release is prepared and all WordPress sites automatically updated since by default all WordPress sites automatically update themselves as soon as a new minor release becomes available. The WordPress security team is made up of approximately 50 experts.

How people make money with WordPress?

WordPress is a multi-billion market today and everything points out that it will continue to grow in the future. A key factor in this business growth is related to the huge user community around it. As an example, WordPress organizes WordCamps; a focused event in a city favoring local speakers and attendees. Last year only, there were 128 WordCamps and over 4000 meetups adding up to more than 100K participants.

Beyond all kinds of consulting services (installing, configuring, tuning, migrating …) to serve the WordPress community many people decide to build and sell plugins extend the core WordPress functionality or themes to customize the look and feel of WordPress sites. WordPress offers many predefined hooks that plugins can “hook into” to provide their functionality without modifying any core WordPress files. Around 2000 hooks are available and each hook corresponds to a common WordPress event (saving a post, approving a comment, creating a user). In response to those events, plugins and themes can either perform a specific action or filter the current content to change how it is going to be displayed to the user.

As of today, there are 47K plugins in the official repository that have been downloaded over 600 million times. Plugins can be completely free, paid (sometimes as part of a subscription service) or follow a freemium model, and similarly for themes. The quality of the plugins wildly differs and, in fact, more often than not, they are the ones to blame when a WordPress site gets hacked. There is a plugin review team and a theme review team that checks that each submitted plugin adheres to the plugin guidelines (verifying security, quality and “spammy” aspects of the plugin). Like all sets of rules, they try to be precise but are always open to interpretation and there has been some controversy regarding decisions on the inclusion/exclusion of certain plugins and themes from the official repository.  The upcoming Tide initiative will make the process more transparent by helping plugin developers to run automated tests on their plugins to check for PHP compatibility errors and warnings prior to the submission.

While creators of plugins and themes are mainly independent developers or small agencies, larger companies offer WordPress hosting services with a more predictable and recurrent revenue model. While you could install WordPress on any internet hosting provider, some provide a more dedicated support for WordPress sites, offering, for instance, staging sites or integrated cache systems. Automattic (Matt Mullenweg’s company) is one such company, offering hosting services under the domain https://wordpress.com/, not to be confused with wordpress.org, the home of the open source project itself.

Roadmap and challenges ahead

WordPress has gone a long way from a humble blogging platform to the flexible CMS it is today. But it will need to continue to evolve if it wants to stay on top. The CMS market is very appealing with new competitors popping up every year in all areas of the CMS spectrum trying to become the best CMS for specific customer profiles / sectors in contrast to WordPress’ aim to be the one-size-fits-all solution with the help of its broad ecosystem of plugins, themes, hosting solutions, etc, mentioned before. A WordPress Growth Council has been recently created as a reaction to this threat. The “Five for the Future” initiative complements this by requesting all companies living on WordPress to dedicate 5% of their people to contribute back to the WordPress core — be it development, documentation, security, support forums, theme reviews, training, testing, translation or whatever it might be that helps move the WordPress mission forward.

On the technical side, the next version will ship with Gutenberg, a major architectural shift for WordPress and the longest feature development effort in the history of WordPress. Gutenberg aims to simplify all previous concepts of WordPress (menu, widget shortcodes) in one elegant concept: the block. According to the WordPress founder, Gutenberg will be the future of WordPress writing, editing and customization for the next ten years. As any major change, Gutenberg has stirred a lot of controversy in the community since it will force many plugin and theme authors to rewrite and rethink how their plugins work and it will make many of the popular page builders obsolete unless they invest considerable efforts to adapt to the new Gutenberg methodology.

While these changes go in line with WordPress goal of dominating the CMS market, from large enterprises to individual bloggers with little technical knowledge, there is also the risk that some communities of WordPress users feel the project is evolving in a direction that no longer represents their views and decide to fork the project and create a specialized version to better fit their needs.

One way or the other, the research community has a lot to contribute to the future of WordPress. It is somewhat surprising that so few research articles focus on WordPress compared to, for instance, papers analyzing Linux from every possible perspective. I think the richness and importance of the WordPress codebase and ecosystem pose many interesting challenges for the research community, especially for researchers working on the mining of software repositories.  As a long-time WordPress user and researcher (e.g. see some of my ideas on the benefits of a closer relationship between WordPress are the research community, most of them still valid today), I encourage you all to contribute to the growth of WordPress and its community.

[1] Compound Annual Growth Rate for Software. M.van Genuchten, Les Hatton. IEEE Software 29(4), 2012

 

Share This