For those who want to track Julia’s growth, some of the most popular measures of programming language popularity include PYPL, TIOBE, GitHub, RedMonk and IEEE Spectrum. TechCrunch published a useful discussion of the differences among some of these measures last year and Zhang Liye published some tracking data on Julia’s discourse forum last year. Here’s a very high-level overview of what the different rankings are based on :
+"X programming"on 25 of the most popular search engines worldwide where
Xis one of potentially several keywords for each language.
So where does Julia rank? As of September 2019, Julia is:
As well as being the lowest, Julia’s ranking on the TIOBE index has been particularly volatile. It jumped 11 places from #50 in July to #39 in August and #36 in September 2019. However, we also saw Julia jump from #50 to #37 from February to March of 2018, only to fall back later. We couldn’t help but wonder “what is going on here?” Since the TIOBE index is the most popular but also the most unpredictable, we decided to do a little digging into their methodology, hoping to better understand the volatility we’ve seen. The specific search query that TIOBE uses for each language is
In other words, to determine the popularity of Java, it looks for the verbatim phrase “Java programming” across different search engines and counts the number of “hits” each engine reports for that search phrase. According to TIOBE:
It is important to note that the TIOBE index is not about the best programming language or the language in which most lines of code have been written.
TIOBE is transparent about the issues with their current ranking, and actively solicits comments for improvement: “If you have any suggestions how to improve the index don’t hesitate to send an e-mail to [email protected]” According to TIOBE, the top 5 most requested changes to the TIOBE index include:
"X programming", also other queries such as
"programming with X",
"X coding"should be tried out.
We at Julia Computing decided to investigate an additional change as part of their most requested potential change. Since Julia and some other languages are often referred to as “the X language” rather than “X programming”, we wanted to learn how rankings would change for Julia and other languages if we included “X language” as well as “X programming” to calculate rankings. We selected the TIOBE top 40 languages and recalculated the rankings using this combined query (
+"X language" OR +"X programming").
In the following graph, we have put TIOBE index rank—based only on the search term “X programming”—on the X-axis, and our revised ranking—including both “X programming” and “X language” as search terms—on the y-axis.
Note that a higher ranking corresponds to a lower number (#1 has the most searches), so we inverted the scale on the graph with the highest rankings (lowest numbers) in the top right corner and the lowest rankings (highest numbers) in the bottom left.
This result led us to give some thought to the following question: Why is is that some languages (e.g. Julia) are more often referred to as “X language” rather than “X programming language”? We can only speculate about the reasons behind this difference. They may be linguistic—the phrase “X programming” is easier to say or sounds more right for some languages, while “X language” is easier or more concordant for others. For example, “Java programming” is a pretty comfortable phrase, whereas “Java language” is kind of awkward and probably only used when trying to make a distinction between the Java language and one of its implementations. This is similar for C, C++ and many languages on the list. This supports the overall use of the search term “Java programming” or “C programming” as a proxy for the popularity of those languages.
On the other hand, since Julia is a person’s name in much of the world, we often find ourselves writing “the Julia language” to clarify what we’re talking about. This may very well affect the number of hits search engines find on the verbatim phrase “Julia programming”. These results, much like the TIOBE ranking itself, are a bit too noisy and hard to interpret to draw firm conclusions, but it does suggest that TIOBE should probably consider broadening their search terms since people write about different languages in different ways.
Another concern with the current TIOBE ranking, alluded to above, is its volatility-language rankings swing wildly from month to month. Indeed, we found that the same search engine frequently shows wildly different counts for the same search depending on the day. We noticed for example, that Baidu’s search counts seem particularly volatile and higher by an order of magnitude compared to Google or Bing. Even over the few weeks as we carried out our exercise, we noticed variations on Google that would move a language a few places in ranking. Naturally, one might consider various statistical ways to address this volatility.
We’re glad that TIOBE is interested in hearing from the community, and we will be sharing these thoughts with them. In the meantime, if you have any further thoughts on this analysis or other suggested changes to ranking methodology, we would love to hear from you.
Methodology links for each ranking: