Having some fun with SA School Boy First XV Rugby Rankings

Growing up in South Africa will no doubt mean that you have heard of, or played rugby in some form or another. For some, the memories might hold negative political connotations, while for others it brings back visions of grandeur and victory. Nevertheless, depending on your political stance, whether you played or not, or if you were in a winning side or not, rugby is entrenched in our society.

My own first experience was being thrashed by the then “Mighty” Queens College, while their supporters, intimidated us with the “Ingonyama” or “Inye Zimbini Zinthatho…” war cry. A harrowing experience if you were on the losing side.

Years later, my “Alma Mater”, Grey College, which is synonymous with school boy rugby, would hand back many of those defeats I experienced, to Queens. Today, however, my allegiance is completely split (in fact it's schizophrenic!). With a son at Bishops, I love watching their running style of rugby, but watching my friends’ and colleagues’ boys play with such passion for their opposing schools, I instantly become a huge supporter of Paul Roos, Paarl Boys and Bosch alike. The talent we have at school boy level in the Western Cape alone, leaves me very positive about the future of rugby in SA.

Being a rugby supporter can at times be very difficult, and being a supporter at school boy level, even more emotional than at national level (especially around a fire, when the Castles are flowing!). It was therefore with interest that we read and agreed with Mr Gustaf Pienaar, the then MIC Rugby and First XV Manager at Rondebosch Boys and current Deputy Headmaster, when he wrote about the issue of school boy rugby rankings (Bosch Blitz, 15 May 2013, 8th edition). Mr Pienaar states:

“I have always maintained that rankings are like someone patting you on the back and paying you a compliment: I accept it graciously and move on”.

For the most part we would agree with Mr Pienaar and “Hat Tip” to him for his pragmatic approach. The reality is that many of the rankings are opinion based to a greater or lesser degree and many do not carry much scientific basis. We were therefore delighted when MyComLink (www.mycomlink.co.za) asked us to see whether we could apply some science to the ranking, so as to remove the qualitative bias and make it a bit more quantitatively rigorous.

At NMRQL Research we love having fun with numbers and decided to see if there were existing ranking algorithms here in SA or overseas that could be applied. Ideally we were looking for something that would:

  1. Take into account wins over losses (naturally!)
  2. Not be biased in favour of schools which play more games.
  3. Would take into account the strength of the opponent.
  4. Would take into account draws between teams.
  5. Would not be skewed by home field or away field advantage, and
  6. Would not be biased by margin of win (WHAT! I hear you say?... more on this later)

So let’s have a look at each of these in no specific order of importance.

Taking into account wins over losses

Of course points for wins and negative points (or deductions) for losses must be used. Assume we give each team a point for a win and subtract a point for a loss, then imagine a round robin where A beats B, B beats C and C beats A. In such a case each team will end up with the same amount of points (i.e. 1-1=0, 1 for the win and -1 for the loss). We need something to break the tie ...

Network

The obvious, and partially correct, solution is: more data. The more data we have the less likely it is that schools will have equal numbers of wins and losses and, as a result, are less likely to be ranked together. Consider the case where another school, D, was introduced with the following games: School C beats School D and School D beats School A. The net effect of this is that the tie is broken, school C is the clear winner with two wins and one loss.

Network2

Perhaps you have already noticed the flaw with this approach. How can School C be the top ranked school if School B beat it? That's a very good question with a simple answer: this ranking algorithm is biased in favour of Schools which played more games!

Removing the games-played bias

In order to overcome this bias we consider indirect games played between the schools. A simple example of an indirect game is that if School A plays School B and wins and School B previously played School C and won, then School A indirectly beats School C. Our simple example with indirect games in red is shown below:

Network3

In the network above we have introduced six indirect games which have the net effect of ranking School C and School B equally. Great! This aligns somewhat better with our intuition, but if you think about it carefully it begs a few more questions:

  1. Should indirect games count as much as direct games?
  2. Should we take into consideration second-order indirect games?

These are all astute questions. Enter Newman and Park, two College Professors from the United States. Newman and Park developed an algorithm which solves all of these problems automatically in their paper: "A network-based ranking system for American college football". This algorithm uses some fancy mathematics to look at all levels of indirection and weights these games appropriately based on the average number of games played by a school.

This is the algorithm we used at NMRQL to produce the school boy rugby teams rankings found further down, but first let's discuss some other considerations you may have regarding ranking algorithms.

Taking into Consideration Opponent Strength

Any given school, let's say A, is awarded a win score (the number of schools it has beaten) and a loss score (the number of schools it has been beaten by). The score of that school is equal to its win score minus its loss score. However, in order to assess how meaningful a victory over School A is we need to look further than School A's score, we need to consider the scores of the schools it has beaten or been beaten by. This is an intuitive idea: if you beat a strong school which has beaten mostly weaker schools it is less meaningful than if you beat a strong school which has only beaten other strong schools. The Park and Newman algorithm takes this into consideration meaning that beating strong schools who have, in turn, beaten strong schools, is the optimal way to climb up the rankings. This doesn't however answer the question of ties!

Solving the Draws Problem

When ties or draws occur in a normal qualitative ranking both teams either get no points or one point. But should these teams really be treated the same? At NMRQL we don't think so for the same reasons we elaborated on in opponent strength. In our scoring system, both teams in a tied game will get zero for the win and zero for the loss, because they have tied. However, the value of the games won/lost by the other opponents that each team has won/lost against gets carried through to the two teams playing. In other words, If A and B draw, but A is a stronger team by virtue of the strength of the teams (Conference) A has played against and beaten, then B will get more points carried over to it (B) due to the draw, than A will get. So a draw by a weak team against a strong team adds points to the weak team and potentially reduces points for the strong team.

Home Field Advantages?

In our analysis, we do not include home field advantage as a contributor or deduction to the final rank score (this could be the most contentious of the methodology) and, we are happy to agree to disagree on this.

In our view, home-field advantage probably has to do with the issue of crowd support adding some advantage and we have to wonder to what extent this still adds or detracts to a game. Some would argue that it is an unfair advantage and result in a deduction in order to even the odds, like a handicap? We could also argue that nowadays this is offset to some degree by large inter-school derbies, where both schools fill the stadiums with supporters.

On a more practical note, many schools do not have the facilities that some of the larger rugby schools might have and a play away on a lush, well-groomed field, might in fact count in the visiting sides favor, being an improvement in conditions. Lastly, it is also quite difficult to include (scientifically, that is) an abstract or qualitative item such as home-field advantage in a purely quantitative number. While it could be a sensitive issue, we ignore this in our calculations.

The Margin-of-Win Approach

Margin of win is a very emotional issue when discussing a win/loss. The reality is that some schools play many games against poorer sides, which implies that their margin of wins is not necessarily a function of how good they are but rather how poor the conference is that they play in. We therefore ignore margin of win in the score calculation even though it might be interesting as a future analysis for this blog. What matters according to the current algorithm is that you win, that you do not lose, that you win many games and lose few, and that you win against strong sides and don’t lose against weak ones.

The ranking algorithm used to produce the list below is not our own invention and we would like to give credit to Park and Newman from Michigan State University for the idea. If you would like a more detailed explanation of the algorithm and the mathematics behind it please read their 2005 paper: "A network-based ranking system for American college football". The following, very salient quote which explains why they developed this algorithm has been taken from their paper:

“One often hears from sports fans arguments of the form: “Although my team A didn’t play your team C this season, it did beat B who in turn beat C. Therefore, A is better than C and would have won had they played a game.” In fact, the argument is usually articulated with less clarity than this and more beer, but nonetheless we feel that the general line of reasoning has merit. What the fan is saying is that, in addition to a real, physical win (loss) against an opponent, an indirect win (loss) of the type described should also be considered indicative of a team’s strength (weakness). It is on precisely this kind of reasoning that we base our method of ranking.”

The remainder of this article will present the rankings followed by a small disclaimer regarding the data. If you have any comments on the algorithm or would like to know more about what we do, you are welcome to comment in the comment section below, email us, or contact us on twitter @NMRQL.

The rankings as at publication date

Rankings4

A Small Disclaimer

Just as a short disclaimer, our analysis is only as good as the data we receive. So we would encourage you to add your scores and data to the www.mycomlink.co.za website and ensure that the data is up to date and correct. Pls feel free to add any comments to the MyComLink blog if there are glaring mistakes in the data so that we can correct these. Hopefully we have not upset anyone with the rankings; as Gustaf Pienaar said: It’s a tap on the back and nothing more. We hope to add future improvements and to add an interactive open source tools for those that are interested to play around with these rankings on a date or conference level.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

Newsletters

Recent Posts

Recent Comments

    Archives

    Categories