Is March Madness making you crazy?

The “science” behind determining the NCAA tournament draw

7 min readMar 5, 2017

Brack•e•tol•o•gist

/ˈbrakiˈtäləjəst/ noun

A guy sitting in his pajamas, living in mom’s basement, who programs his personal baises into a computer modeling system that spits out subjective rankings which are then treated as “fact” by people looking for confirmation bias.
A guy who pimps a conference with ties to the media company that pays his salary.
A guy who year after year bases his ranking system on useless metrics* that should be eliminated.

Since the BCS was disbanded, no sports institution has inspired as much frustration, rage and discussion of conspiracy theories as the selection and seeding of the NCAA Men’s Basketball Tournament.

If I were Bill Simmons, I would insert a cultural reference here, such as “where Obama considered a Congressional investigation into the BCS, my guess is the current president’s only interest in sports would be having his staff communicate with college athletic director PACs to book more events at a Trump hotel.”

Since I won’t do that, I will use another Simmons device that I really like, the anonymous player/team comparison. Let’s compare all the potential top seeds for the upcoming tournament using the following metrics: win-loss record against Top 10 teams (T10), signature road wins (“at” followed by the team’s ranking), win-loss record against teams ranked 11-25 (T25), win-loss record against teams ranked 26-50 (T50), total losses and the rankings of their opposition.

Team A: 3–2 T10 (at 8, at 9), 0–0 T25, 2–1 T50, 3 losses (4, 9, 38)
Team B: 2–0 T10, 2–0 T25, 1–0 T50, 1 loss (68)
Team C: 0–2 T10, 4–0 T25, 7–0 T50, 2 losses (10, at 10)
Team D: 3–0 T10 (at 7, at 8), 2–1 T25, 4–1 T50, 3 losses (23, 30, 92)

Team E: 1–1 T10, 0–3 T25, 6–3 T50, 7 losses (8, 11, 12, 13, 28, 48, 48)
Team F: 1–1 T10, 4–2 T25, 5–1 T50, 6 losses (8, 13, 16, 37, 91, 92)
Team G: 2–2 T10, 0–0 T25, 2–0 T50, 4 losses (3, 9, 97, 106)
Team H: 1–2 T10, 3–4 T25, 3–1 T50, 7 losses (6, 7, 12, 15, 15, 22, 38)

Team I: 2–2 T10, 2–1 T25, 5–1 T50, 6 losses (2, 2, 23, 30, 59, 100)
Team J: 2–4 T10, 0–0 T25, 7–0 T50, 5 losses (2, 3, 6, 15, 80)
Team K: 1–4 T10, 0–0 T25, 3–0 T50, 4 losses (2, 3, 4, 10,)
Team L: 3–0 T10 (at 1), 1–0 T25, 5–3 T50, 7 losses (26, 26, 50, 55, 97, 121, 230)

Team M: 2–1 T10, 4–2 T25, 4–0 T50, 7 losses (4, 13, 22, 70, 79, 91, 123)
Team N: 2–3 T10, 3–0 T25, 5–2 T50, 8 losses (2, 5, 6, 12, 31, 37, 79, 112)
Team O: 0–1 T10, 1–1 T25, 2–1 T50, 4 losses (10, 19, 42, 80)
Team P: 3–2 T10 (at 5), 1–3 T25, 2–2 T50, 9 losses (1, 6, 12, 13, 23, 31, 37, 70, 79)

If you look at the numbers and not search for the team identities, it presents some interesting questions.

Does the team that almost never loses, but plays a less risky schedule take precedence over the team with lots of losses to powerful teams?

Does the team with a ton of decent wins and two losses against an elite team take precedence over a team with outstanding wins but a few horrible losses?

Does the team that has the most impressive wins but a smaller sample size take precedence against a team with a large sample size of solid wins but lots of losses?

How would you weight these metrics?

In my opinion, the key measurements should be:

Signature road wins — home court advantage is huge in college basketball, as evidenced by how many ranked teams get upset on the road by weaker teams. To be considered a top team, beat a top ten team on the road. (Only four teams even accomplished this feat in the list above).
Record against elite teams — without a winning record against top 10 or top 20 teams, teams haven’t proved they deserve to be considered elite. It’s not enough to just have some “good losses.” (Only six teams have a winning record against top 10 teams)
Winning the conference tournament — aside from the NCAAs, these are the most pressure filled games of the year, played back-to-back-to-back-to-back on a neutral court against teams that have usually played each other twice. There are no surprises in conference tournament basketball. The toughest teams prove they can get through a gauntlet that prepares them for March Madness. [Only three conference tournaments will feature multiple top 10 teams playing each other: PAC-12 (3), Big 12 (2), ACC (2). Only three conference tournaments will feature more than two top 25 teams playing each other: ACC (7), Big 12 (4), PAC-12 (3)]
Overall consistency — if a team can win 90% of their games at the D1 level, or never loses to a team ranked below the top 10, they have to receive a bonus for being mentally tough and well coached. (It’s tough to win conference games on the road — even against weaker teams — because teams know each other and coaches have more time to prepare.) On the other hand, a team that beats the #1 team and then loses to the #150 team, shows they have tremendous upside, but also risk getting upset in the first round of the tournament. So there has to be a penalty for bad losses. (Only four teams above have over a 90% winning percentage. Only two teams have no losses against teams outside the top 10)

The first tie breaker would be to consider the win-loss record against the teams ranked 11–25.

The final tie breaker would be to include the win-loss record against weaker teams (26-50). I completely left out the category of win-loss record against teams ranked 51–100 (T100), because there is something really wrong with a system that rewards teams for winning a lot of games against teams they are supposed to beat.

Based on the above, I would award the seeds as follows:

#1 Seeds: D, C, B, A

#2 Seeds: F, L, I, M

#3 Seeds: G, J, N, P

#4 Seeds: K, E, H, O

D was the overall #1 seed because of their dominance in every major category, followed by C which dominated in everything except their record and number of match ups against the most elite. A and B both had strengths and weaknesses in their resumes, so it was basically a coin toss for now. The conference championships will determine their final order.

All the other teams had fatal flaws in their records that dropped them into the lower seeding lines.

Let’s hope that when the smoke clears and the conference tournaments have been completed, the selection committee finally makes sure that the teams seeded properly.

*Metrics that need to be eliminated:

Non-conference SOS (unless its a signature win against an opponent ranked in the top 25 at the end of the season) is completely irrelevant in determining the top seeds in the tournament for three reasons:

Most of the games are played in November and December, so teams with freshmen are going to make tons of mistakes early. By the end of the regular season, these teams have matured and are playing at a completely different level.
Scheduling occurs at least a couple of years in advance, so even if schools are trying to play tough teams, future opponents can be weak when the game is finally played. This is not football, where a cream puff game (a constant, known years in advance) is purposely scheduled the week before a tough game, thus gaining an unfair advantage by allowing the stronger team to rest its starters and give injured players more time to heal. It makes no sense to punish teams for something outside their control.
Playing average teams (RPI 51–150) means nothing to an elite team. If teams average double digit wins against these opponents, playing five or ten more of these games does not prove a team is truly elite.

Obsession over conference SOS. While conference strength of schedule is used to determine if a team with a poor record should get into the tournament, it means absolutely nothing with regard to the top teams, and here’s why:

No conference has top 10 teams from top to bottom. As I wrote above, if a team is in the top 10, winning more games against weaker teams does not prove the highest value of that team.
Analysts suffer from past performance bias. In the same way investors are warned “past results are no indication of future performance,” bracketologists should be warned about assigning subjective weights to a conference based on past play. Mark Titus, a huge Big 10 supporter has written about how bad the conference is this year, and yet they are scheduled to have up to seven teams in the NCAA tournament.
If we combine the law of averages with the strong home court advantage in college basketball, just about any team can get one or two upsets during the course of a 30-game season. So why should a team get rewarded for proving they can lose almost all of the time against better teams? Here’s a comparison between two teams, X (weak conference) and Y (strong conference):
Team X: 1–5 T10, 1–0 T25, 0–0 T50, 3–2 T100, and 8 losses (4, 4, 9, 9, 16, 53, 73, 133)
Team Y: 2–3 T10, 0–2 T25, 3–3 T50, 5–5 T100, and 14 losses (8, 8, 10, 17, 21, 27, 39, 49, 57, 71, 76, 81, 83, 261)
Currently, bracketologists ranked team Y above team X to get into the tournament, using the same computer metrics that rank team X 10 spots higher than team Y. With the larger sample size, team Y performs at roughly the same winning percentage against elite teams, has more top 50 wins but far more losses against sub-top 50 teams. The only reason Y is given preference is the belief they are in a strong conference.