Stats break: keep away

Image from YouTube. Licensed under Fair Use.

or: who you gonna believe? me, or your lyin’ eyes?

You all probably know that I’m kind of a stats geek. And, yes, I am; I think that we can learn a lot from breaking down individual and team metrics. I like fooling with numbers. Damn, I really AM a sad sort of stats geek.

This is to show you that, while I am a stats geek and still think it’s worth tracking statistics, it’s also important to look hard at them. Because in some cases there really are lies, damn lies, and statistics.

Specifically, I want to look hard at this statistical breakdown of goalkeeper performance put together by the usually-invaluable Chris Henderson of All-White Kit on his Twitter feed.

Image by Christ Henderson on Twitter

You’re probably familiar with the “expected goals” statistic, xG. If you’re not, there’s a couple of nice quick breakdowns here and here. Henderson has used the xG estimate and compared it to the actual goals-against to assess the keeper’s efficiency.

Unsurprisingly, keepers playing for the better teams in the NWSL, such as Bledsoe for Washington and Naeher for Chicago (and A.D. Franch for Portland) appear in the upper half of the list. They have let in fewer goals than their team’s xG in the matches they played suggested were possible, ranging from almost a goal per game for Bledsoe to about a goal every three games for Franch.

(What I find kind of fascinating on this list is that Kailen Sheridan, playing for the woeful Sky Blue, is second behind Bledsoe at almost four goals prevented every five games. Your defense sucks but You da Woman, K.S.)

But.

As I looked over this list one anomaly stuck out at me.

Look at the stats for A.D. Franch versus Britt Eckerstrom.

Both have played three games. Franch gave up 5 goals against an xG of 6.08 for a differential of -0.36 goals/game. Eckerstrom gave up 4 goals against an xG of 1.69, for an extra 0.77 goals conceded per game.

The statistical conclusion is that Franch is nearly a goal-per-game better technically and tactically than Eckerstrom.

To a Thorns fan that seems ridiculous on its face. Eckerstrom has been solid, very solid in the three matches she’s played this season. Perhaps a trifle ahead of A.D. – who had something of a tough April and early May – perhaps a little behind, but certainly not an entire goal to the worse.

And, actually, if you break down the goals against Eckerstrom you’ll see that, yes, your subjective assessment is correct and the statistic is wrong.

The problem isn’t Eckerstrom; it’s a flaw in the xG system.

Here’s a diagram of the locations of the goals Eckerstrom has conceded in her three matches played, with the xG values for each.

Outside of the Hatch goal they’re all very low, ranging from 1.8% (DiBiassi) to 18.2% (Pressley and Brynjarsdottir). That’s because the xG system gives lower probability values to shots taken from corner kicks – which includes Pressley’s goal in Orlando and DiBiassi’s and Brynjarsdottir’s in Maryland.

It also gives lower odds for headed shots, which includes Pressley’s, and the own-goal.

It also gives very low odds to shots taken from the far outer flanks, such as a corner, which includes the DiBiassi goal.

Taken altogether, I get a total of 0.7xG for these four concessions.

Henderson’s calcs – or, rather, InStat’s, I suspect – are nearly a goal higher. I think this is because I undercounted the own-goal, which I believe in the xG system is an automatic 1.0. If so that yields Henderson’s cited xG value of 1.69.

So the problem here isn’t with Eckerstrom’s technical skill or tactical nous; it’s that the xG system dramatically underweights the attempts that resulted in goals against her.

Or, to put it another way, Eck has been desperately unlucky.

Yes, the Hatch goal was quality. The other three were a mixture of pure bad luck and defensive field player issues; Menges losing Pressley and giving up the free header in Orlando, nobody keeping Cheyna Matthews out of Eck’s grille on the DiBiassi corner, and Dagny’s freakish own-goal.

So you rock on, Eck. Henderson and I and the damn stats may keep fiddling away, but all the while you’re doing juuuuuust fine.

John Lawes

Soccer-obsessive. Stats geek. Thorns supporter. Former Slide Rule Pass and Stumptown Footy Thorns beat writer. One of those people who's "often mistaken but never wrong"...

Latest posts by John Lawes (see all)

6 thoughts on “Stats break: keep away

  1. I love the stats too and even more the unique voice and funny comments that you include with them. Like you, I like data, but statistics can be manipulated and so they have to be viewed with a jaundiced eye. These keeper stats are a good example. Yes three of Britt’s goal allowed were very unfortunate.

    1. The thing is that I don’t think Henderson (or InStat) fiddled the data; it’s just that the xG methodology has a weakness, and that’s that it weights certain attacking acts very lightly. The low xG number resulting from the combination of the corners, headers, wide shots makes Eck LOOK worse than she did. The relative disparity between her and A.D. is an artifact of an xG system bias rather than manipulation.

  2. The problem with xG is that it measures how difficult the *shot* is, not how difficult the *save* is. Take diBiasi’s Olimpico, for instance: It’s tremendously difficult to put the ball in the upper corner of the goal from a corner kick, and the xG value for it is correspondingly low. People just don’t score from the corner much.

    But *if the shot is on target*, it’s a high-percentage shot. From a keeper perspective, a shot sailing into the upper corner, like diBiasi’s did, is very difficult to save. It’s not much of a knock on Eckerstrom that she didn’t get to it.

    What we need is an “expected save” (xS) statistic. Given the placement of a shot in the goalmouth, the ball’s velocity, the distance/angle of the shooter, and maybe other stuff like the spin on the ball, what percentage of the time do keepers save that shot? Then we could look at S-xS, which would be much more revealing of keeper quality than G-xG.

    1. On the DiBiassi goal I’d give credit to Matthews (or, perhaps, discredit to the Thorns backline for not keeping Matthews out of Eckerstrom’s grille). That wasn’t really much of an “olimpico”; it came right down in the center of the goal, for one thing, not in the back corner as those sorts of goals typically do. And Eck had a good play for the ball…other than the big ol’ Washington player standing RIGHT IN HER FACE! (and, yes, I’m looking at you, forwards – that’s your only job and you failed..!). I had a little hissy fit about this in the Washington match report, if you look; got a screenshot and everything.

      So the knock is really on her defenders.

      And the bottom line really is that the sorts of shots that are given low percentages from the xG system are…well, low-percentage. Think of how many times you’ve seen a REAL olimpico? I can’t recall one off the top of my head. I’m sure I’ve seen one, but nothing comes to mind. That’s why long strikes are golazos; they’re damn deadly difficult and when they’re made we remember them.

      So when a keeper saves from a shot inside the six, that’s a high-percentage xG and a low-percentage xS…so if you have a big defensive differential or “Ddiff”; that is, a much lower number of actual goals than expected goals? You’ve either got an opponent who is trying to overwhelm you with tons of crap shots, or a keeper who’s making terrific saves, or a defense that’s throwing their bodies in from of the ball.

      Look at the Chicago match here last weekend. Chicago’s xG was 2.4, but a ton of it were worthless attempts from outside the 18. The Red Stars had about 6-8 attempts from inside the 6 or right in front of it, but those were either off-target, blocked (thanks, Kling and AMC!) or swallowed up by Eckerstrom. The team Ddiff was -2.4, a huge tribute to the defenders and keepers.

    2. And you do have a point; the xG/G thing doesn’t really differentiate between a keeper who’s standing on her head and one that has a stonewall backline. It measures a bulk quantity – goals – and not the components.

      So in that sense Henderson is, IMO, misusing the stat a bit to rank the keepers. Teams? Sure. But it’s too difficult to separate out the different factors of defending, goalkeeping, and opponent shot quality to put it all on the keeper.

      So you go, Eck!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.