[ODIs] - Babar Metrics & a Statistical Deep Dive into Pakistani Batsmen of Last 20 Years

ahmedwaqas92 · Nov 13, 2020

Considering emotions have cooled down post the 3rd ODI vs Zimbabwe, I believe this is the best time to reflect (in a realistic & logical manner) on what the current lot offers comparatively, to the batting resources Pakistan have had earlier i.e. in the last 20 odd years.

The loss in the 3rd ODI, in the recently concluded Zimbabwe series, saw many mainstream critics having a go at the team, especially our top 3 with Babar being criticized, as a significant factor in our LOI pitfall. This thread, is not by any means a defense to that narrative however, it puts into perspective on where were were historically, the resources we have right now as our mainstays, and what realistically should we be looking towards in the future.

When analyzing or putting forth any narrative, or to test a given hypothesis the building blocks must almost always be tangible data sets that reflect that given argument. To that end I deep dived into historical data on all Pakistani batsmen who have debuted in the last 20 years in ODIs (Starting from the year 2000 up until 2020) and have played a minimum of 20 innings. I also put an upper bound (i.e. upper limit) on every batsmen and capped each individual at the 75 innings mark. This was done so as to standardize the datasets and give us a reflective snapshot on the trend Pakistani batsmen follow when they are a significantly a mainstay in the national team.

So What historical data did I exactly capture? Now comes the fun bit

- Right off the bat I disregarded the usual suspects (Career Average, Career SR) since they have severe limitations in capturing the true essence of how a batsmen might be performing in the current modern era so for every batsmen I calculated the following five metrics (on each innings they played):

[table="width: 500, class: grid, align: center"]
[tr]
[td]Variable[/td]
[td]Inference[/td]
[/tr]
[tr]
[td]M-RPI[/td]
[td]I recorded this variable as the Moving RPI (runs per innings) for each batsmen. The logic behind going with RPI (Runs per Innings) rather than lets say averages was to remove the disparity of not outs from the datasets, which skew the results in either direction. The 'Moving' bit, which implies how RPI changed with each game the batsman played, was done so as to record bad patches and the sustainability of a batsman's form[/td]
[/tr]
[tr]
[td]Z-RPI[/td]
[td]This is the Z score of each M-RPI data point in the study. A Z score is essentially calculated to give us the fluctuations in the entirety of the dataset. In technical terms it records how many standard deviations is a data point away from its aggregate mean. A positive Z score means that the data point is higher than the aggregate mean of the population while a negative instance provides a declining trend. In terms of batsmen RPI, we can say that Z scores that are positively skewed on the right of a distribution table will highlight consistency for the individual while those that are negative will imply lack of consistency[/td]
[/tr]
[tr]
[td]M-SR[/td]
[td]Similar to the moving RPI that we calculated above, M-SR represents the Moving Strike Rate. The fundamental concept behind this variable is the same as the RPI but instead of Runs per Innings it showcases, strike rate for each batsmen after every innings they played[/td]
[/tr]
[tr]
[td]Z-SR[/td]
[td]Z-SR is the Z score of the Moving SR mentioned above. Again, the conceptual framework of this variable is identical to the one explained in Z-RPI. The only difference being that instead of Runs per Innings, this variable would show the fluctuations in the Strike Rate of a given batsman.[/td]
[/tr]
[tr]
[td]D-Ratio[/td]
[td]I have seen a lot of critics and 'Analysts' throw the word 'Dynamic Batsman' around here & as well as on our mainstream media, however NEVER have I seen anyone mathematically try & correlate the term to any given framework. For this particular analysis, I have generated a D-ratio (Dynamic Ratio) which basically is the total number of runs scored by a batsmen (for each innings) via boundaries & then dividing that by total runs scored by the batsmen in rotation (1s/2s/3s). If the ratio comes to more than 1.00, the batsmen can be said to be "Dynamic" (for that particular innings) while if the ratio is less than 1.00 but greater than 0.00 the batsmen will be termed as "least Dynamic". If in any given innings the batsmen does not score via (boundaries or rotation) and there is a 0 on either side of the equation then we will label that innings as "Undynamic"[/td]
[/tr]
[/table]

With the variables sorted lets take the Moving RPI, & see for ourselves the trend batsmen have followed for a maximum of 75 innings in the last 20 years. To record how every batsmen performed in the aforementioned qualification criteria, I animated the results as an RPI race for ease of viewing.

Babar is a class act but in the first 15-20 odd innings of his career (which started in 2016) he wasn't as impactful as he is TODAY. Off the blocks, Fakhar & Imam have a considerable lead in terms of RPIs compared to the rest of the batsmen (24 in total) however by innings number 30, Babar starts to hit his stride and by innings 50 he is so far ahead of the curve - its not even funny. This implies two qualities in Babar's game

(i) The guy has a knack of improving (with each game) & this stat can be clearly depicted for the last 75 innings.
(ii) Babar has an increasing M-RPI which means his consistency is almost unmatched in the way that he makes those runs - Day in Day out.

To further display this notion, I did a rank based, race animation (for the same Moving RPI) and the results themselves show how Babar post the 30 innings mark just leaves everyone behind. By the completion of each race he is almost 8-10 runs ahead of his nearest competition which is an unreal variance in terms of how a batsmen might perform among his immediate peers. Fair play to Imam as well since his RPI (he lacks Babar's SR by quite a lot) matches that off Babar until his most recent international innings but whether he can maintain that same level of consistency which Babar has done for 75 innings now remains to be seen.

Link - https://www.dailymotion.com/video/x7xgb11 (You can also refer to the next Post)

Whatever a person's opinion might be on our current lot (everyone's entitled to it) no factual data can back the simple fact that the batsmen Pakistan are introducing (or have introduced in the last 2 odd years) are NOT cut from the same cloth as those that debuted prior to them - going back as far as the year 2000. The top 3 we have right now simply have numbers that Pakistani fans have never seen from their Top 3 in LOIs in the entirety of their cricketing history. That said, there is still room for massive improvement and players like Haider, Khushdil, Abdullah Shafique provide an additional X-factor from the likes of Imam / Fakhar who along with Babar (statistically) make the best 1/2/3 Pakistan have ever put to field in LOIs in the last 20 years.

Moving on to Strike Rate and the criticism Babar cops every now and then/B]

There's this philosophy (and I use the term loosely here) that somehow or rather Babar doesn't have the power game as compared to either his predecessors or to those that might be his competitive peers. I have no idea where did this notion come from because he averages 56@87 but to put things into perspective I again did a short animation on Moving Strike Rate (M-SR) Babar has displayed in the last 75 innings and then compared that with 23 other individuals that have played a minimum of 20 ODI innings in the last 2 decades - here are the results:

Even here Babar remains in the upper quartile for the batsmen with a high SR. This list included the likes of Sharjeel/Fakhar/Kamran who historically had a better claimed power game than Babar however, even here Babar somehow pushes himself into the top tier & while considering he does manage to score an average of 56 runs per game, a batsmen with 87 M-SR (#6 in the list for the most explosive batsmen in the last 20 years) makes for some unreal reading when we put the numbers into some valid context. These metricized numbers not only put Babar as the best batsman we have had in the last 20 years, but also the best LOI player we have produced in our own cricketing history (STATISTICALLY there is no one ahead of him right now).

To hammer home on how Babar leads the charts yet still maintains his supreme form, here are the Z-RPI and Z-SR for the same dataset per innings and mapped on to the Z-RPI and Z-SR for the remaining 23 batsmen as well.

If you follow the dotted lines in both above tabulations (these represents Babar's data points) we can clearly see how he hardly deviates in either direction making him a hallmark of consistency & someone that we have pillared our batting strategy around. In both his ability to make runs, and his relative SR he is ahead of everyone we have ever produced in our LOI cricketing history while those that are 2nd and 3rd best to the cause, are also part of the setup right now.

This now begs the question on whether when people bring up past laurels, either are misinformed about their own opinions or is there a selected narrative push that might be geared towards the national team, to ensure that certain representation is provided without merit. Whatever the case maybe, the top 3 numbers for Pakistan ODI team has never seen 3 people M-RPI in the mid to late 40s along with the level of consistency that Babar brings. That said, is there room for improvement? ABSOLUTELY YES, Babar can and should take his game to the next level even now, by developing a stronger base which might assist him in capitalizing on bowlers via brute strength (if needed).

Conclusively if we look at the D-Ratio (which is basically how dynamic a batsmen is, based on the aforementioned definition provided above), Babar still has a high probability to scoring well in boundaries as well as via rotation. His probability of scoring dynamically is once in every 3rd game which is a very good return for someone who anchors the innings. Fakhar has more dynamic probability than Babar but his consistency via Z-RPI and Z-SR are rather lacking. Imam (even though has a good Z-RPI) lacks quite a bit when it comes to M-SR so even among the three top tier bats of the last 20 years Babar still makes out as the best one by a considerable margin. The D-ratio metrics are as follows - The numbers in the donut pie chart represents how many innings (out of the total they played) were either Dynamic, Least Dynamic or Undynamic.

P.S. I was not expecting but Imran Farhat and Yasir Hameed's metrics have surprised me the most. Both these batsmen were the best competition to the Babar/Fakhar/Imam historically and in the last 20 years. I was not expecting that at all. The rest of the batsmen through their careers have been rather underwhelming especially in LOIs.

ahmedwaqas92 · Nov 13, 2020

Link to Rank Based Video

Thunderbolt14 · Nov 13, 2020

Oh wow.

ahmedwaqas92 · Nov 13, 2020

Thunderbolt14 said:
Oh wow.

Just wanted to put all this out on PP. Discourse should be intelligent and fact based.

Thunderbolt14 · Nov 13, 2020

ahmedwaqas92 said:
Just wanted to put all this out on PP. Discourse should be intelligent and fact based.

Just taking my time to soak all the analysis in, and understand the data.
In the meanwhile, hat’s off to you for the fantastic work. Love it.

Have to ask, is Babar metrics a play on the term saber metrics? :afridi

I love it

Thunderbolt14 · Nov 13, 2020

Having just read through everything, I want to say it’s a fantastic analysis.

I’m wondering about Fakhar and Imam - clearly statistically they are head and shoulders above many options we’ve tried in the past. However, I’m curious as to how Fakhar’s recent dip in form has translated to his moving RPI and SR in ODI’s, and whether Imam’s showed any upward trajectory.

Those are the two criticisms leveled against them and I suppose your data metrics could feasibly provide insight into whether it’s worth pursuing with Fakhar, for example, for the 2023 world cup. Or should Imam be replaced with someone more “dynamic” if we were to apply similar metrics to upcoming (not yet debuted) batsmen’s list A scores?

I know these aren’t easy questions to answer and the data never gives a clear yes or no, only insight for us to make better decisions on. But I’m curious if you can do similar overlays on the two of them the way you have done for Babar, so maybe we can understand their value to the team better.

And in your opinion, personally speaking, what should our ODI lineup look like over the next 3 years?

ahmedwaqas92 · Nov 13, 2020

Thunderbolt14 said:
Having just read through everything, I want to say it’s a fantastic analysis.

I’m wondering about Fakhar and Imam - clearly statistically they are head and shoulders above many options we’ve tried in the past. However, I’m curious as to how Fakhar’s recent dip in form has translated to his moving RPI and SR in ODI’s, and whether Imam’s showed any upward trajectory.

Those are the two criticisms leveled against them and I suppose your data metrics could feasibly provide insight into whether it’s worth pursuing with Fakhar, for example, for the 2023 world cup. Or should Imam be replaced with someone more “dynamic” if we were to apply similar metrics to upcoming (not yet debuted) batsmen’s list A scores?

I know these aren’t easy questions to answer and the data never gives a clear yes or no, only insight for us to make better decisions on. But I’m curious if you can do similar overlays on the two of them the way you have done for Babar, so maybe we can understand their value to the team better.

And in your opinion, personally speaking, what should our ODI lineup look like over the next 3 years?

Overlays on trends can be calculated by taking a frequency distribution of Z-Scores for both positively and negatively skewed RPI/SR data points. The higher the percentage of Positively skewed data points for a player the better consistency he displays on field for M-RPI and M-SR

Frequency distribution of the Z scores for Imam and Fakhar (RPI and SR).
Imam's Numbers

Fakhar's Numbers

If you take the percentage of positively skewed data points for both the batsmen then RPI and SR come to 37.5% & 35% in case of Imam (for all innings he's played) while 40% & 21% for Fakhar (for all innings he's played). If you then Compare that to Babar's output as shown below then a difference can clearly be seen

Babar's Numbers

For both RPI and SR his positively skewed percentage is 53 (for RPI) and 47 (For SR) implying that in every second game, Babar's threshold as a player increases and he provides an even better output than he did in his previous aggregate outing combined. That's how you mark the difference between elite players like Babar and very good batsmen like Imam/Fakhar.

unemployedgm · Nov 18, 2020

ahmedwaqas92 said:
Considering emotions have cooled down post the 3rd ODI vs Zimbabwe, I believe this is the best time to reflect (in a realistic & logical manner) on what the current lot offers comparatively, to the batting resources Pakistan have had earlier i.e. in the last 20 odd years.

The loss in the 3rd ODI, in the recently concluded Zimbabwe series, saw many mainstream critics having a go at the team, especially our top 3 with Babar being criticized, as a significant factor in our LOI pitfall. This thread, is not by any means a defense to that narrative however, it puts into perspective on where were were historically, the resources we have right now as our mainstays, and what realistically should we be looking towards in the future.

When analyzing or putting forth any narrative, or to test a given hypothesis the building blocks must almost always be tangible data sets that reflect that given argument. To that end I deep dived into historical data on all Pakistani batsmen who have debuted in the last 20 years in ODIs (Starting from the year 2000 up until 2020) and have played a minimum of 20 innings. I also put an upper bound (i.e. upper limit) on every batsmen and capped each individual at the 75 innings mark. This was done so as to standardize the datasets and give us a reflective snapshot on the trend Pakistani batsmen follow when they are a significantly a mainstay in the national team.

So What historical data did I exactly capture? Now comes the fun bit - Right off the bat I disregarded the usual suspects (Career Average, Career SR) since they have severe limitations in capturing the true essence of how a batsmen might be performing in the current modern era so for every batsmen I calculated the following five metrics (on each innings they played):

[table="width: 500, class: grid, align: center"]
[tr]
[td]Variable[/td]
[td]Inference[/td]
[/tr]
[tr]
[td]M-RPI[/td]
[td]I recorded this variable as the Moving RPI (runs per innings) for each batsmen. The logic behind going with RPI (Runs per Innings) rather than lets say averages was to remove the disparity of not outs from the datasets, which skew the results in either direction. The 'Moving' bit, which implies how RPI changed with each game the batsman played, was done so as to record bad patches and the sustainability of a batsman's form[/td]
[/tr]
[tr]
[td]Z-RPI[/td]
[td]This is the Z score of each M-RPI data point in the study. A Z score is essentially calculated to give us the fluctuations in the entirety of the dataset. In technical terms it records how many standard deviations is a data point away from its aggregate mean. A positive Z score means that the data point is higher than the aggregate mean of the population while a negative instance provides a declining trend. In terms of batsmen RPI, we can say that Z scores that are positively skewed on the right of a distribution table will highlight consistency for the individual while those that are negative will imply lack of consistency[/td]
[/tr]
[tr]
[td]M-SR[/td]
[td]Similar to the moving RPI that we calculated above, M-SR represents the Moving Strike Rate. The fundamental concept behind this variable is the same as the RPI but instead of Runs per Innings it showcases, strike rate for each batsmen after every innings they played[/td]
[/tr]
[tr]
[td]Z-SR[/td]
[td]Z-SR is the Z score of the Moving SR mentioned above. Again, the conceptual framework of this variable is identical to the one explained in Z-RPI. The only difference being that instead of Runs per Innings, this variable would show the fluctuations in the Strike Rate of a given batsman.[/td]
[/tr]
[tr]
[td]D-Ratio[/td]
[td]I have seen a lot of critics and 'Analysts' throw the word 'Dynamic Batsman' around here & as well as on our mainstream media, however NEVER have I seen anyone mathematically try & correlate the term to any given framework. For this particular analysis, I have generated a D-ratio (Dynamic Ratio) which basically is the total number of runs scored by a batsmen (for each innings) via boundaries & then dividing that by total runs scored by the batsmen in rotation (1s/2s/3s). If the ratio comes to more than 1.00, the batsmen can be said to be "Dynamic" (for that particular innings) while if the ratio is less than 1.00 but greater than 0.00 the batsmen will be termed as "least Dynamic". If in any given innings the batsmen does not score via (boundaries or rotation) and there is a 0 on either side of the equation then we will label that innings as "Undynamic"[/td]
[/tr]
[/table]

With the variables sorted lets take the Moving RPI, & see for ourselves the trend batsmen have followed for a maximum of 75 innings in the last 20 years. To record how every batsmen performed in the aforementioned qualification criteria, I animated the results as an RPI race for ease of viewing.

Babar is a class act but in the first 15-20 odd innings of his career (which started in 2016) he wasn't as impactful as he is TODAY. Off the blocks, Fakhar & Imam have a considerable lead in terms of RPIs compared to the rest of the batsmen (24 in total) however by innings number 30, Babar starts to hit his stride and by innings 50 he is so far ahead of the curve - its not even funny. This implies two qualities in Babar's game

(i) The guy has a knack of improving (with each game) & this stat can be clearly depicted for the last 75 innings.
(ii) Babar has an increasing M-RPI which means his consistency is almost unmatched in the way that he makes those runs - Day in Day out.

To further display this notion, I did a rank based, race animation (for the same Moving RPI) and the results themselves show how Babar post the 30 innings mark just leaves everyone behind. By the completion of each race he is almost 8-10 runs ahead of his nearest competition which is an unreal variance in terms of how a batsmen might perform among his immediate peers. Fair play to Imam as well since his RPI (he lacks Babar's SR by quite a lot) matches that off Babar until his most recent international innings but whether he can maintain that same level of consistency which Babar has done for 75 innings now remains to be seen.

Link - https://www.dailymotion.com/video/x7xgb11 (You can also refer to the next Post)

Whatever a person's opinion might be on our current lot (everyone's entitled to it) no factual data can back the simple fact that the batsmen Pakistan are introducing (or have introduced in the last 2 odd years) are NOT cut from the same cloth as those that debuted prior to them - going back as far as the year 2000. The top 3 we have right now simply have numbers that Pakistani fans have never seen from their Top 3 in LOIs in the entirety of their cricketing history. That said, there is still room for massive improvement and players like Haider, Khushdil, Abdullah Shafique provide an additional X-factor from the likes of Imam / Fakhar who along with Babar (statistically) make the best 1/2/3 Pakistan have ever put to field in LOIs in the last 20 years.

Moving on to Strike Rate and the criticism Babar cops every now and then/B]

There's this philosophy (and I use the term loosely here) that somehow or rather Babar doesn't have the power game as compared to either his predecessors or to those that might be his competitive peers. I have no idea where did this notion come from because he averages 56@87 but to put things into perspective I again did a short animation on Moving Strike Rate (M-SR) Babar has displayed in the last 75 innings and then compared that with 23 other individuals that have played a minimum of 20 ODI innings in the last 2 decades - here are the results:

Even here Babar remains in the upper quartile for the batsmen with a high SR. This list included the likes of Sharjeel/Fakhar/Kamran who historically had a better claimed power game than Babar however, even here Babar somehow pushes himself into the top tier & while considering he does manage to score an average of 56 runs per game, a batsmen with 87 M-SR (#6 in the list for the most explosive batsmen in the last 20 years) makes for some unreal reading when we put the numbers into some valid context. These metricized numbers not only put Babar as the best batsman we have had in the last 20 years, but also the best LOI player we have produced in our own cricketing history (STATISTICALLY there is no one ahead of him right now).

To hammer home on how Babar leads the charts yet still maintains his supreme form, here are the Z-RPI and Z-SR for the same dataset per innings and mapped on to the Z-RPI and Z-SR for the remaining 23 batsmen as well.

View attachment 104473
View attachment 104474

If you follow the dotted lines in both above tabulations (these represents Babar's data points) we can clearly see how he hardly deviates in either direction making him a hallmark of consistency & someone that we have pillared our batting strategy around. In both his ability to make runs, and his relative SR he is ahead of everyone we have ever produced in our LOI cricketing history while those that are 2nd and 3rd best to the cause, are also part of the setup right now.

This now begs the question on whether when people bring up past laurels, either are misinformed about their own opinions or is there a selected narrative push that might be geared towards the national team, to ensure that certain representation is provided without merit. Whatever the case maybe, the top 3 numbers for Pakistan ODI team has never seen 3 people M-RPI in the mid to late 40s along with the level of consistency that Babar brings. That said, is there room for improvement? ABSOLUTELY YES, Babar can and should take his game to the next level even now, by developing a stronger base which might assist him in capitalizing on bowlers via brute strength (if needed).

Conclusively if we look at the D-Ratio (which is basically how dynamic a batsmen is, based on the aforementioned definition provided above), Babar still has a high probability to scoring well in boundaries as well as via rotation. His probability of scoring dynamically is once in every 3rd game which is a very good return for someone who anchors the innings. Fakhar has more dynamic probability than Babar but his consistency via Z-RPI and Z-SR are rather lacking. Imam (even though has a good Z-RPI) lacks quite a bit when it comes to M-SR so even among the three top tier bats of the last 20 years Babar still makes out as the best one by a considerable margin. The D-ratio metrics are as follows - The numbers in the donut pie chart represents how many innings (out of the total they played) were either Dynamic, Least Dynamic or Undynamic.

View attachment 104475

P.S. I was not expecting but Imran Farhat and Yasir Hameed's metrics have surprised me the most. Both these batsmen were the best competition to the Babar/Fakhar/Imam historically and in the last 20 years. I was not expecting that at all. The rest of the batsmen through their careers have been rather underwhelming especially in LOIs.

I would like to speak to you.

ahmedwaqas92 · Nov 18, 2020

unemployedgm said:
I would like to speak to you.

Yeah Sure

Thunderbolt14 · Nov 18, 2020

ahmedwaqas92 said:
Overlays on trends can be calculated by taking a frequency distribution of Z-Scores for both positively and negatively skewed RPI/SR data points. The higher the percentage of Positively skewed data points for a player the better consistency he displays on field for M-RPI and M-SR

Frequency distribution of the Z scores for Imam and Fakhar (RPI and SR).
Imam's Numbers
View attachment 104477

Fakhar's Numbers
View attachment 104478

If you take the percentage of positively skewed data points for both the batsmen then RPI and SR come to 37.5% & 35% in case of Imam (for all innings he's played) while 40% & 21% for Fakhar (for all innings he's played). If you then Compare that to Babar's output as shown below then a difference can clearly be seen

Babar's Numbers
View attachment 104479

For both RPI and SR his positively skewed percentage is 53 (for RPI) and 47 (For SR) implying that in every second game, Babar's threshold as a player increases and he provides an even better output than he did in his previous aggregate outing combined. That's how you mark the difference between elite players like Babar and very good batsmen like Imam/Fakhar.

Coming back to Imam and Fakhar’s superiority over previous Pakistani batsmen, is there any reliable method of adjusting their statistics as per variables pertaining to the era - specifically flatter pitches, smaller grounds, rules favoring batsmen, and quality of opposition? I almost get the feeling that if you went back a few years before 2000 (the start of your sample set), Imam’s stats would overshadow even Saeed Anwar’s for example.

ahmedwaqas92 · Nov 18, 2020

Thunderbolt14 said:
Coming back to Imam and Fakhar’s superiority over previous Pakistani batsmen, is there any reliable method of adjusting their statistics as per variables pertaining to the era - specifically flatter pitches, smaller grounds, rules favoring batsmen, and quality of opposition? I almost get the feeling that if you went back a few years before 2000 (the start of your sample set), Imam’s stats would overshadow even Saeed Anwar’s for example.

Determining tangibility in such variables is hard because data collection even as early as 15 years ago was quite patchy and almost non existent (beyond avg & sr) around the 90s/2000s. So either we create a standardized framework with lots of errors since accuracy of the result will be based on aggregates or disregard everything and go with a model that is more precise.

Therefore, the only way one can counter that narrative is by running a comparative cluster analysis on batsmen and their competitive peers. In that regard Anwar could still be ahead in terms of opening numbers (I have to check that) but in terms of actual SR & Avg, Imam is stretches ahead as an opener from any of his predecessors.

Thunderbolt14 · Nov 18, 2020

ahmedwaqas92 said:
Determining tangibility in such variables is hard because data collection even as early as 15 years ago was quite patchy and almost non existent (beyond avg & sr) around the 90s/2000s. So either we create a standardized framework with lots of errors since accuracy of the result will be based on aggregates or disregard everything and go with a model that is more precise.

Therefore, the only way one can counter that narrative is by running a comparative cluster analysis on batsmen and their competitive peers. In that regard Anwar could still be ahead in terms of opening numbers (I have to check that) but in terms of actual SR & Avg, Imam is stretches ahead as an opener from any of his predecessors.

I was thinking something similar - adjusting RPI for example versus the average innings totals for that era (even then, this disregards things like home tracks etc).

Perhaps easier for you to do, I’d like to see an analysis of our current era batsmen versus their peers from 2010 onwards from all countries, such as Fakhar versus Finch for example.

I’m only asking because I love the work you’ve put in and think it’s very insightful!

Farabi · Nov 18, 2020

Great analysis. Agreed with Thunderbolt here that the result doesn’t meet the smell test.
Sure, Imam’s average is pretty high (53) but SR is dismally low (80) for 2020. Today’s best top order batsmen are averaging 50+ (Think Kohli, Babar, Root, Williamson etc) In the 90s, the same benchmark was 40 average. Think Saeed, Sachin, Ponting etc
Saeed Anwar’s 80 SR was on the faster side for 90s, which along with his average, made him an impactful player for PK.
Imam’s 80 SR is a total drag on the team when the world’s best (like Roy, Warner, Finch, Bairstow, QDK, Dhawan) are going at 100, or in other cases posses the game to make up for it later (Rohit).

I’de rather have Imam with SR of 95 with average of 40.

ahmedwaqas92 · Nov 18, 2020

Farabi said:
Great analysis. Agreed with Thunderbolt here that the result doesn’t meet the smell test.
Sure, Imam’s average is pretty high (53) but SR is dismally low (80) for 2020. Today’s best top order batsmen are averaging 50+ (Think Kohli, Babar, Root, Williamson etc) In the 90s, the same benchmark was 40 average. Think Saeed, Sachin, Ponting etc
Saeed Anwar’s 80 SR was on the faster side for 90s, which along with his average, made him an impactful player for PK.
Imam’s 80 SR is a total drag on the team when the world’s best (like Roy, Warner, Finch, Bairstow, QDK, Dhawan) are going at 100, or in other cases posses the game to make up for it later (Rohit).

I’de rather have Imam with SR of 95 with average of 40.

Thank you for the appreciation brother

however my aim for the thread in the original post basically wasn't a comparison for Intl players of different counties. It was a deep dive into the comparative benchmarks of Pakistani batsmen in the last 2 decades and to put things into context Z scores were taken to highlight the consistency with which every single one of those ODI batters performance so as to normalize data across the different eras.

Never, in any part of the aforementioned tabulation, does the study imply Imam to be a leading LOI batter when it comes to impact. That discussion is an entirely different facet of performance metrics and should be discussed under its own thread.

Arsal_AK · Nov 18, 2020

Good post, some really interesting insights. And I agree, analysis should be based on facts and solid foundation, otherwise most of time it’s misguided opinion.
I have a few questions and then a few points as well.

1- You seem to be calculating the mean and SD from the whole sample set (everyone innings for every player). And then for the Z score you use the each innings a separate value, right?

2- Why cap it at 75 innings? The standardization is built into the z score (on a sidenote, this is the first time I have read it referred to as such and had to look it up just to be sure). You can look at percentages in direct comparisons and dynamic ratio, as you do in a layer post, so that work outs as well.

Say a player plays 200 matches and peaks from 90 to 180? This is doing that batsman a disservice.

3- Instead of saying that Babar is above this many times, it will be even better to see how many times he lands in 1σ, 2σ and so on, wouldn’t it?

It’s one thing to say that a player is above the mean, it’s another to look at how many standard deviations he is above the mean. That would make this analysis much better and shed a deeper light at how good Babar is.

Take your follow-up post comparing Imam, Fakhar and Babar. The more innings you have further away from the mean the better you are.

4- Your graphs are difficult to read for me. Can you tell how are you plotting the data? Why is there so much overlap between the respective graphs that is to say what is the width of each case representing and where is it coming from?

5- The dynamic definition.There is something it’s not capturing and is evident from your pie charts as well. Ahmed Shehzad seems to be more dynamic that Babar.

To further test it out, consider the following case of two players A and B.
A scores 4 boundaries and runs 10 runs. He also plays 20 dot balls. His dynamic ratio is 1.6.
B on the other hand has also hit 4 boundaries but takes 20 singles with 10 dot balls.

Isn’t B better? What I am getting at is that the definition used here doesn’t seem to be useful. In particular, the dot balls aren’t accounted for and that is really benefiting A over B.

Further more, the undynamic case, which we seem to be discarding because the ratio will either be zero or tend to infinity. The reasons are valid but that does leave a lot of innings from the data.

Now, in one respect that’s good because it takes away innings like a duck, one boundary and out, a couple of singles and out. That’s not something you want skewing your data.
But I would take it a step ahead and exclude innings like 2 ball 7, which give a dynamic ratio of 6. That will be included and counted as dynamic going by the definition.

It will also take away 20 of 10 say with 5 fours. Or 20 of 30 with no boundary. Those aren’t so black and white. Making it more extreme it will miss the 10 of 30 innings, which certainly should be in the not dynamic column.
But this is a minor issue and it does some good, so doesn’t matter that much for now.

6- Even with this it clearly shows just how good Babar has been and how good he is becoming. Also it’s highlights that he can get better.

Once again, a great post. It would have taken a while to do this, so thanks for taking the time out and for posting it here.

A.A.Z · Nov 18, 2020

ahmedwaqas92 said:
Considering emotions have cooled down post the 3rd ODI vs Zimbabwe, I believe this is the best time to reflect (in a realistic & logical manner) on what the current lot offers comparatively, to the batting resources Pakistan have had earlier i.e. in the last 20 odd years.

The loss in the 3rd ODI, in the recently concluded Zimbabwe series, saw many mainstream critics having a go at the team, especially our top 3 with Babar being criticized, as a significant factor in our LOI pitfall. This thread, is not by any means a defense to that narrative however, it puts into perspective on where were were historically, the resources we have right now as our mainstays, and what realistically should we be looking towards in the future.

When analyzing or putting forth any narrative, or to test a given hypothesis the building blocks must almost always be tangible data sets that reflect that given argument. To that end I deep dived into historical data on all Pakistani batsmen who have debuted in the last 20 years in ODIs (Starting from the year 2000 up until 2020) and have played a minimum of 20 innings. I also put an upper bound (i.e. upper limit) on every batsmen and capped each individual at the 75 innings mark. This was done so as to standardize the datasets and give us a reflective snapshot on the trend Pakistani batsmen follow when they are a significantly a mainstay in the national team.

So What historical data did I exactly capture? Now comes the fun bit - Right off the bat I disregarded the usual suspects (Career Average, Career SR) since they have severe limitations in capturing the true essence of how a batsmen might be performing in the current modern era so for every batsmen I calculated the following five metrics (on each innings they played):

[table="width: 500, class: grid, align: center"]
[tr]
[td]Variable[/td]
[td]Inference[/td]
[/tr]
[tr]
[td]M-RPI[/td]
[td]I recorded this variable as the Moving RPI (runs per innings) for each batsmen. The logic behind going with RPI (Runs per Innings) rather than lets say averages was to remove the disparity of not outs from the datasets, which skew the results in either direction. The 'Moving' bit, which implies how RPI changed with each game the batsman played, was done so as to record bad patches and the sustainability of a batsman's form[/td]
[/tr]
[tr]
[td]Z-RPI[/td]
[td]This is the Z score of each M-RPI data point in the study. A Z score is essentially calculated to give us the fluctuations in the entirety of the dataset. In technical terms it records how many standard deviations is a data point away from its aggregate mean. A positive Z score means that the data point is higher than the aggregate mean of the population while a negative instance provides a declining trend. In terms of batsmen RPI, we can say that Z scores that are positively skewed on the right of a distribution table will highlight consistency for the individual while those that are negative will imply lack of consistency[/td]
[/tr]
[tr]
[td]M-SR[/td]
[td]Similar to the moving RPI that we calculated above, M-SR represents the Moving Strike Rate. The fundamental concept behind this variable is the same as the RPI but instead of Runs per Innings it showcases, strike rate for each batsmen after every innings they played[/td]
[/tr]
[tr]
[td]Z-SR[/td]
[td]Z-SR is the Z score of the Moving SR mentioned above. Again, the conceptual framework of this variable is identical to the one explained in Z-RPI. The only difference being that instead of Runs per Innings, this variable would show the fluctuations in the Strike Rate of a given batsman.[/td]
[/tr]
[tr]
[td]D-Ratio[/td]
[td]I have seen a lot of critics and 'Analysts' throw the word 'Dynamic Batsman' around here & as well as on our mainstream media, however NEVER have I seen anyone mathematically try & correlate the term to any given framework. For this particular analysis, I have generated a D-ratio (Dynamic Ratio) which basically is the total number of runs scored by a batsmen (for each innings) via boundaries & then dividing that by total runs scored by the batsmen in rotation (1s/2s/3s). If the ratio comes to more than 1.00, the batsmen can be said to be "Dynamic" (for that particular innings) while if the ratio is less than 1.00 but greater than 0.00 the batsmen will be termed as "least Dynamic". If in any given innings the batsmen does not score via (boundaries or rotation) and there is a 0 on either side of the equation then we will label that innings as "Undynamic"[/td]
[/tr]
[/table]

With the variables sorted lets take the Moving RPI, & see for ourselves the trend batsmen have followed for a maximum of 75 innings in the last 20 years. To record how every batsmen performed in the aforementioned qualification criteria, I animated the results as an RPI race for ease of viewing.

Babar is a class act but in the first 15-20 odd innings of his career (which started in 2016) he wasn't as impactful as he is TODAY. Off the blocks, Fakhar & Imam have a considerable lead in terms of RPIs compared to the rest of the batsmen (24 in total) however by innings number 30, Babar starts to hit his stride and by innings 50 he is so far ahead of the curve - its not even funny. This implies two qualities in Babar's game

(i) The guy has a knack of improving (with each game) & this stat can be clearly depicted for the last 75 innings.
(ii) Babar has an increasing M-RPI which means his consistency is almost unmatched in the way that he makes those runs - Day in Day out.

To further display this notion, I did a rank based, race animation (for the same Moving RPI) and the results themselves show how Babar post the 30 innings mark just leaves everyone behind. By the completion of each race he is almost 8-10 runs ahead of his nearest competition which is an unreal variance in terms of how a batsmen might perform among his immediate peers. Fair play to Imam as well since his RPI (he lacks Babar's SR by quite a lot) matches that off Babar until his most recent international innings but whether he can maintain that same level of consistency which Babar has done for 75 innings now remains to be seen.

Link - https://www.dailymotion.com/video/x7xgb11 (You can also refer to the next Post)

Whatever a person's opinion might be on our current lot (everyone's entitled to it) no factual data can back the simple fact that the batsmen Pakistan are introducing (or have introduced in the last 2 odd years) are NOT cut from the same cloth as those that debuted prior to them - going back as far as the year 2000. The top 3 we have right now simply have numbers that Pakistani fans have never seen from their Top 3 in LOIs in the entirety of their cricketing history. That said, there is still room for massive improvement and players like Haider, Khushdil, Abdullah Shafique provide an additional X-factor from the likes of Imam / Fakhar who along with Babar (statistically) make the best 1/2/3 Pakistan have ever put to field in LOIs in the last 20 years.

Moving on to Strike Rate and the criticism Babar cops every now and then/B]

There's this philosophy (and I use the term loosely here) that somehow or rather Babar doesn't have the power game as compared to either his predecessors or to those that might be his competitive peers. I have no idea where did this notion come from because he averages 56@87 but to put things into perspective I again did a short animation on Moving Strike Rate (M-SR) Babar has displayed in the last 75 innings and then compared that with 23 other individuals that have played a minimum of 20 ODI innings in the last 2 decades - here are the results:

Even here Babar remains in the upper quartile for the batsmen with a high SR. This list included the likes of Sharjeel/Fakhar/Kamran who historically had a better claimed power game than Babar however, even here Babar somehow pushes himself into the top tier & while considering he does manage to score an average of 56 runs per game, a batsmen with 87 M-SR (#6 in the list for the most explosive batsmen in the last 20 years) makes for some unreal reading when we put the numbers into some valid context. These metricized numbers not only put Babar as the best batsman we have had in the last 20 years, but also the best LOI player we have produced in our own cricketing history (STATISTICALLY there is no one ahead of him right now).

To hammer home on how Babar leads the charts yet still maintains his supreme form, here are the Z-RPI and Z-SR for the same dataset per innings and mapped on to the Z-RPI and Z-SR for the remaining 23 batsmen as well.

View attachment 104473
View attachment 104474

If you follow the dotted lines in both above tabulations (these represents Babar's data points) we can clearly see how he hardly deviates in either direction making him a hallmark of consistency & someone that we have pillared our batting strategy around. In both his ability to make runs, and his relative SR he is ahead of everyone we have ever produced in our LOI cricketing history while those that are 2nd and 3rd best to the cause, are also part of the setup right now.

This now begs the question on whether when people bring up past laurels, either are misinformed about their own opinions or is there a selected narrative push that might be geared towards the national team, to ensure that certain representation is provided without merit. Whatever the case maybe, the top 3 numbers for Pakistan ODI team has never seen 3 people M-RPI in the mid to late 40s along with the level of consistency that Babar brings. That said, is there room for improvement? ABSOLUTELY YES, Babar can and should take his game to the next level even now, by developing a stronger base which might assist him in capitalizing on bowlers via brute strength (if needed).

Conclusively if we look at the D-Ratio (which is basically how dynamic a batsmen is, based on the aforementioned definition provided above), Babar still has a high probability to scoring well in boundaries as well as via rotation. His probability of scoring dynamically is once in every 3rd game which is a very good return for someone who anchors the innings. Fakhar has more dynamic probability than Babar but his consistency via Z-RPI and Z-SR are rather lacking. Imam (even though has a good Z-RPI) lacks quite a bit when it comes to M-SR so even among the three top tier bats of the last 20 years Babar still makes out as the best one by a considerable margin. The D-ratio metrics are as follows - The numbers in the donut pie chart represents how many innings (out of the total they played) were either Dynamic, Least Dynamic or Undynamic.

View attachment 104475

P.S. I was not expecting but Imran Farhat and Yasir Hameed's metrics have surprised me the most. Both these batsmen were the best competition to the Babar/Fakhar/Imam historically and in the last 20 years. I was not expecting that at all. The rest of the batsmen through their careers have been rather underwhelming especially in LOIs.

Fakhar and Imam's stats are padded by bashing Zimbabwe's second string team back in 2018 when their main players were on strike against the board.

If this data could filter out lower rank ranked teams and have a minimum requirement of 75 innings it would provide a more accurate representation of each players respective values.

There was an article on cricinfo recently using data analysis to determine the best ODI batsman and bowler, the parameters used were slightly subjective but gave quite an accurate representation of each player's value.

If you could tweak your findings to accommodate for some of those parameters it would be quite insightful. https://www.espncricinfo.com/story/_/id/29304607/greatest-odi-batsman-all

Nonetheless, well done you've done a great job.

Farabi · Nov 18, 2020

Fair enough

Varun · Nov 19, 2020

Super effort to put these into numbers - well done!

Search

[ODIs] - Babar Metrics & a Statistical Deep Dive into Pakistani Batsmen of Last 20 Years

ahmedwaqas92

ODI Debutant

ahmedwaqas92

ODI Debutant

Thunderbolt14

ODI Debutant

ahmedwaqas92

ODI Debutant

Thunderbolt14

ODI Debutant

Thunderbolt14

ODI Debutant

ahmedwaqas92

ODI Debutant

unemployedgm

Tape Ball Star

ahmedwaqas92

ODI Debutant

Thunderbolt14

ODI Debutant

ahmedwaqas92

ODI Debutant

Thunderbolt14

ODI Debutant

Farabi

First Class Player

ahmedwaqas92

ODI Debutant

Arsal_AK

ODI Debutant

A.A.Z

Tape Ball Regular

Farabi

First Class Player

Varun

Senior Test Player

Similar threads