We just published a new blogpost called: “WAR for public goods, or why we need more advanced metrics in crypto”.
tl;dr -
Wins Above Replacement is an advanced baseball stat that measures a player’s contribution to their team by comparing them to a hypothetical replacement-level player. The post explores how an empirically-derived metric similar to WAR might be applied to ecosystem grants programs as a way of measuring ROI. It includes some use case ideas (like a WAR oracle) and strawman WAR formulas for protocols and open source software (OSS) projects.
WAR is not just a fun meme, but also a preview of the form factor that the next generation of impact metric could take.
Context
It’s been a little over a year since our hello world post on the forum and Mirror essay on how data can improve future iterations of RetroPGF (now Retro Funding).
Like most worthy problems, the deeper you go, the more complexity you uncover, but also the more obsessed you get with the problem space. We are not turning back the ship but we are staring at a pretty big ocean of possibilities!
We’ve tried to document both our learnings and our thought process along the way via our blog. Our GitHub is also a pretty good place to see how things have evolved.
But, broadly speaking, this has been the progression:
- a year ago there were effectively no metrics, just project profiles
- we did a lot of analysis with relatively little data on RF3, creating “impact vectors” and “lists” based on fairly gameable metrics like stars, downloads, transaction counts, etc.
- we worked with badgeholders on the first set of metrics in RF4 and the first metrics-based voting round (and have more in store for RF5-7)
- we’ve done some initial longitudinal analysis comparing RF-funded projects against other cohorts
Here’s the evolution…
Right now, we are currently focused on the longitudinal metrics stage, but can start to see the next evolutionary breakthrough: advanced metrics
Advanced metrics
We are continually ingesting new public datasets and building more metric models, but at the same time, we are also expanding OSO’s coverage to more projects and chains. This opens up the possibility of deriving new metrics that look more holistically at a project’s contributions relative to some baseline.
In baseball, there is a famous advanced metric called “WAR” which allows for player comparisons across teams, leagues, years, and eras. There isn’t a single, standard formula for WAR. Different sources use various calculations, but they all rely on the same public datasets, publish their methodologies, and follow similar principles. Some key points:
- WAR includes a mix of offensive and defensive metrics, making it a more holistic measure than ones that focus on a single aspect.
- It adjusts for different player positions, recognizing that not all positions are equally important defensively.
- The averages and constants are derived empirically by looking at performance over a specific period, such as a single season or multiple seasons (an “era”).
In the context of a grant program like Retro Funding, we want to understand how effective each round is relative to all the other incentive programs happening (or not happening) in the space AND have some way of aggregating the quality of all the proects being supported.
WAR could help identify not only successful projects but also weak points within an ecosystem — whether in developer tooling, user engagement, or protocol innovation — that need more investment. It could be used to determine eligibility for grant rounds or power oracles that allocate ongoing funding streams to high-performing projects.
The post includes a few strawman proposals for what a WAR for OSS projects or WAR for protocols might look like. The challenge lies in identifying metrics that not only capture current activity but also predict future growth and network effects. Realistically, it’s going to take some time to have decent WAR models running in production.
Some next steps
The post also include some ideas for how to get started…
- Start Simple: The first version of WAR was fairly basic. Over time, more sophisticated versions emerged. Similarly, the initial metrics for decentralized ecosystems don’t need to be perfect. Starting with something simple allows for gradual refinement and improvement (see: Gall’s Law).
- Embrace Competing Implementations: As mentioned earlier, there is no single standard calculation for WAR. Even though the data and methodologies are public, there’s vibrant debate about which method is best. In an industry as dynamic as crypto, this diversity of approaches is a strength, encouraging innovation and continuous improvement.
- Engage the Research Community: WAR models are like catnip for data enthusiasts. They offer endless opportunities for simulation, experimentation, and visualization. By providing well-documented, easy-to-use datasets and stoking interest in the results from protocol leaders and governance teams, we can create a garden of catnip for researchers and data nerds.
Check out the full post and let us know if you have ideas for a WAR model or anything else metrics-related!