Retro Funding Season 7 Governance Process

In Season 7, the Collective is experimenting with a “metrics-driven, humans in the loop” approach to Retro Funding. In this model an Evaluation Algorithm defines how impact is measured and rewarded. Voters express their preferences by selecting the Evaluation Algorithm to be applied. Continuous feedback by humans with domain specific knowledge informs the refinement of the evaluation algorithm. Read more about it here.


Governance Process in Season 7

Season 7 serves as an experimental phase to refine the governance of a more modular approach to Retro Funding. We aim to test whether this approach enhances the accuracy of impact evaluation, compared to previous methods, and hope to generate insights that can inform the design of the broader gov system

Process Summary: Multiple Evaluation Algorithms are proposed. Badgeholders vote on which Evaluation Algorithm to apply using approval voting. The creator of the selected algorithm is responsible for iterative improvements, informed by expert feedback.

Evaluation Algorithm Vote

Badgeholders select an Evaluation Algorithm which will be continuously refined and applied throughout the Season. The initial vote on the Evaluation Algorithm sets the strategic direction for impact evaluation.

  1. Multiple Evaluation Algorithms with distinct objectives and allocation strategies are proposed. Data and metrics used within the evaluation algorithm and algorithm results are public, while the weights within the evaluation algorithms will remain private and are revealed at the end of the season to prevent against gaming the algorithm.
  2. Badgeholders select the Evaluation Algorithm via approval voting, Initially requiring a quorum of 30% and an approval threshold of 51%. Approval thresholds will be subject to iteration over time.
  3. The Evaluation Algorithm Vote for the Onchain Builders and Dev Tooling Missions will take place during the Citizens’ House: Veto period #33 from Feb 27th - March 5th.

In Season 7, algorithm proposal rights are restricted to contributors who previously implemented metrics within Retro Funding 4: Onchain Builders. Metrics proposals during Retro Funding 4 were open to the public, with bounties offered for metric creation. Currently, the only contributor who meets these requirements is OpenSource Observer. In the future we plan to expand proposal rights to a broader set of contributors.

Evaluation Algorithm Feedback

The Evaluation Algorithm evolves to optimize the measurement and rewarding of impact. Rapid iteration loops are essential to refine metrics and counteract potential gaming (see Goodhart’s Law).

  1. Feedback-Driven Iteration: Iteration is informed by the feedback from humans with domain-specific knowledge. In Season 7, this process is informal and open, encouraging broad participation across the Collective. This can include Qualitative Inputs, such as written statements or suggestions, as well as Quantitative Inputs, such as preference rankings from participants. More details about the avenues to provide feedback will be shared soon.
  2. Collaboration: While a single party will be responsible for proposing and implementing iterations of the evaluation algorithm, many people can support the development of these algorithms. Besides providing feedback, you can contribute by
    1. Proposing metrics: metrics must be sourced from public datasets and calculated using open source code (eg, SQL and Python). OSO has documentation and many examples of metrics for data scientists to experiment with.
    2. Add support for required data sources: new data sources can also be added to OSO’s data lake for community analysis. There are a variety of ways that data engineers can connect or replicate their data on OSO.
    3. Analyses of eval algorithm performance: as algorithms are developed, there’s a need to do backtesting and causal analysis to evaluate performance. Data scientists are encouraged to share research that can improve eval algorithms or their own proposals for upgrades.

Evaluation Algorithm Iteration

The creator of the selected evaluation algorithm is responsible for iteration based on expert feedback. Iterations do not require governance approval unless they substantially alter the strategic direction of impact evaluation.

  1. Iteration cycles: Initially, we expect the evaluation algorithm for Retro Funding Missions to be updated on a monthly basis.
  2. Failsafe: If results for the median project vary heavily between measurement periods, the evaluation algorithm creator can pause the application of the evaluation algorithm. This policy aims to protect against gamification of the algorithm or individual metrics.

Application Review

In the spirit of governance minimization, Retro Funding aims to eliminate the need for human review of applications in Season 7. Instead, metrics will be used to enforce eligibility.
If unforeseen circumstances arise that require human review, the Foundation will ask the govNERDs to assist.

Budget allocation

The approved budget for Retro Funding Missions will be distributed evenly across the designated measurement dates. In future iterations, it may be advantageous to dynamically adjust budget allocations based on the impact achieved within each measurement period.

15 Likes

Retro Funding Season 7: Centralization, Gatekeeping, and a False Promise of Decentralization?

The governance process outlined for Retro Funding Season 7 highlights a significant departure from the principles of decentralization and open participation that the Collective claims to champion. What was once presented as an experimental and inclusive process has devolved into a centralized, gatekept approach that prioritizes a sole-source provider from Season 4 while sidelining other contributors. Let’s break down why this is not just disappointing but fundamentally contradictory to the stated goals of Retro Funding.


1. Sole-Source Provider Under the Guise of Decentralization

The post claims that algorithm proposal rights are restricted to contributors from Retro Funding 4 and conveniently states that the only eligible contributor is OpenSource Observer (OSO). This is both misleading and false. I know for a fact that other contributors have applied and are capable of providing innovative metrics-driven solutions. Yet, these alternatives were not even considered, leaving the Collective captive to a single provider whose codebase we’re now funding and feeding instead of fostering meaningful decentralization.

The rhetoric of “metrics-driven, humans in the loop” governance rings hollow when participation is limited to a single entity. This process reeks of centralization, not the iterative experimentation or open collaboration that Season 7 supposedly aims to test.


2. Feeding Their Codebase Instead of Building a Decentralized System

By restricting contributions to one centralized actor, we’re effectively propping up their codebase under the guise of collective decision-making. This is not participation. This is dependence. Rather than fostering a competitive, decentralized ecosystem, the Collective has chosen to narrow its focus to a single provider, stifling innovation and betraying the principles of open collaboration.

This choice undermines the very essence of decentralization. We are not participating in a decentralized funding model. We’re feeding a centralized system while being told it’s “modular experimentation.” It’s not.


3. A False Narrative of Inclusion

The post claims that metrics proposals were open in Retro Funding 4 and suggests that they will be broadened in the future. But let’s be clear: this is a self-serving narrative. The exclusion of other contributors in Season 7 is not justifiable, especially when they’ve already demonstrated interest and capability. Gatekeeping doesn’t foster decentralization—it erodes trust and limits progress.

Restricting contributions to a single party also contradicts the stated intent to refine impact evaluation through collaboration and feedback. How can we refine an evaluation algorithm when we’re locked into a sole-source provider? How does this process ensure transparency and fairness?


4. Centralization as the Default Mode

The governance minimization approach in Season 7, while efficient in theory, effectively reduces participant influence in the name of automation. Metrics-driven decision-making without true community participation isn’t a step forward—it’s regression. And when the only algorithm in play comes from a centralized actor, we’re left with a system that reinforces existing power structures rather than challenging them.


^result of link for OSO provided above (2024-01-24)


Call to Action: Decentralization Requires Real Participation

Season 7 could have been an opportunity to explore modular experimentation through real competition and collaboration. Instead, it’s become a rubber stamp for OSO’s codebase. If the Collective is serious about decentralization, it needs to:

  1. Open Proposal Rights Immediately: Allow other contributors to propose evaluation algorithms.
  2. Increase Transparency: Make the decision-making process and selection criteria for algorithms clear.
  3. Avoid Sole-Sourcing: Actively encourage participation from multiple contributors to ensure a competitive, decentralized ecosystem.
  4. Commit to Decentralization: Stop feeding centralized systems while calling them decentralized experiments.

The current process is a betrayal of the Collective’s principles and an insult to those who believe in the promise of decentralized governance. It’s not too late to course-correct. Let’s do better.

4 Likes

@mel.eth addressing some of your concerns and feedback.

First off, I’m not sure this statement is related to your work at StableLab or not. Either way, we’re always looking for more people to contribute to data-driven impact measurement, so feel free to reach out if StableLab is interested in this!

On Decentralization: You’re correct that proposal rights for evaluation algorithms are restricted to one party for this season. It’s key that we are thoughtful about how we introduce and test new ideas. If we truly want to iterate and experiment, we can’t always solve everything at the same time, but we have to prioritize what components or hypothesis to validate.
In Season 7, Retro Funding is taking an iterative approach with the primary goal to validate that the “Metrics-driven, humans in the loop” approach significantly improves the accuracy of impact evaluation. As the post describes, we’re looking to decentralize proposal rights for evaluation algorithms in the future.

On the codebase: A big motivator for working with OSO is that their code, data and infrastructure is open. That said, there’s no fundamental reliance here, as data and models will be public and open source, allowing anyone to build on them. Again, we’re aiming to attract more contributors and evolve this into an open competitive ecosystem!

On Collaboration: You see that there’s a section in the post on how to collaborate and propose metrics, add data or analyse eval algo performance. I’m always happy to chat with anyone who’s interested in contributing! Reach my via TG - @jonassft.

btw I updated the broken OSO link in the post - see here

3 Likes

Thanks for the context, @Jonas – I’ll follow up on TG. While open source is undoubtedly better than not, I’d like to push back on your characterization of the progression here.

In prior seasons, badgeholders personally allocated funds (high-context with a high risk of self-dealing). Over time, we’ve shifted toward splitting the workload and favoring models. Now, as described above, the focus is shifting entirely to models, ostensibly obfuscating the “who” in favor of the “how.” However, this approach merely shifts the power to whoever defines the methodology, constrained by the limitations of that methodology.

As a badgeholder, I observe the methodology and allocate according to what I know will achieve outcomes aligned with the stated goals. While some badgeholders may lack this perspective, suppressing the ability of high-context participants like myself to allocate directly doesn’t resolve the issue—it shifts the locus of trust without necessarily improving outcomes.

The Core Dilemma:

  • Are we serving high-context badgeholders who are capable (or not) of allocating directly?
    (Do we trust them to allocate effectively on principle?)
  • Or are we serving high-context badgeholders who are capable (or not) of selecting methodologies?
    (If we don’t trust them to allocate directly, can we trust them to strategically select the allocation framework?)

This raises a fundamental issue: The methodology picker now wields more influence than any other party. While my capacity to allocate intelligently has grown, my ability to act on it has been restricted. If badgeholders are to focus on selecting the methodology, this process needs to become the primary workshop. Otherwise, we risk cycling through allocation methodologies without addressing the root concerns.

Revised Ask:

Do we have an outline of allocation methodologies by season? Is there a willingness to have an independent 3rd party test once selected (Blockscience comes to mind). Such a resource would help illustrate the shifts in governance focus and provide a clearer basis for this discussion.

1 Like

Hi Mel
I appreciate your post and, as the data lead for OSO, will say that I share the same goal as you wrt a competitive landscape for evaluation algos. We have no desire or intention to be a sole provider here, and the fact that we are for S7 is mainly a reflection of the huge lift to get the first versions of these algos up and running.

One common source of confusion that I want to address is that the Foundation requires “models” at various levels:

  1. ETL/Pipeline, ie, to clean and connect various data sources. For example, we want to query: “How much gas did the onchain builders who import all npm packages owned by project X contribute between Feb 1 and Feb 28”. This requires building models to connect different datasets maintained by different orgs.

  2. Defining metrics. For example, we might create a metric called downstream_gas_30_day_total which allows you to look up a project and date, and get its total Superchain gas contribution over the preceding 30 days and another called downstream_gas_30_day_net_change which looks at the absolute difference for the current period relative to the preceding 30 days. This requires getting feedback from various stakeholders on what is important to measure (and then implementing it).

  3. Combining metrics into evaluation algorithms. Basically, some way of weighting the various project-level metrics towards some overall strategy. For example, a growth-focused strategy might weight a metric like downstream_gas_30_day_net_change more heavily than other metrics. This requires a lot of data science!

My impression from your post is that you are mostly concerned about level (3), perhaps (2), but not so much (1). If so, then I think we are on the same page!

Level (1) is the part that OSO is primarily focused on. This is a data engineering problem much more than a data science problem.

There reality is there are several really big datasets that need to be connected in order to get to level (2) and derive metrics for the upcoming devtooling and onchain builder rounds. FYI, OSO is not the maintainer of any of these datasets. The maintainers are:

  • Superchain data → OP Labs
  • TVL data → DefiLlama
  • GitHub data → gharchive
  • Software dependency data → deps.dev
  • Project registry → OP Atlas (via EAS)

(You can get more info here.)

Again, the connecting is OSO’s primary focus. We are not aware of any other teams doing the connecting in a public way. That said, as @Jonas mentioned, the pipeline is completely open source – even the infra is fully open. (View any job in our pipeline here; code is permissively licensed if you want to fork it.)

Once all the data is connected, then data scientists can have fun creating different metrics and creating different eval algos on top of whatever metrics / features they want to use (levels 2 and 3). We want to facilitate this as much as possible. You can grab the data however you like. (Of course, as a community-led project, we’d love if you shared back your best models as a PR to our code base, but there’s no requirement to.)

As of today (Jan 27) we are still hard at work with OP Labs, Foundation, Agora, and other teams trying to connect all the necessary datasets. Once the round opens, there will be a lot of testing to ensure that all the necessary event data is being pulled for each project artifact and underlying metrics are being calculated correctly. It would make zero sense to have data scientists competing to build eval algos until the underlying source data is locked. That’s the work happening now and through the end of this first season (July).

Hopefully this addresses your primary concern, but just to recap:

  • Source data to build models; 100% public, we don’t maintain it
  • Pipeline to connect models; 100% public and open source; we maintain this but you can fork it
  • Metrics implementation; anyone can contribute via PR (or share in their own provided they are public and reproducible from the source data)
  • Eval algos; goal is to open this up and have lots of competition. Also, the algos will be maintained in an OP repo (not an OSO repo).

Finally, I will end by reiterating that we as OSO have no desire to be in the eval algo game long-term AND would love for this to be competitive. Here is an example of a competitive process that we are facilitating for DeepFunding (albeit on a much smaller amount of data).

4 Likes