We’re on a journey of exploring which voting algorithm is optimal to achieve the goals of RetroPGF.
This is an invite for an open discussion of what voting algorithms should be considered and their advantages and disadvantages.
A brief history of RetroPGF voting algorithms
RetroPGF 1: Quadratic Voting (sum of square root of x sub i, for all i)
A few things we observed during the round that may help improve future iterations of RPGF.
0 votes may have been ineffective at achieving what a badge holder decides.
Projects that made it past the initial review filter which are unfavorable earn rewards as long as they meet quorum.
We saw this unravel first hand as the badge holders discussed certain projects during round 3. One of which has a track record of history when it comes to bad actors in the Optimism community.
Here are a few suggestions.
0 votes could be displayed separately on the UI of the platform for tracking ballots.
Signaling to low quality projects in this manner allows badge holders to see whether others have down voted a project.
Give them clarity during the round on whether they have any negative consequences involved.
A project can then decide during the round on how to pivot in a social environment for their campaign. Reflecting the negative votes to the public allows all parties to understand if a project deserves an allocation of OP.
If a project receives more than (17) 0 votes they receive zero funding even if the quorum is met. This requires consensus amongst the same amount of badge holders that it also takes to pass the quorum.
Of course if the quorum number changes so does the number of 0 votes used to veto a project from receiving funds from a round.
With both public displays of the 0 votes & a threshold to veto a project from receiving funds less time needs to be spent researching whether or not to add 0 votes at the end of a round.
This also discourages low effort projects from creating noise by advertising and marketing during the round which may take away attention from high quality projects with significant amounts of impact.
Rather than getting 600 plus projects hopes up that they will all receive funding causing a rally on social channels to sway votes their way.
Hopefully this is constructive feedback that may help us learn from the mistakes of the past.
Coordination is king! Transparency is key when it comes to determining the allocation of funds but the badge holders casting the votes must remain anonymous to provide safety to those individuals.
Additional Security Council
We also feel as if a small council of 2-3 security badge holders are elected prior to the round who are specifically focused on the DYOR side of things it would improve the efficiency of voting.
This core group of individuals would only be allowed to 0 vote projects that had been determined low effort or which made it past the initial review filter & also through the appeal process. Helping to flag projects early on in the round for other badge holders and checking for accuracy in profiles.
TL;DR My suggestion is to make the distribution predetermined on a projects rank and avoid forcing voters to come up with a number of OP
I honestly think we should remove the complexity of giving an OP number from the process. Instead we have more of a ranking process than a “pick a number” process.
If we said out of the gate that the bottom X% get nothing, 1st place gets Y OP, the last project that gets anything gets Z OP and then do a (linear?) distribution between the projects that got OP, we would be a lot better off.
Making the voter come up with an OP number is really hard, it is much easier to make a ranking. There are a million ways to do it, we could give the voter 30M votes and let them distribute them, we can break projects into categories, have the voters rank the projects within categories, and then rank the categories they reviewed against each other (my preference so people can just review categories and not feel obliged to review the entire set) or countless other methods.
Love this suggestion! Ranked voting could lead to comparable results without the complexity of defining specific numbers. To make it even easier, maybe there is a way to allocate ranks based on pre-defined tiers (1-10) with higher Tiers being ranked higher. This way you only need to place a project in a tier, and it would automatically create your ranked vote.
So I did not participate in the RPGF3 and am not a Badgeholder, but I am working on designing mechanisms for public goods funding and read extensively on RPGF3. I wanted to give a few suggestions with a view to scaling Optimism’s RPGF mechanism for future rounds
One of the main complaints I saw among Badgeholders is the sheer number of projects a BH had to deal with. 600+ projects for 146 badgeholder is barely manageable. What happens when the ecosystem grows and you have 5,000 or 50,000 projects to review?
Maybe that’s not a problem for the next round, but it does require rethinking for future iterations.
A few suggestions here:
Evaluation proportional to impact: since its impractical for every BH to review every project when the ecosystem scales, BH should be randomly assigned to projects for review, based on their expected reward (a project expected to receive 2M OP should have a lot more scrutiny than a project expected to receive 2K OP).
Based on this concept, the total amount of time BH reviewers spend on impact evaluation can be more efficiently distributed among the projects based on their expected reward.
Prediction market for expect impact: “Expected reward” implies that projects were already somehow sorted based on their expected reward (which also implies based on their impact).
I recommend creating a permissionless prediction market here that would help sort the projects according to their expected impact. I explain how a version of such a market can work here. The tl;dr is that anyone (not only project contributors) can stake a certain percent of the expected reward of a project, with the expectation to make a profit if the project is indeed reward that amount. If the project is rewarded less, the proposer may be losing funds. If it’s rewarded more, proposer is capped at a certain max rate (so proposers always have an incentive to be accurate in their evaluation).
This mechanism prevents spamming the RPGF protocol, as well as gives incentives for all users to look for the most impactful projects and boost their signal. Since everyone in the ecosystem now has the incentive to seek out and boost the most impactful projects, there’s also no need for contributors to spend time marketing on twitter, so they can instead focus on building/creating.
Paying Badgeholders: BHs can be rewarded from the staked proposal (so it’s not exactly a “stake” but rather a payment), which would reduce the protocol’s need to rely on volunteer work as the protocol scales.
Domain specific & ecosystem-wide evaluation: Each project should be assigned both domain-specific expert BHs and ecosystem-wide BHs (what’s the best ratio is subject for experimentation).
As mentioned before, BHs should be randomly assigned to projects for review. This would significantly reduce the chance of collusion between proposers and reviewers, since a proposer wouldn’t know ahead of time who is assigned for the review (to reduce the possibility for collusion further, the name of BHs reviewing a project can remain hidden until after the process is complete.
Growing Badgeholders: Project contributors may become future BHs, since their work clearly aligns with the ecosystem, and they have some expertise in their relevant field. This can help decentralize the evaluation process, which would keep it credible while maintaining alignment between the BH’s reviews and the wide ecosystem’s interests.
Based on my experience in receiving 3 grants: Ambassador, Individual, Optimism_CIS I would like to highlight the following shortcomings of the previous round of grantees and possible solutions:
Real contribution of nominees for Optimism and the ecosystem
All of these shortcomings were picked up by me while participating in RPGF3 and analyzing those who received OP distributions. To do this, I analyzed all recipients of the End User Experience & Adoption category
Real contribution of nominees for Optimism and ecosystem.
I propose to introduce a system with Karma for nominees. It offers tracking of activity over the entire content creation period, as well as participant behavior outside of the Contributions provided for review. Grantees who did not create content for OP, or who made low quality content (trivially translating official announcements through google translator and publishing it on their media) have already been noticed.
Here are some examples:
I would not like to note the rest, as it would be too subjective. Therefore, I have highlighted the most obvious examples of poor or no counterparties.
It is necessary to create a working group of badge holders who are ready to keep records at least for existing grantees and to monitor not the total contribution from date N to today (if we are talking about those who have already received a grant), but to assess the contribution from RPGF3 to RPGF4. Yes, there are examples of large projects that need more time for implementation, but they should also report on the changes that have taken place in the given period.
The badgeholder group should be anonymous to RPGF nominees and defined within the collective.
The first solution solves this problem as well. By dividing into working groups to check “Karma”, many people will understand the real contribution of badgeholders.
Another question: What to do with inactive ones? In this case, either cancel them or add more of them by selecting proactive community members.
This is a question of technical plan and decentralization in voting.
Evaluation of metrics
The most sensitive question. We have already seen examples of distributing grants to those who do not create value for the ecosystem, but at the same time it is important to notice the participants whose work is happening within the project. Again, segmenting the working group working on Karma will help in this case. Some groups will study the contribution of developers, some will research the contribution of media, and some will study internal participants whose metrics cannot be attributed to populization, but whose contribution is also important to the ecosystem.
Yes, I have little familiarity with the internal and technical workings of badgeholders and RPGF, but I have proceeded from external data available to the average user.
I hope the comments and solutions described above will help the team in improving the work.
I am suggesting badgeholders rank projects (1st, 2nd, 3rd, etc) instead of giving an OP amount (20k OP, 100k OP, etc).
If you look at the results, A LOT of projects got ~50k or ~100k OP… basically round numbers, cause giving a specific number to each project is hard.
I think it would be better for the distribution if badgeholders were tasked with ranking projects instead of giving them an OP amount. In many ways it requires more critical thinking. They wouldn’t be able to say “These 10 projects all get 100k OP” they would have to determine which of those projects are better.
The underlying concept is that impact is qualitative (like the beauty of a sunset, not a test score), so its inherently hard (if not impossible) to quantify. Ranking allows badgeholders to sidestep the quantification.
This would likely lower how many projects would get funded and increase funding concentration in the top projects. This does not seem desirable.
I find it is easier to judge a project in isolation than relatively against all other projects.
Judging is hard and time consuming. I don’t think making it easier for the badge holders should be a top goal. The goal is to allocate capital in the way that’s best for the ecosystem. Nothing wrong with making badge holder job easier but not at the expense of the ecosystem.
That being said, if people are voting in round numbers let’s lean into that. Give each badge holder markers representing OP.
The above adds up to 29,250,000, but can be adjusted to hit whatever total OP number is needed.
A few benefits:
Standardizes the rounding (hard to fight human nature)
Keeps the ethos of rewarding impact
Starts to limit how many projects a badge holder could reward (236, in the above)
Nothing prevented badgeholders from deciding to follow such a scheme in RPGF3.
Why take away their freedom to decide that, say, two projects are equally worthy of 5M?
I don’t really see how it would benefit anyone to turn things into a competition where there can only be one winner, or - more generally - where badgeholders must rank projects rather than freely evaluate their relative impact.
Deciding on a distribution in advance takes responsibility away from badgeholders, but at the expense of flexibility in being able to react to the specific applications that come in.
There is no way for anyone to know beforehand what a fair distribution would be. There could be only one really impactful project - or a thousand similarly impactful ones.
The best bet, imo, is to let human badgeholders evaluate that at the time of voting.
agree for markers and count
would add that maybe minimum amount should be set for projects to receive
looking at stats from RPGF3, if all projects that received below 10k OP, instead received 10k OP as a minimum, top ranked projects would get a bit less (could be easily calculated, but it is less than 5%), and lower ranked projects would received a lot more (even double or more for lowest ranked projects)
No doubt, data about previous rounds will become increasingly important for future rounds. I assume if a project has been “divisive” in round 3 might be one of the areas of interest for badgeholders to decide on funding in round 4.
Still, I have some doubts to make it (“divisive project”) one of the impact metrics provided for voting. Why?
IMHO there is a fine line between rating a projects based on previous RetroPGF success and overall success based on results delivered. If we are not 100% sure that the round 3 voting design was optimal, ratings based on previous rounds might get projects trapped in reinforcing loops of mislead voting.
That said I’m 100% pro impact metrics. This includes providing guidance on what metrics matter, and what metrics should not matter in voting.
Thanks again for sharing these findings, great food for thoughts!
The voting should be on a rubric. Rubrics do require relatively more work for the person who is voting. However, rubrics can be made simple and the voting UX can help simplify the voting process.
People should stake their reputation if they want to vote(we can come up with some initial reputation for all). Now if the rating of the voter varies a lot from the average voting, their reputation should be reduced.
We can also ask the voter to explain(explain using text) their voting criteria for a small number of applications(a few randomly selected). Then their inputs can be analyzed with LLM to analyze how the voter votes.
Voters “MUST BE” randomly assigned ~10% of the full list of projects on which they can vote. This can reduce any favoritism.
How the distribution is decided is completely up to the foundation… It would actually make the amount and distribution of rewards to projects deterministic.
We can decide to give first place 500,000 OP and last place 1000 OP and have any preset distribution that we want. It could be Linear, Power Law… any curve we want. We can say the bottom 5% get nothing or the bottom 50% get nothing.
The point is that taking this method would make a predictable outcome for rewards, it would lower how many projects get funded IF, and only if, thats what we want to do. It could also raise the number of projects that get funded if thats what we want to do.
What it for sure does is force the badgeholders to make some tough calls and say which projects they like more, while removing the psychological burden of saying “this work is worth X OP.”
I find it is easier to judge a project in isolation than relatively against all other projects.
I think this is absolutely true, but if you are then giving it an OP number it is a huge wild card… maybe you sit down hungry and give a bunch of projects low scores… then later you come back after lunch and a nice phone call with your partner and you give a bunch of projects high scores.
While judging projects in isolation might be easier, I would argue its going to give more erratic results.
I’m so excited about the RetroPGF mechanism and have been since the first round. I have seen its impact in Optimism and am confident the mechanisms’ impact can scale beyond crypto and into the default world. I deeply want to support its evolution which is why I am excited to share some insights aimed at refining and enhancing the RetroPGF voting process. We are using these insights to redesign Pairwise
The following problem/solutions sets are inspired by @OPmichael’s format in the thread and ar…