Retro Funding S7 – Onchain Builders – Eval Algos

ccerv1 · February 26, 2025, 5:56am

Hey everyone, Carl from Open Source Observer here, sharing the evaluation algorithms for Retroactive Public Goods Funding (Retro Funding) Season 7: Onchain Builders. The Onchain Builders Mission seeks to reward protocols and dapps that demonstrate meaningful onchain usage, drive TVL growth, and support interop across the Superchain. See here for more on the mission’s objectives.

This season, Optimism’s Retro Funding is pioneering a “metrics-driven, humans-in-the-loop” approach. Instead of voting on projects, citizens vote on evaluation algorithms. Each algorithm represents a different strategy for rewarding projects. Algorithms will evolve in response to voting, community feedback, and new ideas over the course of the season.

Here’s a tl;dr of the three algorithms:

Algorithm	Goal	Best For	Emphasis
Superscale	Reward clear-cut leaders	Large, established projects with high usage	Adoption (favors projects with high recent activity)
Acceleratooor	Prioritize fast-growing projects	New and emerging projects gaining traction	Growth (favors projects with the largest net increase in impact)
Goldilocks	Achieve a more balanced distribution	Consistently active projects with steady engagement	Retention (favors projects maintaining impact over time)

Metrics Overview

Project-Level Metrics

Each project’s score is based on four key metrics:

TVL. The average Total Value Locked (in USD) during the measurement period, focusing on ETH, stablecoins, and eventually other qualified assets.
Transaction Counts. The count of unique, successful transaction hashes that result in a state change and involve one or more of a project’s contracts on the Superchain.
Gas Fees. The total L2 gas (gas consumed * gas price) for all successful transactions that results in a state change and involve one or more of a project’s contracts on the Superchain.
Monthly Active Users. The count of unique Farcaster IDs linked to addresses that initiated an event producing a state change with one or more of a project’s contracts.

Time-Based Variants

For each core metric (i.e., transactions, gas fees, TVL, user counts), we present three variants for comparing the values for the current period vs. the previous period. Each algorithm we present has a different weighting of these variants.

Adoption. The value for the current period only (e.g., the current month’s transaction count). This captures the current level of adoption.
Growth. The positive difference between the current period and the previous period (e.g., net increase in TVL). Values are clipped at 0 if they decrease (no penalty for a decline, but no bonus, either).
Retention. The minimum between the current period and the previous period (e.g., how much of last month’s TVL is still here). This variant rewards projects that post more consistent metrics over time.

Proposed Algorithms

We’ve developed an initial three candidate algorithms for evaluating impact. Each algorithm highlights different priorities: Superscale focuses on established, high-volume projects, Acceleratooor emphasizes rapid growth, and Goldilocks tries to balance adoption, growth, and retention.

We’ve included details on each algorithm below. We also have a Hex dashboard where you can see data on each algorithm.

Important: as projects are still signing up for the current measurement period, we simulated the results for each algorithm using past Retro Funding participants and historical data. The “top projects” below are based on applications from RF4 and data from Dec / Jan.

Superscale

This algorithm aims to reward projects with significant current usage and established impact. It places heavier weights on the most recent values of TVL and transaction metrics, and less weight on user numbers. This strategy aims to embody the philosophy of “it’s easier to agree on what was useful than what will be useful”. You should vote for this algorithm if you want to keep things simple and give the projects that are having the most impact right now the recognition they deserve.

Weightings & Sample Results

Top projects (using simulated data):

Aerodrome
Aave
Virtuals Protocol
Moonwell
Zora

Acceleratooor

This algorithm seeks to reward projects experiencing rapid growth over the current measurement period. In particular, it emphasizes growth (i.e., net increases) in TVL and transaction volume. The goal is to spot breakout stars and accelerate them. This is a good algorithm to vote for if you want to create a strong signal for rising projects that the Superchain is the place to be.

Weightings & Sample Results

Top projects (using simulated data):

Aave
Aerodrome
Virtuals Protocol
Compound Finance
Paragraph

Goldilocks

This algorithm seeks to evenly balance various aspects of impact. It places a moderate weight on each metric and prioritizes retention over sheer growth. The goal is to support steady, sustained contributions across the board rather than “spiky” projects that only excel in one area. This is a good algorithm to vote for if you want to support a wide range of projects.

Weightings & Sample Results

Top projects (using simulated data):

Aerodrome
Zora
Aave
Layer3
LI. FI

How to Vote

Now it’s time for Citizens to choose which algorithm to use.

Here is the link to the Snapshot vote for Citizens (Badgeholders) to decide which algorithm(s) best aligns with our collective goals.

One of the proposed evaluation algorithms for the Onchain Builders mission will be chosen using approval voting. This means you can approve one or multiple options. For an algorithm to be selected, it must meet the following conditions:

Quorum: At least 30% (44) of badgeholders must participate in the vote.
Approval Threshold: An algorithm must receive approval from at least 51% of quorum (22 badgeholders)

The algorithm with the most approvals, provided it meets the thresholds, will be used for the remainder of Season 7.

To learn more about this mission, please see the Onchain Builders Mission Details. We also have more in-depth documentation on these algorithms in these algorithms in the Retro Funding repo.

If you have ideas on refining these models, let us know in the comments. And if you’d like to contribute to making these algorithms better, you can submit a PR or open an issue here.

We’re excited to see what the community thinks and will keep iterating based on your feedback!

joanbp · February 26, 2025, 7:25am

Hi Carl

Thank you for making these things so accessible.

Looking at the hex dashboard, would I be right to think that all three proposed algorithms produce highly similar overall distributions?

As I understand it, of the 133 projects in your simulation, Aerodrome, Aave, Virtuals Protocol, Zora, Layer3 and LI.FI would all be in top 10 no matter what and receive 30% of the round budget (all of them hitting the round cap of 5%)?

So, really, the choice of the voters will mostly affect the distribution between the projects that are in the top half, but not in the absolute top, is that correct?

ccerv1 · February 26, 2025, 8:24am

Hi Joan

First, let me emphasize that:

this is a simulation based on January data and applications from RF4, and
there will be 6 measurement periods, so even if the algorithms stayed exactly the same (which they won’t), we should expect significant movement in the top projects from month-to-month

That said, projects that post two consecutive months of strong growth will do well under any of the three algorithms. The projects you mentioned definitely fit that description. And, generally, if a project can keep that streak going through July, it should do very well in the overall mission (eg, get 5% of the total pool).

Looking at the hex dashboard, would I be right to think that all three proposed algorithms produce highly similar overall distributions?

Yes, all three algos do produce a “capped power law” distribution. However, the graph looks a bit compressed on mobile. You can see more of the variance on desktop. As an example, the 30th project gets 50% more funding under one of the simulated algos.

If you plot the Y axis on log scale, you can see the relative differences more clearly – and they get bigger as you go further down the curve.

So, really, the choice of the voters will mostly affect the distribution between the projects that are in the top half, but not in the absolute top, is that correct?

I don’t think so. Consider a project that grows really fast from a low base – it should do well in 2/3 of the algos (Acceleratooor and Superscale) but bad in Goldilocks. Then consider a project that has high numbers, but no growth in the current month – it should do well in 2/3 of the algos (Superscale and Goldilocks) but bad in Acceleratooor.

joanbp · February 26, 2025, 8:31am

Thank you! That’s very helpful.

GFXlabs · February 26, 2025, 6:32pm

Perhaps this is the wrong place to be bringing this up as a general point, but it’s relevant, so will do it here.

Why is there such emphasis on Farcaster? It has become a gating mechanism to access many parts of governance. Not only is this poor UX, but it undermines the idea that Optimism and the Superchain will maintain credible neutrality. There are no other instances where users or contributors are forced to utilize a specific chain and/or protocol in order to access governance.

Specific to this case, what is the problem that reliance on Farcaster association is solving?

ccerv1 · February 26, 2025, 8:56pm

This probably warrants a separate thread, but from a retro funding perspective let me say the following:

user numbers have the lowest weight overall across algos
we are happy to use any other user-labeling model built off the superchain data or other public datasets
this is a great area for data scientists to contribute; the best place to start is to open an issue here and link a notebook or query that demonstrates the idea

GFXlabs · February 26, 2025, 9:51pm

Ok, but you still didn’t answer

It’s 10%-25% of the project-level score, and presumably wasn’t added by accident. What is the problem that is solved for by using Farcaster IDs? Maybe it’s the right answer, but it’s not clear what the question is.

ccerv1 · February 27, 2025, 1:31am

What is the problem that is solved for by using Farcaster IDs? Maybe it’s the right answer, but it’s not clear what the question is.

The intent is to have several measures of network activity, including one that captures user numbers. All other things being equal, $1M in TVL deposited by a single address with no history is less impactful than $1M in TVL deposited by a bunch of users with some prior reputation.

Options for estimating users include:

Using addresses
Using something else linked to addresses (eg, FIDs)
Using a filtered address model or combination of models (eg, removing likely Sybils, bots, etc)

There was strong pushback against 1, so we started with 2. We’d love to see more work on 3.

GFXlabs · February 27, 2025, 3:57am

Why would that be the case? Is impact not = economic activity?

If you’re trying to estimate users, though, relying on a database with fairly low user numbers seems pretty strange. Farcaster captures a very small number of users.

Also, why would a Farcaster account be particularly suited to identifying unique users? They’re easy to create. We created one specifically to get our OP Atlas identifier, and it’s done zero activity beyond account creation. That is separate from another product-specific Farcaster account we have. (Note that Farcaster has been, to some extent, polluted by requiring anyone accessing OP governance to create throwaway accounts.)

Not trying to be difficult. Just don’t understand where you’re coming from yet, and trying to get there from here.

ccerv1 · February 28, 2025, 6:54am

Haha no worries, appreciate the back and forth.

I think there are two distinct questions here. Let me unpack them.

1. Why are we considering user numbers in the first place?

In RF4, many citizens cared about capturing user numbers somehow. So we’re starting this mission with some weight given to a user metric. Perhaps there’s a future iteration that includes an algo with zero weight to users.

2. If there has to be a user metric, why are we using Farcaster IDs?

Yep it’s easy to spin up a lot of Farcaster accounts (though it costs $7 a pop, so not a particularly cheap attack vector). It’s even easier (and costs $0) to spin up a lot of wallet addresses. As I explained earlier, we’re simply starting somewhere. I would love to see more robust, non-Farcaster based models get proposed.

One technical point to clarify is that we are using Farcaster IDs as a feature in a model. Every feature gets normalized to 0-1, so the actual number is not the important thing. It’s the distribution that matters. Our hope is that active Farcaster users is a better proxy for actual users than active addresses.

We can also compare different user models side-by-side and see if we think they offer improved signal. For example, here’s a quick chart I threw together (using the test data) comparing monthly active addresses vs farcaster users. There’s a pretty strong correlation between the two metrics, with a few outliers that I’ve highlighted on either side. This means that we could use either metric and for most projects the result would be the same. But for some projects (eg, pods.media and Dmail) the choice of metric makes a big difference. Ideally this is the type of analysis that other community members do and that leads us to better metrics / weights.

Finally, the numbers in the Dune chart you showed seem too low. I’m not able to click into it to see the actual query, but I tried to reconstruct it from raw Farcaster data and I get about ~150K MAUs in January.

Don’t get me wrong: I am not here to defend Farcaster IDs as a good proxy for actual human users. I’m here to say: help us find a better one. Or phase it out entirely.

GFXlabs · February 28, 2025, 12:40pm

Guess our key critique here is that we don’t believe the distribution of Farcaster addresses is simply a random selection of actual humans, and will skew the results in ways that are not intended or desirable. Farcaster is a specialized social network, and there’s no particular reason to think that activity or presence on Farcaster maps to economically important/active users onchain.

As a thought experiment, replace Farcaster with Facebook. Would you have any reason to think Facebook identities tied to onchain addresses would be meaningful? What about Twitter? Or Truth Social? Or Blue Sky?

We would probably all agree that would not be useful in isolation and to the exclusion of other methods. Just because Farcaster uses an onchain component itself doesn’t suggest that the users there are especially relevant to Optimism activity. For all we know, it could even be that the lowest value users are clustered there, making it a negative signal and not mere noise. We don’t know, and no one has made a case.

This is not directed at you specifically, but it’s probably time for disclosures about conflicts of interest whenever anyone from Foundation pushes Farcaster as a gating mechanism or otherwise special treatment that would never be given to any other project building on the Superchain. We would, however, happily support it as one option or one measurement metric amongst many, though.

cheeky-gorilla · February 28, 2025, 1:37pm

First off, I’m excited to be part of another RPGF round, big thanks to Jonas and everyone at the foundation for your continued efforts keeping retro funding alive! Also excited to see Optimism continue to take big swings in the experiment that is RPGF. Next, thank you Carl and OSS team for the work on this, and for the extremely well-written writeup.

Here’s how I voted:

Onchain builders: Superscale
Dev tooling: Arcturus

I made these choices because rewarding projects that have already demonstrated significant usage and impact is exactly what RPGF is all about. That being said, I have concerns:

Since projects are still signing up for the current measurement period, we can’t definitively see how our chosen algorithm will reward projects. While this reduces bias, it makes it difficult for me to understand how I am actually rewarding projects.
I say this every round - I strongly feel that revenue should be deducted from impact. Not deducting revenue puts public goods projects which cannot generate revenue on unequal footing with other projects that can. To be clear, my intention is not to exclude projects that generate revenue, just to reward them less. I would love RPGF to incentive projects to maximize their potential to do good, as opposed to pursuing revenue opportunities. See here for my full thoughts on this.
Why is there no longer an “open source reward multiplier”? Are all these projects open source? I understand that the multiplier received some backlash for potentially being flawed, but I still strongly believe we should prioritize rewarding open source projects.
I’m not in favor of using Farcaster as a measure of users for Sybil resistance. My Farcaster address is unique to Farcaster, and I believe everyone should use one unique address per service for privacy and security reasons. Encouraging the use of a single address for all activities is risky and could lead to significant losses if not managed carefully. Instead, it might be worthwhile to consider a community-run competition to identify and exclude Sybils, scammers etc.
You have given badgeholders extremely little control this round, which opens the program up to potential gaming. Inputs like stars or forks should not be considered, as they can be easily manipulated. If there is a lack of trust in badgeholders to make subjective judgments, perhaps we should replace them with a panel of experts.

Thanks for reading

ccerv1 · February 28, 2025, 5:29pm

We would, however, happily support it as one option or one measurement metric amongst many, though.

100% and this is super actionable. TY!

ccerv1 · March 1, 2025, 12:44am

Thanks for the write up!

Just a quick clarification: inputs like stars and forks (in the devtooling round) are only used as an initial “pretrust” seed for EigenTrust. Basically, if two projects had exactly the same usage and engagement from onchain builders, then the one with higher pretrust would rank higher. For more details, see here.

Of course, there will always be ways to try to game the system, but simply accruing lots of stars / forks or publishing useless NPM packages won’t work here.

OPUser · March 3, 2025, 12:23am

I would like to echo this too.

While Farcaster has positive aspects—such as being built on the OP Stack and I am in facor of supporting protocols within OP Stack—it should not be the exclusive onboarding mechanism, nor should it gatekeep participation.

If Sybil attacks and bots are the primary concern, alternative metrics could be implemented. For example:

KYC-verified users: Leverage Gnosis Pay cardholders (all KYC-compliant), though their total numbers are limited. Analyze their connected addresses to map relationships and identify patterns.[https://dune.com/queries/4120201]

CEX-linked addresses: Use addresses associated with centralized exchanges (Coinbase, Kraken, Binance) as a broader dataset. Refine this pool using criteria like Optimism activity, account age, balance, attestations, or overall engagement across the Superchain.Another data source could be Superseed(part of Superchain) participant.
Non-KYC addresses: Even non-KYC addresses could be filtered by activity history, account age, and asset holdings to ensure legitimacy.
Just sharing couple of thoughts here, I am sure, if we think about it, there are many other option to consider.

ccerv1 · March 3, 2025, 6:24pm

Thanks for the feedback @OPUser !
I’ve tried to capture your comments in this issue.
Feel free to comment or add more ideas there.

joanbp · March 6, 2025, 12:17am

I voted for Goldilocks.

Rationale: I believe in a balanced approach that rewards a wide range of projects, favoring steady contribution and a multifaceted definition of impact.

I would have much preferred a less steep distribution. I don’t agree with the idea that impact should be considered in terms of a steep power law distribution, and I am sorry that the decision was made to prevent citizens/voters from voting in favor of a ‘flatter’ distribution, seeing as that seems to have been the tendency in the past.

The way I see it, ‘impact’ is not as simple as 1 onchain action/artifact = 1 unit of impact.

The ecosystem benefits infinitely more from a broad set of contributors than from a few big players.

While the big players should be rewarded for their large contributions, a diverse multitude of smaller players should be celebrated - their true impact lies in that very diversity because that is what creates a buzzing ecosystem that will attract more builders and users over time.

Conceivably, with a capped power law distribution like in this round, we could end up assigning all of the rewards to 20 projects - each hitting the cap of 5%. In this extreme (hypothetical) scenario, we would effectively be making the claim that no other contributions mattered. But what kind of ‘ecosystem’ would that be?

Note that if there were only 20 contributors we would see that hypothetical scenario. Now, how much more valuable (worthwhile) is a Superchain with 100 contributors? 1000? 1000000?

I do believe in trying to recognize and generously reward exceptional contributions, and I believe in retroactive funding of past impact rather than future promises - but impact is so much more than the achievements of a few ‘heros’.

True positive impact - even with a singleminded goal of raising TVL over time - is the addition of more variety and broader engagement; not monopolies and growing inequality of opportunity and a ‘winner takes it all’ mentality.

kamilgorski · March 18, 2025, 2:56pm

Hey,

Kamil from Patterns here, just wanted to share some thoughts on the discussion as we’ve dealt with challenges mentioned here:

There’s no bullet-proof way to identify sybil accounts on large scale. Most state of the art solutions are based on mixing different algorithms - including patterns recognition (eg. time between TXs), source of funding, interactions with specific dApps (eg. Worldcoin, Gnosis Pay, Farcaster, etc.), wallet clustering. KYC is bullet-proof but has a small share of wallets identified and multiple vendors.
Assessing wallets based on only one algo will distort the results. Cross-checking wallets with dApp data may result in including only a specific group of wallets that happen to be its target audience. Confirming what @GFXlabs said - Farcaster has +1m custody addresses registered, however only 12k (1.2%) belong to top 500k OP Mainnet active wallets that generated substantial activity.

image3088×906 341 KB

Our internal data, cross-checked with DappRadar, shows that in first 2 months of 2025, Farcaster FIDs covered only 6.55% of 62 dApp’s users in OP Mainnet. This would basically be fine if this share was distributed evenly among all dApps but the issue is that for some dApps it’s 10% and for some it’s 1% which is understandable - from our experience, whales & DeFi users prefer not to be ‘doxxed’ so they’re not going to register their address at Farcaster. Not to mention the fact that Farcaster itself should then be excluded from RPGF (maybe that’s already taken into consideration) as it will always have the highest result for this specific metric.

image1824×802 98.6 KB

What are the potential solutions?

Design the metrics in a sybil-resistant way. Rather than focus on MAU / DAU, metrics could be category specific (if there’s such division) - different for DeFi (volume, TVL), gaming (stickiness, in-app purchase), etc. If there’s no division then adding at least a low threshold of minimal number of TXs per wallet would weed out ones that were created for sole purpose of pumping up the metric.
Adding a weight to every wallet considered in MAU / DAU metric. IMO this is a more bullet-proof solution, where a weight between 0-1 is added to every wallet, depending on how substantial activity the wallet generates. It does however require much more computation and thus - funds - as significant (~90-70%, depending on the dApp) share of web3 interactions is currently low-value. This is clearly visible in our app where median of all deposits is basically $0.00 for most dApps.

image1920×1154 105 KB

If this is an important challenge to solve for the OP Ecosystem - we’re happy to help in this matter or deliver these kind of weights for OP wallets. @ccerv1 - I think we can discuss it on TG if that’s a case.

Best,
Kamil

Topic		Replies	Views
Retro Funding 4: Voting Rationale Thread Retro Funding Missions 🔴	23	1540	August 3, 2024
Retro Funding S7 – Dev Tooling – Eval Algos Retro Funding Missions 🔴 season-7 , dev-tooling	11	530	March 20, 2025
Retro Funding 4: Onchain Builders - round details Retro Funding Missions 🔴 round-4	48	16667	May 24, 2024
RetroPGF Round 2 Voting Rationale Retro Funding Missions 🔴 retropgf-2	24	6868	November 9, 2023
Season 7: Retro Funding Missions Retro Funding Missions 🔴 missions , season-7 , retro-funding	34	4619	August 19, 2025

Retro Funding S7 – Onchain Builders – Eval Algos

Metrics Overview

Project-Level Metrics

Time-Based Variants

Proposed Algorithms

Superscale

Acceleratooor

Goldilocks

How to Vote

Related topics