Part I: Defining The Web3 Identity Stack
Reputation is the core use case of the Web3 Identity Stack.
As the notion of persistence of Web3 Reputation suggests, there is a fundamental difference between the scope of Web2 and Web3 use-case design spaces around the concept of Identity. The Reputation — a notion that defines Web3 Identity comes at the forefront in most of these use cases. In Web3 Reputation is persistent, user-defined, composable, and open-source. These properties will transform the broader Identity Stack just like DeFi has revolutionized financial institutions and the primitives comprising them.
Web3 Reputation and its myriad of use cases are at the forefront of the Identity Revolution that public blockchains enable.
Introduction
In this article, we aim to:
Elaborate on the notion of Web3 Identity Stack, and
Develop a framework laying down the desired properties of such a stack concerning Web3 Reputation functions.
Both notions are quite elusive and will take some effort to unpack. We will begin Chapter I by clarifying (and updating) the vocabulary emerging around the notion of Web3 Identity Stack. Next, we will touch upon the relationships between these concepts and their web2 predecessors. We will then proceed to define what persistent reputation is and how and why it instills new meaning into the notion of online identities disrupting the very foundation of the web2 data industry.
It is essential to offer the world a flexible and holistic toolbox for working with Web3 Reputation to operationalize the disruptive potential it holds. To do so, we need to deeply consider the mathematical, technological, economic, and user experience-related properties of Web3 Reputation functions.
In the two years since we began working on Galactica.com, we have come from drafting a concept to arriving at a viable prototype solution for on-chain reputation and this article as well as those to follow is an attempt at sharing what we have learned along the way. But it’s more than just that. At the heart of our motivation to draft this series is a desire to seek feedback, align the definitions, and spark public dispute as we believe that a responsible approach to designing permissionless reputation systems is paramount to the internet’s evolution no less.
In Part 2, we attempt to define what a well-behaved reputation function could be from a mathematical viewpoint. We will touch upon the philosophical foundation and desired product properties of such functions. We will then proceed to the formal derivation of what a general purpose function would look like.
In Part 3 we address the technological implementation of a framework capable of instilling practical meaning to all these notions.
We hope that those who spend time delving into the entirety of these articles will find them thought-provoking.
Before proceeding further, we recommend revisiting some of the older papers we have published in the past, including Protocol Citizenship and The Cypher State for some of the most fundamental definitions of what relates to reputation and web3 identities and how they form a reciprocal relationship between each other; Galactica’s Reputation Framework Design for some of the earlier drafts of what a reputation framework could look like mathematically; Galactica’s Reputation Framework Implementation for a deep dive into the technological implementation of a viable reputation framework; Galactica’s Governance Framework Paper https://galactica.com/research/galactica_network_governance_framework_working_paper.pdf for a deep dive into Galactica.com’s governance framework for which Reputation is the beating heart, and finally, Galactica’s for an actionable roadmap towards the ultimate vision of governance framework.
Chapter I: Context for Web3 Identity Stack
Let’s start with two quotes that each address a part of what the Web3 Identity Stack can be defined as:
“Web3 is based on the premise that each internet user will have a unique internet identifier, like an email address, that can be natively linked to any piece of software and stored on a blockchain. As part of someone’s “decentralized identity,” a portion of a person’s online activity would then be “on chain,” meaning that it would be public and easily searchable via their individual crypto wallet.
With such a decentralized identity — a readable history unique to each person — one’s crypto wallet would function as a sort of profile, similar to Facebook or LinkedIn. But unlike web2 profiles, decentralized identities are backed by hard evidence: a permanent, timestamped record of a person’s accomplishments, contributions, interests, and activities to date.
If decentralized identity were widely adopted, people would be able to carry their full selves with them as they traverse cyberspace: their affinities and experiences reflected by what they’ve created contributed to, earned, and owned online, no matter the specific platform. This would bring us closer to how things work in the physical world, where our possessions and reputations are attached to us, rather than to the spaces we occupy; we can take them with us and use them however we please.” Scott Duke Kominers and Jad Esber for a16zcrypto
and“Reputation systems present an opportunity for platforms to recognize — and thus incentivize — participants’ high-quality contributions, including content creation, moderation, community building, and gameplay. This is crucial to the growth and sustainability of any web3 project. Yet designing reputation systems requires complex considerations around reputation supply, distribution, credibility, and more.” Jad Esber and Scott Duke Kominers for a16zcrypto
These two quotes give us a good idea of both, the potential that decentralized identities and web3 footprint enable and the challenges that one has to face to arrive at a viable technological solution for them.
The problem space spans from definitions of on-chain reputation and entities comprising the conceptual plane surrounding this notion, to defining desired properties of the functional form of reputation functions, such that the output is well behaved across all corner cases and is attainable computationally given limitations of today’s blockchain systems. In other words, one has to invent a new narrow vocabulary and a standalone mathematical and economic framework to solve for a viable on-chain reputation.
What are the relationships between Identity, Identification, Web Footprints, Reputation, and Web Paradigms (Web2 vs Web3)? As defined below, all these concepts lie semantically close to each other, and understanding them goes a long way toward solving some of the internet’s biggest long-standing challenges.
The Definitions
Let’s revisit (and update) the definitions:
Identification: The core primitives that collectively (and individually) identify a real-world person — embodying characteristics such as demographic information and skill sets, consumer preferences, geolocations, and entertainment interests. Identification necessarily originates off-chain and can then be on-ramped to a blockchain protocol. It’s instructive to think of identification as most of the data that fuels Google’s business model.
Persistent Identities: A Persistent identity consists of all the individual’s attainable data points originating off/on and cross-chain that exist (originated or on-ramped) on a blockchain protocol. A sufficiently generalized definition of a human from the perspective of a blockchain protocol is the data that a real-world human has generated on the protocol itself or has otherwise been on-ramped. A human on a blockchain can be referred to as a Persistent Identity. Today’s most popular blockchain protocols are not sufficiently equipped with features to reliably associate all data points generated by a given human to a single Persistent Identity.
Web3 Footprint: A Persistent Identity is defined by its Web3 Footprint — it is another word for all the data points that a human has generated or on-ramped.
Social Graph: The multitude of interactions between any such persistent identity and the rest of the protocol could then be called one’s Social Graph. In other words, a subset of one’s Web3 Footprint scoping all the interactions with other protocol entities is one’s Social Graph. Social Graph does not include the Identification. It is instructive to think about it this way: Identification + Social Graph of a Persistent Identity = Web3 Footprint.
Web3 Reputation: Finally, let us define what we mean by Web3 Reputation. The definition is quite formal and at first, might appear counterintuitive. Web3 Reputation is an arbitrary function of the Web3 Footprint of a Persistent Identity evaluated over an arbitrary subset of the data points comprising it. For example, one’s Web3 Footprint can consist of his or her zkKYC, a set of NFT collections one owns, one’s activity across all dApps comprising the wider ecosystem, and many other things. An example of Web3 Reputation then could be something like:
SQRT(age of account) * [0 if defaulted on a lending protocol in the past, 1 otherwise] * [1 if has_KYC, 0 otherwise].
Of course, it can also be any other function that can be computed given the availability of data and technological properties of the protocol.Web3 Identity Stack: This encompasses the suite of technological primitives underpinning the concepts outlined above, enabling practical use of identities and reputations within the Web3 ecosystem.
Before we proceed to define desirable properties of on-chain reputation, let us digress a little and give some valuable historical context that would enable us to frame these esoteric concepts in more conventional web2 terms.
Chapter 2: The Web2 Identity Stack
“Shoshana Zuboff in The Age of Surveillance Capitalism, refers to user interactions on platforms like Google as behavioral value surplus. Historically, a firm had limited resources that it had to employ immediately to produce the goods it sold you. Or it paid ridiculous amounts in storage fees. A pencil manufacturer had to ship pencils. Ford’s car factories had to sell cars. They could not endlessly stockpile the timber or rubber for the process.
With the advent of the internet, however, this equation changed. A player like Google or Meta could keep your data for a decade until it could be monetised for their benefit. I could go to Facebook now and download all the cringe texts I may have sent my crush back in 2011 (and so could you).
In the early 2000s, most dot-com projects were what AI websites are today: an abundance of inbound traffic with little or no business model. You could license your search engine to a larger corporation or sell sponsored ads like Yahoo did. […] So Google had to find a different way of selling ads altogether.
Instead of allowing people to bid on and list ads based on their assumptions about what the audience would click on, the data scientists at Google could measure and predict which ad would best suit which person. Instead of a brand’s ad manager working on assumptions, you had data scientists targeting users, allowing the brand to see a clear ROI on each click from Google.”
In other words:
Data is sticky, hence our conviction that Data is the new TVL;
As we all know in retrospect, a consequence of this stickiness is the incredible marginality of Google’s business model — it is without a doubt the most successful business model ever created; and
At the heart of it are the digital (i.e. web) footprints of real people.
Let’s move on.
“…The perfect storm was in place for the evolution of the web. A business was realizing the possibility of generating and storing a resource (user data) with next-to-no marginal costs and had the pipelines (targeted ads) to monetise it. All that was missing, in the style of most venture-backed companies, was a mechanism to scale it. This is where social graphs came into play.
…[Then came Facebook.]…
More users effectively meant you had a critical mass that could be divided and sold all kinds of goods. Users belonging to similar social graphs could be bucketed and served similar content. This became the basis for the algorithmic feeds in which we currently find love, jobs, giggles, despair & hope.
In Web2 networks, the social graph is the moat. If you allow users to interact with a graph via third-party applications, your chance of capturing user data diminishes. After all, the user then won’t be on a product that you control. If users can simply port their network of friends and family to a different application, they will have no incentive to return to yours, either.
We don’t have a shared protocol for social applications that work at the scale of Meta or Twitter because of how incentives are structured for existing behemoths. A Web2 product with open social graphs opens itself to competition and declining revenue. Both of which may not be a desirable outcome. […]”
Summarizing the extract above, the web2 frontier as we know it today is largely a product of business model-driven ecosystem evolution along two key axes:
Social graphs as in X, Facebook, etc., and
Other individual data footprints or identifications, such as search history, location, other metadata, etc.
Following the logic of definitions we have introduced above, these two combined would make up Web2 Footprint, the fuel of the data economy.
Now, once we are on the same page regarding social graphs and the broader Web2 Identity Stack, let’s try to unpack the evolution of the concept of reputation.
Quoting a16z:
“In contrast to most web2 profiles, decentralized identity is not ephemeral. This means that an NFT of a diploma in your crypto wallet, for instance, would turn into a permanent academic certification. Likewise, each piece of content you post online would be permanently linked to you (unless you choose to delete it). Moreover, with public histories it would be possible to prove that you were early to a trend or active in a project before it took off — like, say, being into Taylor Swift before she was popular or reading this article while web3 is still in its genesis.
This persistence establishes new incentives for reputation-building: instead of creating temporary profiles like we do in web2, web3 promotes long-term thinking. If people were empowered to build and maintain permanent identities online, we believe that on-chain systems could encourage people to more carefully curate the reputation markers that they carry with them into the future. In this case, curating a permanent library of NFTs is higher stakes than, say, curating a series of social media posts, in that it’s a reputation marker that you carry with you across cyberspace.”
As one might observe, the concept of ‘persistence’ is quite ubiquitous when we try to unpack the Web3 Identity Stack. We would even go as far as to say that the added property of persistence of the web footprint in the Web3 space effectively turns any data points associated with an ‘identity’ into reputation. It is a nice observation, but is there more to it? Can we outline a formal framework for the desired properties of Web3 Reputation systems?
In what follows we will attempt to do just that.
We would like to point out that what follows should be treated as an open-ended discussion: the Galactica.com team welcomes thoughtful feedback and rigorous disputes around a topic as important as Web3 Reputation.
Chapter 3: General Desired Properties of Web3 Reputation
As we have outlined above, the Web3 Reputation is a function over data points a Persistent Identity has generated. For example, Alice’s idea of reputation might be a performance someone has shown when paying back one’s debts to a lending dApp. Alice’s idea could then be to use such a reputation when deciding on a collateral rate she likes to set on a new loan to this person. Bob’s definition of reputation on the other hand can be some derivative of Expert Conference POAPs and Education SBTs that would together determine a voting power someone should have in his expert DAO. It is therefore fair to say that Reputation Functions are ubiquitous in Web3 systems, and are use-case driven.
Now, it’s important to tackle the question of how a system adaptable enough to accommodate such diverse use cases might be structured.The following is a non-exclusive list of properties that in our view a Web3 Reputation must possess to be production/adoption viable. Some of these properties are purely technological, others have to do with required properties of system economics, yet others concern something else altogether.
Here they are:
Private and verifiable: nascent properties of public blockchains and the problem space around negative reputation dictate that any holistic stack for Web3 Reputation needs to possess two core properties: (a) a reasonably equipped outside observer cannot infer the reputation of a single user by observing the blockchain, and (b) The reputation calculation is correct, complete and verifiable. As we shall see below, albeit trivial at the outset, this property poses very significant technological challenges. Only a combination of ZKP (Zero-Knowledge Proofs) and FHE (Fully Homomorphic Encryption) can reasonably solve this challenge.
Origin-agnostic: Data points comprising one’s Web3 Footprint (and by implication, the reputation) can originate from cross-chain, on-chain, and off-chain sources and can be reliably on-ramped into the on-chain reputation design space. In other words, to create the most holistic view of one’s Reputation, one has to allow for real-world (e.g. Education), web2 (e.g. X or Discord), and web3 (e.g. Ethereum, Solana activity) data points to represent a person on-chain.
Real-time: A Real-time feed of reputation data is essential for many critical on-chain use cases. The framework resistance to reputation flash attacks where an adversary uses transaction bundles similar to flash loans to take out undercollateralized loans is of the essence. Real-time in this case shall mean ‘same block’. The same shall apply to cross-chain and off-chain data sources, albeit arguably the best one can hope for here is t-1 block.
Time-series and cross-section: Reputational functions need to be sufficiently general-purpose to incorporate two dimensions: arbitrary temporal and cross-sectional logic. In econometrics or statistics, this would mean a ‘panel data’ view of the Persistent Identity activity on a blockchain. The cross-sectional dimension is what we mean elsewhere by Heterogeneous Accounts: every Persistent Identity can have a unique set of data points associated with it at every observable point in time (i.e. every block).
Expressive functional form: In a perfect world, the domain of Reputation functions shall include any elementary function, subject to a constraint of computational feasibility. If one is to examine the functions used by financial institutions to evaluate the quality of a loan portfolio, it becomes clear that they can be immensely complex. The domain of use cases of Web3 Reputation is far greater than what TradFi financial institutions do, hence it is to be preferred that attainable Reputation Functions are as diverse as possible. As in many other cases, the major impediment to realizing a perfect world use case is the available computing and speed of data dissemination in a global decentralized network. The next property deals with defining ‘computationally feasible’ in this context.
Computational feasibility: Arguably, the only solution to defining which reputation function is computationally feasible is to create a Market for Reputation computing. In other words, similar to how Gas addresses the issue of competition for block space, the demand and supply must mediate the inclusion of any new Web3 Reputation Function.
User-defined: There needs to exist democratic access to a market for Reputation Functions mentioned in point 6 above for it to be efficient. No monopoly, no undue barriers to entry.
Composable: Any user-defined Reputation Function needs to be available for use by any other user of the network. Any component of such Reputation Function needs to be open-source to be examinable by other network participants. This property is essential for forming an efficient demand side of the market for Reputation Functions elaborated upon above.
The points above serve as our best attempt to lay down a vision for a perfect Reputation dimension of the Web3 Identity Stack. In our view, this article is instrumental in creating a theoretical foundation for those to follow.
In Part II of this series, we will delve deeper into the desired mathematical properties of reputation functions and how they relate to the attainable economic and game theoretical properties of reputation systems they enable.
Part III will concern the technological implementation of a system that would possess all the properties we have discussed. It will be the first semi-formal specification of the Galactica.com Reputation Framework.
Join us on Zealy and participate in the Cypher State Campaign.