At first glance it feels like the most effective way to game this system is to grind user credit through aggregate low polarization support on fairly neutral low impact posts, then strategically 'spend' on higher profile polerizing posts. Is that a fair 'red teaming' observation?
Yes I think this actually could work. Community Notes has a basic reputation system: users need to "Earn In" by rating notes as "Helpful" that are ultimately classified by the algorithm as helpful. Once enough attackers earn in, they can totally break the algorithm.
Breaking it is not as simple upvoting a lot of, say, right-wing or left-wing posts though. The algorithm will simply classify all the attackers as having a very positive or negative polarization factor, and decide that their votes can be explained by this factor.
What would work is upvoting *unhelpful* posts. I have actually simulated this attack using synthetic data and sure enough it totally breaks the algorithm. I write about it in this article: https://jonathanwarden.com/improving-bridge-based-ranking/
Oh hey, I came across your Social Protocols groups while doing my regular rounds for Polis-related projects a few months ago, when I found Propolis! Was trying to figure out why your name was familiar-ish :)
There's also a Polis User Group discord: https://link.g0v.network/pug-discord It's pretty low-key lately, but high density of potentially-aligned ppl. I am hoping to restart the weekly open calls for prospective Polis facilitators and self-hosters, in case you're interested to log in.
Thanks for your posts by the way! I am jealous of your output -- I tend to have a few calls/meetings about Polis per week, but am not so great at producing clean artifacts like this :)
The reasoning was: coming up with (and answering) yes-no questions is more effort and a higher entry barrier for participation than just posting anything and having up/downvotes - like in a social network. Requiring this formalization of all content on a platform creates an entry barrier, e.g. people need to formulate what they want to post as a yes-no question. At the same time, it disallows content, which does not fit the yes-no question model.
Our big insight was: We can drastically simplify the user interaction and allow arbitrary content, but keep the collective intelligence aspect. That's achieved by introducing a concept similar to community notes, but in a recursive way: Every reply to a post can become a note. And replies can have more replies, which in turn can act as notes for the reply. Notes are AB-tested if, when shown below a post, change the voting behavior on the post. If a reply changes the voting behavior, it must have added some information, which voters were not aware of before, like a good argument.
It's great to see this exploration. Those interested might also want to check out https://vitalik.eth.limo/general/2023/08/16/communitynotes.h... and https://knightcolumbia.org/content/the-algorithmic-managemen... (and how aspects of this might be applied to AI governance https://reimagine.aviv.me/p/governance-of-ai-with-ai-through ).