Introduction
Since our strike announcement, a number of new developments have occurred. Philippe, VP of Community, posted data they have regarding GPT content on the platform. Stack Exchange staff reached out to strike organizers and asked us to choose three moderator representatives for the strike. Also, a former DBA for Stack Exchange, Inc. disclosed that data dumps have been disabled. This post strives to speak on these developments from the perspective of strike participants.
Strike representatives
We have been coordinating action on our Discord server, as Stack Exchange, Inc. has instructed us in no uncertain terms that we cannot organize on their platform. Through this server, we were reached out to by employees of Stack Exchange, Inc. to designate representatives for the strike. These representatives will later meet with Stack Exchange, Inc. representatives and will negotiate on our behalf. These representatives must be moderators, as requested by the company, because some of the discussion points will involve information that is currently confidential and covered by the moderator agreement. Additionally, new data may be disclosed by SE that may not be cleared for public release, and must be protected for privacy reasons.
We are currently voting to determine these representatives. The vote will end on June 11th at 23:59:59 UTC.
The Data Dumps
A former database administrator for the company has disclosed that Stack Exchange, Inc. quietly disabled the data dumps in March 2023, with a note that they should only be re-enabled with approval from senior leadership. Shortly after, the CTO confirmed this to be the case, citing the need to “protect Stack Overflow data from being misused by companies building LLMs” and the dump has been stopped until “guardrails” are in place.
The Stack Exchange data dumps have been in place since 2009 and have been used to make network data available in an alternative format that allows people to take advantage of the open CC BY-SA license.
Disabling the data dumps in this manner is yet another example of poor communication with the community at the heart of the network. The data dumps were turned off for several months, with no advance warning or communication until a user asked about it.
Perhaps more importantly, the data dumps serve to emphasize the very reason for the existence of the platform: Guaranteed, free access to a repository of knowledge. The network was founded to be an alternative to a paywalled platform and to guarantee that information was freely distributed. The data dumps were an insurance that no matter what happened with the company in the future, the information shared on the platform would always be freely accessible to all. Disabling these dumps is a betrayal of the founding philosophy of the network.
The impact so far
Stack Exchange, Inc. has claimed in a statement to the press that 11% of Stack Exchange moderators are participating in this strike. We would like to clarify that while this was technically an accurate statement at the time that it was made, it was a misrepresentation of the actual percentage of moderators actively handling flags who are suspending their activity, and does not put the strike in perspective.
On Monday, June 5th, the notice about the strike was posted to Meta Stack Exchange, and the strike kicked into effect. The open letter, however, had been available to sign beforehand, as organization and coordination required. Some moderators signed the letter before it went “live” on June 5th (although their signatures publicly display as the 5th due to the strike not starting before then). The 11% of moderators cited is the percentage of moderators who had signed the letter before it even went live.
Currently, the vast majority of moderators on Stack Overflow have suspended their activity. The pending flag queue has grown from just over 130 pending flags prior to Stack Exchange posting the moderator-private version of the AI generated content policy to approaching 3,000, even while many of the most active flag-raising users have also ceased raising flags.
On multiple other sites (Super User, Software Engineering, Math, Academia, etc) the majority of, or all, site moderators are on strike.
As of the time of writing, 113 out of 538 total Stack Exchange network moderators have signed the open strike letter, a percentage of 21%, and this number continues to grow.
The GPT data analysis
Stack Exchange, Inc. has released some of the data behind their decision to override community consensus and prohibit moderators from handling AI-generated content. This data analysis has several flaws and unverifiable underlying assumptions, examined in detail in the answers to that post. Among the flaws include a focus on GPT detector accuracy, but moderators do not rely blindly on GPT detectors. We do not believe that this data sufficiently backs up the perceived need to implement such a total prohibition on moderating AI-generated content, nor does it excuse the manner in which Stack Exchange, Inc. went about doing so.
To summarize points raised in just a selection of answers:
- Stack Exchange claims to have a reliable method of detecting GPT posts through draft count. This method has been called into question: it does not appear to consider ways this detection method could fail through trivial action. It also does not match reality as observed by multiple commentators. Many of the remaining conclusions depend on this method being accurate. That is, if the conclusion that GPT posts have fallen based on this detection method is inaccurate, many following conclusions are invalidated. (discussed by Mithical, Gilles, Kevin, CodeCaster, etc)
- Methodology for handling appeals from suspended users is in question. – Moderators could have been conferred with, for example. (Ryan M, Chris)
- Multiple answers question or disprove the validity of the data and the claims drawn from it by Staff as a whole. (starball, kaya3, CodeCaster)
- Staff appeared to have identified a problem (declining user activity rates), but answers provide alternative solutions for the data displayed that staff appear to have not considered. (Bryan Krause, starball)
… and this is just a brief overview of some of the first half of the first page of responses.
Our conditions for ending the strike
In both the open letter and the Meta post we have issued several conditions that must be met for the strike to end. In light of the developments mentioned above, we wanted to reiterate them here:
-
A retraction of the prohibition on moderating GPT content.
This is the immediate, first action that Stack Exchange, Inc. must take in order to begin resolving this issue. This is a non-negotiable, fundamental requirement.
-
The private policy on GPT content that was issued to moderators must be revealed publicly.
Stack Exchange, Inc. has put moderators in the untenable position of having a private policy dictating how to handle flags and moderate content that differs from the public version of this policy. The private policy must be retracted and revealed publicly so that the public knows what restrictions moderators were placed under.
-
The data dumps must be re-enabled, and SEDE and API access guaranteed.
The data dumps of Stack Exchange content serve to further the goals of free knowledge-sharing. The content posted to the Stack Exchange network was done so to further that goal and with the understanding that it would be freely distributed to anyone seeking knowledge. The data dumps safeguard that collected knowledge and must be continued.
The Stack Exchange API and Data Explorer both serve as major parts of moderation. Userscripts, queries, bots, and others are used to find, identify, and improve content across the Stack Exchange network. Access to these resources and the data dumps must be allowed to continue unimpeded.
-
Stack Exchange, Inc. must communicate, gather feedback, and act on that feedback before making major policy or software changes to the public platform.
Stack Exchange, Inc. has consistently made harmful changes to both policies and the software running the public platform that run counter to the knowledge-sharing goal of the network. Moving forwards, Stack Exchange, Inc. must consult with the community to gather feedback in order to safeguard the goals of the platform.
We continue to hope for a speedy resolution to this conflict. We look forward to Stack Exchange, Inc. taking the steps required in order for the network to return to its normal operations, focused on building and maintaining a repository of freely accessible, high-quality information in the form of questions and answers.