The final speakers at the Social Media Access Days at the German National Library for today are Oliver Watteler and Jan Schwalbach, whose interest is in the legal conditions for sharing platform data; platforms’ developer policies and Terms of Service are in constant flux, so it is important to keep track of how they evolve over time.
Researchers often have a strong interest in sharing the datasets they have collected with others; data sharing aids replicability, speeds up the research process, and enables new work. But researchers are rarely aware of the frameworks the platforms have imposed on such sharing: data scraping for research purposes is usually covered by applicable laws, but use of an API implies the acceptance of the platform’s Terms of Service for the API.
Such Terms of Service are often interlinked with various other policy documents published by the platforms; these frequently change over time, and may or may not be legally applicable under relevant national laws.
To address this, the project drew on the Internet Archive to capture a corpus of Twitter’s API Terms of Service and Developer Policies between 2006 and 2023, and found a number of major changes. Before 2010, there were only the overall Twitter Terms of Service; by 2010, there were distinct API Terms; from 2014, Twitter incorporated a Developer Policy into its Developer Agreement.
The lengths of these documents grew substantially but not steadily over time; the final iteration of the Developer Terms contained over 12,000 words, and was more than twice as long as these documents had been for much of the overall period. Over time, the restrictiveness of these terms – as measured by the number and type of data objects that could be shared, the mandated technology for sharing, the time limits to sharing, the permitted usage purposes, and the mechanisms for asking Twitter for sharing permission – also changed significantly.
Early terms were very permissive; by 2010, no sharing without prior written approval was allowed; from 2014, sharing of tweet IDs without further details was allowed; later policies introduced timeframe or volume limitations.
Twitter is only an example in this: it illustrates the constantly fluctuating restrictions being applied on research uses of platform data, and the unpredictability of such changes to the terms. This affects the ability of such research to address calls for open data and adherence to the FAIR principles: legal constraints imposed on researchers need to be considered in advance.











