The next speakers in this session at the Social Media Access Days at the German National Library are Gabriel Viehhauser and Carl Friedrich Haak, whose interest is in making use of donated social media data – the concrete context here is that the Austrian author Clemens J. Setz, who has at times posted some of his short-form work on Twitter, has donated his archive of tweets to a library in Vienna, which was unsure about what to do with this gift.
Such work is diverse in its formats; further, Setz is author, but also interlocutor, curator, recipient, object of mentions, and many other things. His posts can be prompted by (and respond to, retweet, or quote tweet) other accounts’ posts, including spam and bot posts; they may include multimedia content and other extraneous materials; they exist within an unbounded space centred on Setz as the author.
To deal with this dataset, the standards developed by the Text Encoding Initiative for the metadata annotation of printed texts may be relevant but (in its focus on print) insufficient; also relevant is the Web archiving standard WARC, which is focussed on the hypertextual interlinkages between texts but does not model these connections as communicative acts); otherwise, screen recordings of traversals between texts could also be used to capture the original flavour of such interactions on Twitter, but these are difficult to regenerate now and do not lend themselves to subsequent analysis.
Combining these ideas, the project therefore drew on a graph data model on the basis of CIDOC CRM, which is widely used in the cultural heritage field; this facilitates the interlinkage of the individual tweets in the dataset on the basis of various shared attributes, and makes this graph dataset queryable using tools like SparQL for the purposes of further analysis.
It would also be desirable to present the dataset in the form of an interactive and potentially annotated platform that also allows for filtering and further analysis and visualisation by its users. Ideally, this takes a modular design that is applicable beyond the specific dataset being examined here.











