#edDDI: Digital Day of Ideas 2015

2016 update: #DigScholEd was liveblogged by Nicola Osborne. Keynotes from literary historian Ted Underwood on Predicting the past, a distant reading type approach to digital libraries, Lorna Hughes on Content, co-curation and innovation: digital humanities and cultural heritage collaboration, and Karen Gregory on Conceptualizing digital sociology.

Bumped/rewritten post – see below for brief mentions of #edDDI in 2014 and 2013 and other #digitalhss doings.

From the #digitalhss stable came Digital Day of Ideas 2015 (#EdDDI | TAGSExplorer – see graph) on 26 May, livetweeted, blogged and Storified by Lorna Campbell (@LornaMCampbell), with recordings of the talks to come.

Speakers and outputs:

Other #edDDIs:

#digitalhss in four keys: medicine, law, bibliography and crime, workshop on 12 November 2013, liveblogged by Nicola Osborne:

  • Digital articulations in medicine (Alison Crockford) – ah, the Surgeons’ Hall…seeks to illuminate the relationship between literature and medicine in Edinburgh through the development of a digital reader,  joining together not only the literary and medical spheres but also the rapidly expanding field of the digital and the medical humanities; interesting points on the nature of digihum and public engagement issues, see Dissecting Edinburgh for more
  • Rethinking property: copyright law and digital humanities research (Zhu Chen Wei) – the entrenched idea of copyright as an exclusive property regime is ill suited for understanding digihum research activities; how might copyright law respond to the challenges posed by digital humanities research, in particular the legality of mass digitisation of scholarly materials and the possible copyright exemption for text and data mining
  • Building and rebuilding a digital catalogue for modern Chinese Buddhism (Gregory Scott) – the Digital Catalogue of Chinese Buddhism is a collection of data on over 2300 published items with a web based, online interface for searching and filtering its content; can the methods and implications of working with a large number of itemised records, bibliographic or otherwise, be applied to other projects?; channelling Borges’ library of Babel 
  • Digitally mapping crime in Edinburgh, 1900-1939 (Louise Settle) – specifically an historical geography of prostitution in Edinburgh; used Edinburgh Map Builder, developed as part of the Visualising Urban Geographies project, which allows you to use National Library of Scotland maps, Google Maps and your own data; viz helps you spot trends and patterns you may not have noticed before;  for locations elsewhere in UK Digimap includes both contemporary and historical maps; Historypin uses historical photography to create maps, (EH4, plus come in #kierkegaard); see also the Edinburgh Atlas

See also the workshop on data mining on 19 November 2013.

#smwbigsocialdata: getting social at CBS

On 27 February the boffins at Copenhagen Business School (aka the Computational Social Science Laboratory in the Department of IT Management) opened their doors for Social Media Week with Big social data analytics: modelling, visualization and prediction. This was the second time CSSL has participated in #smwcph, with their 2014 workshop (preso) looking at social media analytics. See also my post on text analysis in Denmark.

Wifi access was not offered, resulting in only 19 tweets, but as many of these were photos of the slides I’m not really complaining. Also no hands-on this year, all in all a bit of a lacklustre form of public engagement.

Ravi Vatrapu kicked off the workshop with a couple of definitions:

  • What is social? – involves the other; associations rather than relations, sets rather than networks
  • What is media? – time and place shifting of meanings and actions

The CSSL conceptual model:

model

  • social graph analytics – the structure of the relationships emerging from social media use; focusing on identifying the actors involved, the activities they undertake, the actions they perform and the artefacts they create and interact with
  • social text analytics – the substantive nature of the interactions; focusing on the topics discussed and how they are discussed

It’s a different philosophy from social network analysis, using fuzzy set logic instead of graph theory, associations instead of relations and sets instead of social networks.

Abid Hussain then presented the SODATO tool, which offers keyword, sentiment and actor attribute analysis on Twitter and Facebook (public posts only, uses Facebook Graph API). Data from (for example) a company’s wall can be presented in dashboard style, eg post distribution by month.

Next, Raghava Rao Mukkamala explored social set analytics for #Marius and other social media crises. Predictions (emotions, stock market prices, box office revenues, iphone sales) can be made based on Twitter data.

Benjamin Flesch’s Social Set Visualizer (SoSeVi) is a tool for qualitative analysis. He has built a timeline of factory accidents and a corpus of Facebook walls for 11 companies, resulting in a social set analysis dashboard of 180 million+ data points around the time of the garment factory accidents in Bangladesh.

The dashboard shows an actor’s engagement before, during and after the crisis (time), which can also be analysed over space (how many walls did they post on). Tags are also listed, allowing text analysis to be undertaken.

Niels Buus Lassen and Rene Madsen then outlined some of their work with predictive modelling using Twitter. You have to buy into #some activity being a proxy for real world attention, ie Twitter as a mirror of what’s going on out in the market – a sampling issue like any other. Using a dashboard driven by SODATA they classify tweets using ensemble classifiers, such as iPhone sales from 500 million plus tweets containing the keyword “iphone” (see CBS news story | article in Science Nordic).

They also used a very cool formula I nearly understood.

Last up, Chris Zimmerman gave an overview of CSSL’s new Facebook Feelings project, a counterpart to all those Twitter happiness studies. A classification of 143 different emotions on Facebook, based on mood mining from 12 million public posts, yikes. “Feeling excited” was the most popular feeling by far. Analysis can be done and correlations made on any number of aspects of the data, with an active | passive axis in addition to the positive | negative axis used in sentiment analysis. Analysis by place runs into the usual issue – only 5% of data has locality data.

Overview slides currently available from the URL below…

Telling stories with maps: literary geographies

Telling stories with maps: the geoweb, qualitative GIS and narrative mapping (programme) was a seminar (report | another one) held on 30 April as part of Hestia2, a project centred round spatial reading and visualising Herodotus’ Histories (see posts). Sessions in the morning covered narrative mapping while the afternoon focused on literary analysis and networks.

Sessions of particular note:

During the lunch break participants tried out the MapLocal app (Android only), which allows users to take photos and record audio commentaries which are geolocated and uploaded to a shared map. Echoes of the Gdn’s Google  Street View Sleuth?

Time to revisit Kierkegaard in maps, although other personally related themes might prove more doable.

A recurrent theme [was] the conceptual and technical challenges associated with efforts to shift the focus away from traditional ‘Cartesian’ cartographic methods – with their focus on surfaces, images and topographies – onto the topological and networked representations contained in narrative depictions of space.

What is lost in translation from narrative to map or map to narrative form?

Great livetweeting from @muziejus:

A further event on 6 June explored digital pedagogy.

Some linkage:

Some notes:

  • literary cartography
    • an approach using a symbolic language
    • spatial elements of texts are translated into cartographic symbols
    • allows new ways in exploring and analysing the geography of literature
    • tools of interpretation – show something which hasn’t been seen before
    • not just supporting the text
  • the space of fiction – categories
    • settings – where the action takes place (house, village)
    • zones of action – several settings combined (city, region)
    • projected spaces
      • characters are not present but are thinking of, remembering, longing for or imagining a specific place
    • markers – places which are mentioned; indicate the geographical range and horizon of a fictional space
    • paths/routes – along which characters move; connections between waypoints (settings, projected spaces)
  • database support
    • data model
      • general text information, including bibliography and assigned model region
      • about the author
      • the temporal structure of the story line
      • spatial objects
    • maps created automatically from database
  • what elements of the literary space can be mapped
    • the city in literature
    • interactions/tensions between centre and periphery
    • travelling
    • crossing borders
    • imaginary places
    • literary tourism
  • what elements are unmappable
  • different representations for epochs, genres?
  • spatial models
    • maps in literature, eg Treasure Island
    • imaginary settings
    • mapping of a single text
    • mapping of groups of texts
      • where and when do cities appear on the literary map of Europe?
      • how international is the space?
    • placing literature on a map
      • simplistic
      • no theoretical foundation
    • issues and uncertainties
      • the artistic freedom of the author
      • semantic and linguistic variation in describing places and spaces
      • vague geographical concepts
      • reading variations by different readers
      • visualisation need to make some things clearer than they actually are
      • texts do not always provide distinct or correct information
      • different interpreters can provide different viewpoints – subjective
      • mark data as direct/indirect reference
      • detail may not be provided of a journey, but a straight line gives the wrong impression
  • maps as an intermediate results, sources of inspiration, generators of ideas for future research
    • makes aspects visible which were invisible before
    • creates knowledge about places, their historical layers, meanings, functions and symbolic values

Danish Twitter census 2014

(Post copied from Danegeld blog, 4 Feb 2015.)

The latest Danish Twitter census was launched on 8 May. Couldn’t attend, but here’s the gen, from the 60 tweets at #twittercensus, 129 at #tcdk. See also my report on last year’s census, on page 7 of my 2013 Social Media Week diary (PDF).

  • number of Danish users of Twitter doubled since last census at 222,5o5 (SE: 641,746, NO: 406, 250, FI: 153,746)
  • still few ‘active’ –  39,963  Danes (18%) tweet at least once a month (SE: 38%, NO: 29%, FI: 34%)
  • very active (tweet at least once a day) – 6245 (3%; SE: 13%, NO: 8%, FI: 6%); falling
  • 64% of all Danish twitter users have only posted between 1 and 9 tweets
  • an average Danish twitter user has 64 followers (208 globally) and follows 89; low figures may be due to the rapid growth
  • techsome segment isn’t growing, but teen segment is
  • individuals are driving Twitter use in Denmark, not brands
  • only 30% tweet location
  • @bavnhoej: “mon der en dag kommer  kvalitativ analyse af hvad der bliver sagt på twitter. Vildt spændende værktøj #tcdk , men fokus er igen på kvantitet” – see @tagsterdk and @anders_boje

Project #marius and infostorms

(Post copied from Danegeld blog, 4 Feb 2015.)

Update, 28 Feb 2015: I gave #SMWZOOSHITSTORM a wide berth as it would just make me cross, although the CBS team commented at another Socal Media Week CPH event that the story keeps on giving. The event did yield up:

Updates: 2 April 2014: story in Berlingske on the research, plus perspectives of the day from Denmark and RoW. The CPH Post, who had their own Marius fool, reported that the Jobindex spoof was pulled at around noon due to complaints, but it still seems to be there…9 April: the Zoo’s comms guy tells his story…a peer reviewed article on the saga, Marius, the giraffe: a comparative informatics case study of linguistic features of the social media discourse, was presented at the ACM’s CABS 14 conference (abstract)

A team at Copenhagen Business School has taken a look at the use of social media around Copenhagen Zoo’s recent giraffe story:

See also Tableau visualisations and the timeline of events.

Research questions:

  • how did the conversation amplitude evolve?
  • where did negative sentiment originate and how did it evolve/spread?
  • who were the main actors – for some #sna see slides 20-23; Twitter bios showed a lot of vegans, activists etc (slide 19), well organised on #some
  • what types of posts and events instigated the issue online?
  • how did CPH Zoo handle the event on social channels and how did the social media storm affect their presence? – posted both in English and Danish on its Facebook and very successful in terms of check-ins, likes etc, but commentary very negative, mainly English (slide 24-26)
  • how did other organisations deal with the crisis?

Over 80% of the data came from Twitter. Highest buzz rate: 332 posts/minute, with a second short lived spike at 20K tweets/hr re the second Marius. 50% of tweets were retweets – a reflection of sentiment?

Twitter offered a more direct reflection of events, in terms of volume and sentiment, and also demonstrated a more drastic reaction to network prestige factors from activists and celebs. Discourse on Facebook was different –  a more closed environment, with feelings expressed to family and friends and maybe the Zoo.

95% of the global conversation was in English, with Danish detected in only 2,220 posts. Differences in the Danish subset are particularly interesting (slide 11) – Twitter and Facebook only share 50% of the conversation – does mainstream media play a larger role in Danish society? Fewer RTs – #some used more to express oneself than to share information? But sentiment is also more neutral (slide 17), with more negative sentiment on Facebook (apart from that viral photo in support of the Zoo; ?Twitter penetration in Denmark lower, large subset of politicians, media etc).

Radian6 used for analysis, but came up short – pretty hopeless for the Danish data subset, and its automatic sentiment coding was “either super safe or super crap” (slide 16), neutral heavy, often failing to detect negative sentiment. 50 corporate communications students at CBS hand coded some data with rather different results. Much discussion over what is positive or negative in this case. Now starting to analyse YouTube comments.

Was #marius an infostorm? Infostorms, a new book from two researchers in Denmark (one chairman of the Danish Nudging Network), explores whether #some “amplifies irrational social behaviour and can manipulate minds and markets” (see press release).

Denmark’s utilitarian approach towards animals is out of step with the English speaking world in particular. Some rather less robustly scientific articles have been sighted lately, and this is a topic it will be interesting to track in the future. Here’s my collection of notable #marius stories for the record:

An image in support of the Zoo’s Director went viral on my Facebook timeline at least, and an ill advised tweet from actor Pilou Asbæk, one of the hosts for Eurovision 2014 in Copenhagen, went viral on Facebook (traces of both now deleted), but it is to be hoped that organisations representing Denmark are sensitive to the issues:

Copenhagen’s visitor card

Mapping a community: a SNA case study

(Post copied from Danegeld blog, 4 Feb 2015.)

Update, July 2015: see Hazel Hall on DREaM Again (again), investigating the long term impact of the project. Splendid! May 2016: not much on #sna lately, apart from a snippet on R4’s Digital Human: Are you more likely to find what you’ve lost using online social networks? Are we as connected as we think we are? Or does it make more sense to step out of the digital world and search with the help of physical social networks? A larger network of weaker/looser ties is more effective in finding something lost – these ties have information you don’t have. Other factors also come into play, eg how navigable is the network? The same processes go on IRL, with the Lost and Found Office now also online.

Over the last couple of years I followed the work of the DREaM project, aimed at building a community of LIS researchers in the UK. Effective event amplification provided me with an introduction to social network analysis (SNA; nearly two years ago now!) and a host of other research methods.

The DReAM project SNA’d themselves, specifically a cadre of 33 individuals who attended all the f2f events and created the network ‘core’. In the first workshop the participants provided data on (1) individuals’ awareness of the research expertise and knowledge of other participants, and (2) social/ interactional links across the network, data which was collected again at the final workshop. The hypothesis was that analysis of the two sets of data would reveal changes in levels of integration among the DREaM cadre and network density among the group as a whole over the series of workshops – ie that integration and network density would increase.

Initial findings were presented at the final DREaM event and a paper finally published in the Journal of Documentation in October – see  Hazel Hall’s post for full details and to download the manuscript. The paper offers a potential model for nurturing and assessing network and community (of practice) development, specifically a developing, or emergent, network based on spontaneously formed ties, which could also be applied to NSMNSS , the legal education community, Danish literary translators, walking types, etc. As well as a useful overview of the development of SNA from the 1930s it provides a model for moving forward from the presentation of network diagrams, discussing features of network articulation and measurement, relational ties and network roles.

Methodology and findings:

  • data were input manually into Ucinet v.6 and visualised network diagrams (sociograms) were produced using Netdraw; measures of density and degree centrality were calculated using Ucinet
  • the sociograms highlighted the centrality of position of certain participants, prompting speculation as to their identity and the reasons behind this centralisation as well as discussion on the meaning behind some of the more isolated positions occupied by some of the outliers
  • the findings from the first round of data collection demonstrated that the participant networks were not very highly connected, and heavily centralised around a small number of actors from one role
  • analysis of data collected in the course of the final workshop reveals a demonstrable increase in network density, indicating a much more closely linked and robust network; more evenly linked, with less dependence on two or three very densely networked actors, when analysed by role several categories had moved to a more central position, one category had formed a clique and one category seemed particularly adept at network building, with most members moving towards the centre of the network
  • not all the key players were those one might have expected to play such roles; a small number of relatively novice researchers proved to be particularly strong networkers and were central to the network structure (this was not explored further due to ethical concerns)
  • greater change in the density of the network with regard to expertise awareness than for interaction, suggesting that even if participants had not had one-to-one interaction with another participant they were still more likely to know of their area of research expertise – ie who knows what, typical of a work related rather than ‘social’ network
  • note of caution: in an information sharing network, for example, an actor with a high degree of betweenness centrality may be playing the role of either broker or a bottleneck – for most network patterns multiple interpretations are possible, and it is therefore appropriate to follow up such analysis with qualitative research that seeks to explore likely explanations (data from other sources included a ‘before and after’ audit of skills and feedback on face to face events)

Discussion:

  • the results suggest that network density and integration can be increased by structured and informal social and work based interaction; a model of combining workshops with social events and the use of social media reduces the isolation often experienced by the researcher, in particular the solitary, novice or practitioner researcher
  • increased network density and integration reduces the dependence of the network on a couple of actors, making the sustainability of the network more likely and increasing network capital – more likely that participants will be able to leverage potential benefits
  • potential drawbacks – a higher density of network structure and the formation of cliques may pose a barrier to incomers and increased homogenisation – homophily; it is critical to ensure that barriers to entry to the network remain low with a network of loose ties; individuals should be encouraged to play an active role in boundary spanning, ensuring innovation, opportunity and diversity of viewpoint
  • the challenge is to maintain the existing links and further develop the network so that it evolves into a self sustaining and continuously developing supportive community

Specific interventions used to increase and strengthen network ties over the course of the project included pre-event social meetups, a Twitter list, curation over the full event lifecycle, a Spruz community, participant led sessions, event reporters.

The role of event amplification in particular is interesting, an issue which keeps popping up and perhaps has potential in proving its ROI. Effective event coverage can in fact change the nature of an event, ensuring that participants can make the most of f2f interaction and are better able to reflect after the event. Alan Cann touches on this issue too in his recent post on the way forward for #solo13 – the conference as aggregator, building an online community of mutual support. The same goes for MOOCs, but the role of aggregation and curation is often overlooked.

Some #sna bits n bobs picked up from the paper:

Commonly measured network features:

  • size – at the actor level: the number of linkages an actor has; at network level: the total number of linkages in the network
  • reachability – the accessibility of points of the network based on a notion of path, ie the connected sequence of linkages by which it is possible to move from one point to another in the network; a point is reachable when there is a path between points
  • density – the degree to which actors are linked to one another; parts of a path are dense if each of its points is reachable from every other
  • centrality – the degree to an individual actor is near others in the network and the extent to which the person lies on the shortest path between others and thus has potential for control over their communication

Examples of relational ties:

  • evaluation of one person by another – friendship, liking, respect
  • transfer of material resources – business transaction, lending, borrowing
  • association/affiliation – jointly attending the same social event, belonging to the same club
  • behavioural interaction – talking together, sending messages
  • movement between places or statuses – migration, social or physical mobility
  • physical connection – co-location at work
  • formal relations – authority
  • biological relations – kinship, descent
  • communication relations – sharing of publications, discussion of ideas

Example of network diagrams from Martin Hawksey:

network diagrams from Martin Hawksey:

#SRAconf: social media in social research

The Social Research Association‘s conference on 24 June explored the value of socme to social researchers. The SRA is a membership body, have to admit to being a bit vague about what a social researcher is, but never mind. Twitter: @TheSRAOrg.

Sessions:

Storify from social network reporter @commutiny (and reportto follow, plus one from Eoghan O’Neill bringing up some useful points:

  • a ‘perception of privacy’ – platform specific? are users on Twitter more aware of their content being public than Facebook users? to what extent do people change their content and tone from platform to platform?
  • researching ‘issues’ – which issues are people  bothered enough about to talk about online; things that are controversial, fun, funny, cool, sexy, rapidly progressing, modern, topical or just generally interesting
  • difference between online and offline personas
  • even ‘elite’ users of twitter only use hashtags 60% of the time; using hashtags for research may miss crucial info
  • types of user – apprehensive passives, confident cavaliers, controlling cautionaries, savvy opinionators…

A report from a research consultancy has also popped up.

@Flygirltwo tweeted a Bluenod SNA of #SRAconf tweets. I’d forgotten about Bluenod. Quite fun, but not sure it tells you that much really, particularly as it only looks at the last (?) 300 tweets. Comparing #SRAconf with hot topic #letr, the latter is much more dispersed, as you might perhaps expect from a topic as opposed to an event:

#letr visuaised by Bluenod