#edDDI: Digital Day of Ideas 2015

2016 update: #DigScholEd was liveblogged by Nicola Osborne. Keynotes from literary historian Ted Underwood on Predicting the past, a distant reading type approach to digital libraries, Lorna Hughes on Content, co-curation and innovation: digital humanities and cultural heritage collaboration, and Karen Gregory on Conceptualizing digital sociology.

Bumped/rewritten post – see below for brief mentions of #edDDI in 2014 and 2013 and other #digitalhss doings.

From the #digitalhss stable came Digital Day of Ideas 2015 (#EdDDI | TAGSExplorer – see graph) on 26 May, livetweeted, blogged and Storified by Lorna Campbell (@LornaMCampbell), with recordings of the talks to come.

Speakers and outputs:

Other #edDDIs:

#digitalhss in four keys: medicine, law, bibliography and crime, workshop on 12 November 2013, liveblogged by Nicola Osborne:

  • Digital articulations in medicine (Alison Crockford) – ah, the Surgeons’ Hall…seeks to illuminate the relationship between literature and medicine in Edinburgh through the development of a digital reader,  joining together not only the literary and medical spheres but also the rapidly expanding field of the digital and the medical humanities; interesting points on the nature of digihum and public engagement issues, see Dissecting Edinburgh for more
  • Rethinking property: copyright law and digital humanities research (Zhu Chen Wei) – the entrenched idea of copyright as an exclusive property regime is ill suited for understanding digihum research activities; how might copyright law respond to the challenges posed by digital humanities research, in particular the legality of mass digitisation of scholarly materials and the possible copyright exemption for text and data mining
  • Building and rebuilding a digital catalogue for modern Chinese Buddhism (Gregory Scott) – the Digital Catalogue of Chinese Buddhism is a collection of data on over 2300 published items with a web based, online interface for searching and filtering its content; can the methods and implications of working with a large number of itemised records, bibliographic or otherwise, be applied to other projects?; channelling Borges’ library of Babel 
  • Digitally mapping crime in Edinburgh, 1900-1939 (Louise Settle) – specifically an historical geography of prostitution in Edinburgh; used Edinburgh Map Builder, developed as part of the Visualising Urban Geographies project, which allows you to use National Library of Scotland maps, Google Maps and your own data; viz helps you spot trends and patterns you may not have noticed before;  for locations elsewhere in UK Digimap includes both contemporary and historical maps; Historypin uses historical photography to create maps, (EH4, plus come in #kierkegaard); see also the Edinburgh Atlas

See also the workshop on data mining on 19 November 2013.

#smwbigsocialdata: getting social at CBS

On 27 February the boffins at Copenhagen Business School (aka the Computational Social Science Laboratory in the Department of IT Management) opened their doors for Social Media Week with Big social data analytics: modelling, visualization and prediction. This was the second time CSSL has participated in #smwcph, with their 2014 workshop (preso) looking at social media analytics. See also my post on text analysis in Denmark.

Wifi access was not offered, resulting in only 19 tweets, but as many of these were photos of the slides I’m not really complaining. Also no hands-on this year, all in all a bit of a lacklustre form of public engagement.

Ravi Vatrapu kicked off the workshop with a couple of definitions:

  • What is social? – involves the other; associations rather than relations, sets rather than networks
  • What is media? – time and place shifting of meanings and actions

The CSSL conceptual model:

model

  • social graph analytics – the structure of the relationships emerging from social media use; focusing on identifying the actors involved, the activities they undertake, the actions they perform and the artefacts they create and interact with
  • social text analytics – the substantive nature of the interactions; focusing on the topics discussed and how they are discussed

It’s a different philosophy from social network analysis, using fuzzy set logic instead of graph theory, associations instead of relations and sets instead of social networks.

Abid Hussain then presented the SODATO tool, which offers keyword, sentiment and actor attribute analysis on Twitter and Facebook (public posts only, uses Facebook Graph API). Data from (for example) a company’s wall can be presented in dashboard style, eg post distribution by month.

Next, Raghava Rao Mukkamala explored social set analytics for #Marius and other social media crises. Predictions (emotions, stock market prices, box office revenues, iphone sales) can be made based on Twitter data.

Benjamin Flesch’s Social Set Visualizer (SoSeVi) is a tool for qualitative analysis. He has built a timeline of factory accidents and a corpus of Facebook walls for 11 companies, resulting in a social set analysis dashboard of 180 million+ data points around the time of the garment factory accidents in Bangladesh.

The dashboard shows an actor’s engagement before, during and after the crisis (time), which can also be analysed over space (how many walls did they post on). Tags are also listed, allowing text analysis to be undertaken.

Niels Buus Lassen and Rene Madsen then outlined some of their work with predictive modelling using Twitter. You have to buy into #some activity being a proxy for real world attention, ie Twitter as a mirror of what’s going on out in the market – a sampling issue like any other. Using a dashboard driven by SODATA they classify tweets using ensemble classifiers, such as iPhone sales from 500 million plus tweets containing the keyword “iphone” (see CBS news story | article in Science Nordic).

They also used a very cool formula I nearly understood.

Last up, Chris Zimmerman gave an overview of CSSL’s new Facebook Feelings project, a counterpart to all those Twitter happiness studies. A classification of 143 different emotions on Facebook, based on mood mining from 12 million public posts, yikes. “Feeling excited” was the most popular feeling by far. Analysis can be done and correlations made on any number of aspects of the data, with an active | passive axis in addition to the positive | negative axis used in sentiment analysis. Analysis by place runs into the usual issue – only 5% of data has locality data.

Overview slides currently available from the URL below…

Engage 2014: a conversation on #some

Update: the AHRC/BBC R3 New Generation Thinkers 2017 have just been revealed – while it feels like a very Oxbridge endeavour, easy to mock, it does bring up some items of interest…public engagement doesn’t seem to be a thing (yet) in Denmark, but there is DR’s Rosenkjærprisen (Wikipedia), awarded for annually for the best dissemination of a tricky topic.

…Three sessions at Engage 2014 focused on research communication and dissemination:

  • A Conversation with the public
  • Social media communities: challenges, lessons and opportunities for engagement with science
  • Attributes of digital engagers: academic identity and role in engaged research online

First up, A Conversation with the public. The Conversation (@ConversationUK) is a current affairs website offering “academic rigour and journalistic flair”, targeting a niche in the market with “in-depth, research informed, academic-led insight”.

Articles are written by experts, ie the individual researchers, with editors for eg Scotland. It’s fast moving, with eg a rolling response to the Chancellor’s Autumn Statement. Owned and governed by a trust, with a range of funders.

I’ve just signed off the newsletter (it’s been in ‘too much’ corner for a while, although the editor’s note is well done) and taken a proper look:

  • RSS feeds at faculty level
  • topics in abundance which you can follow via RSS or as a reader, but not controlled, limiting usefulness
  • bunging in Denmark brings up nine results, inc the usual suspects but rather deeper than the mainstream press; duly RSSd
  • republishing is invited – this is probably key
  • is there much conversation? you need to be registered as a reader to comment; on the blog (no RSS) there’s a weekly off topic space for general discussion, that must keep someone busy…have to wonder how much it is the same people talking to themselves

The Danish equivalent, Videnskab.dk (Facebook | Twitter) looks very similar, with faculty or higher RSS and topics, RSS hidden for those, but should work if you bung them in a reader. They also offer courses on research communication and formidling (~dissemination). Trying out the newsletter for now. Again, looks like overload, and wonder about usage levels.

No English spotted, but turns out there’s also something called ScienceNordic.com (Facebook | Twitter), set up in partnership with a similar service in Norway and covering a pretty broad definition of science including the ‘human sciences’. Duly newsletter’d for now, with an RSS feed for the society & culture section. If you prefer to engage aurally, there’s also Pod Academy.

Next up, Social media communities: challenges, lessons and opportunities for engagement with science with Oliver Marsh (@SidewaysScience). Social media offers a range of opportunities for public engagement, but what role should researchers play in these emerging spaces, and what skills and support might they need to engage effectively? Are gatekeepers still needed?

Finally, Attributes of digital engagers: academic identity and role in engaged research online – the potential for digital forms of communication to support and create opportunities for engaged research, with Trevor Collins (Open) and Ann Grand (@ann2_g). The session involved a Visitors and Residents mapping to explore how people engage in online places – see blog post for more.

Project #marius and infostorms

(Post copied from Danegeld blog, 4 Feb 2015.)

Update, 28 Feb 2015: I gave #SMWZOOSHITSTORM a wide berth as it would just make me cross, although the CBS team commented at another Socal Media Week CPH event that the story keeps on giving. The event did yield up:

Updates: 2 April 2014: story in Berlingske on the research, plus perspectives of the day from Denmark and RoW. The CPH Post, who had their own Marius fool, reported that the Jobindex spoof was pulled at around noon due to complaints, but it still seems to be there…9 April: the Zoo’s comms guy tells his story…a peer reviewed article on the saga, Marius, the giraffe: a comparative informatics case study of linguistic features of the social media discourse, was presented at the ACM’s CABS 14 conference (abstract)

A team at Copenhagen Business School has taken a look at the use of social media around Copenhagen Zoo’s recent giraffe story:

See also Tableau visualisations and the timeline of events.

Research questions:

  • how did the conversation amplitude evolve?
  • where did negative sentiment originate and how did it evolve/spread?
  • who were the main actors – for some #sna see slides 20-23; Twitter bios showed a lot of vegans, activists etc (slide 19), well organised on #some
  • what types of posts and events instigated the issue online?
  • how did CPH Zoo handle the event on social channels and how did the social media storm affect their presence? – posted both in English and Danish on its Facebook and very successful in terms of check-ins, likes etc, but commentary very negative, mainly English (slide 24-26)
  • how did other organisations deal with the crisis?

Over 80% of the data came from Twitter. Highest buzz rate: 332 posts/minute, with a second short lived spike at 20K tweets/hr re the second Marius. 50% of tweets were retweets – a reflection of sentiment?

Twitter offered a more direct reflection of events, in terms of volume and sentiment, and also demonstrated a more drastic reaction to network prestige factors from activists and celebs. Discourse on Facebook was different –  a more closed environment, with feelings expressed to family and friends and maybe the Zoo.

95% of the global conversation was in English, with Danish detected in only 2,220 posts. Differences in the Danish subset are particularly interesting (slide 11) – Twitter and Facebook only share 50% of the conversation – does mainstream media play a larger role in Danish society? Fewer RTs – #some used more to express oneself than to share information? But sentiment is also more neutral (slide 17), with more negative sentiment on Facebook (apart from that viral photo in support of the Zoo; ?Twitter penetration in Denmark lower, large subset of politicians, media etc).

Radian6 used for analysis, but came up short – pretty hopeless for the Danish data subset, and its automatic sentiment coding was “either super safe or super crap” (slide 16), neutral heavy, often failing to detect negative sentiment. 50 corporate communications students at CBS hand coded some data with rather different results. Much discussion over what is positive or negative in this case. Now starting to analyse YouTube comments.

Was #marius an infostorm? Infostorms, a new book from two researchers in Denmark (one chairman of the Danish Nudging Network), explores whether #some “amplifies irrational social behaviour and can manipulate minds and markets” (see press release).

Denmark’s utilitarian approach towards animals is out of step with the English speaking world in particular. Some rather less robustly scientific articles have been sighted lately, and this is a topic it will be interesting to track in the future. Here’s my collection of notable #marius stories for the record:

An image in support of the Zoo’s Director went viral on my Facebook timeline at least, and an ill advised tweet from actor Pilou Asbæk, one of the hosts for Eurovision 2014 in Copenhagen, went viral on Facebook (traces of both now deleted), but it is to be hoped that organisations representing Denmark are sensitive to the issues:

Copenhagen’s visitor card

Mapping #some

Update, Feb 2015: Tourists v locals: city heat maps showing geolocated tweets; tourists in CPH can be found in the city centre and at the airport, duh…but interesting concept! Here’s more…

Eric Fisher (Flickr | Twitter):

Cue #SoMe klaxon! Week 4 of #mapmooc looked at social media as spatial data, how social media can be used with maps, advantages and pitfalls…and just how easy it actually is to plot it on a map.

On Twitter few tweets are geotagged.  We’re up to a grand total of three in the #mapmooc TAGS archive – two by me plus:

But not:

See the difference in @asudell‘s stream:

Drew

#vandymaps are also having issues:

Seems that tweets made with the web client only get geolocation information (coordinates) in TAGS if they are tagged individually, but not if the user has merely added location in Settings, which TAGS doesn’t collect (htow about the vanilla Twitter API?). OTOH mobile apps, with inbuilt GPS, _do_ offer geocoordinates simply when location is turned on. At least I think that’s right – thanks to @derekbruff and @asudell for sorting this out!

(Update: @derekbruff has set up a #vandymaps archive, and is investigating geotagging tweets. Checking the #mapsmooc archive reveals that two of my own tweets, where I added location via the Twitter Web client, are the only ones with data in the geo_coordinates field. I’ve extracted the data from the user_lang field and will take a closer look PDQ.)

But even a small set of tweets can offer potentially interesting results – see What’s happening in our vicinity from Field Office (an arts project currently going on in CPH) – a snapshot of geotagged tweets using the Streamd.in app, plus the Esri Public Information Map, in the week’s mapping assignment. This shows the real time effects of extreme weather events and other natural disasters, including geotagged social content from Twitter, Flickr, and YouTube. As noted in the forums however this is a rather blunt instrument with a poor signal to noise ratio.

Tweetmap Alpha is a further tool to filter geotagged tweets. As we know geotagging and privacy kinda go together. GeoSocial Footprint looks at the location information you divulge on Twitter in the light of potential privacy concerns. A footprint is made up of GPS enabled tweets, social check-ins, natural language location searching (geocoding) and profile harvesting. It states that “14 million tweets per day contain embedded GPS coordinates and up to 35% of all tweets containing additional location information”, which seems rather higher than in my experience.

Geolocating tweets the hard way

Back in lesson 1, it was noted that locations relevant to a particular tweet could include:

  • the locations mentioned in the message itself
  • the user’s location when they created the message
  • the user’s home location
  • the locations implied by the message

What are you plotting when you plot location? Where people live, where they work, where there is free wifi?

And from a thread, the following methods can be used to determine the spatial origin of tweets:

  • gelocation (geotags?)
  • Geo-IP and user designations (haven’t a clue)
  • the location from the user’s profile

So, there’s more to it than geotagging via GPS. See for example Tweak The Tweet, which uses “a hashtag-based syntax to help direct Twitter communications for more efficient data extraction”.

A bunch of maps were presented on the forums, including a lone Facebook example (Mapping the world’s friendships), leading to extensive discussions on sentiment analysis and how it might/not work. Happy days!

For starters, at least three university projects use Twitter to understand [emotions] in the USA, including…

Other projects which may/not be connected to the above: Emography | Tweetfeel | Twittermood | We feel fine | Mappiness (UK). Enough already! Update, June 2014: Five Labs ” analyzes your Facebook posts to predict the personalities of you and your friends”.

More clues on sophisticated methods IRT geolocation no doubt to be found in:

I could also do with:

A nice story to finish, in the warm up to week 5.  #mapmoocer Tony Targonski created a map of Seattle on an earlier Coursera MOOC: “Larger circles mean more social activity. Greener colour represents more “positive” than expected; redder is less “positive” than expected. In this case “positive” refers to valence (a commonly used measure of sentiment), and “expected” is the predicted valence score based on the walkability measure of the block (overall more walkable places correlate with more positive sentiment).”

Which is an interesting point IRT Happy Denmark. They’re not happy, they just bike a lot (like I didn’t know).

#mapmooc statistics week 4 (7-13 August):

  • 656 (558; 374; 206) tweets, 202 (181, 117, 82) RTs, 264 (212, 112, 61) links (all +/- due to time zone differences)
  • top tweeters: @MapRevolution, @DougOfNashville, @PublicUniverse
  • n=246 (230, 152, 129); 157 (155 (102, 74) have tweeted only once
  • 61 (54, 40, 30) threads 9 (9 (11, 12)%
  • top conversationalists: @MapRevolution, @derekbruff, @annindk

Postscript: among its rather nice web apps Esri offers a social media app (hopefully a bit more stable than the gallery app) plus stuff on making a social media map in minutes – come in! See the Horn of Africa Drought Crisis Map for an example.

As a quick test I took a look at Denmark’s most popular hashtag,#dkpol. Danes aren’t big tweeters, but they are big mobile users and #dkpol people are a pretty vociferous bunch, but the results were rather underwhelming. Putting #SoMe on a map seems to be less about creating a meaningful map and more about simply harvesting the data – see We are on Albert Drive for an example of what can be done. To be revisited.

#SRAconf: social media in social research

The Social Research Association‘s conference on 24 June explored the value of socme to social researchers. The SRA is a membership body, have to admit to being a bit vague about what a social researcher is, but never mind. Twitter: @TheSRAOrg.

Sessions:

Storify from social network reporter @commutiny (and reportto follow, plus one from Eoghan O’Neill bringing up some useful points:

  • a ‘perception of privacy’ – platform specific? are users on Twitter more aware of their content being public than Facebook users? to what extent do people change their content and tone from platform to platform?
  • researching ‘issues’ – which issues are people  bothered enough about to talk about online; things that are controversial, fun, funny, cool, sexy, rapidly progressing, modern, topical or just generally interesting
  • difference between online and offline personas
  • even ‘elite’ users of twitter only use hashtags 60% of the time; using hashtags for research may miss crucial info
  • types of user – apprehensive passives, confident cavaliers, controlling cautionaries, savvy opinionators…

A report from a research consultancy has also popped up.

@Flygirltwo tweeted a Bluenod SNA of #SRAconf tweets. I’d forgotten about Bluenod. Quite fun, but not sure it tells you that much really, particularly as it only looks at the last (?) 300 tweets. Comparing #SRAconf with hot topic #letr, the latter is much more dispersed, as you might perhaps expect from a topic as opposed to an event:

#letr visuaised by Bluenod

#nsmnss: the story of a network

Updates: Dec 2013: tweetchat on defining #some: Storify | Huma Bird analysis…August 2013: see paper (26 pages, PDF) on developing the network; the section on the community of practice looks particularly interesting

On 23 April the NSMNSS network held a digital debate, the last I think of a series of events before funding runs out in May. I’ve written four posts about #nsmnss, and following the blog and Twitter stream has played a key role in my learning about research methods in relation to social media over the last year – thanks to the team!

The ‘one year on’ presentation gives some insights into the success of the network and its activities. In terms of statistics, there are now 451 fully signed up members (35% non-UK) with 77 in the Methodspace group, and @nsmnss has 1000+ followers (with 900+ tweets).

I particularly liked the way the network played around with the full spectrum of f2f and virtual events (two conferences, four knowledge exchange seminars with around 25 participants each, three online seminars, seven Twitter chats), for example holding tweetchats prior to f2f events. Plus the videos shown at the digital debate were from the previous week’s conference. This hybrid/flipped events model could work well in other fora.

It is hoped to sustain the network after funding runs out – this presumably has the biggest impact on f2f events, but in the era of social media it should be feasible to carry on some activities. A poll is calling for volunteers to get involved in projects, take responsibility for organising Twitter chats, develop resources or deliver training. A test of the strength of the network!

A range of platforms was used – perhaps too many (home page vs blog vs Methodspace anyone?). One way of streamlining activities would be to slim these down and perhaps change the ratio of curation to content – another task which could be done by v0lunteers, assuming the Twitter account is to carry on.

Finally, the dog food question: is any social network analysis or other research planned as part of the network evaluation?