Place names were coded in XML and converted to a GIS, allowing mentions to be compared. Other features mapped included emotional response (on a scale of 1-10) and physical characteristics, ie altitude. Photos from Flickr were also incorporated. The end result permitted close reading of the text alongside a map of the area described. Next up, a corpus of Lake District writing for the period up to 1900, over a million words from 80 texts.
Next, geographical text analysis:
Claire Grover’s work (of Trading Consequences) on georeferencing (ie identifying all the place names automatically (?), pulling them out of the text and linking them up to a gazetteer to give them a point location on a map) 17,667 instances of places mentioned in the Registrar General’s Reports, 1851-1911 (2 million words; Histpop). Recall: 81%, precision: 82%, and correct with locality: 75%. Mapping the instances and smoothing gave a pretty good reflection of major population centres in England, however with Bedford as an outlier cluster:
Analysing ‘London’ found high z-scores relating to water supply/quality, whereas the Liverpool/Manchester cluster was more descriptive of diseases, with no discourse on water supply. Exploring causes of death and mapping collocations with place names led to the following conclusions:
Geographical text analysis can help to understand the geographies within a corpus. At the moment we only have recall and precision statistics of about 80%, but this will get better, and even if it doesn’t you still have most of the place names within a text. Bringing together statistical summaries from corpus linguistics and micro/close readings helps understand what’s going on within a text, to aid in decisions on which parts you perhaps need to close read and which parts you can ignore.
More on georeferencing place names, from Putting big data in its ‘place’…the power and value of amalgamating and querying content by ‘place’ has long been recognised through the use of place name gazetteers, however these have limitations as they tend to record only modern place names and lack spatial resolution. A number of initiatives aimed at extending the scope of modern gazetteers include:
- Historical gazeteer of England’s place names
- Welsh historical gazetteer project
- Pelagios (antiquity and the Middle Ages)
Some spatial hums linkage:
- Spatial Humanities (Lancaster) | resources
- Locating London’s past
- A text analytic approach to rural and urban legal histories (Adam Wyner; Aberdeen)
- Trading Consequences | white paper | access the data | lexical resources | Text mining 19th century place names
- Toward spatial humanities: historical GIS and spatial history – eds Ian Gregory and Alistair Geddes
- A vision of Britain through time | GBHGIS | Historical GIS Research Network
- Visualisation techniques and GIS to analyse letter correspondence | Mapping Norman Nicholson’s network – see Stanford’s Mapping the Republic of Letters
See also my post on Telling stories with maps: literary geographies.