Applying internet-based research methods to transform a paper archive of cross-cultural data into a digitally coded open-access research database: Details, ideas and discussion.
PRESENTER: Dr. Kimberly A. Jameson (Institute for Mathematical Behavioral Sciences at the University of California, Irvine) http://aris.ss.uci.edu/~kjameson/kjameson.html
ABSTRACT: One of the most widely cited datasets in Psychology is the “World Color Survey” (known as the WCS – http://www.icsi.berkeley.edu/wcs/data.html). The WCS database stems from the classic 1969 theory of cross-cultural color categorization published by Berlin and Kay (Berlin, Brent and Paul Kay, 1969, Basic Color Terms: Their Universality and Evolution. Berkeley and Los Angeles: University of California). The WCS contains data from 110 linguistic societies from across the globe. The raw WCS data were painstakingly hand-transcribed and digitized over a number of years by just a few researchers, and gradually became publicly available beginning in 2005. About 2009 the WCS database was complete, and the last 5 years the WCS has seen a substantial rise in impact – being widely referenced across a range of psychological, linguistic, and cognitive science disciplines. Happily, an equally valuable extension of the WCS database exists in the MesoAmerican Color Survey (or MCS). The MCS was conducted by Dr. Robert E. MacLaury, a student and colleague of Berlin and Kay’s, in the early 1980’s. The MCS data collection methods were similar to the WCS, however MCS languages differ as it includes 116 surveyed societies from MesoAmerica. Thus the MCS database substantially extends the scope and wealth of information provided for by the widely-cited WCS. Unfortunately the MCS data is currently not in digital format. So although R. E. MacLaury published a theory and analyses of the MCS data in his 1997 book Color and Cognition in Mesoamerica, the thousands of pages of raw MCS survey data has never been publicly available, and up to now has been locked away in handwritten paper format. Here I present a recently funded project to transcribe the MCS data into an open-access digital database for general use by students and research scientists. This new in-progress project aims to use the power of the web, employing internet-based research methods to address the challenges inherent in the labor-intensive transcription and coding of the MCS data. The end goal — for which suggestions and comments are warmly welcomed — is to quickly and optimally take the MCS archive from hand written pages to a usable digitally encoded format.