Crowdsourced Collections: A Democratic Approach


I came across a project of Berry College and Bloomsberg University this week called the Martha Berry Digital Archive.  The site is built on Omeka and a plugin the developers created for the site called “Crowd-Ed”.  The plugin allows the digitization of their entire collection to be crowdsourced.  In other words, the archive digitizes the documents and users annotate, tag, and fill in the metadata for each item.

Crowdsourcing has been gaining popularity recently and has become a way for archives to digitize mass amounts of data fairly quickly and at a minimal cost.  The National Archives and Records Administration began doing this in January 2012 and have allowed users to transcribe, tag, and edit documents all online.  Later in 2012, the technology used by the National Archives was released as open-source.  The overwhelming popularity of crowdsourcing with the public has led many other archives to begin doing the same thing.   Ancestry.com used it to transcribe the 1940 census (although then they charged for access to the census–not cool).

So what are the advantages and disadvantages for archives who want to crowdsource their collections?

Advantages:

  1. It offsets prices.  Archives still have to pay to digitize their collections, but many are doing this anyway.  Crowdsourcing the digitized copies might help to decrease the costs of digitization a little bit.

  2. It engages the public.  Allowing the public to transcribe, edit, and tag a collection of documents gets them involved and interested.  Even if someone only works on transcribing the documents for an hour–they’ve explored the website, probably learned something, and are, in the best case scenario, more interested to continue exploring something related to the topic.  For a museum like the Smithsonian, who now uses the software developed by the National Archives, it probably encourages people to come out to the museums.  Meredith Stewart, an archivist at the National Archives, told a Time Magazine reporter that when a university archive “had folks get involved in transcribing, they started to get more donations, people really got engaged, their website traffic went up.” [1. “Crowdsourcing Technology Offers Organizations New Ways To Engage Public In History.” Forbes. Accessed September 22, 2013.]

  3. It creates a community of users who are interested, engaged, and invested in helping your archive or museum.  Many smaller museums and archives have to battle severe budget issues and are always looking for donors.  Crowdsourcing creates a community of users who are, through participation, invested in the success of the archives and the preservation of your collections.  Even if all the museum or archive gets is increased traffic to its exhibitions…that’s a win.

Disadvantages:

  1. The content is only as accurate and as detailed as your users.   The downside to crowdsourcing is that it can get disorganized very fast.  Users are invited to tag documents but if one person tags something with African American and another with African American(s), you end up with two separate tags for documents that should really be grouped together.  (There might be some fancy solution to this that I am just not familiar with.) It would be very easy for the structure of the archive to get out of hand very quickly.  I guess one way to combat this would be to have an editor check every document before the changes made by the user were approved, but that takes a lot of man power.

  2. The technology can be expensive and difficult to implement.  Sure, Drupal and Omeka are free, but getting a team of programmers to create a crowdsourcing platform for you isn’t.  Building a website that will hold the possibly hundreds of thousands of documents takes some serious development and the right infrastructure.  For many archives and museums this would probably a grant or a very generous donation.  The Martha Berry Archive has made their “Crowd-Ed” plugin open-source, but it still lacks documentation and would take a good amount of time and knowledge to get it running.

Personally, I think the advantages way out number the disadvantages to crowdsourcing and I think it is an excellent way to get the public involved with history.  The internet has the possibility to make history more democratic, more open, and more accessible.  Crowdsourcing is a gigantic step toward this.

Here are some other cool examples of crowdsourcing: