Decoding the Civil War: Phase 2, Two Work Flows, Your Choice
After a year of hard work by our volunteers on Decoding the Civil War, Phase 1, we are about ready to launch Phase 2, the marking of metadata within specific telegrams. There are two work flows to this task. The first work flow, Code Words, is marking the arbitraries, or code words, for those messages in code. These coded telegrams will then be fed into Phase 3, the final decoding of the telegrams. Having the marked arbitraries should make the process of decoding much faster, possibly aided by computer algorithms.
The second work flow, Metadata, is a little more ambitious and complex, as we are asking our volunteers to work with individual telegrams, identifying specific metadata such as the sender, recipient, date sent, time received, etc. We are asking for metadata for a total of 10 fields; most telegrams have only a few; rarely do they have all 10. What we wish to accomplish is a way to provide simple metadata that will enable researchers to find all the telegrams to, say, Secretary of War Edwin M. Stanton, no matter whether it is in Ledger 2, or 6, or 22.
But did not Phase 1 enable full-text searching? Yes it did, and it is wonderful, but the transcriptions are accurate to the text as written in the ledger. Keeping with Stanton, if you typed in “Stanton” in the search box, you would get those pages where “Stanton” matches the search. But what if the telegram begins or ends with “EMS” or “Stantin” or “the Secretary of War”? The full-text search would ignore those pages. Furthermore, such a search looks at the whole message and returns results for any mention of “Stanton,” including other people named Stanton or places named Stanton. What if you want to look for Stanton only as the recipient? A search in a specific metadata field for “recipient” would enable that search and give you the correct results.
To aid in that search we will take the metadata tagged by the volunteers in Phase 2 and standardize the terms. So, continuing with Stanton, if the recipient is “EMS” and it is tagged as a sender or recipient, we will be able to take the consensus term and edit it to the standardized form of “Stanton, Edwin M. (Edwin McMasters), 1814-1869.” Once all the telegrams are tagged and the fields edited, if you do a specific search for “Recipient” as “Stanton, Edwin M. (Edwin McMasters), 1814-1869.” you will only get those telegrams to Stanton, not from or about him, and you will have those whether they are sent to him as “EMS,” “Stanton,” or “Stantin.”
The tagging of individual telegrams in the Phase 2 Metadata workflow will eventually enable specific searches to be done across the almost 16,000 telegrams. It will enable users to look for individuals or places or dates in specific fields. And the tagging of code words (arbitraries) in the Code Word work flow will help round out this project with the final decoding of encoded telegrams. An incredibly useful archive has been made available in Phase 1 of Decoding the Civil War. Help us leverage and categorize that hard-earned knowledge in Phase 2 to aid in the discovery of the American Civil War.
The Beta Test site for Phase 2 is here. The original site can be seen here.