Archival Data Writeup: Log of the Felix Discovery Vessel
The dataset we are looking at is a logbook from 1850-1851 of the journey on the Felix Discovery Vessel. Each entry contains various elements accordingly labeled:
- H: Each hour for the ship’s 24 hour day; The first 12 hours would be the PM hours on the day before, and the second 12 hours the AM hours of the calendar day.
- Courses: The compass course steered by the helmsman during that hour, with the compass course being expressed in terms of the 32 “points” of the Mariner’s Compass
- K: The ship’s headway, or speed through the water, in Knots as measured by casting the log-line every hour.
- F: The depth of the water in “Fathoms” as measured by casting the lead-line.
- Winds: the direction from which the wind was blowing, and the winds shown in the log were also expressed in terms of compass north.
- Lee Way: The drift off the line of the keel, due to the effect of the wind upon the hull and sails.
- General notes (not labeled): Represented by the space for additional information. Writing of the user is illegible, but it appears that notes surround specific time and dates; space is split between AM and PM. In traditional logbooks, there exists a “Remarks” column which always began at 1 PM on the new ship’s day with a comment on the strength of the wind.
Maintaining this information was vital for keeping track of the vessel’s location when out at sea. Considering that there was no GPS technology at the time, the distance traveled, distance until reaching their destination, and current position had to be carefully calculated in order to stay on track without exhausting supplies and to refrain from getting lost. The ship’s log ran for the 24 hours of the ship’s day, from 12 noon on the present day to 12 noon the next day.
A major theme raised by this dataset is honoring the exhaustive efforts performed by a vessel crew during the centuries in which navigation had to be done manually, through record keeping and (computer-less) computation. We may ask ourselves if it is ethical to digitize and share the logbook, to which I respond with a firm “absolutely”. The devotion exemplified by the crew should be noticed even hundreds of years later, especially when the task is as intimidating as open sea sailing. As is preached by most elders nowadays, modern technology is often underappreciated, and although present day sailing still consists of extensive record keeping and laborious tasks, the severity of making mistakes is not at all to the same degree. Thus, the pressure that burdened the assigned captain and navigation crew was likely overwhelming, to say the least. The dataset as a whole tells the story of a year long journey at sea, recounting crucial events, most noticeably the crew’s endeavors to journey through unpredictable weather.
Unfortunately, vital elements are likely lost during digitalization. Although our dataset appears to be relatively complete, we can consider other collections of the same nature. With time, logbooks became more technical, and during transition, we lose the personal touch that a historical one would carry. Let us compare two examples. Example 1 is a logbook from 1853 for the USS Fenimore. The author maintained a hand-written, journal-style method of recording, much like one does with a diary. Example 2 on the other hand, is a typed logbook from 1944 for the USS Montgomery, recorded in a more organized format but with more technical vocabulary, making it difficult to understand for those who are not familiar with the jargon. Even so, the entries are significantly shorter than both the logbooks from the 1800s. During this shift in the 20th century, vessels became much more complex and sophisticated, and the technology that aided sailing as well. It is likely that the requirements for keeping track of the ship’s position and path became less demanding. Consequently, as readers, we feel less connected to the captain, but at least we are finally able to read their notes, even if we can’t quite understand!
Let us consider what our logbook implies about what sort of paper datasets are salvaged. There is an undeniable, ongoing exponential growth of information that continues to stack up, and with so much to record and use for educational purposes, scholars must be picky with what they choose to dedicate hours to save. Before we analyze the standard of what is salvageable, we must recognize that paper datasets may be personally or professionally stored. Any information that is private to someone, especially if they are not a public figure of some sort, is likely to be personally stored; an example of this would be a ticket receipt. It is likely that this was preserved by family members, rather than researchers, because it must carry some personal significance and digitizing it immortalizes the value that the receipt holds for those who it belongs to. A dataset with historical value and context, on the other hand, is likely professionally preserved for scholarly or informational purposes. An example of this would be an Enumeration of Confederate Soldiers, Sailors, and Widows of a certain county. This dataset is important for not only the families of those who lived through this historical event, but for those who study it down the line. Dataset records like such give great insight into the circumstances and consequences of major events, thus giving us greater knowledge of what may precisely have occurred in the past. The collection of slavery advertisements found in Grimkės’ and Weld’s American Slavery As It Is perfectly captures the gravity of tactful recordkeeping and how it can influence our evolving society towards better directions. The Gremké sisters selected real life advertisements seeking runaway slaves and other pieces of newsletters essentially confessing to the cruelties black people suffered during this time, thus creating one of the most revolutionary abolitionist pieces of history. Hence, we may finally consider what paper datasets are typically worthy of being saved. They have at least some historical significance, contain organized and properly formatted information, and contain a purpose: either to enlighten others or share previous knowledge to inspire the advancement of current conditions.
- Digitalization Plan
A long record like our year-long logbook would beyond a doubt take considerable time to digitize. The first step to digitizing a document is determining what condition it is in and whether it is worthy of devoting long hours to saving. Luckily, our logbook appears to have been kept in pretty decent shape through its lengthy lifetime. Being that the logbook covers a year’s worth of journaling, the next step would be to scan each page at a time, adjusting the resolution and lighting along the process. During this time, one would have to be careful to not damage any pages or ensure that none are missing, torn, or stuck together. If we really wanted to be fancy with it, we’d have to clean up any stains, fading, or other abnormalities in our scans. Finally, we’d transfer this information to digital data. This would probably require the assistance of a software and a user with very good eyesight who may distinguish any characters that are doubtful to the computer. We must ensure that whatever program is being used to translate the information is not mistaking any value for another. If we really care about the precision of the document, we’d have to translate it manually or review it in its entirety post-upload, after the software of choice has identified all it can. This would definitely be required during the uploading of the general notes, which would have to be examined by someone with proficiency in reading traditional cursive. Lastly, before manipulating it any further, we’d have to save it to cloud storage to prevent having to process that whole book all over again. Finally, once it is on our computer, we may organize it and structure our data with metadata and a graphical interface. Metadata would allow for us to explore the document more efficiently, enabling us to search according to date, subject, or whatever sections we divide our logbook into. For this particular logbook, I believe that organizing by date and title of the columns would be the most useful. An interface would allow us to interact with the document, although not much visualization would be required since the logbook is rather repetitive. If we estimate a total of 400 pages, I’d guess that the process would require about 60-80 hours, with 40 of those hours being dedicated solely to scanning our document, and the rest for editing and refining.
 The description of each element is obtained from Peter Reavely’s online article “Navigation and Logbooks in the Age of Sail''. Reavely has spent a lifetime in the aviation industry and is currently a historical researcher to the Ocean Technology Foundation and the Underwater Archaeology Branch of the U.S. Naval History and Heritage Command. Reavely’s full credentials can be found at the top of said page.