edit in Google

While digitizing historical documents ensures that their legacy will live on forever, the digitization process does come with certain downsides such as losing crucial information that can only be seen when viewing the original physical document. There are varying levels of digitization, each of which leads to more information being disregarded. When a document is being scanned so the images can be seen online, while all the data and information is still there – being unable to touch the paper, feel the crinkled corners, or view the indentation of the ink, causes the reader to lose sight of the time period that the document was created and the amount of effort that the person went to in order for such a document to be created. Furthermore, when a historical data set is being digitized into a computerized spreadsheet, many of the small details get completely overlooked and the overall historical importance of the document is lost. This concept can be seen clearly when looking at the digitization process for James Madison’s meteorological journal.

In 1784, James Madison embarked on a four-year project to document weather patterns that occurred on his plantation called Montpelier. James Madison’s meteorological journal focuses on twelve primary categories[1] that encompass some of the most important aspects of nature that were occurring on his plantation. These categories touched on topics such as weather patterns, animal behaviors, plant activity, and several other aspects of nature that Madison felt were important to keep track of. This was the first of two volumes and it recorded weather information from the years 1784 to 1788. Though most of the data set is written by James Madison, due to Madison having to travel, certain parts are documented by his wife and father – Dolley Madison and James Madison Senior. By looking at the handwriting on each page, we are able to see when the writing styles change and thus are able to track Madison’s travel schedule. While Madison typically kept a straightforward data collection process where he would use repetitive variables to label the weather or wind in the same way each day, he used the last three categories of the data set namely falling of leaves, bird life, and miscellanea in a more casual note-taking manner. In these three sections, Madison used more of a qualitative data collection method rather than a quantitative method where he would write brief sentences about what he saw such as “peach tree begin to blossom” or “wild geese flying Northward”. Overall, James Madison’s meteorological journal gives a detailed account of the weather conditions and environmental occurrences on his plantation.

In addition to the data points that Madison curated over the four years,  his meteorological journal is an extremely interesting historical document due to the intricate details such as the way he wrote his script letters, the slip of the pen which led to a mark on the page, skipping specific data entries, and all the other small details that don’t show up on a spreadsheet. Though these aspects are what make documents such as Madison’s meteorological journal worth preserving, when they are digitized into a spreadsheet, we lose what makes these historic documents actually historic. The common denominator amongst each of these components that are lost due to digitization is that they are what humanizes Madison. In the case of Madison’s meteorological journal, as I mentioned above, there are several details that would be lost if it were simply digitized and transcribed into a typical spreadsheet software program. One of the most notable features that the digitization process is unable to capture in both Madison’s journal as well as any other documents that are written by famous human beings is the person’s handwriting. Handwriting is the first layer that allows for the personalization of a document. People fly all across the world to visit the National Archives Museum in Washington DC to see John Hancock’s signature on The Declaration of Independence. Similarly, being able to see Madison's personal handwriting brings a whole new level of meaning to the historical document. This concept of the importance of penmanship has many different facets including fonts, bolding specific letters, and coloring of the ink. As seen in figure three when Madison is documenting the weather during the second half of March 1789, we can see in the eighth field from the left that he chooses to bold the PM in the field title – “Weather at 4 O’Clock PM”. Another interesting aspect of Madison’s meteorological journal is how rarely a day passed that he didn’t enter the data points. Though this is true, there were a few exceptions. Most notably, on September 17, 1787 – the day the Constitution was signed – Madison didn’t record a data entry at night. This lack of a data entry shows a different perspective into the life of our fourth president. Not only are we able to see Madison’s passion for weather patterns and plant life, but we are also given a glimpse into his daily life and are left to wonder what he was doing on the night of one of the most important days in American history. Whether he was further reviewing the document or celebrating with his friends, this lack of a data entry allows the reader to theorize how Madison was spending his time. Additionally, given that the meteorological journal was an individual-led collection process by Madison, it demonstrates aspects of his life, priorities, and general thought process.

Similar to what makes Madison’s meteorological journal so compelling, within the Lewis and Clark journals, it is not the direction of the wind that is most likely what fascinates most people two centuries later, but rather the minute details on the physical paper of the journal that are able to give the reader insight into the day-to-day lives of two of the most renowned explorers. One example of this can be seen in the top right corner of Lewis and Clark's journal as shown in figure six. In the top right corner, the number fifteen is upside down. Typically, this detail would get left out of any spreadsheet digitization process, as there is no method to enter numbers upside down and the number does not fall under any specific category, so a researcher would be unable to have a slot for it in their spreadsheet. This data point being left out not only means that anyone reading the digitized spreadsheet version won’t know that the stray number fifteen represents the fifteenth straight month that Lewis and Clark had been traveling on their expedition, but the reader will also not be able to speculate what was going on in their heads that compelled them to decide to write the number in the opposite direction of everything else on the page. Was it the way that they put the papers in their bag? Did they make a mistake during the first month and want to stay consistent? Due to the digitization process of converting the journal into a spreadsheet, these details are lost forever.

We can further understand this concept of losing what makes a historical document truly special by looking at specific field notes written by Carl Linnaeus. Linnaeus, a renowned Swedish botanist from the eighteenth century, was constantly learning and writing about plant species. As shown in figure four, Linnaeus not only includes detailed written accounts of what he was researching at the time, but he also uses his talent of drawing to add another descriptive element to show his findings. It is this casual yet detailed drawing that gives the reader a better connection with Linnaeus. On the contrary, while this structure doesn’t fit into a spreadsheet format, if it were converted into a simple Word document, while it would still retain the core information present by Linnaeus, it would lose the details that were hand drawn by one of the most influential eighteenth-century scientists.

When thinking about the digitization plan that would occur for Madison’s meteorological journal, a typical process would mean taking all the categories and data points and simply creating a digital spreadsheet with the same information. For Madison’s meteorological journal, that would include approximately 10,392 records. I came to this estimate because the document starts on April 1st, 1784 and ends on January 2nd, 1789 which means there are 1732 days between the start date and end date. While there are twelve main categories for each day, only six of them are primarily used. Therefore, if we do 1732 multiplied by 6 we get 10,392. On average, once the outline of the spreadsheet is set up, I estimate that each page would take about 30 minutes to digitize. This digitization process actually took place in 2020 by Molly Nebiolo – a 5th year Ph.D. candidate at Northeastern University. Nebiolo spent time researching and converting Madison’s data points from the journal into a spreadsheet. As shown in figure five, while Nebiolo was successful in her mission to digitize the entire document, we are able to see that it has lost all of the trademark features that made it a historical document worth preserving. It is not the weather content that makes it special, but rather that James Madison took the time out of each day for four years to write this document.

After contemplating for a while, I have come to the conclusion that this current digitalization process of merely entering the data into a spreadsheet is inadequate and a new software program needs to be invented. As outlined in the first graphic below, the main feature of this new software program would need to be the ability to hover over any section of a spreadsheet and be able to view the original document. This product feature would allow for the standard digitalization process to occur without having to lose all the intricate details of the original writing that make the document unique. Additionally, another aspect of the new interface would include an explanation section for parts of the data that are unclear. Using the number fifteen from the Lewis and Clark journal as an example, as seen in figure two, a reader would be able to now understand the purpose of that number and why it is outside the typical data fields by simply tapping on the information button. These two features of the new interface solve the problem of digitalization getting rid of what is truly special about these historic documents written by famous people and allow the documents to be simultaneously digitized and preserved in their original form.

Figure One:

Figure Two:

Figure Three:

Figure Four:

Figure Five:

Figure six:

[1] Date, therm (sunrise), barom (sunrise), wind (sunrise), weather (sunrise), therm (P.M), barom (P.M), wind (P.M), weather (P.M), schooling or falling of-leaves of trees flowers other remarkable plants, appearance or disappearance of birds life, and miscellanea