Category Archives: Tools

Network Analysis with Palladio

This post will explore Palladio; an online digital tool that creates malleable visualizations to interpret certain formats of data.

Initially, the process of using Palladio is similar to kepler.gl; where one submits data (in this case I used .csv files) into its repository, which then converts this into data that is readable through the program to plot one’s points to their specifications. Palladio is not a full-fledged mapping tool; but rather, it uses the data to create connections and visualizations of patterns, similar to Voyant, a text-mining tool. In my exploration of the program, I utilized data surrounding the interviews of formerly-enslaved people in Alabama, creating “network maps” and “network graphs,” mainly. Although there are other functions that one can explore through this service, I mainly utilized those two.

Network maps take points which can be plotted through simple points, or through “point-to-point” plotting, which displays connecting lines through the data; indicating a path or another sort of relation between two or more points. This is useful in identifying patterns within geographical data, but the full or specified metadata is not displayed, as that is not the primary function of Palladio. It is more about the overarching connections present in the data, which the next tool I utilized exemplifies further.

There are other customization features in the “map” section, being split up into “tiles” and “shapes.” Tiles mostly shows geographic, more 3-D means of representing a space; through terrain, streets, satellite imagery, infrastructure, and custom tiles that the user can insert! Shapes allows the user to insert shapes that presumably indicate signifiers determined by the user. This system is open-ended enough for the user to do anything they want with it; whether it is for emphasis or to serve as a key/legend.

Moving on to the “tables” section, Network Graphing takes the data one submitted, and identifies themes based on one’s parameters they documented. For instance, when exploring the interviews of formerly-enslaved people, one can find connections between the sexes and topics they discussed during their interviews, age groupings, region in which the interviews took place, and other combinations that one wishes to explore further (mostly between one set of data columns within the overall document). This essentially functions as an interconnected web, creating connections for the data entries so identifiable patterns can emerge and be interpreted by the user. It is all accessible, easy to adjust, and quite flexible in what it can do! Some screenshots will be attached at the bottom of this post to give you a sense of what I am describing. Due to the volume of data, these screenshots are representing the visualizations that Palladio can produce, rather than the content itself.

There are also other visualization methods, such as “galleries” and other formats for tables. I did not utilize these functions, so although I cannot describe their functionality in-depth, it is worth mentioning that there are a variety of ways to represent user data.

Overall, Palladio enabled me to generate context and seek patterns in the data that I may have otherwise not spotted! The user experience was the best for me during my analysis of type of enslavement corresponding to the topics that were brought up in the respective interviews through network graphing. Although overwhelming at first, organizing the data through dragging circles to desired locations within the space (another great feature) enabled me to identify patterns of lingo, priorities, and the hierarchy of each condition represented in the data. It was something I was not entirely expecting, mostly because I did not know what to expect out of this program. It was surprising, and something that would have been possible through manual review; just not on this scale. This kind of technology is interesting, and something that I want to utilize more often.

Comparing Digital Tools (Voyant, kepler.gl, and Palladio)

Over the last few weeks, I have been exploring a variety of tools that are utilized in the Digital Humanities to read, interpret, map, and visualize data.  Among these tools are “Voyant,” “kepler.gl,” and “Palladio.” This post will explore what each program does, my experiences with all three, and how I will approach utilizing these tools moving forward. As I am gaining a better understanding of digital tools and the Digital Humanities as a field, it has been really interesting to observe the capabilities of these programs!

Starting with Voyant: this is a tool that takes one’s submitted corpus and analyzes it through different, generated visualizations based on the preset parameters and filters that one can modify to their needs. Without modifying any of the windows, this includes five main features: the “cirrus” (essentially a word cloud for simplicity), a “reader” that displays the chosen text and words/phrases that one explores in other sections, “trends” or a relative frequency graph that analyzes the frequencies of most-used words by default (which can be modified to analyze specific texts and/or words), a “summary” tool that identifies distinctive qualities of the texts submitted and general trends throughout the corpus by document, and a “context” tool that will find instances of the word/phrase one has selected (preceding and proceeding words in the sentence in which the word/phrase is used in). Voyant has some of the capabilities of Palladio, which will be discussed briefly.

Kepler.gl is a mapping tool that takes a set of data (traditionally through .csv files generated through programs like Microsoft Excel), and maps it based on coordinates and other relevant metadata. There are a lot of different visualizations one can make with programs like kepler.gl, including heat maps, data clusters, point maps, timelines, and more! The capabilities of kepler.gl–at least in terms of what I explored–provides a lot of variety for visual storytelling. This is best reserved for regional analysis, as plotting points/trends on a global scale can get quite complex, especially when showing broader connections between points and the conclusions made from said points. This is not to say that it is not possible, but rather not the scale that mapping projects typically deal with.

Lastly, Palladio is a tool commonly utilized to make visualizations of data and connect two or more parameters to each other. Palladio does have mapping capabilities just as kepler.gl has, which is quite useful (although quite limited in comparison)! However, its main appeal lies in its variety of visualization techniques. I utilized “Network Maps” and “Network Graphs,” so I will speak on the functionality of both in-depth to represent the technology as a whole. After uploading files (in this case a few .csv files), I was able to create visualizations through two parameters that I set. I did this for several pairings involving interviews from formerly-enslaved people in Alabama. With the info contained in the .csv files, I was able to determine connections between the type of work that enslaved people were forced into relative to the topics that they discussed in their interviews, among other trends with the different data present in the files. Although visualizations can get a bit messy when analyzing many different points of data and themes that it pulls, there are still interesting conclusions one can draw–or at the very least explore–with visual aids.

I believe that all of these tools can all pair well together in some form. Voyant is great for text-based analysis and as a starting point for broader trends in large corpora. Voyant and Palladio for instance could have the potential to pair well together! Although situational, source material that initially is ran through Voyant could pre-emptively identify themes and make parsing through data when converted and transferred to Palladio quite a bit easier!

I believe that kepler.gl and Palladio have the most potential out of these three, however. While kepler.gl is best used for mapping, I believe that Palladio better identifies patterns within the data itself. If nothing else, kepler.gl can be used to map the .csv files, while Palladio provides basic mapping functionality; also creating visual connections that Voyant would not be able to do effectively with this type of format. Voyant could potentially analyze the source material itself however, and that is where I believe its strength lies when comparing these three tools.

Overall, I believe that all three of these tools have great potential and purposes, and should be used together! Although different projects/research will call for different needs, I believe that if one has the time and knowledge, these tools will be essential for a growing digital catalogue of source material and means of digitization. I would assume and hope that these methods will only improve in accessibility and availability over time, so it would not hurt to familiarize oneself with the world of digital tools!

My experiences with these tools make me wonder what else is out there. My professors at George Mason University have encouraged me to familiarize myself with the Digital Humanities, so I decided to take two courses related to the topic this semester. Through these courses, I am utilizing a variety of different tools, and in the process of completing projects using online exhibition tools and other mapping programs. Both traditional means of research and newer ideas (at least to me) revolving around digital representations for the public have been at the front of my mind all of this year. I have found new ways to uncover new angles, narrow my research parameters, and create accessible projects for topics that I care deeply about. Digital tools, while intimidating at first, have allowed me to see the work that goes into research that targets the public, which I am greatly interested in pursuing in some form. The present day is not just museums and websites, but so much more.

Mapping with Kepler.gl

In this post, I will be discussing digital mapping as a whole through kepler.gl, a free online mapping tool that serves a lot of different analytical functions! Based on my experiences with this program, I could analyze data through coordinates, trends relating to proximity, clusters of mapping data, timelines, heat maps, color coding based on one of the provided parameters in the uploaded file(s) (in this case a .csv file through Microsoft Excel), and more! It is a tool with a lot of versatility and features to visualize data through its different filters, mapping options, and its ability to show connections within the data that the user can then interpret through other means!

At the bottom of this post is an example of a map that was created through data surrounding interviews with formerly-enslaved people in the state of Alabama in the 1930s. It specifically plots their name, age, their sex, where they were interviewed, and their place of birth. Additionally, there is more metadata that could be enabled through the source material it is visualizing, meaning that there is a lot of different information that can be provided through the metadata. Visualizations that utilize this type of source material can be useful for analyzing the interviewer and interviewee alike. It can visualize demographic data in the case of the interviewee; encouraging questions surrounding the formerly-enslaved population of Alabama at the time, their age and life-expectancy, where they came from and who ended up there at the time of the interviews (and perhaps “why” in some cases). On the side of the interviewer, it can document where these interviews took place, who interviewed them if that was inserted into the map, and the timeline in which these interviews were conducted. Through timelines–another feature that was previously mentioned–one can track the time, day, month, and year in which interviews took place, while isolating different points if one has that data entered into the files that they submitted.

I first utilized kepler.gl during my undergraduate education, so I had some prior experience before revisiting this tool (admittedly, it has been a while since I last utilized tools of this nature in any capacity). Upon doing this, I remembered the potential of not only the tool, but the practice itself. It is a multi-step process which necessitates thorough data that covers multiple aspects of what it is one is researching and eventually plotting on a map. I was vaguely aware of this process, but seeing it unfold again and being able to edit the information yourself is a valuable experience. If nothing else, it makes me appreciate this process that the public can take for granted when utilizing these types of services, even in everyday life. Geospatial research is a crucial tool in the digital humanities, and one of my favorite methods of visualizing data!

Text Analysis with Voyant

Content Warning: presence of the N-word in visuals (with the hard-R), topics surrounding chattel slavery in the United States

Voyant is a digital text-mining tool that allows for the user to analyze their submitted corpus through different visual tools. This includes (but is not limited to) word clouds, relative frequency graphs for chosen words/phrases for one or all of the documents in one’s corpus, a context tool to analyze the preceding and proceeding words in sentences containing a chosen word/phrase, a text reader to provide full texts that are being analyzed, and a summary tool that spots patterns and distinctive qualities of each document in one’s corpus. Each tool has different parameters one can set to fit their analytical needs; creating a malleable tool with a lot of different options for exploration!

Text mining can be useful if one’s source material relates to each other in some way, but the corpus’ overall length remains an obstacle for extensive/effective analysis. Some projects utilize millions of pages of text, therefore digital tools like Voyant were created to parse through raw data to create a user-friendly experience.

Voyant’s utility is in its ability to find core themes within documents or throughout the corpus. Below is an example of a word cloud generated through the “cirrus” tool; using a corpus of interviews with formerly-enslaved people in the state of Georgia:

This was useful in determining topics, themes, and the framing of the interviews. The word cloud shows themes of racial disparity, age, a sense of place through the home, plantations and the hierarchy inherent to slavery, and some instances of African-American Vernacular English (or AAVE). It sums up different aspects of the interview, which while lacking its entire context, can be a representative introduction to the content of specific documents or corpora.

The “context” tool was another feature that allows for a much better understanding of this corpus. In Maryland interviews, the “summary” tool identified the distinctive phrase “rezin” seven different times. When investigating further, it turns out that this was a local “freeman”; “Uncle ‘Rezin’ Williams.” It not only brought an individual’s story to life, but also gave me an opportunity to explore something specific to the state of Maryland through these interviews. The context tool filtered to show instances of “rezin” is pictured below:

Prior to exploring digital tools, I was not familiar with text-mining tools, or how they worked. Voyant (and I imagine other programs in the same vein) enabled me to explore texts in a whole new way. Being more of a visual learner, I appreciate when projects or texts have an interactive element that deepens one’s understanding of the topic that they are exploring. Through visual analysis driven by text, it is both a way to connect to the broader public, while succinctly conveying one’s points through more accessible means. I am grateful for tools like this, and I hope they only get utilized more in the years to come!

Why Metadata Matters

Metadata, to me, can be defined simply as “the details of data points.” By this, I mean that metadata serves as an organizational tool, while also providing context surrounding an object or text. If one were to manually fill in the metadata on an image of one’s own common frying pan, for instance, one would take dimensions, identify raw materials, when (and potentially where) the image was taken, the file format, copyright information, and so on. If the object or text is not one’s own, then this would necessitate the addition of where the item was found and analyzed; whether this be an archive, a collection, or other such creations.

Digital tools assist immensely in keeping this information together in an ethical and efficient way that provides proper context and credit. However, the effectiveness of these tools is dependent on the user compiling their primary and secondary sources. Omeka and Tropy, for instance, provide premade and customizable templates to fit the needs of the source one is adding to their online exhibit or archive, respectively.

In trying to understand the importance of proper, manually-generated metadata, we can start with the reliability of records versus human memory when one must utilize these kinds of online tools. Research requires a multitude of sources to make a convincing and holistic argument/narrative. When considering the arduous task of conducting research itself–let alone turning that into a coherent piece–one must consider that these programs are here for a reason. The field of history and Digital Humanities in general are dependent on ethical citation. They are fields that build off centuries of analysis and research to improve our understanding of the world. The manual creation of metadata, in my eyes, is a two-step process: the creation, and the observation. If one values their peers, it is vital to understand where one’s objects originate, which will help others build off of your own findings.