This is a comparative analysis assignment, which means you will be comparing two things. For this assignment, you will be exploring how two related texts (or sets of texts) of your choice compare with one another based on their patterns of frequent word use and clusters of words. Explore this using Voyant Tools and AntConc.
You will have a choice of what texts you wish to may wish to try several different combinations, and observe what you can of their word distributions using the corpus analysis tools we have been practicing with. When you feel as if you have found some meaningful and interesting patterns that seem worth comparing between the texts, write a reflection post on one of the websites you have created for this class. (You may work with either site that you wish for this assignment.) Write up a page that presents your comparison and provides images (screen captures) and links to share the source texts you used and the data you could gather. Work with the images to illustrate your essay, in which you point out interesting patterns to compare or contrast these documents in your distant reading
of them through the corpus analysis tools.
For this assignment, you have many options of texts you could choose. If you have an idea about a pair of texts to try comparing that is not on my suggestions list, ask me (Dr. B) about it. As long as you can save the documents as an electronic file in plain text, and cut out any unnecessary materials (like footnotes, headings, styling, etc) you can work with them using our corpus analysis tools. You may need to clean texts that you pull from internet sources to remove their headers, long sections of footnotes, anything that is not part of the main text of what you want to be analyzing.
These are a small sample size. Notice the number of words and ngram tokens created when you use the corpus tools.
These are larger files and you should find a greater variety of word frequency patterns here. Choose any two or three to compare with each other. You could choose to compare two texts written in nearly the same time period, or about similar topics, or choose to contrast texts that seem completely different. It is up to you to experiment. Right-click to download the linked text files to your computer to begin working with these:
.txt
at the end so that AntConc can read the file. Rich Text Format(
.rtf
), look for an option to convert it to plain textso you can save it with a
.txt
file extension.If you are not sure you are seeing anything worth comparing in the documents you selected, try changing it up: Change the ngram minimum and maximum value. The minimum value of two may not show the most interesting patterns, so try starting it at 3. You can always choose a different document from the collection, and continue experimenting.
As you work on the the corpus analysis, take notes on things that surprise or interest you. Can you see a strong pattern that makes one writer obviously different from another? Is it a pattern you would have guessed when you started, or something surprising?
Spend some time reviewing your data, and write up a reflection post including images and screen captures from your analysis. Your post should present one pair of texts, or one trio of texts that you studied with this assignment. Present your findings: how do these texts compare and/or contrast with each other in what you could see of the distinct words and phrases they most frequently use?
Prepare your post as a webpage to present on one of the websites you developed in the previous assignment (your choice: either GitHub Pages or your Wordpress or personal PSU site). Include your screenshot images on the page.
When this portion of this assignment is complete, post links to it on Canvas at the appropriate assignment link.