One of my interests is corpus linguistics and creating corpora. However, I want to get better at analyzing my corpora more deeply.
As a project to help me learn the software/language R, I made a corpus analysis tool that gets the first 5 and last 5 characters of each word in a corpus, counts their occurrences and outputs the results in CSV files.
You’ll need to download R if you don’t have it.
The code I wrote is here.