Assignent 8

.docx

School

University of Missouri, Columbia *

*We aren’t endorsed by this school

Course

8740

Subject

Computer Science

Date

Jan 9, 2024

Type

docx

Pages

5

Uploaded by sharukh95

Assignment 8 1. This R code uses the igraph library to create and visualize a network graph of the top bigrams in H.G. Wells' novels. Here are the findings and a summary of the code: Findings: The code filters for bigrams in the text data, meaning it pairs each word with the word that follows it. It removes any bigrams that contain stop words (common words like "the," "and," etc.), as these are not typically informative for analysis. The frequency of each remaining bigram is counted and sorted in descending order. The code selects the top N bigrams based on their frequency. In this example, N is set to 10. This code reads in a dataset of H.G. Wells' novels (assuming it is in the tidy_hgwells data frame), identifies bigrams (pairs of words), filters out stop words, and counts the frequency of each bigram. It then selects the top 10 bigrams by frequency and creates a network graph where each bigram is a node, and directed edges show the order of occurrence.
2. The provided R code defines two functions, count_bigrams and visualize_bigrams , to analyze and visualize bigrams (pairs of consecutive words) in a text dataset using the tidytext , igraph , and ggraph libraries. Here are the findings and a summary of the code:
Findings: 1. count_bigrams Function: The count_bigrams function takes a dataset as input. It tokenizes the text into bigrams using the unnest_tokens function, specifying a tokenization method that creates pairs of two consecutive words (bigrams). The bigrams are then separated into individual words, creating columns for word1 and word2 . visualize_bigrams Function: The visualize_bigrams function takes a dataset of bigrams as input. It sets a random seed for reproducibility. It defines an arrow style for the edges in the network graph. The function creates a network graph from the input dataset using graph_from_data_frame . This code provides a set of reusable functions to analyze and visualize bigrams in text data. The count_bigrams function tokenizes the text, filters out stop words, and counts the frequency of each bigram. The visualize_bigrams function creates a network graph of the bigrams, applying a force- directed layout to represent their relationships visually. 3 1. Loading the King James Version (KJV):
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help