Nathan Cumbo
September 27, 2023
DAT 375
Module 3 Assignment
For this analysis, the most appropriate data analysis technique is quantitative analysis.
The purpose of this analysis is to use the provided scraped data set to
find the average number of
messages by a single username, the average number of reshares by a single username, and the
time frame that has the highest number of original messages. I felt a quantitative analysis was the
best fit approach for this problem because I am searching for numerical present data, with no
need to make future predictions at this time.
To find the average number of messages and reshares by a single username, the scripts
were quite similar to one another, the primary difference being which column/category is being
chosen from (in this case, messages vs. reshares).
To find the time frame with the highest
number of original messages, I wrote a script to select the column ‘created_at’ which identifies a
timestamp for message creation time, accurate to date and time down to a second. The results of
this script shown below in figure 1.1 show that 6 separate messages were created at exactly
13:42:34 on 09/20/2017, more than on any other specific single timestamp.