Session 14 ChatGPT

pdf

School

Syracuse University *

*We aren’t endorsed by this school

Course

651

Subject

Computer Science

Date

Feb 20, 2024

Type

pdf

Pages

24

Uploaded by PresidentLapwing3845

Report
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 1 ChatGPT Optimizing Language Models for Dialogue - ChatGPT ChatGPT (Chat Generative Pre-trained Transformer) is a chatbot launched by OpenAI in November 2022. The dialogue format of ChatGPT makes it possible for ChatGPT to answer follow up questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. It is built on top of OpenAI's GPT-3 family of large language models and is fine-tuned with both supervised and reinforcement learning techniques. Based on the information by ChatGPT provider, the model is trained using Reinforcement Learning from Human Feedback (RLHF). They trained an initial model using supervised fine-tuning: human AI trainers provided conversations in which they played both sides—the user and an AI assistant. They gave the trainers access to model-written suggestions to help them compose their responses. To create a reward model for reinforcement learning, comparison data is collected, which consisted of two or more model responses ranked by quality. To collect this data, the develop team took conversations that AI trainers had with the chatbot. The team randomly selected a model-written message, sampled several alternative completions, and had AI trainers rank them. These rankings were used to create 'reward models' that the model was further fine-tuned on using several iterations of Proximal Policy Optimization (PPO). Proximal Policy Optimization algorithms present a cost-effective benefit to trust region policy optimization algorithms; they negate many of the computationally expensive operations with faster performance.
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 2 Access ChatGPT 1. go to openai chartgpt website: https://openai.com/blog/chatgpt/ 2. Click Try ChatGPT in the lower left corner 3. Click Sign up, then enter your SU email address and click Continue
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 3 4. Create a password (at least 12 characters) and click Continue 5. Verify your email 6. In your email account, click Verify email address 7. Enter information in Tell us about you (name and birthdate) 8. Enter cell number for verification 9. Enter verification number from your cell phone 10. Click through preview messages
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 4 Iris dataset Reference: https://archive.ics.uci.edu/ml/datasets/iris The IRIS dataset from the Machine Learning Repository at University of California Irvine is one of the best know datasets for pattern recognition. The dataset contains measurements on 3 classes of the iris flower, with 50 instances of each class. The fields include: 1. sepal length in cm 2. sepal width in cm 3. petal length in cm 4. petal width in cm 5. class: -- Setosa -- Versicolor -- Virginica The website https://www.datacamp.com/community/tutorials/machine-learning-in-r shows the differences across the three classes and identifies the sepal versus petal.
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 5 R and RStudio Visualization You can use R or RStudiio for this example. Techniques listed here can be found at: https://gist.github.com/ChristianEllis/4f43e0b81551aa7147681fa9dd13df7f Let’s now use RStudio to visualize the data. 1. Open RStudio 2. The left side of RStudio is the Console where we will enter commands 3. The top right of RStudio will list datasets and history of commands 4. In the following instructions, RStudio commands are highlighted in yellow 5. Load the data by entering the following command in the console data(iris) 6. Click on iris on the right to see the data (note the labels are different from UC Irvine) a. Sepal.Length b. Sepal.Width c. Petal.Length d. Petal.Width e. Species 7. Load the plotting routine install.packages("ggplot2",dependencies=TRUE) library(ggplot2) 8. The data has 150 rows. 9. Extract the first 100 rows for two species irisSubset2D <- iris[1:100, c("Sepal.Length", "Sepal.Width", "Species")]
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 6 10. Plot the data with ggplot (copy the entire command) ggplot(irisSubset2D, aes(x = Sepal.Length, y = Sepal.Width)) + geom_point(aes(colour = Species, shape= Species), size = 3) + xlab("sepal length") + ylab("petal length") + ggtitle("Species vs Sepal Length and Sepal Width") + theme(plot.title = element_text(hjust = 0.5))
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 7 Graphing in R using ChatGPT 1. Type in “Send a message” box to start a conversation with ChatGPT 2. For a data analysis example, copy the following to the Send a message box Generate R Code for a ggplot2 graphic that allows viewers to get an overview of the iris dataset 3. ChatGPT Answer: 4. It is possible for ChatGPT to generate different answers in different conversations 5. ChatGPT assumed you have installed ggplot2 package. Install the package on your device if you do not have it yet. install.packages(“ggplot2”) 6. Copy the commands into R by clicking on the “Copy code” option in ChatGPT then pasting into R: library(ggplot2) data(iris) ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + geom_point() + ggtitle("Overview of the Iris Dataset") + xlab("Sepal Length") + ylab("Sepal Width") + theme_classic()
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 8 7. Here’s graph ChatGPT generated for us.
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 9 Graphing in R using ChatGPT for reduced data 1. Type in “Send a message” box to start a conversation with ChatGPT 2. For a data analysis example, copy the following to the Send a message box Generate R Code for a ggplot2 graphic that allows viewers to get an overview of the iris dataset for species setosa or versicolor 3. ChatGPT Answer:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 10 4. This generates the code: # Load required library library(ggplot2) # Load iris dataset data(iris) # Filter the dataset for species setosa or versicolor filtered_data <- iris[iris$Species %in% c("setosa", "versicolor"), ] # Create the ggplot graphic ggplot(data = filtered_data, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + geom_point() + facet_wrap(~ Species) + labs(x = "Sepal Length", y = "Sepal Width", color = "Species") + theme_minimal() 5. Running the code generates
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 11 ChatGPT for neural networks 1. In this example, we will create a neural network for the iris data. 2. Type in “Send a message” box to start a conversation with ChatGPT 3. Copy the following command to Send a message box Generate R code to run a neural network on iris data for Species with Petal.Width and Petal.Length and plot the graph 4. ChatGPT Answer:
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 12 5. ChatGPT assumed you have installed the neuralnet package. Install the package on your device if you do not have it yet. install.packages(“neuralnet”) library(neuralnet) 6. Copy the commands into RStudio by clicking on “Copy code”: # load the iris dataset data(iris) # extract the Petal.Width, Petal.Length, and Species columns x <- iris[, c("Petal.Length", "Petal.Width")] y <- iris[, "Species"] # convert the species column to a numeric factor y <- as.numeric(factor(y)) # scale the input variables x <- scale(x) # create a neural network with one hidden layer of 5 neurons library(neuralnet) net <- neuralnet(y ~ Petal.Length + Petal.Width, data = iris, hidden = 5) # plot the neural network plot(net) 7. The plot will display: 8. What did ChatGPT do correctly? What did it do incorrectly?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 13 9. Let’s run the neural network again, but send ChatGPT the following message: Generate R code to run a neural network on iris data for Species as factor with Petal.Width and Petal.Length and plot the graph 10. ChatGPT Answer:
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 14 11. Copy the commands into RStudio by clicking on “Copy code”: # load the iris dataset data(iris) # extract the Petal.Width, Petal.Length, and Species columns x <- iris[, c("Petal.Length", "Petal.Width")] y <- iris[, "Species"] # convert the Species column to a factor y <- as.factor(y) # scale the input variables x <- scale(x) # create a neural network with one hidden layer of 5 neurons library(neuralnet) net <- neuralnet(y ~ Petal.Length + Petal.Width, data = iris, hidden = 5) # plot the neural network plot(net) 12. The plot will display:
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 15 ChatGPT without specifying X-variables 1. In this example, we will not specify the Petal.Width and Petal.Length as x-variables 2. Enter the following message in Send a message Generate R code to run a neural network on iris data for Species as factor 3. ChatGPT generates a different result including accuracy measures
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 16 4. Click on Copy code # load the iris dataset data(iris) # extract the Petal.Width, Petal.Length, and Species columns x <- iris[, c("Petal.Length", "Petal.Width")] y <- iris[, "Species"] # convert the Species column to a factor y <- as.factor(y) # scale the input variables x <- scale(x) # create a training and testing dataset set.seed(123) # for reproducibility train <- sample(nrow(iris), 0.7 * nrow(iris)) # 70% of the data for training x_train <- x[train, ] y_train <- y[train] x_test <- x[-train, ] y_test <- y[-train] # create a neural network with one hidden layer of 5 neurons library(neuralnet) net <- neuralnet(y_train ~ Petal.Length + Petal.Width, data = data.frame(x_train, y_train), hidden = 5) # make predictions on the testing dataset predictions <- predict(net, x_test) # convert the predictions to the corresponding species predicted_species <- levels(y)[apply(predictions, 1, which.max)] # calculate the accuracy of the predictions accuracy <- sum(predicted_species == y_test) / length(y_test) print(paste0("Accuracy: ", round(accuracy, 3) * 100, "%")) 5. What did it do differently? a. Only used 2 x-variables b. Generated accuracy measure c. Did not generate plot 6. Run the code in R [1] "Accuracy: 97.8%"
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 17 ChatGPT with “New chat” 1. ChatGPT uses your chat history to tailor answers to additional requests 2. For this example, click on New chat 3. This assumes no history 4. In this example, we will not specify the Petal.Width and Petal.Length as x-variables 5. Enter the following message in Send a message Generate R code to run a neural network on iris data for Species as factor 6. ChatGPT generates a different result including all four x-variables and accuracy measures
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 18 7. Click on Copy code library(neuralnet) data(iris) # set the seed for reproducibility set.seed(123) # create a training and testing dataset trainIndex <- sample(1:nrow(iris), 0.7 * nrow(iris)) trainData <- iris[trainIndex, ] testData <- iris[-trainIndex, ] # create a neural network model with one hidden layer model <- neuralnet(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, trainData, hidden = 5) # make predictions on the test dataset predictions <- predict(model, testData) # convert the predicted values to class labels predictedClasses <- apply(predictions, 1, which.max) # compare the predicted classes with the actual classes in the test dataset accuracy <- mean(predictedClasses == testData$Species) print(paste("Accuracy:", accuracy)) 8. Run the code in R 9. What happened? Algorithm did not converge.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 19 Support Vector Machines Background Support vector machines are another technique for classification. In this example we will use an SVM to classify the iris data. ChatGPT for SVM 1. For this example, click on New chat 2. In this example, we will not specify the Petal.Width and Petal.Length as x-variables 3. Enter the following message in Send a message Generate R code to run a SVM on iris data for Species as factor 4. ChatGPT responds with:
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 20 5. Click on Copy code # Load required library library(e1071) # Load iris dataset data(iris) # Convert the "Species" column to a factor iris$Species <- as.factor(iris$Species) # Split the data into training and testing sets set.seed(123) train_indices <- sample(1:nrow(iris), 0.7 * nrow(iris)) train_data <- iris[train_indices, ] test_data <- iris[-train_indices, ] # Train the SVM model svm_model <- svm(Species ~ ., data = train_data) # Make predictions on the test data using the trained model svm_predictions <- predict(svm_model, test_data[, -5]) # Compare the predicted classes with the actual classes accuracy <- mean(svm_predictions == test_data$Species) cat("Accuracy:", accuracy, "\n") 6. Generating the result: Accuracy: 0.9777778
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 21 ChatGPT for logistic regression – Titanic data 1. In this example, we will create a logistic regression for the titanic data. 2. First, import the titanic dataset from the spreadsheet titanic.xlsx; you can use R or RStudio. 3. Type in “Send a message” box to start a conversation with ChatGPT 4. Copy the following command to Send a message box Generate R code to run a logistic regression on titanic data for survived as a function of age and gender 5. ChatGPT Answer:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 22 6. ChatGPT continues with the code required:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 23 7. ChatGPT then describes the analysis steps: 8. Click on the Copy code option in the upper right corner: # Install and load required packages if (!require("titanic")) install.packages("titanic", dependencies=TRUE) library(titanic) # Load the Titanic dataset data("titanic_train") titanic_data <- titanic_train # Check the structure of the dataset str(titanic_data) # Keep only the relevant variables titanic_data <- titanic_data[, c("Survived", "Age", "Sex")] # Drop missing values titanic_data <- na.omit(titanic_data) # Convert "Sex" variable to a factor titanic_data$Sex <- as.factor(titanic_data$Sex) # Perform logistic regression logistic_model <- glm(Survived ~ Age + Sex, data = titanic_data, family = "binomial") # Display summary of the logistic regression model summary(logistic_model)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Business Analytics: Session 14: ChatGPT Copyright © Don Harter 2023 24 9. Paste the code into R or RStudio and run. 10. Result
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help