How I use ChatGPT – Joseph Noonan

Introduction

Since ChatGPT’s release in December 2022, the way I think about, and solve problems has radically changed. During the last nine months, there has been a flood of posts, tweets, and examples of how to use this amazing tool. I don’t have that much unique to add, but I wanted to put something out there about what has worked for me.

People often both over and underestimate the power of using large language models (LLMs) as part of a work process. The first instinct is to have the AI ‘do your work for you’. You write a question like, “Using Worldbank data, make me a chart of the GDP growth from 1960 to 2020”. This doesn’t work. The question is too vague and the specifications too broad. ChatGPT will respond, most likely by telling you the question is too difficult for it to answer. LLMs may produce content that has the appearance of finished work but can often be incorrect, missing small details, or doesn’t really follow the prompt. After initial amazement, some people will point to these errors to show the fallibility of the technology. The prevalence of the sort of a textual uncanny valley, where the text takes the form of the authoritative text in terms of tone, style, and structure, but can be filled with small mistakes.

The general approach

I find it best to view LLMs like a coworker who is extremely good at specific small tasks, but much worse at thinking of the big picture, while also tending towards of being confidently incorrect. This means that the tool is most useful when (1) know what you need it to do and (2) have the skills to identify errors and make small changes to complete the task. You then work iteratively with ChatGPT to chip away at the problem and get to a usable solution. It often works best to break down big tasks into small and specific tasks.

The rest of this post will be examples of problems I have faced and how I have used LLMs to help me solve them. Many (but not all) of these problems are related to writing code in R. This post is also aimed at social scientists rather than R programmers. While I consider myself extremely competent in using R for a social scientist, I am not a computer programmer, my focus is getting a specific task done, even if it could be done better or more efficiently.

Example 1: Downloading multiple csv files from a website

I wanted to download all of the Swedish roll call votes from 1993 to 2022. Luckily, the Swedish parliament provides these votes in a clean easily accessible format on their website. However, they split these votes up by parliamentary session, providing 29 separate csv files.

I know how to solve this. You can write some code using the ‘rvest’ package to find the .csv links then use download.file() to download them. But writing the actual code takes time! So, I just asked ChatGPT to do it feeding it into the html of the website.

From this prompt ChatGPT suggests the following code below. Note that my message to ChatGPT was truncated. I included a few hundred lines of html, and importantly sections where the .csv files were referenced.

library(rvest)
library(tidyverse)

# Step 1: Read the HTML content
html_content <- read_html("<paste your html here>")  # Replace with your actual HTML content as a string

# Step 2: Extract CSV URLs
csv_links <- html_content %>%
  html_nodes(xpath = "//a[contains(@href, 'csv.zip')]") %>%
  html_attr("href")

# Step 3: Build complete URLs (adding the missing part of the URL)
complete_urls <- paste0("http:", csv_links)

# Step 4: Download CSV files
for(url in complete_urls) {
  download.file(url, destfile = basename(url))
}

# At this point, the CSV files will be downloaded in the working directory

This code basically works right out of the box. The only line where you need to adjust, is putting a URL instead of copying html text when using read_html. In addition to working immediately, ChatGPT can help you understand what the code is doing. I wasn’t familiar with the argument xpath. So I asked ChatGPT what it did.

This gives me an overview of what is happening with the code. However, I would say that this is the “danger zone” for mistakes and inaccuracies Since this is new material for me, I cannot assess whether this is accurate or not. If I planned on using this argument more, I would experiment with it, to try to both learn how it works, but also see if ChatGPT’s explanations are correct.

Example 2: Implementing parallel processing

As part of my job, I had to process a large number of PDFs, converting each page to a single image file. When you are dealing with processing a ton of data the time for each operation starts to matter a lot. For instance, if you are turning 40,000 pages from a pdf into single image files, and processing each page takes one second, the entire operation will take 11.1 hours. Optimization isn’t something you really learn about in the bog standard political science, and I don’t know good principles of how to do this. I knew my first step was to write a function that does this using functions from ‘magick’ an image processing package. You can see the function below.¹

# Define a function called `page_splitter`.
page_splitter <- function(pdf, page, image_dir){
  
  # Use `image_read_pdf` to read a specific page from the pdf file with a resolution of 600 DPI.
  # `pdf` is the file path to the PDF, and `page` specifies the page number to read.
  image_read_pdf(pdf, pages = page, density = 600) %>% 
  
    # Use `image_deskew` to correct the skewness of the read image by rotating it.
    image_deskew() %>%
    
    # Use `image_write` to write the processed image to the specified directory `image_dir`.
    # The file path is created by concatenating the current working directory (`getwd()`) with `image_dir`.
    # The output image is written with a resolution of 600 DPI.
    image_write(., paste0(getwd(), "/", image_dir), density = 600)
}

I then put this through a standard for loop. But it was much too slow. So, I asked ChatGPT how to implement parallel processing. I have heard of the concept, but had zero idea of how to implement it. The overall idea of parallel processing is that you use more than one processing core at the same time. This leads to a linear decease in time: processing one image will take one second per processor, so each additional processor decreases the time required. This is obviously a simplification (I am a political scientist not a computational scientist).

In response, ChatGPT provided me with the code below, along with an explanation of how it works.

# Load necessary libraries
library(foreach)
library(doParallel)
library(magick)

# Register the parallel backend
no_cores <- detectCores() # detect the number of cores
cl <- makeCluster(no_cores)
registerDoParallel(cl)

# Define the page_splitter function
page_splitter <- function(pdf, page, image_dir) {
  image_read_pdf(pdf, pages = page, density = 600) %>% 
    image_deskew() %>%
    image_write(., paste0(getwd(), "/", image_dir), density = 600)
}

# Use foreach to apply the page_splitter function to each row in df in parallel
results <- foreach(i = 1:nrow(df), .packages = 'magick') %dopar% {
  page_splitter(df$pdf[i], df$page[i], df$image_dir[i])
}

# Stop the cluster
stopCluster(cl)

In this code:

detectCores() is used to find out the number of cores on your machine.

makeCluster(no_cores) creates a cluster with the same number of workers as the number of cores.

registerDoParallel(cl) registers the cluster for use with foreach.

foreach is used to apply the page_splitter function to each row of df in parallel. %dopar% is used for parallel execution.

stopCluster(cl) is used to stop the cluster after the parallel computation is done.

This code generally looks good, but there are a couple of oddities here. Firstly, the code defines page_splitter after registering the parallel backend. I have no idea if this makes any real difference but it doesn’t feel intuitive. More importantly, and what I think would break this code, is that it assigns the results of the parallelized for loop to an object results. This doesn’t make sense because the purpose of the code is to save PDF pages as images. Luckily, ChatGPT, unlike a lot of people, is quick to admit its mistakes when you point them out.²

This example again highlights the two principles of using ChatGPT in your workflow. You need to know what to ask it, and you need to be good enough at what you are doing to find and correct inevitable errors. The requests have to be fairly specific. You can ask more generalized ideas, like how to optimize code, which can lead you down the path of asking it how to do parallel processing, but in my experience ChatGPT thrives on very specific instructions.

Example 3: Commenting on code

ChatGPT isn’t just good at writing new code, it is also good at commenting on existing code. Let’s say you have some uncommented code and you need to share it to others, say, for example, you are writing a blog post on how to use ChatGPT in your workflow, and you want your code to be well commented so that people can understand it.

Here, I just asked ChatGPT if they could add inline comments to the code below. It is important that you tell ChatGPT to add inline comments, otherwise you might end up with general comments about the code.


ocr_tool <-function(image_dir, text_dir, data_dir){


# Apply OCR using tesseract with the swe language and the valid_chrs option
  
tesseract::ocr(image_dir, engine = tesseract("swe", options = list(tessedit_char_whitelist = valid_chrs))) %>%
# Write the output to the text_dir column
writeLines(., paste0(text_dir))

tesseract::ocr_data(image_dir, engine = tesseract("swe", options = list(tessedit_char_whitelist = valid_chrs))) %>% 
write_csv(., paste0(data_dir))

print('DONE')

  }

ChatGPT has added comments to every step of the operation. Some of this might be excessive for day to day use. Like, do we really need an explanation of what print("DONE") does? Probably not. But for showing code to people who are new with using R, these types of comments can be super useful. It can also be very useful for explaining code to people who might need to know what the code does, but don’t actually know R.

# Define a function called `ocr_tool`.
ocr_tool <- function(image_dir, text_dir, data_dir) {

  # Apply OCR on the image files located in `image_dir` using the Tesseract OCR engine.
  # Set the language for OCR to Swedish ('swe') and apply the whitelist character options (`valid_chrs`).
  tesseract::ocr(image_dir, engine = tesseract("swe", options = list(tessedit_char_whitelist = valid_chrs))) %>%
  
    # Write the OCR output (extracted text) to the location specified by `text_dir`.
    writeLines(., paste0(text_dir))

  # Again, apply OCR on the image files located in `image_dir` to get additional data (such as bounding boxes).
  # Use the same Tesseract engine settings as before.
  tesseract::ocr_data(image_dir, engine = tesseract("swe", options = list(tessedit_char_whitelist = valid_chrs))) %>%
  
    # Write the additional OCR data as a CSV file to the location specified by `data_dir`.
    write_csv(., paste0(data_dir))

  # Print 'DONE' to the console to indicate that the function has completed its execution.
  print('DONE')
}

Example 4: Writing regex

Regular expressions (regex) are sequences of characters used for pattern matching within strings, allowing for searching, extracting, and manipulating text based on specific patterns.³ See the image below for an example. I have always disliked writing regex. It is hard for me to parse and write. But it is super important! Without regex it is impossible to work with textual data.

Luckily, ChatGPT is really good at writing regex. I am not sure if this is due to ChatGPT being a LLM that “understands” text, or if is just because it is trained on hundred of thousands StackOverflow posts desperately asking for help for whatever task has forced them to use regex. I am 99% sure it is the latter.

Let’s take a real world example. Below is an OCRed version of a book documenting the 1970 session of the Swedish parliament. This is a list of the names of the seat numbers of an MP, their names, the number of years in parliament, and their party. I want to structure the section starting after “Stockholms stad:” (Stockholm City). This is pretty structured text, with each new case even being semi-colon separated!

Riksdagens ledamöter 1970

Första kammaren

Talman: E. C:son Boheman
Förste vice talman: A. W. Strand
Andre vice talman: N. I. Johansson

Stockholms stad: 1. A. W. Strand (35) s;
2. N. G. Helén (15) fp; 3. B. G. Bohman
(12) m; 4. U. G. Lindström (25) s; 5.
P. 0. R. Dahlén (12) fp; 6. K. B. M.
Johansson (14) s; 7. J. A. Hernelius (8) m;
8. A. Myrdal (8) s; 9. J. Sörenson (6) fp;

The first thing I want to do here is use that semi-colon to separate the text into individual cases.

You can easily do this with str_split from stringr.⁴

# Load the stringr package
library(stringr)

# Text
text <- "Stockholms stad: 1. A. W. Strand (35) s; 2. N. G. Helén (15) fp; 3. B. G. Bohman (12) m; 4. U. G. Lindström (25) s; 5. P. 0. R. Dahlén (12) fp; 6. K. B. M. Johansson (14) s; 7. J. A. Hernelius (8) m; 8. A. Myrdal (8) s; 9. J. Sörenson (6) fp;"

# Split the text by semicolon
split_text <- str_split(text, "; ")

# The result is a list with one element, converting it to a vector
split_text <- unlist(split_text)

# Viewing the split text
print(split_text)

[1] "Stockholms stad: 1. A. W. Strand (35) s"
[2] "2. N. G. Helén (15) fp"                 
[3] "3. B. G. Bohman (12) m"                 
[4] "4. U. G. Lindström (25) s"              
[5] "5. P. 0. R. Dahlén (12) fp"             
[6] "6. K. B. M. Johansson (14) s"           
[7] "7. J. A. Hernelius (8) m"               
[8] "8. A. Myrdal (8) s"                     
[9] "9. J. Sörenson (6) fp;"

Now, we have the the cases in a nice clean format, with one element of the vector corresponding to one MP, even though the first case has some extra junk in the text. Now that we have this, we can ask ChatGPT to help us out.

Here, I am formulating my questions in a specific way. I am describing in words, what the conditions are for the regex that I need. I am not simply saying extract the seats, parties and names. ChatGPT responded with the following code, which I have cleaned up due to it producing some small errors.

# Load the stringr package
library(stringr)

# Extracting elements using regex
seat_num <- str_extract(split_text, "\\b\\d+\\.")
name <- str_extract(split_text, "(?<=\\d\\. ).*?(?= \\()")
party <- word(split_text, -1)
time_in_parliament <- str_extract(split_text, "(?<=\\().*?(?=\\))") %>%
  str_replace_all("[^\\d]", "") # removing non-numeric characters

# Removing unnecessary characters
seat_num <- str_remove_all(seat_num, "\\.")
name <- str_trim(name)
party <- str_remove_all(party, "[;]")
party <- str_trim(party)

# Create a dataframe
df <- data.frame(seat_num, name, party,time_in_parliament)

knitr::kable(df)

seat_num	name	party	time_in_parliament
1	A. W. Strand	s	35
2	N. G. Helén	fp	15
3	B. G. Bohman	m	12
4	U. G. Lindström	s	25
5	P. 0. R. Dahlén	fp	12
6	K. B. M. Johansson	s	14
7	J. A. Hernelius	m	8
8	A. Myrdal	s	8
9	J. Sörenson	fp	6

Even with some errors and perhaps slightly idiosyncratic ways of getting this job done, ChatGPT provided me with a quick way to get this job done. The parameters I set for ChatGPT are steps I would have to take regardless.⁵ You need to know what you need to do before you can start writing code. For me, this abductive style, of writing quick commands to ChatGPT, correcting, then scaling has been extremely effective.

Example 5: Making graphs using ggplot2

I have used ggplot2 since 2016. A good chunk of some of my previous jobs was making different graphs. Despite this, I don’t really have a personal ggplot2 theme. I don’t like many of the standard themes but I haven’t sat down and made one for myself. I also still have trouble remembering how to manipulate very specific parts of the graph. I still rely on a lot of Googling when making a new graph. This is where ChatGPT comes in. ChatGPT is really good with ggplot2.

library(ggplot2)

# Create a custom theme
my_theme <- function() {
  theme_minimal(base_size = 15) %+replace%
    theme(
      plot.background = element_rect(fill = "beige", color = "beige"),
      panel.background = element_rect(fill = "beige"),
      panel.grid = element_blank(),
      axis.ticks = element_blank(),
      axis.text = element_text(color = "black"),
      legend.background = element_rect(fill = "beige")
    )
}

# Generate some example data
data <- data.frame(
  category = factor(c("A", "B", "C", "D")),
  value = c(3, 5, 2, 8)
)

# Create a bar plot using the custom theme
ggplot(data, aes(x = category, y = value)) +
  geom_bar(stat = "identity") +
  my_theme()

This is exactly what I wanted. This gives me a canvas to start working with. I then iteratively added and changed features of the theme simply by copying and pasting it into the chat and asking ChatGPT to make modifications.

The speed and accuracy here is really remarkable. For me, it is easier to quickly paste in the code and write “remove y axis title” then to type axis.title.y = element_blank(). Both of these things aren’t extremely time consuming but it is just faster to have ChatGPT write out all the code for you. You can even have it make aesthetic choices like choosing colors.

As you can also see, once I get to a certain point of quality in the responses, my messages queries start to become more lazy and direct. When you work for a long time with ChatGPT you get a sense of when specification is required and when you can lazily send a request that is hardly a complete sentance.

Using this method of iteratively tweaking the aesthetics. I applied this theme to a dataviz idea I had, looking at electoral winning streaks of parties. Below is the final result. Here, you can see how the initial theme delivered by ChatGPT ended up being the building blocks for this final graph. ChatGPT didn’t help me come up with the idea of a graph. It didn’t even really help design the graph. It simply provided responses to my queries to turn my ideas into usable ggplot2 code.

Example 6: Adjusting the formality in emails

I have worked in a number of international settings. One of the biggest areas of difficulty in cross-cultural communication is formality in emails. In Sweden, it is extremely informal. E-mails are extremely short and concise. This is obviously not the case everywhere. Sometimes I will use ChatGPT as a tone modifier, allowing it to take drafts of emails and increase the politeness.

Even without the full message, you can see that this is a bit much. I think if I got an email asking for my “esteemed feedback”, I’d assume it was sarcastic. Luckily, you can ask ChatGPT to adjust.

There we go. This strikes a balance of formal and polite without sounding like a merchant addressing a nobleman.

Changing the formality of emails isn’t too hard (in your native language), but it still requires a bit of work and thinking. When I am feeling a bit uninspired with reworking emails, I find that asking ChatGPT to help can break down some of the “writer’s block” I get when working in the constraints of high levels of formality and politeness. I also think it would be a bit weird to do this for all emails. It is a weird future if we all loose our own personal voice in written communication. But then again, formal rules and practices of politeness kind of already do this for these types of emails anyways.

Example 7: Boilerplate language and templates

Many written texts follow a given format and convention. This stuff isn’t difficult to write, but it can still take time, especially if you aren’t used to writing these types of notes. In the example below, I asked for a template with common phrases to write an acknowledgment note in an academic paper. I also have instructed ChatGPT to keep the note short – the replies can ramble a lot if you don’t condition them by explicitly stating the length or scope of your requests.

This is a good starting point. I would probably remove some of the overly expressive adjectives, and ChatGPT is laying it on a bit thick in their praise of the funders. But still, this gives us a good sense of the scope of an acknowledgment note.

Another example is writing an abstract. There are better guides than ChatGPT, but again, quick and easy, and all in one place.

Conclusion

I’ve highlighted he main ways I have used ChatGPT in my workflow. The vast majority of help I have gotten from this software is support for technical tasks in R, which is reflected in the examples. These generally follow the same structure (1) know what you need it to do and (2) have the skills to identify errors and make small changes to complete the task and (3) go back and forth with ChatGPT to come up with a workable solution.

I have always been skeptical for overambitious claims of technological advancement. There is a tremendous amount of hype around AI. This type of hype is often promoted and feels like it echos other trends like NFTs. I understand the skepticism around this new technology, and the lofty claims of how much it will “change things”. But from my experience, this is the first time where I really have felt that my way of working and problem solving was dramatically changed due to a technological advancement.

Footnotes

The in-line comments were generated by ChatGPT see Example 3: Commenting on code, for more information↩︎
I actually made a mistake in my correction. I should have written saving “images” not “pdfs” to the hard drive. ChatGPT still understood what I meant and was kind enough to not point out my error.↩︎
I had ChatGPT write this description.↩︎
I know this function well now, but I am pretty sure I learned about it through conversations with ChatGPT.↩︎
ChatGPT helped me reword this sentence.↩︎