Bender’s Composition

Every once in awhile bender proclaims he is ‘40 percent’ of something. What has he said he is made out of:

Benders Composition
im 40% zinc
my bodys 40% titanium im finally richer than those snooty atm machines
im 40% dolomite oh its hot its very hot
oh no im 40% lucky the scrap metal im made from included a truckload of horseshoes from the luckiest racehorses in mexico who had just been sent to a glue factory
im 40% scrap metal
im 40% wire
im 40% empty also what the hell is a free will slot

Code

This was based off of previous Futurama: Benders Top 10 Words.

library(httr)
library(XML)
library(kableExtra)

base_site <- "http://theinfosphere.org"

## Get Links
# XPATH
linkXPath <- "//*[contains(concat( ' ', @class, ' ' ), concat( ' ', 'oLeft', ' ' ))]//a/@href"
# Path to transcript listings
ts_path <- "Episode_Transcript_Listing"
# Get it
pageL <- GET(base_site, path=ts_path)
# Convert to HTML
h <- htmlParse(pageL)
# Provided by SelectorGadget... /@href was added as only want the link part
hLinks <- getNodeSet(h, linkXPath)
# Convert to character
hLinks <- as.character(hLinks)

# This function will be used to clean up each episode
getBenDialog <- function(diag) {
  diag <- diag[grepl("Bender: ", diag)]
  # From end of "Bender: " to character before of \n
  found <- regexpr("(?<=Bender: ).*?(?=\\\n)", diag, perl=T) 
  diag <- ifelse(found == -1, NA, regmatches(diag, found))
  # Remove anything between square brackets [], regex "\\[[^\\]]*\\]"
  diag <- gsub("\\[[^\\]]*\\]", "", diag, perl=T)
  diag <- gsub("[^[:alnum:][:space:]%]", "", diag) # Remove punctuation
  diag <- gsub("\\s+", " ", diag) # Remove white space
  diag <- gsub("^\\s+|\\s+$", "", diag) # Remove leading/trailing whitespace
  diag <- tolower(diag)
  return(diag)
}

# Output
benDialog <- NULL
# Loop over each episode and get data
for (k in 1:length(hLinks)) {
  # Get episode
  pageT <- GET(paste0(base_site, hLinks[k]))
  h <- htmlParse(pageT, asText=TRUE)
  # XPaths
  diagXPath <- "//p"
  diag <- xpathSApply(h, diagXPath, xmlValue)
  # Process Episode
  benDiag <- getBenDialog(diag)
  # Remove words, add it to vector
  benDialog <- c(benDialog, benDiag)
  # Be nice
  Sys.sleep(1)
}
benDialog <- benDialog[!is.na(benDialog)]
benDialog <- unique(benDialog)

knitr::kable(data.frame(benderComposition=benDialog[grepl("40%",benDialog)])) %>% kable_styling(bootstrap_options = c("striped"))