Introduction

In Canada, there really only seems to be two coffee chains, Tim Hortons and Starbucks. Each chain has fiercly loyal costumers and almost impossible to drive anywhere without seeing one or both dotting the streets and malls. This led to the question of where and how many of each are there in Canada.

Data

In order to locate Tim Hortons and Starbucks, Open Street Map (OSM) data was used. Specifically the Canada extract from GeoFabrik last modified on 2015-12-31T20:14:02Z.

Data Extraction

As in previous posts 1, 2, RPostgreSQL package will be used to connect to the database.

library(RPostgreSQL)

# Connect to database
m <- dbDriver("PostgreSQL")
con <- dbConnect(m, dbname="gis")

In order to get the list, the planet_osm_point table should be queried for partial matches to Starbucks and Tim Hortons. The names are set to uppercase for both to partial match search and values in table so letter case is not an issue. As well, the amenity should exist so that it is a cafe, restaraunt, or exists.

q <- "SELECT p.name AS name, p.amenity AS amenity, ST_X(ST_Transform(way, 4326)), ST_Y(ST_Transform(way,4326))
FROM planet_osm_point p 
WHERE (UPPER(p.name) LIKE UPPER('%Starbuck%') OR UPPER(p.name) LIKE UPPER('%Tim H%'))
AND p.amenity IS NOT NULL;"

cLoc <- dbGetQuery(con, q)

# Show data
kable(rbind(head(cLoc), tail(cLoc)), align='c')

	name	amenity	st_x	st_y
1	Tim Hortons	fast_food	-114.6626	51.79412
2	Starbucks	cafe	-114.4748	51.18928
3	Tim Hortons	cafe	-114.4725	51.18484
4	Tim Hortons	restaurant	-114.4636	49.62061
5	Starbucks	cafe	-114.2468	51.12408
6	Tim Hortons	restaurant	-114.2400	51.10743
1938	Tim Horton’s	cafe	-115.5760	51.18011
1939	Starbucks	cafe	-115.5710	51.17814
1940	Starbucks	cafe	-115.5553	51.14796
1941	Tim Horton’s	cafe	-115.3621	51.10262
1942	Starbucks	cafe	-115.3576	51.09280
1943	Starbucks	cafe	-115.0585	49.51497

The last part of the data to extract is Canadian boarders of each province.

q2 <- "SELECT (dp).path[1] AS region, (dp).path AS index, ST_Y((dp).geom) AS lat, ST_X((dp).geom) AS lon
FROM (SELECT ST_DumpPoints(ST_Transform(ST_Collect(way), 4326)) AS dp
FROM planet_osm_line where boundary='administrative' and admin_level = '4') AS foo
ORDER BY index;"

cb <- dbGetQuery(con, q2)

Data Clean Up

The data should be checked to make sure that the Tim Hortons and Starbucks are what we want.

unique(cLoc$amenity)

## [1] "fast_food"  "cafe"       "restaurant" "toilets"

Well, toilets should be checked to see if it is a Tim Hortons or not.

cLoc[cLoc$amenity == "toilets",]

##                                                                             name
## 860 150 King W Sun Life Financial PATH toilet behind Tim Hortons under escalator
##     amenity      st_x     st_y
## 860 toilets -79.38467 43.64788

This will need to be removed, seems very specific instructions for finding the toilet location as well.

cLoc <- cLoc[cLoc$amenity != "toilets",]

The last thing to check is the names:

unique(cLoc$name)

##  [1] "Tim Hortons"                                          
##  [2] "Starbucks"                                            
##  [3] "Starbuck's"                                           
##  [4] "Starbucks Coffee"                                     
##  [5] "Tim Hortons Express"                                  
##  [6] "Tim Horton's"                                         
##  [7] "Tim Horton"                                           
##  [8] "Tim Hortons in Walmart"                               
##  [9] "Tim Horton's Komoka"                                  
## [10] "Tim Hortons Drive thru only"                          
## [11] "Tim Hortons @ Mini Mac"                               
## [12] "Tim Hortons @ MUSC"                                   
## [13] "Tim Hortons/Wendy's"                                  
## [14] "Wendy's/Tim Hortons"                                  
## [15] "On The Run/Tim Hortons"                               
## [16] "On the Go Featuring: Tim Hortons Coffee & Baked Goods"
## [17] "Wendy's / Tim Hortons"                                
## [18] "Tim Hortons "                                         
## [19] "Tim Horton`s"                                         
## [20] "Tim Horton's Drive Thru"                              
## [21] "Café Starbucks"                                      
## [22] "Starbucks coffee"                                     
## [23] "Café Starbucks Coffee"                               
## [24] "Café Starbucks Coffee Peel"                          
## [25] "tim Hortons"                                          
## [26] "TIM HORTONS"                                          
## [27] "Starbuck"                                             
## [28] "starbucks"                                            
## [29] "Tim Horton's Drive-thru"                              
## [30] "Wendy's / Tim Horton's"                               
## [31] "Starbucks "

This looks good.

The last part is to identify each Tim Hortons and Starbucks to make it easier to plot and count. With 0 for Starbucks and 1 for Tim Hortons.

cLoc$identity <- NA
cLoc$identity[grepl(toupper("Star"), toupper(cLoc$name))] <- 0 # Set for Starbucks
cLoc$identity[grepl(toupper("Tim Hort"), toupper(cLoc$name))] <- 1 # Set for Tim Hortons

Plot

regions <- cb
plot(x="", xlim=c(-140,-52), ylim=c(43,63), xlab="Longitude", ylab="Latitude", main="Tim Horton and Starbucks Locations from OSM Data")
uRegions <- unique(regions$region)
# Probably can be done without the loop
for (k in 1:length(uRegions)) {
  theRegion <- uRegions[k]
  lines(regions$lon[regions$region == theRegion], regions$lat[regions$region == theRegion])
}

# Add points
points(cLoc$st_x[cLoc$identity == 1], cLoc$st_y[cLoc$identity == 1], lwd=0.5, col=rgb(red=0, green=0, blue=255, alpha=255, max=255), pch=20)
points(cLoc$st_x[cLoc$identity == 0], cLoc$st_y[cLoc$identity == 0], lwd=0.5, col=rgb(red=255, green=0, blue=0, alpha=55, max=255) , pch=20)
legend("bottomleft", legend=c("Starbucks", "Tim Hortons"), col=c(2,4), pch = c(20,20))

Summary

The number of Starbucks location from OSM Data in Canada is: 677

The number of Tim Hortons location from OSM Data in Canada is: 1265

According to OSM data, there is nearly twice as many Tim Hortons locations in Canada. Where as Starbucks tends to be located around city centers, Tim Hortons seems to be dotted around Canada.

Issues

Since OSM data is crowd sourced, it may be that Tim Hortons locations are recorded more more often than Starbucks. Again, with crowd sourced data, Tim Hortons or Starbucks may not be properly represented in smaller cities and towns. Another issue is setting amenity to be not null, we may have missed some that were not tagged.