Dallas’s Favorite Local Coffee Shops

Hello again. It’s me, Derek… from the website derekbeaton.com. I’m here to sell you some Encyclopedia Britannica.

As a PhD student I acquire sustenance from the two major food groups: beer and coffee. I also like to pretend I’m an expert in those things, but, really, I’m probably just a snob about them.

But I’m not the only snob about these things. Dallas, clearly, is really into its beer. But a little less obvious is that Dallas appears to love its coffee much the same way — craft and/or local. In fact, the coffee scene in Dallas right now parallels the beer scene about 3-4 years ago: lots and lots of bars and restaurants catering to the craft beer enthusiasts (snobs), with a handful of breweries. We have countless places to get some great local or Texas-based coffee, and we have quite a few roasters: Cultivar, Noble Coyote, Oak Cliff Coffee Roasters, Full City Rooster, Novel, White Rock (and all the ones throughout the rest of the metroplex) with more on the way.

WHICH MEANS WE HAVE THE CHANCE TO GET NERDY — and statistically determine the best coffee places in Dallas. Like before, I aggregated the ratings from both Facebook and Yelp for all the local/independent coffee shops in Dallas. These primarily exist within the area created by 635, the Tollway, loop 12, and Bishop’s Arts. Let’s take a look:

coffeemap
So, let’s get to some facts first. The map above only includes 24 coffee shops within Dallas. Four of those shops just don’t have enough ratings to be analyzed (Serj, Black Forest, Urban Blend, Weekend). Another shop was excluded because people often say “Oh, that’s actually in Dallas?” — Coffee House Cafe — and yes it technically is. Dallas has an odd shape and city limits that make little sense. Coffee House Cafe is practically in the ‘burbs, and I’ll talk about those (and other) coffee shops some other time1. A few other shops have closed recently (e.g., Laguna’s). Some were not included because they’re more into something else than their coffee. And finally, I excluded roasters that don’t have a separate shop (e.g., Noble Coyote, Full City Rooster).

It wasn’t an easy list to cultivate in part because — we’re a lucky crowd with some great coffee options all through out town.

Now let’s get to one other fact about coffee places in Dallas: they are almost all entirely distinct from one another. Very few of these coffee shops have a lot in common with one another except that they generally use local or at least Texas roasted beans and/or are not part of a conglomerate. So let’s take a look at some of these coffee shop categories:

  • Some of the coffee shops are also bars (e.g., State St/Alcove, Ascension, Mudsmith)
  • Some serve as venues for art, music, and/or worship (e.g., Union, Mokah)
  • Some are super nerdy (in the good way) about their coffee techniques (e.g., Method, Cultivar)
  • Some also focus quite a bit on food (e.g.,Oddfellows, Legal Grounds)
  • Some are actually simple coffee shops (e.g., Murray St., Café Silva)
  • Some are located in, or are bookstores (e.g., Black Forest, Serj)

And then there’s The Wild Detectives which is almost all of the above. Plus a place for dogs in sidecars.

In sum — there’s a coffee shop for nearly any personality or occasion in this town. Like I said — it looks like the beer scene from about four years ago. Enough background… it’s stats time.

2
It’s pretty obvious that 1 star and 2 star ratings are rarely if ever used. Which is awesome because we can just collapse 1, 2, and 3 star ratings into just the “3 star” category. We can pretend that 3 is “low”, 4 is “middle”, and 5 is “high”:

3

Now there’s something a little unfair here… some coffee shops have a ridiculous amount of ratings (i.e., Oddfellows) and some have a much smaller amount (e.g., Houndstooth). So let’s make these bars relative, that is, the total number of 5, 4, or [3, 2, 1] stars divided by the total number of ratings:

4
That’s just much easier to interpret, too! And it even looks kind of like the right answer. But it’s not and you’re a fool for believing it!

Ratings systems like this tend to be a bit flawed. For example, the movie “50 Shades of Grey” has 4.1 out of 10 stars on IMDB.com. Does that mean it’s generally receiving middle responses from most people?

No, no it absolutely isn’t. It’s easy to get an “average” rating when the underlying distribution makes absolutely no sense.

In order to understand how people really perceive Dallas’s coffee shops we need to get fancy with our stats. So let’s turn to one of my favorite statistical methods: Correspondence Analysis (CA). CA is a technique that takes a large table made up of a bunch of variables (ratings) and turns them into new variables that better represent what’s happening2. CA produces new variables called “components” — which are the horizontal and vertical axes (lines) in the following pictures. The other really nice thing about CA is that it can handle data in a correct way when the number of items are different. Here, the number of ratings per coffee shop is quite different. Well, CA makes it so things are fair between all these coffee shops — kind of like the relative percentages above.

6a_forGif
Correspondence Analysis is a nifty technique that finds for us a boundary. The boundary, called a simplex, is defined by the variables — in this case the ratings. All of the coffee shops have to live inside this simplex — which is the triangle in the prior and next few images. Let’s color the simplex by regions. This will help us understand how these coffee shops really rank:

6b_forGif
Now, let’s pause for a moment. All of these coffee shops, judging by their average ratings, would get at least a B or B+. None of these shops are bad at all — they’re all good or great (see the bar charts above). But, with CA we’re going to see which coffee shops are more likely to receive 5 star ratings than others, which coffee shops are more likely to receive 4 star ratings than others, and which coffee shops are more likely to receive 3 star ratings than others.

Note that repeated sentence: “which coffee shops are more likely to receive [some number] star ratings than others” — that means this is a relative interpretation. A shop that is close to a 3 doesn’t mean it gets more 3s overall — just that, proportionally, it receives more 3s than other shops.

So, the above two image shows us that (more likely to receive) 5 stars shops are on the left side, (more likely to receive) 4 star shops to the upper right, and (more likely to receive) 3 star shops to the lower right. Let’s see how the shops are configured:

6c_forGif
The purple dots are coffee shops. Let’s zoom in and now look at them, labeled with their average rating:

7b
Well look at that. Sometimes 4.67 is equal to 4.7 and 4.14 is better than 4.17. See! I told you — sometimes ratings are stupid. In fact, as I’ve pointed out before, these types of systems are usually despised. It’s pretty well documented, especially here in DFW (except by “Elite Yelpers”). Not only that, but one of the coffee shops here is anti-Yelp.

Anyways. With these ratings systems, they can still be informative. But they aren’t very informative when you just average the stars from a very broad and unrefined rating system.

In the picture above, we have 3 zones to describe our coffee shops: (1) The Red Zone is coffee shops that have relatively more 3 (and 2, and 1) star and 4 star ratings than other places, (2) The Orange Zone is the “50 Shades of Grey” zone — these coffee shops get their average rating from a bimodal distribution: People that love (5 star) the places and people that definitely don’t (3, 2 and 1 star ratings), and (3) The Purple Zone: these shops generally receive more 5 star and 4 star ratings, proportionally, than other shops.

Another small note: any coffee shops at the middle, where the horizontal and vertical lines cross, are essentially the average coffee shops.

So, which shops are these in all these weird 50 Shades of Grey zones and what not…?

7a
The red zone shows us the coffee shops that are, essentially, a “B” or “B+” students. The orange zone: Davis St. and Method. So these two places have lovers and haters. But, in the case of Method — maybe that’s just because of their anti-Yelp leanings. They are essentially the A- students.

Now that purple zone is where we want to dive into. There appears to be two groups: the A students–closer to the origin–and the A+ students–the ones most to the left.

At this point you’re thinking “SHUT UP DEREK I’VE BEEN READING THIS FOR FAR TOO LONG TRYING TO FIND OUT WHICH COFFEE SHOP TO GO TO AND IT HAS DELAYED MY COFFEE CONSUMPTION AND THUS I AM IRRITATED AS IS EVIDENT THROUGH THE USE OF CAPITAL LETTERS RUN ON SENTENCES AND LACK OF PUNCTUATION.”

Well you’ll just have to wait, because I have something important to show you. And I’m going to show you through the power of an animated .gif. The .gif below shows us each coffee individually (purple dot) and how their ratings differ from Facebook ratings (blue dot) with an arrow towards their Yelp ratings (red dot):

CoffeeMotion
Remember — the arrow points from Facebook to Yelp. What we can generally see is that, again, Yelpers are generally more negative than Facebookers when it comes to ratings, except in two cases: Mokah (#23 in .gif) and Café Silva (#24 in .gif).

Both Mokah and Café Silva have overall positive ratings (they’re A to A+ students here). But they’re the only two where the ratings are better on Yelp than Facebook — completely counter to every other shop. And I even made sure to grab the hidden ratings from Yelp.

So how can we rank these coffee shops and give them a new rating? Well, that’s where a classic statistical technique comes in: linear regression.

All of the coffee shops will now get a new rating. This new rating is computed by using the original overall rating from above as the dependent variable, where the positions of the coffee shops from Correspondence Analysis3 are used as predictors4.

So, let’s get down to the important question: what are Dallas’s top 5 coffee shops, and what are their new ratings?

  1. Stupid Good — 4.75
  2. The Wild Detectives — 4.68
  3. Cultivar — 4.65
  4. Café Silva — 4.63
  5. Flying Horse — 4.61

So, where are they?

Top5
.

Now back to all that distinctness between shops — you really couldn’t ask for a more diverse set of coffee shops to be the top 5 — all have a unique personality, relative unique locations, wide array of coffee beans (including 3 local roasters: Oak Cliff at Stupid Good and The Wild Detectives, Noble Coyote at Café Silva, and Cultivar at Cultivar).

Given how far apart these places are, now we can answer a bonus question: Which neighborhood has the best coffee? That is, if you had to be trapped in a particular neighborhood in Dallas, and the primary condition is that you just need to be surrounded by great coffee shops, where should that be?

heatmap
Downtown. That’s right… Downtown. That sea of green surrounding Downtown (and parts of Uptown) mean that’s the best place for you to be trapped.

It’s pretty much one of the most boring neighborhoods–where everything is closed tightly by 5pm–is actually the best neighborhood for coffee. Go figure.

And the final re-rankings of coffee shops in Dallas:

Coffee Shop Rating
Stupid Good 4.8
Wild Detectives 4.7
Cultivar 4.7
Cafe Silva 4.6
Flying Horse 4.6
Union 4.6
Method 4.5
Sip Stir 4.5
Davis St. 4.5
Oak Lawn 4.4
Alcove 4.3
Opening Bell 4.3
Mokah 4.3
Mudsmith 4.3
White Rock 4.3
Crooked Tree 4.3
Ascension 4.3
Houndstooth 4.2
Espumoso 4.1
Murray St. 4.0
Drip 4.0
Oddfellows 4.0
Legal Grounds 3.9
Lil’ White Rock 3.9

All analyses performed in R. Correspondence Analysis was performed with the ExPosition package – a package created by particularly attractive and smart people. Maps were created in R with the RgoogleMaps and MASS packages. Some code was borrowed and adapted from Everyday Analytics and Stackoverflow.

Code and data available, for the nerds who are so inclined.

Footnotes:
1There are some great shops outside of Dallas: Avoca and Brewed in Ft. Worth, a few Buon Giorno locations, Generator in Garland, Pearl Cup in Richardson… the list goes on.

2For the stats nerds, technically both the coffee shops and the ratings are variables. The observations (people making ratings) are kind of hiding. Each person simply helps increase the number of responses within a particular cell of this table. CA is analogous a principal components analysis but for data more suited for χ2 analyses.

3These are called “Component Scores” or “Factor Scores”.

4For the stats nerds: one lovely property here is that the components (axes, lines) are orthogonal, which makes for an easy regression! Furthermore, this is a components-based analysis where the components are used as predictors in a simple regression… You may be more familiar with this under a different name (with a different technique): Principal Components Regression.