DFW’s Favorite Breweries

Oh hello again.

Now that science has determined Dallas’s favorite year round/flagship DFW-made beer, science must answer the following question: what is DFW’s favorite brewery?

You might be thinking: “Derek. That’s a stupid question that doesn’t require science. Given the abundance of overpriced Miller Lite at Jerry’s Dome of Eminent Domain, the answer is MillerCoors (which is produced by smashing frosty-cold bullet trains into mountains)”. While it might be technically correct (based on sales) that’s just gross and you should feel gross for having such gross thoughts.

Unlike before — I’m not going to tell you upfront which breweries are the best. You’re going to have to get nerdy with me (or just scroll to the end). So let’s continue the tradition1!

So how can we determine which of DFW’s breweries are the best? Well, you might be thinking, “don’t we have the (much reviled) Yelp average ratings?” or “I gave it 5 stars on Facebook so it’s clearly the best.”. Yeah, sure. If you go to Facebook, you see how many people rate Lakewood Brewing Company with 5, 4, 3, 2, or 1 star. You can do the same with Yelp, but you need to make sure to go find the hidden ratings, too. So, for this venture into stats and beer nerdery, I aggregated all the ratings from Yelp and from Facebook for all the DFW area craft breweries2. This gives me a count of, for example, how many 5 star ratings a brewery has (per platform: Facebook or Yelp).

Before we go on, let’s get something quite obvious out of the way. The 5 star all-purpose rating system is… flawed. In fact, these types of systems are usually despised. It’s pretty well documented, especially here in DFW, that ratings systems need to be more elaborate — rating different aspects of something, instead of an all-purpose feel-goodery star system (as if it were kindergarten and you didn’t knock the blocks down today — 5 stars for not being a clumsy 4 year old).

So the average rating might be quite unfair for these breweries. Are people giving stars because they are architecture nerds and love the actual building? Was it the tour? General opinion on all the beers? Who knows. What we do know is that the 5 star all-purpose feel-goodery system is flawed. And some businesses are very anti-Yelp because of this all-purpose feel-goodery star system.

Sometimes, when averaged together, the stars tell you just enough. But when it comes to these breweries, as we’ll see, the average tells you very little. However, when we take a closer look — the distribution of stars speaks volumes. Let’s begin with just looking at the frequency of ratings for all the DFW breweries. We’ll also sort them (top to bottom) the total number of ratings per brewery, with “average stars” on the right:
1

Here, we can see that Rahr & Sons and Deep Ellum Brewing have the most overall ratings in DFW. So, let’s sort this by average rating (average of Facebook & Yelp):

2

From the looks of both of these pictures, it really seems as though 3, 2, and 1 star ratings are rarely, if ever, used. This suggests that, for the most part, when people rate these breweries 5 means “Great”, 4 means “Good” and anything else means “Relatively Unsatisfactory”. So from here on out, I’m going to combine 3, 2, and 1 star ratings into a single category of “Not Good”.

2_5_new

But that still feels weird, so let’s look at things proportionally: that is, the percentage of ratings for each brewery:

3_replace2

From the looks of this, you’d probably think Peticolas is DFW’s favorite brewery. And then I would kindly interject and say “Your thought lacks science and is thus far incorrect!”.

When we look at these ratings, we’ve probably noticed right away that all ratings exist between 4.45 and 4.8. In fact, 7 different breweries have ratings between 4.63 and 4.67. So if we go just by average ratings on a (fictitiously) 5 point all-purpose feel-goodery kindergarten star scale — we’d conclude “they’re all pretty good so let’s go party.”.

So, how can we figure out which brewery really is the best? And how can we do that when the number of overall ratings are so different between breweries? By now you’re thinking the answer to that is “Science, duh”. So let’s science.

The data here look something like this:

 

Brewery 5 Stars 4 Stars 3, 2, or 1 star
903 289 40 28
Rahr & Sons 2690 726 277

 

where each row is a brewery, and under the ratings columns, are the total number of stars from both Facebook and Yelp3. One of the best ways to analyze this type of data is with Correspondence Analysis (CA). If you’re not into stats, avert your eyes for a moment…

For the stats nerds: CA is a technique that takes a large table made up of a counts, and finds the best overall representations of these counts. Like PCA, CA produces components. These components explain the maximum possible variance in descending order. But these components are derived under χ2 assumptions. However, CA—unlike other techniques—takes into account the total number of ratings (which is different for each brewery). That means we can more fairly analyze the ratings, even when the overall number of ratings is very different for each brewery. In this application of CA, we’re going to use the asymmetric version — where the columns are privileged. The privilege here is that we want the columns to define a maximum possible boundary of where the breweries can go. This is called a simplex.

Back to beer business. So, with some statistical magic, let’s start to find out which breweries can lay claim to being the best. First, let’s look at the ratings:

4

The configuration of ratings here defines a boundary, that can be broken into regions:

5 6

Those regions reflect 3 different traits of how a brewery receives it’s “average” rating. The purple region is due to breweries that pretty much get 5s and 4s. The orange region is due to breweries that get 5s and {3, 2, 1} ratings. And finally, the red region is due to breweries that are more associated with 4s, and {3, 2, 1} ratings than the other breweries. So let’s put the breweries in:

7

All those purple dots are the breweries. Note how close they are to “5 stars”. Let’s pause a moment. We can already assume that the average ratings-type system is flawed — people love to love their favorite things. Because the 5s are being used a little too much, we can’t figure out which breweries are really the best just by average. We need to use the other ratings to find this out. Let’s pretty that last picture up a bit.

89

A little better. Now we can see the breweries’ logos and where they fall in these boundaries. If you’re here for beer… avert your eyes again.

For the R nerds: I searched high and low for a way to plot raster graphics onto a plot device. I found no obvious and simple way to do this (but plenty of advice on how to put a plot device on a raster image — painfully unhelpful). My current solution (pictured above and below) exists somewhere between “Neat trick” and “Disgusting hack”. See the attached code in the footnotes. 

Back to beer business. Let’s zoom in on this area, which has all the breweries:

10

And bring back our magical boundaries:

13_replace

Oh man we are about to get scienced. Remember: all these breweries have a ridiculous amount of 5 star ratings. What’s important for figuring out which breweries are the best are the not-5 stars and how the stars are distributed. Instead of asking “which breweries get loved on the most?”, we’re really asking: “which breweries get hated on the least?”. Also remember that the red area means that these breweries get their average ratings from a higher number of 4, and {3, 2, 1} ratings than any other breweries. While beloved, Deep Ellum, Firewheel, Cobra, and Community get hated on the most. But 903, Cedar Creek, and Grapevine live in the “love-hate” zone — they have their lovers giving them 5s and their haters giving them {3, 2, 1} ratings. Here in the orange “love-hate” zone there is no middle ground: these breweries are less likely to get a 4 star rating than the other breweries. That purple zone, though… that’s what we care about.

So now we know that the purple zone is, generally, the “zone of favored breweries” in DFW. But exactly which breweries are the best?… We’re so close to the big reveal. So close. Before the big reveal, let’s look at the breweries, but marked with their average ratings:

14

Now that’s fancy. Science just told us that not all 4.6whatevers are created equal! 903 and Grapevine’s “4.64” is because they have lots of 5s, but those 5s get dragged down by the {3, 2, 1}s, where as Martin House’s 4.64 has its 5s dragged down by 4s! Making Martin House the best damn 4.64 in DFW! Likewise, Cedar Creek and Rahr & Son’s 4.63s are different: Rahr’s 4.63 is the best damn 4.63 in DFW!

Now that we can see a lot more of what’s going on — let’s take a look at just those top ratings: Peticolas (4.80), Revolver (4.79), Rabbit Hole (4.76), and Franconia (4.72). With Correspondence Analysis (CA) — we can think of the dots for the star ratings (5, 4, {3, 2, 1}) as pulling the breweries towards their “star position” (in CA the terminology is “inertia” because we can think of this as a gravitational pull)4. So which star ratings are pulling which breweries towards them?

While Peticolas and Rabbit Hole are being pulled by 5 star ratings — they’re also getting pulled back towards the {3, 2, 1}s. While there’s no doubt that these are some of DFW’s favorite breweries — they are not, according to (my analysis of) Facebook and Yelp (ratings), #1 nor #2. Rabbit Hole is #4 and Peticolas is #3.

And then there were two. To find out the #2 and #1 breweries in DFW, we need to get extra nerdy: Facebook ratings vs. Yelp ratings.

1516

First off — most of the ratings from this analysis come from Facebook. There is a disproportionately high amount of them there as opposed to Yelp. However there is something quite insightful on how these ratings relate to the overall analysis:

Facebook ratings are generally very positive and include even more 5 star ratings. Note how in the figure on the left, that the blue Facebook dots are being pulled towards the 5 star ratings. Then look at the figure on the right. And then notice how far away all the Yelp ratings are. This would suggest an anecdote most of us are probably well aware of: Yelpers are mean-spirited jerks (or, rather, just tend to more negatively rate things).

This is actually really important to note: Facebook ratings are overly positive while Yelp ratings are overly negative. Now, there’s a bit of additional unfairness here… Franconia has no (business) Facebook page. That means, it has no ratings from Facebook to help it out. Let’s look at one more picture: how Franconia and Revolver stack up on Yelp (with respect to their aggregated results):

16_2

From Yelp’s perspective, Franconia is closer to the 5 stars than Revolver. Revolver is getting pulled closer to the 4 star ratings. And given that we now know that Yelp ratings are generally more negative than Facebook we have but one conclusion:

Revolver is #2, and Franconia is DFW’s #1 brewery (based on two of the ubiquitous 5 star rating systems available).

But it’s quite important to remember: we have no idea why people are rating these breweries as they do5, simply that—when it comes down to ratings—Franconia gets lots of 5s and 4s, and very, very, very few {3, 2, 1} star ratings.

All analyses performed in R. Correspondence Analysis was performed with the ExPosition package – a package created by particularly attractive and smart people. Code and data for the nerds who are so inclined.

Footnotes

1 I don’t think 2 blog posts counts as “tradition” yet.
2 Some breweries don’t have any ratings, and some have just a few, so they’ve been unfortunately excluded.
3 Some breweries only have ratings on Facebook and some only on Yelp.
4 I just rewatched Guardians of the Galaxy and Star Wars (in Machete Order) and am really emphasizing “star systems” and “star positions”. Space operas are the best.
5 For the stats nerds: there is actually another problem hiding here. Not all ratings are necessarily independent. In fact, it’s not unlikely that the same person provides a rating on both Facebook and Yelp. So, yes, there are some statistical assumptions that have been violated. But this is what happens sometimes — just do the best you can.

Dallas’s Favorite Beers

Hello Dallas.

Now that we’re hot off of 2014’s North Texas Beer Week… Have you ever wondered what Dallas’s favorite local craft beer is? You’re probably thinking “Yeah, it’s clearly Lone Star because it’s the ‘National Beer of Texas'”, or “duh – it’s the one in my hand right now, bro!”.

While valid guesses, they are clearly not correct (and you should feel bad about those guesses). The correct answer is: Lakewood Brewing Company’s “Temptress” – a milk stout. Now Dallas – you’re probably now thinking “Well, Lone Star and the beer in my hand are clearly the second and third favorite local craft beers.”

Well… this is the point where I ask you to stop thinking such terrible thoughts – those answers are also not correct (and you should continue to feel bad). The correct answers are: Peticolas Brewing Company’s “Velvet Hammer” — an imperial red — and Community Beer Company’s “Mosaic IPA” — an American-style IPA.

How do I know that Temptress, Velvet Hammer, and Mosaic IPA—in that order—are Dallas’s three favorite beers? These beers are on tap, or (for Temptress and Mosaic IPA) on shelves all across town. But just being available doesn’t make a beer Dallas’s favorite – or else those truly wretched thoughts you were having about Lone Star would have been true.

Well as a beer nerd and a stats nerd, I decided I just had to know: of all the local craft beers that are now produced and available throughout DFW – which are Dallas’s favorites? Let’s get nerdy.

I created a relatively simple survey on Google Docs. This survey listed 35 beers produced in (the broader) DFW area. For a beer to get on the list it had to meet the following criteria:

  1. The brewery itself must have been in operation for at least 1 year
  2. The beer itself must have been available for at least the past six months
  3. It has to be a year-round beer (no seasonals, specials, or one-offs)

That qualified 35 beers from the following breweries1: Franconia, Peticolas, Revolver, Martin House, Four Corners, Lakewood, Rahr, Deep Ellum, Community, and Cedar Creek.

When I had my list, I randomized the order in which these beers were listed and sent the survey out. Here’s a quick breakdown of some demographics:

  • 202 respondents. One was excluded2.
  • Gender: 36 Females, 160 Males, 1 Meat Popsicle, 1 Unicorn, 1 Manatar, and 3 non-responses.
  • 33 People professionally work with beer (brewer, bartender, waitstaff, etc…).
  • 58 People consider themselves homebrewers.

The survey asked people to respond to each beer with one of the 6 following options 3:

  • It is one of my favorite beers.
  • I like this beer.
  • This beer is OK.
  • I don’t like this beer.
  • I’ve never had this beer.
  • I have no opinion.

At this point, we can just count how many people, out of 201, had the answers above for each of the beers in the survey. So let’s get down to it:

RawValuesResponses
What we’re looking at here are the beers (on the rows, listed vertically) and the proportion (out of 201) for each response. I reordered the beers so that they are listed from the beers with the most to the least “It is one of my favorite beers”.

There are some clear favorites: Temptress, Velvet Hammer, Revolver’s Blood & Honey, and Mosiac IPA all have a lot of “Favorite” responses. You might be thinking, “Yo, Derek, you didn’t say a thing about Blood & Honey before—that’s my go to crushable—so maybe you’re lying about Lone Star too?”. If I were inclined to respond to such accusations, I’d say that 1) I’m building suspense (or boring you to tears) and 2) I’ve grown really tired of you talking about Lone Star – but I’m above that so I won’t say it.

As a stats nerd, though, this picture feels a bit… rudimentary. There are better ways to figure out and visualize Dallas’s favorite beer. So let’s turn to one of my favorite statistical methods: Correspondence Analysis (CA). CA is a technique that takes a large table made up of a bunch of variables (here: the responses) and turns them into new variables that better represent what’s happening4.

The data from above looks something like this:

Beer FAVORITE LIKE OK DO NOT LIKE Never Had No Opinion
Lakewood Temptress 107 62 7 4 19 2
Four Corner’s Block Party 15 83 29 6 66 2

So what will CA do for us with a table of data like this? It tells us which beers are most similar to one another – based on all the different categories. It can also tell us if any of the categories are similar to one another, too. Most importantly, it tells us which beers are more related to responses than other beers. Let’s take a look at what a CA would produce:

CA_1_Beers CA_1_Responses

CA produces for us these new variables—these variables are called “components”—denoted by the axes (horizontal and vertical lines) in these pictures. There are 3 other axes besides these – but those aren’t very important. Just these first two explain 87% of the entire data.

With what we know about CA we can say some of the following:

  • Temptress, Velvet Hammer, Blood & Honey, and Mosaic are more associated with “A FAVORITE” than other beers (both figures)
  • The responses of “OK” and “DO NOT LIKE” are essentially the same – which probably means people are being nice when they say “OK” or they’re being mean when they say “DO NOT LIKE”.
  • The lower left of the left figure shows Cedar Creek Scruffy’s, Cedar Creek Elliot’s Phoned Home, and Martin House XPA – which means they are nearly identical based on their responses; the responses being that most people haven’t had these beers. Sad times.

Let’s go a bit further. We know a bit about this data to, perhaps, make it easier to understand. Let’s combine “OK” with “DO NOT LIKE” – because they are basically one in the same here. We’ll also combine “NO OPINION” with “NEVER HAD” – so that we can group together the responses that are basically non-responses. Let’s do another CA and this time color each beer by the responses they are most similar to.

CA_2_Responses

With the combined responses – we can see the general configuration is essentially the same. Except this time we can explain 92.5% of the data instead of just 87% (take that Lone Star!). It’s also a little clearer that from right to left is a gradient of liking (or ever having) a beer. Now let’s take a look at the beers, colored by which response they are most similar to:

CA_2_BeersColored

Now we have a much clearer idea of which beers people have never really had (in gray), which ones are not particularly cared for (in red), which ones are liked (in yellow), and which are Dallas’s favorites (in green).

The favorites are still Temptress, Velvet Hammer, Blood & Honey, and Mosaic. So why did I exclude poor ol’ Blood & Honey from the top 3? Let’s take a look at the responses in these 4 categories like we did initially. Beers are sorted by those with the most “A FAVORITE” responses:

RawValues_2_Responses

Let’s also look at beers sorted by fewest responses of “OK/DO NOT LIKE”:

RawValues_2_Responses_Flipped

Now we have a bit different of a perspective – one that we can also get directly out of the CA results. Some beers are very related to “A FAVORITE” while at the same time rarely ever get a “DO NOT LIKE”. Unfortunately for Blood & Honey – the responses for “A FAVORITE”, “LIKE”, and “DO NOT LIKE” are equally likely.

But for Temptress, Velvet Hammer, and Mosaic IPA – very few people would say they “DO NOT LIKE” these beers. Thus, these three beers—in that order—are Dallas’s favorite beers. And that’s just science.

So what’s next? In about a year I’ll try to re-do this survey. That’s because by then approximately 30,786 breweries are, apparently, going to be open in Dallas (thanks, urban sprawl!), and many of the breweries that are currently open—but didn’t qualify this time—will qualify in a year.

All analyses performed in R. Correspondence Analysis was performed with the ExPosition package – a package created by particularly attractive and smart people. Data available here5. Code to recreate these analyses here6.

I’m tired just writing this and I’m sure you’re tired just reading it. So let’s go get some Lone Stars.

Footnotes
1 I had only realized after I sent out the survey I had made 2 glaring errors. I mistakenly excluded Firewheel and Armadillo Ale Works. Woops – sorry!

2They responded with “I’ve never had this beer” to all beers.

3For the stats nerds: these are survey options not usually seen. Often times when you get a survey, you’re asked to respond with a 1, 2, 3, 4, or 5 (or some similar numeric scale). Well, what if people have no opinion? What if they don’t want to answer the question? They need a way to opt out. Also, categories aren’t numbers, you dummy! For your (statistical) health!

4 For the stats nerds, technically both the beers and the responses are variables. The observations (people) are kind of hiding. Each person simply helps increase the number of responses within a particular cell of this table. CA is analogous a principal components analysis but for data more suited for χ2 analyses.

5Some responses are decimals. This is because some people left their responses blank (instead of choosing the very comprehensive categories I outlined – jerks). When a response was blank, I just replaced it with the average response.

6It’s in a text file, but, change the extension to .R to use it more easily with R.