Skip to content

Issues with the "incidence rate" in the "neighborhoods" analysis #6

@ghost

Description

@ProQuestionAsker @iblind

Inspired by your work I'm in the process of analyzing Yelp data for Berlin's neighborhoods, but I'm a little confused by the "incidence rate" you use. It's obviously the same that Katie Hempenius used. Do you happen to know where it comes from, or if she came up with it herself? The only incidence rate I know of comes from epidemiology and includes a component of time, which Katie's formula definitely does not. I've checked out some business resources and they all use the standard epidemiological definition.

Also, do you do any kind of within-category, between-district comparison in your analysis, i.e. do you rank the categories within each district and the districts within each category? As far as I can tell you don't. I ask because the second half of the formula, the part that normalizes the data, only effects the within-category, between-district rankings, but not the within-district, between-category rankings, since it is simply a constant at the within-district level. If you don't compare between districts then the normalization step is superfluous.

Hopefully you can tell me if I've maybe missed something. I plan to post my analysis on my blog in the next week or so. It's likely very few will read it, but I do plan to take up David Robinson on his offer, so if I'm lucky some people will read it. I respect your work and don't want it to seem like I've launched a surprise attack, ergo this "issue".

I have a few other objections to Katie's "incidence rate" beyond the above details. You are welcome to read and comment on them here. You'll want the "Deciding on a Metric" section that starts at line 359. It's still very much a rough draft, so some things will change, but my arguments in this section are fleshed out enough to understand what I'm aiming for.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions