Login

Sharing NYC Police Precinct Data

No sense keeping good data to yourself.

The map below went with these excellent WNYC stories about low-level marijuana arrests in New York City. After building it, I ended up with some data files that could be useful to others crunching population data by NYPD precinct. So we're sharing them here.

The trick to doing this analysis was to determine the populations of each precinct. But the US Census Bureau doesn't break down numbers that way. So I took the smallest Census unit -- the block -- and determined which blocks were in which precincts.

(I worked with PostgreSQL, PostGIS and QGIS, along with the generosity and insights of Jeff Larson and Al Shaw at ProPublica, and Jonathan Soma at Balance Coop.)

Data For You

Each of the following files are kept on Google Fusion Tables. You can use them there, or download them to your computer using File -> Export.

• precinct_block_key.csv is the Rosetta Stone for this project. It has two columns: each block's identifier, which the Census calls "GEOID10," and the precinct in which that block sits. Note that some blocks aren't in any precinct, usually because they're actually in the water. 

• NYC_Blocks_2010CensusData_Plus_Precincts contains base-level 2010 Census data for each block, married to the precinct for that block. A nice Fusion Tables trick is to pick View -> Aggregate, check "Sum" for the columns you want and then, lower down, chose to aggregate by precinct. Then you get totals for each precinct. For descriptions of the population columns, get this rather large PDF from the Census Bureau and jump to page 6-21 (Updated: Or, go to the page online with DocumentCloud.)

NYC_Police_Precinct_Shapes_4326 is the official police precinct map converted into a Google Map-friendly projection. I've used the fantastic tool shpescape.com to upload my transformed shape file to Fusion Tables, where it's easy to play with.

Caveats

I've done my best to be accurate in computing the intersection of blocks and precincts, even generating precinct maps and inspecting them visually. But errors may exist.

In fact, they do exist. While Census blocks generally fall nicely within precinct outlines, they don't always. In particular, three blocks significantly straddle two precincts. If you're doing very precise analysis, you'll want to account for them:

• Block 360470071002003: An area near the north end of the Gowanus Canal in Brooklyn. About half is in Precinct 76 and half in Precinct78. Total people: 51

• Block 360050096002000: Mainly industrial. Half in Precinct 76, half in Precinct 78. Total people: 5.

• Block 360610265003001: This block consists of five similar-sized apartment buildings near the GW Bridge. The northern three buildings are in Precinct 34, and the southern two are in Precinct 33. Looks like roughly a 60/40 split of the 687 living there.

If you find this information useful, drop me a note or post a comment below. We'd love to know about it.

5 responses

SR_spatial said
John, great work as always. When you aggregated the block data to precincts, did you compute intersections or use a centroid approach? I find that the centroid approach is much cleaner, easier, and -- except for the "straddlers" as you point out above -- more consistent in terms of allocating small-area population to larger spatial units. (I've been meaning to post something about this, your blog gives me more impetus to do so -- we've been dealing with this extensively at our shop at the CUNY Graduate Center as well.)

Cheers,
Steve

John Keefe said
Yes! I tried both methods and experienced the same thing. I actually used the "internal point" latitude and longitude provided by the Census Bureau, which should be even better than a simple centroid. Since the precincts seem to have been drawn around the blocks for the most part, it worked really nicely -- and the "straddlers" were easy to spot.
Josh Livni said
Hey John,

Cool stuff - it's great to see you share the data and methodology for this stuff.

You could make the (possibly incorrect, especially in rural areas) assumption that people are evenly spaced within the census blocks, and then assign the appropriate numbers to each arbitrary polygon by using the block shapes rather than centroids. The general methodology for this would be to:
- Calculate the percentage intersection of each census block to each polygon
- Multiply the total block population by that percentage
- Sum these partial populations of each block to get the total polygon population [or other attribute]

Assuming your data is unprojected (e.g. epsg:4326) and you want more easily readable areas to look at, then in PostGIS land your query might look something like:
SELECT b.population,
area(b.the_geom) * 100000000 as area,
area(st_intersection(b.the_geom, p.the_geom) *
100000000 as intersected_area
FROM precincts p, blocks b
WHERE st_intersects(b.the_geom, p.the_geom)
--and p.gid = 1234 -- [for some specific precinct]
GROUP by p.gid, b.population, intersected_area, area

Note that all we care about is relative area, not actual area (area calculations don't really make sense in unprojected reference systems; I just multiplied by a big number to make the result more easily readable).

Cheers,

-Josh

John Keefe said
Josh, thanks so much for this! I was actually heading down this path when I switched back to the centroid method -- partly because I wasn't sure how to pull off this area calculation, and also because there were only three blocks that really came up as an issue -- and I could get a sense of them visually in Google Maps.

One REALLY BIG thing to consider is that you can have problems when much of the area in question falls over water -- which is common for Census *tracts* on shorelines. For example, in Manhattan, there are tracts that have slivers of densely populated areas along the water, and with the bulk of the tract in the Hudson River! That's because the tracts extend to the state line, which is in the water.

A way around that is to consider the tract's AREAWATR and AREALAND metadata values, provided by the Census Bureau, in the calculation.

Better yet, use blocks, which end at the shorelines. (Other, watery blocks -- with 0 AREALAND -- fill out the tracts to the state line.)

denise cante said
Hi John. Thanks for sharing this tool with the public. I have a question: in doing this research, did you ever figure out how the NYPD precinct lines were actually drawn? I work for a non-profit that serves troubled youth, and we have several locations across the city. I'm trying to compare crime stats in each of the communities we serve, so it would be helpful for me to know how the our precinct llines were created (by population, by census, by geography). I've made several calls to the NYPD, and the answer has been "they've been like that forever." Just wondering if you might have come across the answer in the course of your work. Best, Denise