New online class: Mapmaking for journalists

The video above is the trailer for my new online class: Mapmaking for Journalists.

It starts Monday, July 11, 2022 and runs for four weeks. It's all online, and you can go at your own pace. I'll be in the class forums and even host a couple of (optional) live chats.

Along the way, you'll learn how to make maps using Datawrapper, Mapbox, and Mapshaper.

No prior coding experience is needed, and you get to keep all of the code I share with you.

It's run by the Knight Center for Journalism in the Americas at the University of Texas at Austin. The cost is $95, and you can register at journalismcourses.org

Hope you'll join me!


Sharing NYC Police Precinct Data

Note: This post was originally published April 29, 2011, and updated in June 2020. In February 2022, I updated it again using 2020 Census data. 

Anyone doing population analysis by NYC police precinct might find this post helpful, especially if you're interested in race and/or ethnicity analysis by precinct.

Back in 2011, I wanted to compare the racial and ethnic breakdown of low-level marijuana arrests — reported by police precinct — with that of the general population. The population data, of course, is available from the US Census, but it's not provided by police precincts, which also don't follow any major census boundaries like census tracts. Instead, they generally follow streets and shorelines. Fortunately, census blocks (which in New York, are often just city blocks) also follow streets and shorelines.

So I used US Census block maps and precinct maps from the city to figure out which blocks are in which precincts. Since population data is available at the block level, that data can then be aggregated into precincts.

In this, the third version of this post, I've updated the counts now that the 2020 population data is available.

The 2020 data

• nyc_precinct_2020pop.csv is the 2020 Census population, race, and ethnicity (Hispanic/non-Hispanic) data by NYPD police precinct. The column headers from the US Census are a little cryptic, but you can translate them using the P1 table metadata file and the P2 table metadata file.

nyc_block_precinct_2020pop.csv — every populated block in NYC is identified by its ID (called "GEOID20"), is matched to the police precinct it sits within, and contains the block's race/ethnicity information. Use the same metadata tables to translate the column headers. Also be sure to read about the caveats below.

nyc_precincts.geojson depicts the geographic boundaries of the NYPD precincts I used for the files above, as they existed in February 2022. As of this post, the information on the NYC Open Data portal indicates it was last updated on Nov 24, 2021.

Caveats for the 2020 data

The biggest caveat is that the US Census has introduced data fuzziness, or "noise," to make it difficult to identify individuals based on census data. This fuzziness is more pronounced at smaller geographies — the smallest being census blocks, which I've used for these calculations. Hansi Lo Wang did a great primer on these data protections for NPR, and the US Census Bureau has put out a lot of material on how it uses "differential privacy."

Covid cases, animated

I've been both awed and terrified by the transmissibility of Omicron and the speed at which it's spread. As the case curves hockey-sticked upward and the maps all turned red, I thought it'd be interesting to visualize the spread of this new twist on the coronavirus.

So the other day, while doing some work mapping Covid-19 data by US counties, I realized it wouldn't take much to generate a map for each day of the pandemic ... and make those maps into a movie.

It almost seems like cheating to use a work project as one of my "Make Every Week" projects, but I'm lucky to have a job where creative tinkering is celebrated. When I shared a tinker-made movie of six months of case data with colleagues Kaeti Hinck and Sean O'Key, they thought it would make a good data feature for CNN.

While truly a horrible topic — nearly every county is now reporting more than 100 cases per 100,000 people — the process of turning that case data into a movie was a worthy project. I learned a lot, and I did it almost entirely from the command line (the text-only interface that is my Mac's "Terminal" program).

Make Every Week returns

The last two years were rough. And as 2021 ended, and a new coronavirus surge began, the outlook wasn't exactly sunny.

In an effort stay centered and battle the blues, I've turned again to my soothing practice: making things.

Back in 2015, I tried to make something every week for a year. I only averaged something every 1.7 weeks, but it was still successful fun.

So I'm doing it again for 2022.

Might be a gadget, might be a toy, might be a map, might be bread. I'll try to learn something new every time, and will share each thing here.

But make. Every week.

A 3D-printed flexi-dog

To kick things off I literally dusted off my 3D printer, which I had set aside when we got a pandemic puppy, and tried to remember how to use it.

Seemed appropriate to print a dog, so I found this Flexi Dog on Thingiverse. I downloaded the shape's .stl file, navigated PrusaSlicer to turn the object into printable slices ...

... and then used OctoPrint to actually send the dog to the printer.

It didn't work right away; the triangle at the end of the tail — in the foreground of the next picture — kept coming off the base plate during printing, leading to tangled messes. In the end, I warmed the plate an extra 5° Celsius, and that made it stick.

Several false starts and two hours of continuous printing later, I had my first "make" of 2022.

Taking time to build a triangle-grid clock

I like what's possible with triangles.

Playing with rectangular blinky grids is super fun, and I've made a weather monitor and a pulse-oximeter with those.

But there's something additionally awesome about the pattern possibilities with triangle pixels.

So when I saw a Hackaday post about building a clock display with LED triangles, I was hooked.

The short story is that I made it! It now lights up my living room with dazzling animations and a funky time display.

The longer story involves perseverance made possible by my coronavirus lockdown.

Is it Monday? My Pi has the answer

Keeping track of the days has been harder lately, it seems.

So I was excited to see a nifty blog post by Dave Gershgorn, where he described how he built a slick dashboard by attaching a screen to a Raspberry Pi computer. In fact, the Pi actually attaches to the back of the screen, out of sight.

I happen to be the kind of nerd who has a couple of Raspberry Pis around (in my case, some older Pi 3 model B's), so I ordered the recommended screen and followed Dave's great directions along with this ETA Prime video. If you're similarly inspired, just follow those guides.

If you're new to setting up a Pi, you might not realize that it doesn't come with an operating system. You need to install one on a micro SD card, and slide it into the Pi. I like to download the latest, recommended system from the Pi site, unzip it, and use the balena Etcher to flash the SD card.

One of the build steps that was unclear from the video was exactly how to attach the power lines to the Pi. For my Pi, the pins were these three:

Another tricky step was folding the ribbon cable so it fit nicely. Here's how I did it:

Then it was just a treat to see the tiny Pi desktop appear before my eyes:

I launched the Terminal application with the little cursor icon in the upper left corner, and in order to run the installation commands I increased the Terminal text size using Ctrl-Shift-+.

Once I got everything running, I installed MagicMirror, added a monthly calendar module, and played with the configuration settings to suit my needs. (I also toyed with the Javascript and the CSS because I couldn't help myself, but you certainly don't have to.)

Works like a charm.

Drawing arcs of circles

Maybe you've seen them: Rainbows of circles representing members of the U.S. Senate, House of Representatives, or all of Congress.

I wanted to make such a visualization to show the number of members of Congress who've tested positive for coronavirus and the positive tests among each party.

If not for the serious nature of the topic, it was a fun puzzle to solve.

The steps I took were:

  1. Figure out how many circles fit in each ring
  2. Calculate the positions for every circle
  3. Sort the positions to suit my needs
  4. Marry the positions to data

It turns out that once I established a) the number of circles in each ring and b) the size of those circles, I could figure out the rest with code.

You can play with the final results, or take a look at the code yourself. But here's the explanation:

Printing a pumpkin

There's something exciting about holding an object you previously only imagined — whether it's a freshly baked loaf, a tomato off a garden vine, or a printed plastic pumpkin.

I've had that feeling a lot lately, with a pandemic purchase of a 3D printer.

Rolling an object in your fingers that was previously just a digital file on the internet is ridiculously fun. It's even more rewarding if the thing conjured was something you — or your kid — dreamed up.

That's what happened with this 3D pumpkin. My daughter drew it late one night for an animation class assignment using the program she was learning, Cinema 4D.

And then we made it real.

Modeling the 2020 vote with Observable

I've been interested in how voter turnout might affect the 2020 US election and I've wanted to play with Observable notebooks.

So I blended the two projects, and you can play with my live Observable notebook that does those calculations.

The result is an admittedly super-simplistic model of how things might turn out. But you can increase the percentage of Republican and Democratic voters nationwide and see what happens!

Notably, even if Democrats were able to boost turnout more than Republicans — say 107% vs 106% — Trump still wins.

As written, it doesn't consider nuances such as regional differences in voting turnouts, swing voters, or faithless electors. (It does, however, account for the unique ways Maine and Nebraska divide their electoral votes). But I learned a lot in the process ... and there's more to come.

All my calculations are visible in the Observable notebook itself, and the initial data prep is documented in a Github repository. For good measure, I put all the raw data in my Datasette library.

Minneapolis race and ethnicity data by neighborhood, served with Datasette

Minneapolis police report stops and other incidents by neighborhood, so I decided to calculate the racial makeup of those neighborhoods to make some comparisons — along the lines of what I've already done for New York, Chicago, and Washington, DC.

This time, though, I'm using Datasette.

I've seen creator Simon Willison tweet about Datasette, and with some extra time on my hands I took a look. It's so impressive!

With Datasette, one can publish data online easily, efficiently (even free!) and in a way that allows others to explore the data themselves using SQL and feed data visualizations and apps. At scale.

How is this not in every newsroom?

(Simon, by the way, has offered to help any newsroom interested in using Datasette — an offer I hope to take him up on someday.)

Minneapolis neighborhoods

Once again, I've married US Census blocks with other municipal zones, this time the official neighborhood map of Minneapolis.

That data is now online, served up with Datasette.

And with some nifty SQL queries, bookmarked as simple links, I can list the race and ethnic makeup of every neighborhood by raw number.

Or by percentage.