Race and ethnicity data by Washington DC police zones

If you've got arrest or incident data from the Metropolitan Police in Washington DC, and that data is broken out by police district or public service area, you may want to compare it with the racial and ethnic makeup of the people living in those zones.

If so, this post is for you.

The US Census doesn't break out populations by police districts. But in DC and other large cities, census blocks serve as atomic units that usually do fall within police precinct boundaries. So by knowing which blocks are within which districts, you can calculate the populations. Unfortunately, block-level data is only available from the decennial count, so the latest data is from 2010.

This is my third spin at such data — I've also done New York City and Chicago

Chicago race and ethnicity data by police district

If you're trying to match Chicago police district data with the racial and ethnic makeup of those police districts, this post is for you.

The boundaries for police districts and precincts don't usually line up nicely with US census boundaries like census tracts or block groups. That makes it tough to compare incident and arrest data reported by precinct with the population of those precincts. 

But in bigger cities, census blocks are small enough to serve as atomic units that usually do fall within police precinct boundaries. So by knowing which blocks are within which districts, you can calculate the populations. Block-level data is only available from the decennial census count, so the latest data is from 2010. But it still should serve as a good measure — and a reason to fill out your 2020 census form online!

After doing these calculations for New York City, I put together Chicago's by request!

Lockdown loaves

It's become a coronavirus cliché, but for this week's #MakeEveryWeek I made sourdough bread. 

The twist: I made one loaf in the oven and one in a slow cooker.

It all started with sourdough starter, specifically this guide from Quartz colleague Tim McDonnell. This was a great project for my teens, incorporating chemistry, biology, and excellent smells.

Next was this incredibly fun and detailed sourdough recipe from Kitchn, which makes two loaves and relies on two oven-safe pots. Alas, our family has but one.

We do have a slow cooker, though. Could I make one of the loaves in that? The answer is yes!

Building a pulse oximeter

At-home pulse oximeters, those fingertip devices doctors use to measure the oxygen saturation in your blood, have been selling out everywhere thanks to the Covid-19 pandemic.

But as my Quartz colleague Amirta Khalid points out in this great article, most people don't need 'em. If your oxygen level is worryingly low, you'll know — you don't need a machine to tell you. Folks with some existing conditions, however, can use a pulse oximeter to help a remote doctor monitor their vitals or to adjust supplementary oxygen devices.

When Khalid mentioned she was working the story, it reminded me of the DIY "pulse ox" sensor Sparkfun sells. It, like other pulse oximeters, shines light into the skin and makes measurements based on how that light is absorbed. I've built heartbeat-driven projects before and had been exploring new ways to monitor pulse rates. So I got one.

Sparkfun warns in red letters that "this device is not intended to diagnose or treat any conditions," and I offer the same caution if you're tempted to build one. The process wasn't hard at all. I got it running quickly ... and then added an LED display for fun and flourish.

Here's how I made it, and the code, too.

Work-from-home "on air" light

I'm incredibly lucky to be both healthy and able to work from home during this coronavirus crisis. That means I spend large chunks of my day on video calls.

As a courtesy to my family, all of whom are also working and schooling from home, I've tried to warn them when they risk being broadcast to my colleagues. 

Now I have a fun "on air" light to help! And I've put the code online so you can make one, too.

DIY aquarium lights

Buy a new aquarium, and you often get hood lights that are ... meh. They're good enough, but not great.

There are plenty of high-quality replacement lights out there, but none of them had the nice, low profile of the plastic covers that came with this tank. So I decided to spruce up the existing illumination with some DYI lights — and even make them programmable with an Arduino.

That was more than a year ago. Now in coronavirus isolation, I finally made it happen.

Here's how.

Amazon Aurora MySQL + Python

Ok, so this isn't the sexiest topic, but if you're completely stuck the way I was several times today, maybe you're happy you found this post.

Today I needed to spin up a database I want available to students at the Newmark Graduate School of Journalism and also colleagues at Quartz. I also want to connect to the database from my home and the school using Python.

Since we use Amazon's web services, and I wanted to show the students SQL, I decided to give the AWS Aurora system a whirl — specifically the MySQL-compatible version.

As with many things AWS, it was a bit of a slog to get set up ... and I've decided to jot it all down while it's fresh so I can remember how the heck I did it (and show my students).

After a few tries, here's how I finally got set up:

Machine learning in my pajamas

Tonight I gave a presentation at Newsgeist about how I did machine learning in my pajamas — in my pajamas.

I promised the gathered crowd I'd post how they, too, can make their own bike-detector, so here it goes:

  1. Follow the instructions here.
  2. When you get to the part about picking a notebook, use this one: notebooks/ee-searching-videos-with-fastai.ipynb

Then follow the steps to work through the code! Have fun!


AI classes for journalists

(Promo video for the Knight Center course.)

If you're a journalist, you've probably done a story or two about about AI.  But did you know you can use machine learning, too? 

I'll show you! 

While the classes below have passed, the videos and accompanying code for the Knight Center course are now available free online.

Work at your own pace and enjoy. It could help with your next investigation, and the experience will help you report about machine learning, too.


Past classes:


September 13, 2019 • 11 am •  InterContinental New Orleans • Treme / 2nd Floor 

Hands-on Introduction: Machine Learning for Journalists at ONA

If you're going to ONA, get a practical, hands-on introduction to using machine learning to help pore through documents, images, and data records. This 90-minute training session by members of the Quartz AI Studio will give you the chance to use third-party tools and learn how to make custom machine-learning models. We'll walk you through pre-written code you can take home to your newsroom.


October 26 & 27, 2019 • Newmark Graduate School of Journalism • New York City

This will be a small-group, guided bootcamp where we'll spend the weekend working through practical machine-learning solutions for journalists. You'll learn to recognize cases when machine learning might help solve such reporting problems, to use existing and custom-made tools to tackle real-world issues, and to identify and avoid bias and error in your work. Students will get personalized instruction and hands-on experience for using these methods on any beat.


November 18 to December 15 • Knight Center for Journalism in the Americas • Online • $95

4-Week Online Course: Hands-on Machine Learning Solutions for Journalists

In this online video course, you will first learn how to use some off-the-shelf systems to get fast answers to basic questions: What’s in all of these images? What are these documents about? Then we’ll move to building custom machine learning models to help with a particular project, such as sorting documents into particular piles. Our work will be done with pre-written code, so you always start with a working base. You’ll then learn more by modifying it.

Updated 21 April 2020

Detecting feature importance in fast.ai neural networks

I'm working on a new neural network that tries to predict an outcome – true or false – based on 65 different variables in a table.

The tabular model I made with fast.ai is somewhat accurate at making those predictions (it's a small data set of just 5,000 rows). But to me even more interesting is determining which of the 65 features matter most. 

I knew calculating this "feature importance" was possible with random forests, but could I do it with neural nets?

It turns out I can. The trick is, essentially, to try the model without each feature. The degree to which the model gets worse with that feature missing indicates its importance – or lack of importance.

This blog post describes how to run this test, and this adaptation worked perfectly in my fast.ai notebook. Here's the code in a Gist:

Unfortunately, because my project uses internal Quartz analytics, I can't share the data or the charts I'm playing with. But with the code above, I can now "see into" the neural network and get cool insights about what's going on