Welcome! I am a data scientist, writer, and policy analyst currently working as the Director of Data and Analytics at the Guinn Center for Policy Priorities in Las Vegas, Nevada. On this site, I share some of my work on my many interests, which include R programming, statistics, data science, machine learning, emacs, and the history of science.
Background For the past several months, I have been working on an analysis on the effects of urban heat on vulnerable populations, particularly during a public health crisis. For some background, I currently live in Las Vegas, where summer heat can exceed 110 degrees. Last summer included a 45-day-long streak of temperatures over 100 degrees.
Urban heat is not distributed evenly within cities. Features such as parks or ponds can lead to cooler temperatures in some areas, while areas without foliage or with dark surfaces such as roads and buildings lead to warmer temperatures.
Introduction Below is the output of an iPython notebook covering the process of transforming images for deep learning applications in the fastai library. In particular, it shows how to use the many transformations in the Albumentations library within a fastai DataBlock.
A Brief Note I’ve made a number of these small guides, but haven’t posted them here. I may do so in the future. In general, I want to be better about putting materials I generate here on my site in some format.
Note 2: Now that the 2020 Census has concluded, the embedded Shiny app has been shut down. I’ve left the code used to embed the app on the site, and I’ve linked to a blog post with screenshots of the app itself. But the app itself is no longer linked and no longer embedded here. Note: This post was updated on 2020-07-10 with a new version of the Shiny app.
Contents Table of Contents Contents Introduction Base R Figure in Base R with Default Options High-Resolution Figure with Incorrect Figure Component Dimensions in Base R High-Resolution Figure with Correct Font Sizes in Base R High-resolution figures in ggplot Figure in ggplot2 with default options Figure in ggplot2 with higher resolution Formatted, high-resolution ggplot2 figure
Introduction In this post, I go over the process for exporting high-resolution graphics of the desired size with consistent layouts and font sizes.
In this post, we attempt to embed an interactive plot generated using plot_ly in this web page. The plot was generated in R. We are following this guide to embed the interactive figure.
Some context: the figure below depicts the change in enrollment in health insurance through the HealthCare.gov marketplace (the federal marketplace established through the Affordable Care Act) from coverage year 2019 to coverage year 2020. A static version of this figure was made using ggplot.
Introduction and Perspective In this post, I am briefly reviewing the draft of Andrew Ng’s Machine Learning Yearning, which can currently be obtained for free here. This is a work in progress, so my comments will be brief and not particularly critical. I obtained the copy I read on May 11; the page numbers and content may have changed since then.
I read the book because I have not had the chance to do too much machine learning work in the recent past.
I do a lot of work in Emacs (mostly in org mode): I write posts for this site; I work through exercises in textbooks; I document personal projects; I take notes; I keep a journal; the list goes on. Most of the writing I do ends up in emacs at some time or another. There are plenty of commands I use all the time and that I don’t need to look up anymore.
I added a new section to this website! It is a work in progress (currently a long wall of text), but it will serve as a portfolio of some of my data science and statistics projects. I will be updating it in the coming days and weeks with more code snippets and visualizations.
You can keep an eye on it here. The link can also be found at the top of this page.
MailChimp Export API I’m in the process of setting up regular exports of data from MailChimp to help Pocket Lobby keep track of user engagement. We want to integrate the MailChimp data with a number of other data sources, so getting the data out of MailChimp and into a format that can be linked with our other data sources is important. Ideally, we can automate as much of this as possible (rather than manually downloading the CSV files and adding them to a database).
I’ve been doing some pro bono data and statistics work for a nonprofit focused on keeping constituents informed of how well their elected representatives are representing them between elections. The goal is to allow users to vote along with their representatives and to receive information on how aligned their views are with the votes of their representatives.
We are trying to maintain and analyze data from a number of different sources: about the users (e.