diff --git a/docs/_freeze/posts/campaign-finance/index/execute-results/html.json b/docs/_freeze/posts/campaign-finance/index/execute-results/html.json
index 4da314f908f0..4051c81c5a8e 100644
--- a/docs/_freeze/posts/campaign-finance/index/execute-results/html.json
+++ b/docs/_freeze/posts/campaign-finance/index/execute-results/html.json
@@ -1,14 +1,15 @@
 {
-  "hash": "2631514785c59e4e1d3b37b9c07ea232",
+  "hash": "989ed0f2ebddb8e202db6a33bc1bf790",
   "result": {
-    "markdown": "---\ntitle: \"Exploring campaign finance data\"\nauthor: \"Nick Crews\"\ndate: \"2023-03-24\"\ncategories:\n    - blog\n    - data engineering\n    - case study\n    - duckdb\n    - performance\n---\n\nHi! My name is [Nick Crews](https://www.linkedin.com/in/nicholas-b-crews/),\nand I'm a data engineer that looks at public campaign finance data.\n\nIn this post, I'll walk through how I use Ibis to explore public campaign contribution\ndata from the Federal Election Commission (FEC). We'll do some loading,\ncleaning, featurizing, and visualization. There will be filtering, sorting, grouping,\nand aggregation.\n\n## Downloading The Data\n\n::: {#02d63441 .cell execution_count=1}\n``` {.python .cell-code}\nfrom pathlib import Path\nfrom zipfile import ZipFile\nfrom urllib.request import urlretrieve\n\n# Download and unzip the 2018 individual contributions data\nurl = \"https://cg-519a459a-0ea3-42c2-b7bc-fa1143481f74.s3-us-gov-west-1.amazonaws.com/bulk-downloads/2018/indiv18.zip\"\nzip_path = Path(\"indiv18.zip\")\ncsv_path = Path(\"indiv18.csv\")\n\nif not zip_path.exists():\n    urlretrieve(url, zip_path)\n\nif not csv_path.exists():\n    with ZipFile(zip_path) as zip_file, csv_path.open(\"w\") as csv_file:\n        for line in zip_file.open(\"itcont.txt\"):\n            csv_file.write(line.decode())\n```\n:::\n\n\n## Loading the data\n\nNow that we have our raw data in a .csv format, let's load it into Ibis,\nusing the duckdb backend.\n\nNote that a 4.3 GB .csv would be near the limit of what pandas could\nhandle on my laptop with 16GB of RAM. In pandas, typically every time\nyou perform a transformation on the data, a copy of the data is made.\nI could only do a few transformations before I ran out of memory.\n\nWith Ibis, this problem is solved in two different ways.\n\nFirst, because they are designed to work with very large datasets,\nmany (all?) SQL backends support out of core operations.\nThe data lives on disk, and are only loaded in a streaming fashion\nwhen needed, and then written back to disk as the operation is performed.\n\nSecond, unless you explicitly ask for it, Ibis makes use of lazy\nevaluation. This means that when you ask for a result, the\nresult is not persisted in memory. Only the original source\ndata is persisted. Everything else is derived from this on the fly.\n\n::: {#83a871f2 .cell execution_count=2}\n``` {.python .cell-code}\nimport ibis\nfrom ibis import _\n\nibis.options.interactive = True\n\n# The raw .csv file doesn't have column names, so we will add them in the next step.\nraw = ibis.read_csv(csv_path)\nraw\n```\n\n::: {.cell-output .cell-output-display execution_count=2}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> C00401224 </span>┃<span style=\"font-weight: bold\"> A      </span>┃<span style=\"font-weight: bold\"> M6     </span>┃<span style=\"font-weight: bold\"> P      </span>┃<span style=\"font-weight: bold\"> 201804059101866001 </span>┃<span style=\"font-weight: bold\"> 24T    </span>┃<span style=\"font-weight: bold\"> IND    </span>┃<span style=\"font-weight: bold\"> STOUFFER, LEIGH   </span>┃<span style=\"font-weight: bold\"> AMSTELVEEN   </span>┃<span style=\"font-weight: bold\"> ZZ     </span>┃<span style=\"font-weight: bold\"> 1187RC    </span>┃<span style=\"font-weight: bold\"> MYSELF            </span>┃<span style=\"font-weight: bold\"> SELF EMPLOYED           </span>┃<span style=\"font-weight: bold\"> 05172017 </span>┃<span style=\"font-weight: bold\"> 10    </span>┃<span style=\"font-weight: bold\"> C00458000 </span>┃<span style=\"font-weight: bold\"> SA11AI_81445687 </span>┃<span style=\"font-weight: bold\"> 1217152 </span>┃<span style=\"font-weight: bold\"> column18 </span>┃<span style=\"font-weight: bold\"> EARMARKED FOR PROGRESSIVE CHANGE CAMPAIGN COMMITTEE (C00458000) </span>┃<span style=\"font-weight: bold\"> 4050820181544765358 </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                                          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>               │\n├───────────┼────────┼────────┼────────┼────────────────────┼────────┼────────┼───────────────────┼──────────────┼────────┼───────────┼───────────────────┼─────────────────────────┼──────────┼───────┼───────────┼─────────────────┼─────────┼──────────┼─────────────────────────────────────────────────────────────────┼─────────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101867748</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STRAWS, JOYCE    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">34761    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SILVERSEA CRUISES</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">RESERVATIONS SUPERVISOR</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05182017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81592336</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544770597</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101867748</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STRAWS, JOYCE    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">34761    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SILVERSEA CRUISES</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">RESERVATIONS SUPERVISOR</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81627562</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544770598</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865942</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">039020760</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05132017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81047921</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765179</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865942</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">039020760</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05152017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81209209</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765180</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865942</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">039020760</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81605223</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765181</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865943</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">039020760</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05242017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_82200022</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765182</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865943</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">03902    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED           </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05292017</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00213512</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_82589834</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR NANCY PELOSI FOR CONGRESS (C00213512)            </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765184</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865944</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">039020760</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05302017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_82643727</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765185</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101867050</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STRANGE, WINIFRED</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">34216    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED           </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05162017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81325918</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544768505</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101867051</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STRANGE, WINIFRED</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">34216    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED           </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05232017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81991189</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544768506</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>        │     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>               │       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                                               │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└───────────┴────────┴────────┴────────┴────────────────────┴────────┴────────┴───────────────────┴──────────────┴────────┴───────────┴───────────────────┴─────────────────────────┴──────────┴───────┴───────────┴─────────────────┴─────────┴──────────┴─────────────────────────────────────────────────────────────────┴─────────────────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#d2a81789 .cell execution_count=3}\n``` {.python .cell-code}\n# For a more comprehesive description of the columns and their meaning, see\n# https://www.fec.gov/campaign-finance-data/contributions-individuals-file-description/\ncolumns = {\n    \"CMTE_ID\": \"keep\",  # Committee ID\n    \"AMNDT_IND\": \"drop\",  # Amendment indicator. A = amendment, N = new, T = termination\n    \"RPT_TP\": \"drop\",  # Report type (monthly, quarterly, etc)\n    \"TRANSACTION_PGI\": \"keep\",  # Primary/general indicator\n    \"IMAGE_NUM\": \"drop\",  # Image number\n    \"TRANSACTION_TP\": \"drop\",  # Transaction type\n    \"ENTITY_TP\": \"keep\",  # Entity type\n    \"NAME\": \"drop\",  # Contributor name\n    \"CITY\": \"keep\",  # Contributor city\n    \"STATE\": \"keep\",  # Contributor state\n    \"ZIP_CODE\": \"drop\",  # Contributor zip code\n    \"EMPLOYER\": \"drop\",  # Contributor employer\n    \"OCCUPATION\": \"drop\",  # Contributor occupation\n    \"TRANSACTION_DT\": \"keep\",  # Transaction date\n    \"TRANSACTION_AMT\": \"keep\",  # Transaction amount\n    # Other ID. For individual contributions will be null. For contributions from\n    # other FEC committees, will be the committee ID of the other committee.\n    \"OTHER_ID\": \"drop\",\n    \"TRAN_ID\": \"drop\",  # Transaction ID\n    \"FILE_NUM\": \"drop\",  # File number, unique number assigned to each report filed with the FEC\n    \"MEMO_CD\": \"drop\",  # Memo code\n    \"MEMO_TEXT\": \"drop\",  # Memo text\n    \"SUB_ID\": \"drop\",  # Submission ID. Unique number assigned to each transaction.\n}\n\nrenaming = {old: new for old, new in zip(raw.columns, columns.keys())}\nto_keep = [k for k, v in columns.items() if v == \"keep\"]\nkept = raw.relabel(renaming)[to_keep]\nkept\n```\n\n::: {.cell-output .cell-output-display execution_count=3}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CMTE_ID   </span>┃<span style=\"font-weight: bold\"> TRANSACTION_PGI </span>┃<span style=\"font-weight: bold\"> ENTITY_TP </span>┃<span style=\"font-weight: bold\"> CITY         </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_DT </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │\n├───────────┼─────────────────┼───────────┼──────────────┼────────┼────────────────┼─────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05182017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05132017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05152017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017      </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05242017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05292017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05302017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05162017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05232017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>              │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└───────────┴─────────────────┴───────────┴──────────────┴────────┴────────────────┴─────────────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#1e6d16fe .cell execution_count=4}\n``` {.python .cell-code}\n# 21 million rows\nkept.count()\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n```\n:::\n\n::: {.cell-output .cell-output-display execution_count=4}\n\n::: {.ansi-escaped-output}\n```{=html}\n<pre><span class=\"ansi-cyan-fg ansi-bold\">21730730</span></pre>\n```\n:::\n\n:::\n:::\n\n\nHuh, what's up with those timings? Previewing the head only took a fraction of a second,\nbut finding the number of rows took 10 seconds.\n\nThat's because duckdb is scanning the .csv file on the fly every time we access it.\nSo we only have to read the first few lines to get that preview,\nbut we have to read the whole file to get the number of rows.\n\nNote that this isn't a feature of Ibis, but a feature of Duckdb. This what I think is\none of the strengths of Ibis: Ibis itself doesn't have to implement any of the\noptimimizations or features of the backends. Those backends can focus on what they do\nbest, and Ibis can get those things for free.\n\nSo, let's tell duckdb to actually read in the file to its native format so later accesses\nwill be faster. This will be a ~20 seconds that we'll only have to pay once.\n\n::: {#185a2d89 .cell execution_count=5}\n``` {.python .cell-code}\nkept = kept.cache()\nkept\n```\n\n::: {.cell-output .cell-output-display execution_count=5}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CMTE_ID   </span>┃<span style=\"font-weight: bold\"> TRANSACTION_PGI </span>┃<span style=\"font-weight: bold\"> ENTITY_TP </span>┃<span style=\"font-weight: bold\"> CITY         </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_DT </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │\n├───────────┼─────────────────┼───────────┼──────────────┼────────┼────────────────┼─────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05182017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05132017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05152017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017      </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05242017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05292017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05302017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05162017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05232017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>              │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└───────────┴─────────────────┴───────────┴──────────────┴────────┴────────────────┴─────────────────┘\n</pre>\n```\n:::\n:::\n\n\nLook, now accessing it only takes a fraction of a second!\n\n::: {#9253e73f .cell execution_count=6}\n``` {.python .cell-code}\nkept.count()\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n```\n:::\n\n::: {.cell-output .cell-output-display execution_count=6}\n\n::: {.ansi-escaped-output}\n```{=html}\n<pre><span class=\"ansi-cyan-fg ansi-bold\">21730730</span></pre>\n```\n:::\n\n:::\n:::\n\n\n### Committees Data\n\nThe contributions only list an opaque `CMTE_ID` column. We want to know which actual\ncommittee this is. Let's load the committees table so we can lookup from\ncommittee ID to committee name.\n\n::: {#30076e2c .cell execution_count=7}\n``` {.python .cell-code}\ndef read_committees():\n    committees_url = \"https://cg-519a459a-0ea3-42c2-b7bc-fa1143481f74.s3-us-gov-west-1.amazonaws.com/bulk-downloads/2018/committee_summary_2018.csv\"\n    # This just creates a view, it doesn't actually fetch the data yet\n    tmp = ibis.read_csv(committees_url)\n    tmp = tmp[\"CMTE_ID\", \"CMTE_NM\"]\n    # The raw table contains multiple rows for each committee id, so lets pick\n    # an arbitrary row for each committee id as the representative name.\n    deduped = tmp.group_by(\"CMTE_ID\").agg(CMTE_NM=_.CMTE_NM.arbitrary())\n    return deduped\n\n\ncomms = read_committees().cache()\ncomms\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CMTE_ID   </span>┃<span style=\"font-weight: bold\"> CMTE_NM                                                        </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                                         │\n├───────────┼────────────────────────────────────────────────────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00659441</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">JASON ORTITAY FOR CONGRESS                                    </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00661249</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SERVICE AFTER SERVICE                                         </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00457754</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">U.S. TRAVEL ASSOCIATION PAC                                   </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00577635</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ISAKSON VICTORY COMMITTEE                                     </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00297911</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">TEXAS FORESTRY ASSOCIATION FORESTRY POLITICAL ACTION COMMITTEE</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00551382</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">VOTECLIMATE.US PAC                                            </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00414318</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">LOEBSACK FOR CONGRESS                                         </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00610709</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">AUSTIN INNOVATION 2016                                        </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00131607</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FLORIDA CITRUS MUTUAL POLITCAL ACTION COMMITTEE               </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00136531</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NATIONAL DEMOCRATIC POLICY COMMITTEE                          </span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                                              │\n└───────────┴────────────────────────────────────────────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\nNow add the committee name to the contributions table:\n\n::: {#0a9f3b35 .cell execution_count=8}\n``` {.python .cell-code}\ntogether = kept.left_join(comms, \"CMTE_ID\").drop(\"CMTE_ID\", \"CMTE_ID_right\")\ntogether\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> TRANSACTION_PGI </span>┃<span style=\"font-weight: bold\"> ENTITY_TP </span>┃<span style=\"font-weight: bold\"> CITY             </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_DT </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃<span style=\"font-weight: bold\"> CMTE_NM                                         </span>┃\n┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                          │\n├─────────────────┼───────────┼──────────────────┼────────┼────────────────┼─────────────────┼─────────────────────────────────────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">COHASSET        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">MA    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">KEY LARGO       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01042017      </span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5000</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">LOOKOUT MOUNTAIN</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NORTH YARMOUTH  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ALPHARETTA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">HOLLIS CENTER   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ALEXANDRIA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">VA    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>              │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                               │\n└─────────────────┴───────────┴──────────────────┴────────┴────────────────┴─────────────────┴─────────────────────────────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\n## Cleaning\n\nFirst, let's drop any contributions that don't have a committee name. There are only 6 of them.\n\n::: {#14ae871f .cell execution_count=9}\n``` {.python .cell-code}\n# We can do this fearlessly, no .copy() needed, because\n# everything in Ibis is immutable. If we did this in pandas,\n# we might start modifying the original DataFrame accidentally!\ncleaned = together\n\nhas_name = cleaned.CMTE_NM.notnull()\ncleaned = cleaned[has_name]\nhas_name.value_counts()\n```\n\n::: {.cell-output .cell-output-display execution_count=9}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> NotNull(CMTE_NM) </span>┃<span style=\"font-weight: bold\"> NotNull(CMTE_NM)_count </span>┃\n┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">boolean</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>                  │\n├──────────────────┼────────────────────────┤\n│ True             │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21730724</span> │\n│ False            │                      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span> │\n└──────────────────┴────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\nLet's look at the `ENTITY_TP` column. This represents the type of entity that\nmade the contribution:\n\n::: {#72577ed8 .cell execution_count=10}\n``` {.python .cell-code}\ntogether.ENTITY_TP.value_counts()\n```\n\n::: {.cell-output .cell-output-display execution_count=10}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> ENTITY_TP </span>┃<span style=\"font-weight: bold\"> ENTITY_TP_count </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │\n├───────────┼─────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21687992</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">CCM      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">698</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">CAN      </span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13659</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">ORG      </span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">18555</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">PTY      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">49</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">COM      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">867</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">PAC      </span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3621</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>      │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5289</span> │\n└───────────┴─────────────────┘\n</pre>\n```\n:::\n:::\n\n\nWe only care about contributions from individuals.\n\nOnce we filter on this column, the contents of it are irrelevant, so let's drop it.\n\n::: {#f29924a2 .cell execution_count=11}\n``` {.python .cell-code}\ncleaned = together[_.ENTITY_TP == \"IND\"].drop(\"ENTITY_TP\")\n```\n:::\n\n\nIt looks like the `TRANSACTION_DT` column was a raw string like \"MMDDYYYY\",\nso let's convert that to a proper date type.\n\n::: {#15443483 .cell execution_count=12}\n``` {.python .cell-code}\nfrom ibis.expr.types import StringValue, DateValue\n\n\ndef mmddyyyy_to_date(val: StringValue) -> DateValue:\n    return val.cast(str).lpad(8, \"0\").to_timestamp(\"%m%d%Y\").date()\n\n\ncleaned = cleaned.mutate(date=mmddyyyy_to_date(_.TRANSACTION_DT)).drop(\"TRANSACTION_DT\")\ncleaned\n```\n\n::: {.cell-output .cell-output-display execution_count=12}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> TRANSACTION_PGI </span>┃<span style=\"font-weight: bold\"> CITY             </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃<span style=\"font-weight: bold\"> CMTE_NM                                         </span>┃<span style=\"font-weight: bold\"> date       </span>┃\n┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>       │\n├─────────────────┼──────────────────┼────────┼─────────────────┼─────────────────────────────────────────────────┼────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">COHASSET        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">MA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">KEY LARGO       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5000</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-04</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">LOOKOUT MOUNTAIN</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NORTH YARMOUTH  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ALPHARETTA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">HOLLIS CENTER   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ALEXANDRIA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">VA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>          │\n└─────────────────┴──────────────────┴────────┴─────────────────┴─────────────────────────────────────────────────┴────────────┘\n</pre>\n```\n:::\n:::\n\n\nThe `TRANSACTION_PGI` column represents the type (primary, general, etc) of election,\nand the year. But it seems to be not very consistent:\n\n::: {#fa016097 .cell execution_count=13}\n``` {.python .cell-code}\ncleaned.TRANSACTION_PGI.topk(10)\n```\n\n::: {.cell-output .cell-output-display execution_count=13}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> TRANSACTION_PGI </span>┃<span style=\"font-weight: bold\"> Count(TRANSACTION_PGI) </span>┃\n┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>                  │\n├─────────────────┼────────────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">17013596</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">G2018          </span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2095123</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P2018          </span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1677183</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P2020          </span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">208501</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">O2018          </span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">161874</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">S2017          </span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">124336</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">G2017          </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">98401</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P2022          </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">91136</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P2017          </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">61153</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">R2017          </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">54281</span> │\n└─────────────────┴────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#35c8a393 .cell execution_count=14}\n``` {.python .cell-code}\ndef get_election_type(pgi: StringValue) -> StringValue:\n    \"\"\"Use the first letter of the TRANSACTION_PGI column to determine the election type\n\n    If the first letter is not one of the known election stage, then return null.\n    \"\"\"\n    election_types = {\n        \"P\": \"primary\",\n        \"G\": \"general\",\n        \"O\": \"other\",\n        \"C\": \"convention\",\n        \"R\": \"runoff\",\n        \"S\": \"special\",\n        \"E\": \"recount\",\n    }\n    first_letter = pgi[0]\n    return first_letter.substitute(election_types, else_=ibis.NA)\n\n\ncleaned = cleaned.mutate(election_type=get_election_type(_.TRANSACTION_PGI)).drop(\n    \"TRANSACTION_PGI\"\n)\ncleaned\n```\n\n::: {.cell-output .cell-output-display execution_count=14}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CITY             </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃<span style=\"font-weight: bold\"> CMTE_NM                                         </span>┃<span style=\"font-weight: bold\"> date       </span>┃<span style=\"font-weight: bold\"> election_type </span>┃\n┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │\n├──────────────────┼────────┼─────────────────┼─────────────────────────────────────────────────┼────────────┼───────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">COHASSET        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">MA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">KEY LARGO       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5000</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-04</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">LOOKOUT MOUNTAIN</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">NORTH YARMOUTH  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">ALPHARETTA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">HOLLIS CENTER   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">ALEXANDRIA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">VA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │\n└──────────────────┴────────┴─────────────────┴─────────────────────────────────────────────────┴────────────┴───────────────┘\n</pre>\n```\n:::\n:::\n\n\nThat worked well! There are 0 nulls in the resulting column, so we always were\nable to determine the election type.\n\n::: {#e7038c36 .cell execution_count=15}\n``` {.python .cell-code}\ncleaned.election_type.topk(10)\n```\n\n::: {.cell-output .cell-output-display execution_count=15}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> election_type </span>┃<span style=\"font-weight: bold\"> Count(election_type) </span>┃\n┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>                │\n├───────────────┼──────────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">19061953</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2216685</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">other        </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">161965</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">special      </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">149572</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">runoff       </span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">69637</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">convention   </span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">22453</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">recount      </span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5063</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>          │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │\n└───────────────┴──────────────────────┘\n</pre>\n```\n:::\n:::\n\n\nAbout 1/20 of transactions are negative. These could represent refunds, or they\ncould be data entry errors. Let's drop them to keep it simple.\n\n::: {#ab64b9b2 .cell execution_count=16}\n``` {.python .cell-code}\nabove_zero = cleaned.TRANSACTION_AMT > 0\ncleaned = cleaned[above_zero]\nabove_zero.value_counts()\n```\n\n::: {.cell-output .cell-output-display execution_count=16}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> Greater(TRANSACTION_AMT, 0) </span>┃<span style=\"font-weight: bold\"> Greater(TRANSACTION_AMT, 0)_count </span>┃\n┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">boolean</span>                     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>                             │\n├─────────────────────────────┼───────────────────────────────────┤\n│ True                        │                          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">20669809</span> │\n│ False                       │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1018183</span> │\n└─────────────────────────────┴───────────────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\n## Adding Features\n\nNow that the data is cleaned up to a usable format, let's add some features.\n\nFirst, it's useful to categorize donations by size, placing them into buckets\nof small, medium, large, etc.\n\n::: {#db1e9cbe .cell execution_count=17}\n``` {.python .cell-code}\nedges = [\n    10,\n    50,\n    100,\n    500,\n    1000,\n    5000,\n]\nlabels = [\n    \"<10\",\n    \"10-50\",\n    \"50-100\",\n    \"100-500\",\n    \"500-1000\",\n    \"1000-5000\",\n    \"5000+\",\n]\n\n\ndef bucketize(vals, edges, str_labels):\n    # Uses Ibis's .bucket() method to create a categorical column\n    int_labels = vals.bucket(edges, include_under=True, include_over=True)\n    # Map the integer labels to the string labels\n    int_to_str = {str(i): s for i, s in enumerate(str_labels)}\n    return int_labels.cast(str).substitute(int_to_str)\n\n\nfeatured = cleaned.mutate(amount_bucket=bucketize(_.TRANSACTION_AMT, edges, labels))\nfeatured\n```\n\n::: {.cell-output .cell-output-display execution_count=17}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CITY             </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃<span style=\"font-weight: bold\"> CMTE_NM                                         </span>┃<span style=\"font-weight: bold\"> date       </span>┃<span style=\"font-weight: bold\"> election_type </span>┃<span style=\"font-weight: bold\"> amount_bucket </span>┃\n┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │\n├──────────────────┼────────┼─────────────────┼─────────────────────────────────────────────────┼────────────┼───────────────┼───────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">COHASSET        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">MA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">KEY LARGO       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5000</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-04</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">1000-5000    </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">LOOKOUT MOUNTAIN</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">NORTH YARMOUTH  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">ALPHARETTA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">HOLLIS CENTER   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">ALEXANDRIA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">VA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │\n└──────────────────┴────────┴─────────────────┴─────────────────────────────────────────────────┴────────────┴───────────────┴───────────────┘\n</pre>\n```\n:::\n:::\n\n\n## Analysis\n\n### By donation size\n\nOne thing we can look at is the donation breakdown by size:\n- Are most donations small or large?\n- Where do politicians/committees get most of their money from? Large or small donations?\n\nWe also will compare performance of Ibis vs pandas during this groupby.\n\n::: {#2c306d0f .cell execution_count=18}\n``` {.python .cell-code}\ndef summary_by(table, by):\n    return table.group_by(by).agg(\n        n_donations=_.count(),\n        total_amount=_.TRANSACTION_AMT.sum(),\n        mean_amount=_.TRANSACTION_AMT.mean(),\n        median_amount=_.TRANSACTION_AMT.approx_median(),\n    )\n\n\ndef summary_by_pandas(df, by):\n    return df.groupby(by, as_index=False).agg(\n        n_donations=(\"election_type\", \"count\"),\n        total_amount=(\"TRANSACTION_AMT\", \"sum\"),\n        mean_amount=(\"TRANSACTION_AMT\", \"mean\"),\n        median_amount=(\"TRANSACTION_AMT\", \"median\"),\n    )\n\n\n# persist the input data so the following timings of the group_by are accurate.\nsubset = featured[\"election_type\", \"amount_bucket\", \"TRANSACTION_AMT\"]\nsubset = subset.cache()\npandas_subset = subset.execute()\n```\n:::\n\n\nLet's take a look at what we are actually computing:\n\n::: {#a621ca5f .cell execution_count=19}\n``` {.python .cell-code}\nby_type_and_bucket = summary_by(subset, [\"election_type\", \"amount_bucket\"])\nby_type_and_bucket\n```\n\n::: {.cell-output .cell-output-display execution_count=19}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> election_type </span>┃<span style=\"font-weight: bold\"> amount_bucket </span>┃<span style=\"font-weight: bold\"> n_donations </span>┃<span style=\"font-weight: bold\"> total_amount </span>┃<span style=\"font-weight: bold\"> mean_amount  </span>┃<span style=\"font-weight: bold\"> median_amount </span>┃\n┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>         │\n├───────────────┼───────────────┼─────────────┼──────────────┼──────────────┼───────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">50-100       </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2663933</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">155426540</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">58.344763</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">10-50        </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8115403</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">187666251</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">23.124699</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3636287</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">637353634</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">175.275943</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">150</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">&lt;10          </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2423728</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10080721</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.159180</span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">634677</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">334630687</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">527.245649</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">1000-5000    </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">684755</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1231394874</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1798.299938</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1008</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">5000+        </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">44085</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1558371116</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35349.237065</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10000</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">700821</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">123174568</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">175.757530</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">150</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">50-100       </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">304363</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16184312</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">53.174374</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">10-50        </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">660787</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14411588</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21.809733</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└───────────────┴───────────────┴─────────────┴──────────────┴──────────────┴───────────────┘\n</pre>\n```\n:::\n:::\n\n\nOK, now let's do our timings.\n\nOne interesting thing to pay attention to here is the execution time for the following\ngroupby. Before, we could get away with lazy execution: because we only wanted to preview\nthe first few rows, we only had to compute the first few rows, so all our previews were\nvery fast.\n\nBut now, as soon as we do a groupby, we have to actually go through the whole dataset\nin order to compute the aggregate per group. So this is going to be slower. BUT,\nduckdb is still quite fast. It only takes milliseconds to groupby-agg all 20 million rows!\n\n::: {#fc3694c3 .cell execution_count=20}\n``` {.python .cell-code}\n%timeit summary_by(subset, [\"election_type\", \"amount_bucket\"]).execute()  # .execute() so we actually fetch the data\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n679 ms ± 11.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n```\n:::\n:::\n\n\nNow let's try the same thing in pandas:\n\n::: {#ab990661 .cell execution_count=21}\n``` {.python .cell-code}\n%timeit summary_by_pandas(pandas_subset, [\"election_type\", \"amount_bucket\"])\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n3.59 s ± 31.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n```\n:::\n:::\n\n\nIt takes about 4 seconds, which is about 10 times slower than duckdb.\nAt this scale, it again doesn't matter,\nbut you could imagine with a dataset much larger than this, it would matter.\n\nLet's also think about memory usage:\n\n::: {#03834f0b .cell execution_count=22}\n``` {.python .cell-code}\npandas_subset.memory_usage(deep=True).sum() / 1e9  # GB\n```\n\n::: {.cell-output .cell-output-display execution_count=22}\n```\n2.782586663\n```\n:::\n:::\n\n\nThe source dataframe is couple gigabytes, so probably during the groupby,\nthe peak memory usage is going to be a bit higher than this. You could use a profiler\nsuch as [FIL](https://github.com/pythonspeed/filprofiler) if you wanted an exact number,\nI was too lazy to use that here.\n\nAgain, this works on my laptop at this dataset size, but much larger than this and I'd\nstart having problems. Duckdb on the other hand is designed around working out of core\nso it should scale to datasets into the hundreds of gigabytes, much larger than your\ncomputer's RAM.\n\n### Back to analysis\n\nOK, let's plot the result of that groupby.\n\nSurprise! (Or maybe not...) Most donations are small. But most of the money comes\nfrom donations larger than $1000.\n\nWell if that's the case, why do politicians spend so much time soliciting small\ndonations? One explanation is that they can use the number of donations\nas a marketing pitch, to show how popular they are, and thus how viable of a\ncandidate they are.\n\nThis also might explain whose interests are being served by our politicians.\n\n::: {#cf2c035e .cell execution_count=23}\n``` {.python .cell-code}\nimport altair as alt\n\n# Do some bookkeeping so the buckets are displayed smallest to largest on the charts\nbucket_col = alt.Column(\"amount_bucket:N\", sort=labels)\n\nn_by_bucket = (\n    alt.Chart(by_type_and_bucket.execute())\n    .mark_bar()\n    .encode(\n        x=bucket_col,\n        y=\"n_donations:Q\",\n        color=\"election_type:N\",\n    )\n)\ntotal_by_bucket = (\n    alt.Chart(by_type_and_bucket.execute())\n    .mark_bar()\n    .encode(\n        x=bucket_col,\n        y=\"total_amount:Q\",\n        color=\"election_type:N\",\n    )\n)\nn_by_bucket | total_by_bucket\n```\n\n::: {.cell-output .cell-output-display execution_count=23}\n```{=html}\n\n<style>\n  #altair-viz-86646cc3a2964613b7509f2dcc47f742.vega-embed {\n    width: 100%;\n    display: flex;\n  }\n\n  #altair-viz-86646cc3a2964613b7509f2dcc47f742.vega-embed details,\n  #altair-viz-86646cc3a2964613b7509f2dcc47f742.vega-embed details summary {\n    position: relative;\n  }\n</style>\n<div id=\"altair-viz-86646cc3a2964613b7509f2dcc47f742\"></div>\n<script type=\"text/javascript\">\n  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n  (function(spec, embedOpt){\n    let outputDiv = document.currentScript.previousElementSibling;\n    if (outputDiv.id !== \"altair-viz-86646cc3a2964613b7509f2dcc47f742\") {\n      outputDiv = document.getElementById(\"altair-viz-86646cc3a2964613b7509f2dcc47f742\");\n    }\n    const paths = {\n      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.8.0?noext\",\n      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n    };\n\n    function maybeLoadScript(lib, version) {\n      var key = `${lib.replace(\"-\", \"\")}_version`;\n      return (VEGA_DEBUG[key] == version) ?\n        Promise.resolve(paths[lib]) :\n        new Promise(function(resolve, reject) {\n          var s = document.createElement('script');\n          document.getElementsByTagName(\"head\")[0].appendChild(s);\n          s.async = true;\n          s.onload = () => {\n            VEGA_DEBUG[key] = version;\n            return resolve(paths[lib]);\n          };\n          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n          s.src = paths[lib];\n        });\n    }\n\n    function showError(err) {\n      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n      throw err;\n    }\n\n    function displayChart(vegaEmbed) {\n      vegaEmbed(outputDiv, spec, embedOpt)\n        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n    }\n\n    if(typeof define === \"function\" && define.amd) {\n      requirejs.config({paths});\n      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n    } else {\n      maybeLoadScript(\"vega\", \"5\")\n        .then(() => maybeLoadScript(\"vega-lite\", \"5.8.0\"))\n        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n        .catch(showError)\n        .then(() => displayChart(vegaEmbed));\n    }\n  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"hconcat\": [{\"data\": {\"name\": \"data-7df486c18103705f447413da2d200c84\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"color\": {\"field\": \"election_type\", \"type\": \"nominal\"}, \"x\": {\"field\": \"amount_bucket\", \"sort\": [\"<10\", \"10-50\", \"50-100\", \"100-500\", \"500-1000\", \"1000-5000\", \"5000+\"], \"type\": \"nominal\"}, \"y\": {\"field\": \"n_donations\", \"type\": \"quantitative\"}}}, {\"data\": {\"name\": \"data-334b351fcf36268adfd9e34181060327\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"color\": {\"field\": \"election_type\", \"type\": \"nominal\"}, \"x\": {\"field\": \"amount_bucket\", \"sort\": [\"<10\", \"10-50\", \"50-100\", \"100-500\", \"500-1000\", \"1000-5000\", \"5000+\"], \"type\": \"nominal\"}, \"y\": {\"field\": \"total_amount\", \"type\": \"quantitative\"}}}], \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.8.0.json\", \"datasets\": {\"data-7df486c18103705f447413da2d200c84\": [{\"election_type\": \"primary\", \"amount_bucket\": \"<10\", \"n_donations\": 2423728, \"total_amount\": 10080721, \"mean_amount\": 4.159179990493983, \"median_amount\": 5}, {\"election_type\": \"primary\", \"amount_bucket\": \"10-50\", \"n_donations\": 8115403, \"total_amount\": 187666251, \"mean_amount\": 23.12469892129818, \"median_amount\": 25}, {\"election_type\": \"primary\", \"amount_bucket\": \"100-500\", \"n_donations\": 3636287, \"total_amount\": 637353634, \"mean_amount\": 175.27594329050484, \"median_amount\": 150}, {\"election_type\": \"primary\", \"amount_bucket\": \"50-100\", \"n_donations\": 2663933, \"total_amount\": 155426540, \"mean_amount\": 58.34476317535013, \"median_amount\": 50}, {\"election_type\": \"primary\", \"amount_bucket\": \"5000+\", \"n_donations\": 44085, \"total_amount\": 1558371116, \"mean_amount\": 35349.237064761255, \"median_amount\": 10000}, {\"election_type\": \"primary\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 684755, \"total_amount\": 1231394874, \"mean_amount\": 1798.2999379340056, \"median_amount\": 1008}, {\"election_type\": \"general\", \"amount_bucket\": \"100-500\", \"n_donations\": 700821, \"total_amount\": 123174568, \"mean_amount\": 175.75753009684357, \"median_amount\": 149}, {\"election_type\": \"primary\", \"amount_bucket\": \"500-1000\", \"n_donations\": 634677, \"total_amount\": 334630687, \"mean_amount\": 527.2456493618014, \"median_amount\": 500}, {\"election_type\": \"general\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 246101, \"total_amount\": 460025242, \"mean_amount\": 1869.2538510611496, \"median_amount\": 1959}, {\"election_type\": \"general\", \"amount_bucket\": \"10-50\", \"n_donations\": 660787, \"total_amount\": 14411588, \"mean_amount\": 21.809732939661345, \"median_amount\": 25}, {\"election_type\": \"general\", \"amount_bucket\": \"50-100\", \"n_donations\": 304363, \"total_amount\": 16184312, \"mean_amount\": 53.174374020495264, \"median_amount\": 50}, {\"election_type\": \"general\", \"amount_bucket\": \"500-1000\", \"n_donations\": 174182, \"total_amount\": 91015697, \"mean_amount\": 522.5321617618354, \"median_amount\": 500}, {\"election_type\": \"general\", \"amount_bucket\": \"<10\", \"n_donations\": 115873, \"total_amount\": 536742, \"mean_amount\": 4.632157620843509, \"median_amount\": 5}, {\"election_type\": \"runoff\", \"amount_bucket\": \"10-50\", \"n_donations\": 20166, \"total_amount\": 461107, \"mean_amount\": 22.865565803828225, \"median_amount\": 25}, {\"election_type\": \"runoff\", \"amount_bucket\": \"50-100\", \"n_donations\": 11578, \"total_amount\": 585827, \"mean_amount\": 50.59828986007946, \"median_amount\": 50}, {\"election_type\": \"runoff\", \"amount_bucket\": \"100-500\", \"n_donations\": 18193, \"total_amount\": 3088289, \"mean_amount\": 169.75149782883526, \"median_amount\": 100}, {\"election_type\": \"runoff\", \"amount_bucket\": \"500-1000\", \"n_donations\": 4117, \"total_amount\": 2110393, \"mean_amount\": 512.6045664318679, \"median_amount\": 500}, {\"election_type\": \"runoff\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 5196, \"total_amount\": 9601993, \"mean_amount\": 1847.958622016936, \"median_amount\": 1894}, {\"election_type\": \"runoff\", \"amount_bucket\": \"<10\", \"n_donations\": 10191, \"total_amount\": 49621, \"mean_amount\": 4.869100186439015, \"median_amount\": 5}, {\"election_type\": \"special\", \"amount_bucket\": \"100-500\", \"n_donations\": 34497, \"total_amount\": 5943498, \"mean_amount\": 172.29028611183583, \"median_amount\": 117}, {\"election_type\": \"general\", \"amount_bucket\": \"5000+\", \"n_donations\": 3125, \"total_amount\": 44496373, \"mean_amount\": 14238.83936, \"median_amount\": 7541}, {\"election_type\": \"special\", \"amount_bucket\": \"5000+\", \"n_donations\": 129, \"total_amount\": 788712, \"mean_amount\": 6114.046511627907, \"median_amount\": 5400}, {\"election_type\": \"other\", \"amount_bucket\": \"500-1000\", \"n_donations\": 119, \"total_amount\": 62535, \"mean_amount\": 525.5042016806723, \"median_amount\": 500}, {\"election_type\": \"other\", \"amount_bucket\": \"100-500\", \"n_donations\": 630, \"total_amount\": 117988, \"mean_amount\": 187.2825396825397, \"median_amount\": 192}, {\"election_type\": \"other\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 235, \"total_amount\": 548212, \"mean_amount\": 2332.817021276596, \"median_amount\": 2633}, {\"election_type\": \"other\", \"amount_bucket\": \"50-100\", \"n_donations\": 451, \"total_amount\": 27149, \"mean_amount\": 60.19733924611973, \"median_amount\": 50}, {\"election_type\": \"other\", \"amount_bucket\": \"10-50\", \"n_donations\": 2644, \"total_amount\": 64297, \"mean_amount\": 24.318078668683814, \"median_amount\": 23}, {\"election_type\": \"convention\", \"amount_bucket\": \"100-500\", \"n_donations\": 6350, \"total_amount\": 1097843, \"mean_amount\": 172.88866141732282, \"median_amount\": 138}, {\"election_type\": \"convention\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 2822, \"total_amount\": 4977314, \"mean_amount\": 1763.7540751240256, \"median_amount\": 1434}, {\"election_type\": \"convention\", \"amount_bucket\": \"10-50\", \"n_donations\": 6848, \"total_amount\": 141604, \"mean_amount\": 20.678154205607477, \"median_amount\": 25}, {\"election_type\": \"convention\", \"amount_bucket\": \"500-1000\", \"n_donations\": 1824, \"total_amount\": 945321, \"mean_amount\": 518.2680921052631, \"median_amount\": 500}, {\"election_type\": \"convention\", \"amount_bucket\": \"50-100\", \"n_donations\": 2966, \"total_amount\": 153281, \"mean_amount\": 51.67936614969656, \"median_amount\": 50}, {\"election_type\": \"convention\", \"amount_bucket\": \"<10\", \"n_donations\": 945, \"total_amount\": 4660, \"mean_amount\": 4.931216931216931, \"median_amount\": 5}, {\"election_type\": \"other\", \"amount_bucket\": \"<10\", \"n_donations\": 10993, \"total_amount\": 25816, \"mean_amount\": 2.3484035295187846, \"median_amount\": 1}, {\"election_type\": \"special\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 7935, \"total_amount\": 13493154, \"mean_amount\": 1700.4604914933836, \"median_amount\": 1001}, {\"election_type\": \"special\", \"amount_bucket\": \"50-100\", \"n_donations\": 22859, \"total_amount\": 1177660, \"mean_amount\": 51.518439126820944, \"median_amount\": 50}, {\"election_type\": \"special\", \"amount_bucket\": \"<10\", \"n_donations\": 25115, \"total_amount\": 122898, \"mean_amount\": 4.893410312562214, \"median_amount\": 5}, {\"election_type\": \"special\", \"amount_bucket\": \"10-50\", \"n_donations\": 51066, \"total_amount\": 1134616, \"mean_amount\": 22.21861904202405, \"median_amount\": 25}, {\"election_type\": \"special\", \"amount_bucket\": \"500-1000\", \"n_donations\": 7811, \"total_amount\": 4003293, \"mean_amount\": 512.5199078223019, \"median_amount\": 500}, {\"election_type\": \"runoff\", \"amount_bucket\": \"5000+\", \"n_donations\": 37, \"total_amount\": 211400, \"mean_amount\": 5713.513513513513, \"median_amount\": 5400}, {\"election_type\": null, \"amount_bucket\": \"1000-5000\", \"n_donations\": 116, \"total_amount\": 228657, \"mean_amount\": 1971.1810344827586, \"median_amount\": 1300}, {\"election_type\": \"recount\", \"amount_bucket\": \"5000+\", \"n_donations\": 26, \"total_amount\": 1888024, \"mean_amount\": 72616.30769230769, \"median_amount\": 101450}, {\"election_type\": null, \"amount_bucket\": \"5000+\", \"n_donations\": 48, \"total_amount\": 1622455, \"mean_amount\": 33801.145833333336, \"median_amount\": 21731}, {\"election_type\": \"recount\", \"amount_bucket\": \"100-500\", \"n_donations\": 2232, \"total_amount\": 413753, \"mean_amount\": 185.37320788530465, \"median_amount\": 200}, {\"election_type\": \"recount\", \"amount_bucket\": \"10-50\", \"n_donations\": 883, \"total_amount\": 20860, \"mean_amount\": 23.62400906002265, \"median_amount\": 25}, {\"election_type\": \"recount\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 589, \"total_amount\": 1113150, \"mean_amount\": 1889.8981324278438, \"median_amount\": 1965}, {\"election_type\": \"recount\", \"amount_bucket\": \"500-1000\", \"n_donations\": 494, \"total_amount\": 250960, \"mean_amount\": 508.0161943319838, \"median_amount\": 500}, {\"election_type\": \"recount\", \"amount_bucket\": \"50-100\", \"n_donations\": 712, \"total_amount\": 38450, \"mean_amount\": 54.002808988764045, \"median_amount\": 50}, {\"election_type\": \"convention\", \"amount_bucket\": \"5000+\", \"n_donations\": 219, \"total_amount\": 1590300, \"mean_amount\": 7261.643835616438, \"median_amount\": 8100}, {\"election_type\": null, \"amount_bucket\": \"100-500\", \"n_donations\": 195, \"total_amount\": 46746, \"mean_amount\": 239.72307692307692, \"median_amount\": 250}, {\"election_type\": null, \"amount_bucket\": \"500-1000\", \"n_donations\": 89, \"total_amount\": 48290, \"mean_amount\": 542.5842696629213, \"median_amount\": 500}, {\"election_type\": null, \"amount_bucket\": \"50-100\", \"n_donations\": 36, \"total_amount\": 1880, \"mean_amount\": 52.22222222222222, \"median_amount\": 50}, {\"election_type\": \"recount\", \"amount_bucket\": \"<10\", \"n_donations\": 110, \"total_amount\": 569, \"mean_amount\": 5.172727272727273, \"median_amount\": 5}, {\"election_type\": null, \"amount_bucket\": \"10-50\", \"n_donations\": 151, \"total_amount\": 3167, \"mean_amount\": 20.973509933774835, \"median_amount\": 25}, {\"election_type\": null, \"amount_bucket\": \"<10\", \"n_donations\": 24, \"total_amount\": 108, \"mean_amount\": 4.5, \"median_amount\": 5}, {\"election_type\": \"other\", \"amount_bucket\": \"5000+\", \"n_donations\": 48, \"total_amount\": 1901300, \"mean_amount\": 39610.416666666664, \"median_amount\": 16950}], \"data-334b351fcf36268adfd9e34181060327\": [{\"election_type\": \"primary\", \"amount_bucket\": \"<10\", \"n_donations\": 2423728, \"total_amount\": 10080721, \"mean_amount\": 4.159179990493983, \"median_amount\": 5}, {\"election_type\": \"primary\", \"amount_bucket\": \"10-50\", \"n_donations\": 8115403, \"total_amount\": 187666251, \"mean_amount\": 23.12469892129818, \"median_amount\": 25}, {\"election_type\": \"primary\", \"amount_bucket\": \"100-500\", \"n_donations\": 3636287, \"total_amount\": 637353634, \"mean_amount\": 175.27594329050484, \"median_amount\": 150}, {\"election_type\": \"primary\", \"amount_bucket\": \"50-100\", \"n_donations\": 2663933, \"total_amount\": 155426540, \"mean_amount\": 58.34476317535013, \"median_amount\": 50}, {\"election_type\": \"primary\", \"amount_bucket\": \"5000+\", \"n_donations\": 44085, \"total_amount\": 1558371116, \"mean_amount\": 35349.237064761255, \"median_amount\": 10000}, {\"election_type\": \"primary\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 684755, \"total_amount\": 1231394874, \"mean_amount\": 1798.2999379340056, \"median_amount\": 1005}, {\"election_type\": \"general\", \"amount_bucket\": \"100-500\", \"n_donations\": 700821, \"total_amount\": 123174568, \"mean_amount\": 175.75753009684357, \"median_amount\": 150}, {\"election_type\": \"primary\", \"amount_bucket\": \"500-1000\", \"n_donations\": 634677, \"total_amount\": 334630687, \"mean_amount\": 527.2456493618014, \"median_amount\": 500}, {\"election_type\": \"general\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 246101, \"total_amount\": 460025242, \"mean_amount\": 1869.2538510611496, \"median_amount\": 1946}, {\"election_type\": \"general\", \"amount_bucket\": \"10-50\", \"n_donations\": 660787, \"total_amount\": 14411588, \"mean_amount\": 21.809732939661345, \"median_amount\": 25}, {\"election_type\": \"general\", \"amount_bucket\": \"50-100\", \"n_donations\": 304363, \"total_amount\": 16184312, \"mean_amount\": 53.174374020495264, \"median_amount\": 50}, {\"election_type\": \"general\", \"amount_bucket\": \"500-1000\", \"n_donations\": 174182, \"total_amount\": 91015697, \"mean_amount\": 522.5321617618354, \"median_amount\": 500}, {\"election_type\": \"general\", \"amount_bucket\": \"<10\", \"n_donations\": 115873, \"total_amount\": 536742, \"mean_amount\": 4.632157620843509, \"median_amount\": 5}, {\"election_type\": \"runoff\", \"amount_bucket\": \"10-50\", \"n_donations\": 20166, \"total_amount\": 461107, \"mean_amount\": 22.865565803828225, \"median_amount\": 25}, {\"election_type\": \"runoff\", \"amount_bucket\": \"50-100\", \"n_donations\": 11578, \"total_amount\": 585827, \"mean_amount\": 50.59828986007946, \"median_amount\": 50}, {\"election_type\": \"runoff\", \"amount_bucket\": \"100-500\", \"n_donations\": 18193, \"total_amount\": 3088289, \"mean_amount\": 169.75149782883526, \"median_amount\": 100}, {\"election_type\": \"runoff\", \"amount_bucket\": \"500-1000\", \"n_donations\": 4117, \"total_amount\": 2110393, \"mean_amount\": 512.6045664318679, \"median_amount\": 500}, {\"election_type\": \"runoff\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 5196, \"total_amount\": 9601993, \"mean_amount\": 1847.958622016936, \"median_amount\": 1901}, {\"election_type\": \"runoff\", \"amount_bucket\": \"<10\", \"n_donations\": 10191, \"total_amount\": 49621, \"mean_amount\": 4.869100186439015, \"median_amount\": 5}, {\"election_type\": \"special\", \"amount_bucket\": \"100-500\", \"n_donations\": 34497, \"total_amount\": 5943498, \"mean_amount\": 172.29028611183583, \"median_amount\": 118}, {\"election_type\": \"general\", \"amount_bucket\": \"5000+\", \"n_donations\": 3125, \"total_amount\": 44496373, \"mean_amount\": 14238.83936, \"median_amount\": 7528}, {\"election_type\": \"special\", \"amount_bucket\": \"5000+\", \"n_donations\": 129, \"total_amount\": 788712, \"mean_amount\": 6114.046511627907, \"median_amount\": 5400}, {\"election_type\": \"special\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 7935, \"total_amount\": 13493154, \"mean_amount\": 1700.4604914933836, \"median_amount\": 1000}, {\"election_type\": \"convention\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 2822, \"total_amount\": 4977314, \"mean_amount\": 1763.7540751240256, \"median_amount\": 1437}, {\"election_type\": \"convention\", \"amount_bucket\": \"100-500\", \"n_donations\": 6350, \"total_amount\": 1097843, \"mean_amount\": 172.88866141732282, \"median_amount\": 139}, {\"election_type\": \"convention\", \"amount_bucket\": \"500-1000\", \"n_donations\": 1824, \"total_amount\": 945321, \"mean_amount\": 518.2680921052631, \"median_amount\": 500}, {\"election_type\": \"special\", \"amount_bucket\": \"10-50\", \"n_donations\": 51066, \"total_amount\": 1134616, \"mean_amount\": 22.21861904202405, \"median_amount\": 25}, {\"election_type\": \"special\", \"amount_bucket\": \"50-100\", \"n_donations\": 22859, \"total_amount\": 1177660, \"mean_amount\": 51.518439126820944, \"median_amount\": 50}, {\"election_type\": \"special\", \"amount_bucket\": \"500-1000\", \"n_donations\": 7811, \"total_amount\": 4003293, \"mean_amount\": 512.5199078223019, \"median_amount\": 500}, {\"election_type\": \"special\", \"amount_bucket\": \"<10\", \"n_donations\": 25115, \"total_amount\": 122898, \"mean_amount\": 4.893410312562214, \"median_amount\": 5}, {\"election_type\": null, \"amount_bucket\": \"1000-5000\", \"n_donations\": 116, \"total_amount\": 228657, \"mean_amount\": 1971.1810344827586, \"median_amount\": 1300}, {\"election_type\": null, \"amount_bucket\": \"5000+\", \"n_donations\": 48, \"total_amount\": 1622455, \"mean_amount\": 33801.145833333336, \"median_amount\": 21731}, {\"election_type\": \"convention\", \"amount_bucket\": \"10-50\", \"n_donations\": 6848, \"total_amount\": 141604, \"mean_amount\": 20.678154205607477, \"median_amount\": 25}, {\"election_type\": \"convention\", \"amount_bucket\": \"<10\", \"n_donations\": 945, \"total_amount\": 4660, \"mean_amount\": 4.931216931216931, \"median_amount\": 5}, {\"election_type\": \"convention\", \"amount_bucket\": \"50-100\", \"n_donations\": 2966, \"total_amount\": 153281, \"mean_amount\": 51.67936614969656, \"median_amount\": 50}, {\"election_type\": \"convention\", \"amount_bucket\": \"5000+\", \"n_donations\": 219, \"total_amount\": 1590300, \"mean_amount\": 7261.643835616438, \"median_amount\": 8100}, {\"election_type\": \"recount\", \"amount_bucket\": \"100-500\", \"n_donations\": 2232, \"total_amount\": 413753, \"mean_amount\": 185.37320788530465, \"median_amount\": 200}, {\"election_type\": \"recount\", \"amount_bucket\": \"10-50\", \"n_donations\": 883, \"total_amount\": 20860, \"mean_amount\": 23.62400906002265, \"median_amount\": 25}, {\"election_type\": \"recount\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 589, \"total_amount\": 1113150, \"mean_amount\": 1889.8981324278438, \"median_amount\": 1965}, {\"election_type\": \"recount\", \"amount_bucket\": \"500-1000\", \"n_donations\": 494, \"total_amount\": 250960, \"mean_amount\": 508.0161943319838, \"median_amount\": 500}, {\"election_type\": \"recount\", \"amount_bucket\": \"50-100\", \"n_donations\": 712, \"total_amount\": 38450, \"mean_amount\": 54.002808988764045, \"median_amount\": 50}, {\"election_type\": \"recount\", \"amount_bucket\": \"<10\", \"n_donations\": 110, \"total_amount\": 569, \"mean_amount\": 5.172727272727273, \"median_amount\": 5}, {\"election_type\": null, \"amount_bucket\": \"100-500\", \"n_donations\": 195, \"total_amount\": 46746, \"mean_amount\": 239.72307692307692, \"median_amount\": 250}, {\"election_type\": null, \"amount_bucket\": \"500-1000\", \"n_donations\": 89, \"total_amount\": 48290, \"mean_amount\": 542.5842696629213, \"median_amount\": 500}, {\"election_type\": \"other\", \"amount_bucket\": \"10-50\", \"n_donations\": 2644, \"total_amount\": 64297, \"mean_amount\": 24.318078668683814, \"median_amount\": 23}, {\"election_type\": \"other\", \"amount_bucket\": \"50-100\", \"n_donations\": 451, \"total_amount\": 27149, \"mean_amount\": 60.19733924611973, \"median_amount\": 50}, {\"election_type\": \"other\", \"amount_bucket\": \"100-500\", \"n_donations\": 630, \"total_amount\": 117988, \"mean_amount\": 187.2825396825397, \"median_amount\": 192}, {\"election_type\": \"other\", \"amount_bucket\": \"<10\", \"n_donations\": 10993, \"total_amount\": 25816, \"mean_amount\": 2.3484035295187846, \"median_amount\": 1}, {\"election_type\": \"other\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 235, \"total_amount\": 548212, \"mean_amount\": 2332.817021276596, \"median_amount\": 2633}, {\"election_type\": \"recount\", \"amount_bucket\": \"5000+\", \"n_donations\": 26, \"total_amount\": 1888024, \"mean_amount\": 72616.30769230769, \"median_amount\": 101450}, {\"election_type\": null, \"amount_bucket\": \"<10\", \"n_donations\": 24, \"total_amount\": 108, \"mean_amount\": 4.5, \"median_amount\": 5}, {\"election_type\": null, \"amount_bucket\": \"10-50\", \"n_donations\": 151, \"total_amount\": 3167, \"mean_amount\": 20.973509933774835, \"median_amount\": 25}, {\"election_type\": \"other\", \"amount_bucket\": \"500-1000\", \"n_donations\": 119, \"total_amount\": 62535, \"mean_amount\": 525.5042016806723, \"median_amount\": 500}, {\"election_type\": \"runoff\", \"amount_bucket\": \"5000+\", \"n_donations\": 37, \"total_amount\": 211400, \"mean_amount\": 5713.513513513513, \"median_amount\": 5400}, {\"election_type\": null, \"amount_bucket\": \"50-100\", \"n_donations\": 36, \"total_amount\": 1880, \"mean_amount\": 52.22222222222222, \"median_amount\": 50}, {\"election_type\": \"other\", \"amount_bucket\": \"5000+\", \"n_donations\": 48, \"total_amount\": 1901300, \"mean_amount\": 39610.416666666664, \"median_amount\": 16950}]}}, {\"mode\": \"vega-lite\"});\n</script>\n```\n:::\n:::\n\n\n### By election stage\n\nLet's look at how donations break down by election stage. Do people donate\ndifferently for primary elections vs general elections?\n\nLet's ignore everything but primary and general elections, since they are the\nmost common, and arguably the most important.\n\n::: {#92651642 .cell execution_count=24}\n``` {.python .cell-code}\ngb2 = by_type_and_bucket[_.election_type.isin((\"primary\", \"general\"))]\nn_donations_per_election_type = _.n_donations.sum().over(group_by=\"election_type\")\nfrac = _.n_donations / n_donations_per_election_type\ngb2 = gb2.mutate(frac_n_donations_per_election_type=frac)\ngb2\n```\n\n::: {.cell-output .cell-output-display execution_count=24}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> election_type </span>┃<span style=\"font-weight: bold\"> amount_bucket </span>┃<span style=\"font-weight: bold\"> n_donations </span>┃<span style=\"font-weight: bold\"> total_amount </span>┃<span style=\"font-weight: bold\"> mean_amount  </span>┃<span style=\"font-weight: bold\"> median_amount </span>┃<span style=\"font-weight: bold\"> frac_n_donations_per_election_type </span>┃\n┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                            │\n├───────────────┼───────────────┼─────────────┼──────────────┼──────────────┼───────────────┼────────────────────────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">10-50        </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8115403</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">187666251</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">23.124699</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.445831</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">&lt;10          </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2423728</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10080721</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.159180</span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.133151</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3636287</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">637353634</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">175.275943</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">150</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.199765</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">50-100       </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2663933</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">155426540</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">58.344763</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.146347</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">634677</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">334630687</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">527.245649</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.034867</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">1000-5000    </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">684755</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1231394874</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1798.299938</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1008</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.037618</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">5000+        </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">44085</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1558371116</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35349.237065</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10000</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.002422</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">50-100       </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">304363</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16184312</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">53.174374</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.138017</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">700821</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">123174568</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">175.757530</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">150</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.317796</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">174182</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">91015697</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">522.532162</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.078985</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │                                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└───────────────┴───────────────┴─────────────┴──────────────┴──────────────┴───────────────┴────────────────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\nIt looks like primary elections get a larger proportion of small donations.\n\n::: {#fd42d9bf .cell execution_count=25}\n``` {.python .cell-code}\nalt.Chart(gb2.execute()).mark_bar().encode(\n    x=\"election_type:O\",\n    y=\"frac_n_donations_per_election_type:Q\",\n    color=bucket_col,\n)\n```\n\n::: {.cell-output .cell-output-display execution_count=25}\n```{=html}\n\n<style>\n  #altair-viz-d70f3d08fb7d48c3b9c5b7d4e4e4b735.vega-embed {\n    width: 100%;\n    display: flex;\n  }\n\n  #altair-viz-d70f3d08fb7d48c3b9c5b7d4e4e4b735.vega-embed details,\n  #altair-viz-d70f3d08fb7d48c3b9c5b7d4e4e4b735.vega-embed details summary {\n    position: relative;\n  }\n</style>\n<div id=\"altair-viz-d70f3d08fb7d48c3b9c5b7d4e4e4b735\"></div>\n<script type=\"text/javascript\">\n  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n  (function(spec, embedOpt){\n    let outputDiv = document.currentScript.previousElementSibling;\n    if (outputDiv.id !== \"altair-viz-d70f3d08fb7d48c3b9c5b7d4e4e4b735\") {\n      outputDiv = document.getElementById(\"altair-viz-d70f3d08fb7d48c3b9c5b7d4e4e4b735\");\n    }\n    const paths = {\n      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.8.0?noext\",\n      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n    };\n\n    function maybeLoadScript(lib, version) {\n      var key = `${lib.replace(\"-\", \"\")}_version`;\n      return (VEGA_DEBUG[key] == version) ?\n        Promise.resolve(paths[lib]) :\n        new Promise(function(resolve, reject) {\n          var s = document.createElement('script');\n          document.getElementsByTagName(\"head\")[0].appendChild(s);\n          s.async = true;\n          s.onload = () => {\n            VEGA_DEBUG[key] = version;\n            return resolve(paths[lib]);\n          };\n          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n          s.src = paths[lib];\n        });\n    }\n\n    function showError(err) {\n      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n      throw err;\n    }\n\n    function displayChart(vegaEmbed) {\n      vegaEmbed(outputDiv, spec, embedOpt)\n        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n    }\n\n    if(typeof define === \"function\" && define.amd) {\n      requirejs.config({paths});\n      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n    } else {\n      maybeLoadScript(\"vega\", \"5\")\n        .then(() => maybeLoadScript(\"vega-lite\", \"5.8.0\"))\n        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n        .catch(showError)\n        .then(() => displayChart(vegaEmbed));\n    }\n  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"data\": {\"name\": \"data-a720be181ccb86f83b23a767dba558e0\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"color\": {\"field\": \"amount_bucket\", \"sort\": [\"<10\", \"10-50\", \"50-100\", \"100-500\", \"500-1000\", \"1000-5000\", \"5000+\"], \"type\": \"nominal\"}, \"x\": {\"field\": \"election_type\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"frac_n_donations_per_election_type\", \"type\": \"quantitative\"}}, \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.8.0.json\", \"datasets\": {\"data-a720be181ccb86f83b23a767dba558e0\": [{\"election_type\": \"primary\", \"amount_bucket\": \"<10\", \"n_donations\": 2423728, \"total_amount\": 10080721, \"mean_amount\": 4.159179990493983, \"median_amount\": 5, \"frac_n_donations_per_election_type\": 0.13315088589336582}, {\"election_type\": \"primary\", \"amount_bucket\": \"10-50\", \"n_donations\": 8115403, \"total_amount\": 187666251, \"mean_amount\": 23.12469892129818, \"median_amount\": 25, \"frac_n_donations_per_election_type\": 0.44583100860809405}, {\"election_type\": \"primary\", \"amount_bucket\": \"50-100\", \"n_donations\": 2663933, \"total_amount\": 155426540, \"mean_amount\": 58.34476317535013, \"median_amount\": 50, \"frac_n_donations_per_election_type\": 0.14634688335925966}, {\"election_type\": \"primary\", \"amount_bucket\": \"100-500\", \"n_donations\": 3636287, \"total_amount\": 637353634, \"mean_amount\": 175.27594329050484, \"median_amount\": 150, \"frac_n_donations_per_election_type\": 0.1997645096366133}, {\"election_type\": \"primary\", \"amount_bucket\": \"500-1000\", \"n_donations\": 634677, \"total_amount\": 334630687, \"mean_amount\": 527.2456493618014, \"median_amount\": 500, \"frac_n_donations_per_election_type\": 0.03486686823197312}, {\"election_type\": \"primary\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 684755, \"total_amount\": 1231394874, \"mean_amount\": 1798.2999379340056, \"median_amount\": 1008, \"frac_n_donations_per_election_type\": 0.037617973167744775}, {\"election_type\": \"primary\", \"amount_bucket\": \"5000+\", \"n_donations\": 44085, \"total_amount\": 1558371116, \"mean_amount\": 35349.237064761255, \"median_amount\": 10000, \"frac_n_donations_per_election_type\": 0.0024218711029492714}, {\"election_type\": \"general\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 246101, \"total_amount\": 460025242, \"mean_amount\": 1869.2538510611496, \"median_amount\": 1956, \"frac_n_donations_per_election_type\": 0.1115976768187944}, {\"election_type\": \"general\", \"amount_bucket\": \"100-500\", \"n_donations\": 700821, \"total_amount\": 123174568, \"mean_amount\": 175.75753009684357, \"median_amount\": 150, \"frac_n_donations_per_election_type\": 0.3177963334802553}, {\"election_type\": \"general\", \"amount_bucket\": \"50-100\", \"n_donations\": 304363, \"total_amount\": 16184312, \"mean_amount\": 53.174374020495264, \"median_amount\": 50, \"frac_n_donations_per_election_type\": 0.13801733316645898}, {\"election_type\": \"general\", \"amount_bucket\": \"<10\", \"n_donations\": 115873, \"total_amount\": 536742, \"mean_amount\": 4.632157620843509, \"median_amount\": 5, \"frac_n_donations_per_election_type\": 0.052544108337731925}, {\"election_type\": \"general\", \"amount_bucket\": \"500-1000\", \"n_donations\": 174182, \"total_amount\": 91015697, \"mean_amount\": 522.5321617618354, \"median_amount\": 500, \"frac_n_donations_per_election_type\": 0.0789850774423966}, {\"election_type\": \"general\", \"amount_bucket\": \"10-50\", \"n_donations\": 660787, \"total_amount\": 14411588, \"mean_amount\": 21.809732939661345, \"median_amount\": 25, \"frac_n_donations_per_election_type\": 0.2996423991453131}, {\"election_type\": \"general\", \"amount_bucket\": \"5000+\", \"n_donations\": 3125, \"total_amount\": 44496373, \"mean_amount\": 14238.83936, \"median_amount\": 7525, \"frac_n_donations_per_election_type\": 0.0014170716090496688}]}}, {\"mode\": \"vega-lite\"});\n</script>\n```\n:::\n:::\n\n\n### By recipient\n\nLet's look at the top players. Who gets the most donations?\n\nFar and away it is ActBlue, which acts as a conduit for donations to Democratic\ninterests.\n\nBeto O'Rourke is the top individual politician, hats off to him!\n\n::: {#e844f42e .cell execution_count=26}\n``` {.python .cell-code}\nby_recip = summary_by(featured, \"CMTE_NM\")\nby_recip\n```\n\n::: {.cell-output .cell-output-display execution_count=26}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CMTE_NM                                                                          </span>┃<span style=\"font-weight: bold\"> n_donations </span>┃<span style=\"font-weight: bold\"> total_amount </span>┃<span style=\"font-weight: bold\"> mean_amount </span>┃<span style=\"font-weight: bold\"> median_amount </span>┃\n┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                                                           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>         │\n├──────────────────────────────────────────────────────────────────────────────────┼─────────────┼──────────────┼─────────────┼───────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">EXELON CORPORATION POLITICAL ACTION COMMITTEE (EXELON PAC)                      </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13250</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1939503</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">146.377585</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">118</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">ARCHER DANIELS MIDLAND COMPANY-ADM PAC                                          </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4460</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">275807</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">61.840135</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">PFIZER INC. PAC                                                                 </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">46900</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1948689</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">41.549872</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">20</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">SUEZ WATER INC. FEDERAL PAC                                                     </span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">108</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16873</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">156.231481</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">120</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">INTERNATIONAL WAREHOUSE LOGISTICS ASSOCIATION PAC                               </span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">90</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">132200</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1468.888889</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1000</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">BAKERY, CONFECTIONERY, TOBACCO WORKERS AND GRAIN MILLERS INTERNATIONAL UNION PAC</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">387</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">19091</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">49.330749</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">UNION PACIFIC CORP. FUND FOR EFFECTIVE GOVERNMENT                               </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16118</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2436963</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">151.195123</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">114</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">NATIONAL ASSOCIATION OF REALTORS POLITICAL ACTION COMMITTEE                     </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">24277</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5492063</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">226.224945</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">154</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">AMERICAN FINANCIAL SERVICES ASSOCIATION PAC                                     </span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">690</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">685839</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">993.969565</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">65</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">WEYERHAEUSER COMPANY POLITICAL ACTION COMMITTEE                                 </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5512</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">343244</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">62.272134</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                                                                │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└──────────────────────────────────────────────────────────────────────────────────┴─────────────┴──────────────┴─────────────┴───────────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#a0c1efd8 .cell execution_count=27}\n``` {.python .cell-code}\ntop_recip = by_recip.order_by(ibis.desc(\"n_donations\")).head(10)\nalt.Chart(top_recip.execute()).mark_bar().encode(\n    x=alt.X(\"CMTE_NM:O\", sort=\"-y\"),\n    y=\"n_donations:Q\",\n)\n```\n\n::: {.cell-output .cell-output-display execution_count=27}\n```{=html}\n\n<style>\n  #altair-viz-59298bf7a2fb45ba8365f4fc9ef956b9.vega-embed {\n    width: 100%;\n    display: flex;\n  }\n\n  #altair-viz-59298bf7a2fb45ba8365f4fc9ef956b9.vega-embed details,\n  #altair-viz-59298bf7a2fb45ba8365f4fc9ef956b9.vega-embed details summary {\n    position: relative;\n  }\n</style>\n<div id=\"altair-viz-59298bf7a2fb45ba8365f4fc9ef956b9\"></div>\n<script type=\"text/javascript\">\n  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n  (function(spec, embedOpt){\n    let outputDiv = document.currentScript.previousElementSibling;\n    if (outputDiv.id !== \"altair-viz-59298bf7a2fb45ba8365f4fc9ef956b9\") {\n      outputDiv = document.getElementById(\"altair-viz-59298bf7a2fb45ba8365f4fc9ef956b9\");\n    }\n    const paths = {\n      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.8.0?noext\",\n      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n    };\n\n    function maybeLoadScript(lib, version) {\n      var key = `${lib.replace(\"-\", \"\")}_version`;\n      return (VEGA_DEBUG[key] == version) ?\n        Promise.resolve(paths[lib]) :\n        new Promise(function(resolve, reject) {\n          var s = document.createElement('script');\n          document.getElementsByTagName(\"head\")[0].appendChild(s);\n          s.async = true;\n          s.onload = () => {\n            VEGA_DEBUG[key] = version;\n            return resolve(paths[lib]);\n          };\n          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n          s.src = paths[lib];\n        });\n    }\n\n    function showError(err) {\n      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n      throw err;\n    }\n\n    function displayChart(vegaEmbed) {\n      vegaEmbed(outputDiv, spec, embedOpt)\n        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n    }\n\n    if(typeof define === \"function\" && define.amd) {\n      requirejs.config({paths});\n      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n    } else {\n      maybeLoadScript(\"vega\", \"5\")\n        .then(() => maybeLoadScript(\"vega-lite\", \"5.8.0\"))\n        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n        .catch(showError)\n        .then(() => displayChart(vegaEmbed));\n    }\n  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"data\": {\"name\": \"data-ce10b3f5b7c7e35451245a008d469163\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"CMTE_NM\", \"sort\": \"-y\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"n_donations\", \"type\": \"quantitative\"}}, \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.8.0.json\", \"datasets\": {\"data-ce10b3f5b7c7e35451245a008d469163\": [{\"CMTE_NM\": \"ACTBLUE\", \"n_donations\": 5820888, \"total_amount\": 693057213, \"mean_amount\": 119.06382892094814, \"median_amount\": 25}, {\"CMTE_NM\": \"DCCC\", \"n_donations\": 1315476, \"total_amount\": 124802082, \"mean_amount\": 94.87218466927561, \"median_amount\": 25}, {\"CMTE_NM\": \"REPUBLICAN NATIONAL COMMITTEE\", \"n_donations\": 570561, \"total_amount\": 131525422, \"mean_amount\": 230.5194746924518, \"median_amount\": 50}, {\"CMTE_NM\": \"END CITIZENS UNITED\", \"n_donations\": 489710, \"total_amount\": 13654987, \"mean_amount\": 27.8838230789651, \"median_amount\": 15}, {\"CMTE_NM\": \"DSCC\", \"n_donations\": 347493, \"total_amount\": 67844824, \"mean_amount\": 195.2408365060591, \"median_amount\": 35}, {\"CMTE_NM\": \"PROGRESSIVE TURNOUT PROJECT\", \"n_donations\": 313433, \"total_amount\": 9251647, \"mean_amount\": 29.517144014829327, \"median_amount\": 15}, {\"CMTE_NM\": \"DNC SERVICES CORP./DEM. NAT'L COMMITTEE\", \"n_donations\": 280264, \"total_amount\": 70156788, \"mean_amount\": 250.32393743042275, \"median_amount\": 50}, {\"CMTE_NM\": \"BETO FOR TEXAS\", \"n_donations\": 280027, \"total_amount\": 44914966, \"mean_amount\": 160.39512618426082, \"median_amount\": 50}, {\"CMTE_NM\": \"NRSC\", \"n_donations\": 203124, \"total_amount\": 55384644, \"mean_amount\": 272.66420511608675, \"median_amount\": 50}, {\"CMTE_NM\": \"NRCC\", \"n_donations\": 178176, \"total_amount\": 38646560, \"mean_amount\": 216.90104166666666, \"median_amount\": 50}]}}, {\"mode\": \"vega-lite\"});\n</script>\n```\n:::\n:::\n\n\n### By Location\n\nWhere are the largest donations coming from?\n\n::: {#3348eca1 .cell execution_count=28}\n``` {.python .cell-code}\nf2 = featured.mutate(loc=_.CITY + \", \" + _.STATE).drop(\"CITY\", \"STATE\")\nby_loc = summary_by(f2, \"loc\")\n# Drop the places with a small number of donations so we're\n# resistant to outliers for the mean\nby_loc = by_loc[_.n_donations > 1000]\nby_loc\n```\n\n::: {.cell-output .cell-output-display execution_count=28}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> loc              </span>┃<span style=\"font-weight: bold\"> n_donations </span>┃<span style=\"font-weight: bold\"> total_amount </span>┃<span style=\"font-weight: bold\"> mean_amount </span>┃<span style=\"font-weight: bold\"> median_amount </span>┃\n┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>         │\n├──────────────────┼─────────────┼──────────────┼─────────────┼───────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">DALLAS, TX      </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">154038</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">66558403</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">432.090802</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">58</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">PHILADELPHIA, PA</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">222938</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">36054977</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">161.726476</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">62</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">MALIBU, CA      </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">11699</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4934763</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">421.810668</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">SANTEE, CA      </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2454</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201274</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">82.018745</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">WINNETKA, IL    </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8589</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5621809</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">654.535918</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">172</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">OREM, UT        </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2110</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">837475</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">396.907583</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">MESA, AZ        </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">22128</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1856636</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">83.904375</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">20</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">WAYZATA, MN     </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6488</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3326275</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">512.681104</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">117</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">MINNETONKA, MN  </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5709</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1187881</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">208.071641</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">OJAI, CA        </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4496</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">926422</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">206.054715</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└──────────────────┴─────────────┴──────────────┴─────────────┴───────────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#95c93760 .cell execution_count=29}\n``` {.python .cell-code}\ndef top_by(col):\n    top = by_loc.order_by(ibis.desc(col)).head(10)\n    return (\n        alt.Chart(top.execute())\n        .mark_bar()\n        .encode(\n            x=alt.X('loc:O', sort=\"-y\"),\n            y=col,\n        )\n    )\n\n\ntop_by(\"n_donations\") | top_by(\"total_amount\") | top_by(\"mean_amount\") | top_by(\n    \"median_amount\"\n)\n```\n\n::: {.cell-output .cell-output-display execution_count=29}\n```{=html}\n\n<style>\n  #altair-viz-0d2671a035be4c5e8b9b39040d382bd3.vega-embed {\n    width: 100%;\n    display: flex;\n  }\n\n  #altair-viz-0d2671a035be4c5e8b9b39040d382bd3.vega-embed details,\n  #altair-viz-0d2671a035be4c5e8b9b39040d382bd3.vega-embed details summary {\n    position: relative;\n  }\n</style>\n<div id=\"altair-viz-0d2671a035be4c5e8b9b39040d382bd3\"></div>\n<script type=\"text/javascript\">\n  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n  (function(spec, embedOpt){\n    let outputDiv = document.currentScript.previousElementSibling;\n    if (outputDiv.id !== \"altair-viz-0d2671a035be4c5e8b9b39040d382bd3\") {\n      outputDiv = document.getElementById(\"altair-viz-0d2671a035be4c5e8b9b39040d382bd3\");\n    }\n    const paths = {\n      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.8.0?noext\",\n      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n    };\n\n    function maybeLoadScript(lib, version) {\n      var key = `${lib.replace(\"-\", \"\")}_version`;\n      return (VEGA_DEBUG[key] == version) ?\n        Promise.resolve(paths[lib]) :\n        new Promise(function(resolve, reject) {\n          var s = document.createElement('script');\n          document.getElementsByTagName(\"head\")[0].appendChild(s);\n          s.async = true;\n          s.onload = () => {\n            VEGA_DEBUG[key] = version;\n            return resolve(paths[lib]);\n          };\n          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n          s.src = paths[lib];\n        });\n    }\n\n    function showError(err) {\n      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n      throw err;\n    }\n\n    function displayChart(vegaEmbed) {\n      vegaEmbed(outputDiv, spec, embedOpt)\n        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n    }\n\n    if(typeof define === \"function\" && define.amd) {\n      requirejs.config({paths});\n      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n    } else {\n      maybeLoadScript(\"vega\", \"5\")\n        .then(() => maybeLoadScript(\"vega-lite\", \"5.8.0\"))\n        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n        .catch(showError)\n        .then(() => displayChart(vegaEmbed));\n    }\n  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"hconcat\": [{\"data\": {\"name\": \"data-88a0fd8958c48a49df7689732aa79f72\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"loc\", \"sort\": \"-y\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"n_donations\", \"type\": \"quantitative\"}}}, {\"data\": {\"name\": \"data-3702e7b9b7a3360d0314740591d3849d\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"loc\", \"sort\": \"-y\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"total_amount\", \"type\": \"quantitative\"}}}, {\"data\": {\"name\": \"data-290a9847dd1ffd50031c2439e64328e3\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"loc\", \"sort\": \"-y\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"mean_amount\", \"type\": \"quantitative\"}}}, {\"data\": {\"name\": \"data-5b715f99a3cfcd002ab9d2ec64e81179\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"loc\", \"sort\": \"-y\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"median_amount\", \"type\": \"quantitative\"}}}], \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.8.0.json\", \"datasets\": {\"data-88a0fd8958c48a49df7689732aa79f72\": [{\"loc\": \"NEW YORK, NY\", \"n_donations\": 695091, \"total_amount\": 444600108, \"mean_amount\": 639.6286356750411, \"median_amount\": 50}, {\"loc\": \"WASHINGTON, DC\", \"n_donations\": 401498, \"total_amount\": 124456508, \"mean_amount\": 309.9803934266173, \"median_amount\": 57}, {\"loc\": \"HOUSTON, TX\", \"n_donations\": 251960, \"total_amount\": 83026989, \"mean_amount\": 329.5244840450865, \"median_amount\": 50}, {\"loc\": \"LOS ANGELES, CA\", \"n_donations\": 245263, \"total_amount\": 89881980, \"mean_amount\": 366.4718282007478, \"median_amount\": 50}, {\"loc\": \"SAN FRANCISCO, CA\", \"n_donations\": 238117, \"total_amount\": 189799961, \"mean_amount\": 797.086982449804, \"median_amount\": 50}, {\"loc\": \"PHILADELPHIA, PA\", \"n_donations\": 222938, \"total_amount\": 36054977, \"mean_amount\": 161.72647552234253, \"median_amount\": 62}, {\"loc\": \"CHICAGO, IL\", \"n_donations\": 212527, \"total_amount\": 108119674, \"mean_amount\": 508.7338267608351, \"median_amount\": 40}, {\"loc\": \"SEATTLE, WA\", \"n_donations\": 197671, \"total_amount\": 52867387, \"mean_amount\": 267.4514066302088, \"median_amount\": 36}, {\"loc\": \"AUSTIN, TX\", \"n_donations\": 189865, \"total_amount\": 33315922, \"mean_amount\": 175.4716351091565, \"median_amount\": 38}, {\"loc\": \"ARLINGTON, VA\", \"n_donations\": 163168, \"total_amount\": 23382868, \"mean_amount\": 143.30547656403218, \"median_amount\": 50}], \"data-3702e7b9b7a3360d0314740591d3849d\": [{\"loc\": \"NEW YORK, NY\", \"n_donations\": 695091, \"total_amount\": 444600108, \"mean_amount\": 639.6286356750411, \"median_amount\": 50}, {\"loc\": \"SAN FRANCISCO, CA\", \"n_donations\": 238117, \"total_amount\": 189799961, \"mean_amount\": 797.086982449804, \"median_amount\": 50}, {\"loc\": \"LAS VEGAS, NV\", \"n_donations\": 65940, \"total_amount\": 153467387, \"mean_amount\": 2327.37923870185, \"median_amount\": 46}, {\"loc\": \"WASHINGTON, DC\", \"n_donations\": 401498, \"total_amount\": 124456508, \"mean_amount\": 309.9803934266173, \"median_amount\": 57}, {\"loc\": \"CHICAGO, IL\", \"n_donations\": 212527, \"total_amount\": 108119674, \"mean_amount\": 508.7338267608351, \"median_amount\": 40}, {\"loc\": \"LOS ANGELES, CA\", \"n_donations\": 245263, \"total_amount\": 89881980, \"mean_amount\": 366.4718282007478, \"median_amount\": 50}, {\"loc\": \"HOUSTON, TX\", \"n_donations\": 251960, \"total_amount\": 83026989, \"mean_amount\": 329.5244840450865, \"median_amount\": 50}, {\"loc\": \"DALLAS, TX\", \"n_donations\": 154038, \"total_amount\": 66558403, \"mean_amount\": 432.09080226956985, \"median_amount\": 57}, {\"loc\": \"SEATTLE, WA\", \"n_donations\": 197671, \"total_amount\": 52867387, \"mean_amount\": 267.4514066302088, \"median_amount\": 36}, {\"loc\": \"BOSTON, MA\", \"n_donations\": 82925, \"total_amount\": 47592049, \"mean_amount\": 573.9167802230932, \"median_amount\": 57}], \"data-290a9847dd1ffd50031c2439e64328e3\": [{\"loc\": \"LAKE FOREST, IL\", \"n_donations\": 5636, \"total_amount\": 37486362, \"mean_amount\": 6651.235273243435, \"median_amount\": 100}, {\"loc\": \"MOUNT VERNON, OH\", \"n_donations\": 1431, \"total_amount\": 5605857, \"mean_amount\": 3917.4402515723273, \"median_amount\": 47}, {\"loc\": \"LOS ALTOS HILLS, CA\", \"n_donations\": 4098, \"total_amount\": 10367629, \"mean_amount\": 2529.92410932162, \"median_amount\": 318}, {\"loc\": \"PALM BEACH, FL\", \"n_donations\": 7140, \"total_amount\": 17212425, \"mean_amount\": 2410.703781512605, \"median_amount\": 260}, {\"loc\": \"LAS VEGAS, NV\", \"n_donations\": 65940, \"total_amount\": 153467387, \"mean_amount\": 2327.37923870185, \"median_amount\": 45}, {\"loc\": \"RHINEBECK, NY\", \"n_donations\": 3014, \"total_amount\": 5942571, \"mean_amount\": 1971.6559389515594, \"median_amount\": 47}, {\"loc\": \"JOPLIN, MO\", \"n_donations\": 1839, \"total_amount\": 3617186, \"mean_amount\": 1966.9309407286569, \"median_amount\": 50}, {\"loc\": \"BALA CYNWYD, PA\", \"n_donations\": 3668, \"total_amount\": 6949933, \"mean_amount\": 1894.7472737186479, \"median_amount\": 100}, {\"loc\": \"CARMEL, IN\", \"n_donations\": 10932, \"total_amount\": 20383688, \"mean_amount\": 1864.5890962312478, \"median_amount\": 53}, {\"loc\": \"WAYLAND, MA\", \"n_donations\": 5283, \"total_amount\": 9704279, \"mean_amount\": 1836.8879424569373, \"median_amount\": 50}], \"data-5b715f99a3cfcd002ab9d2ec64e81179\": [{\"loc\": \"GLADWYNE, PA\", \"n_donations\": 1727, \"total_amount\": 1333243, \"mean_amount\": 771.9994209612044, \"median_amount\": 335}, {\"loc\": \"LOS ALTOS HILLS, CA\", \"n_donations\": 4098, \"total_amount\": 10367629, \"mean_amount\": 2529.92410932162, \"median_amount\": 326}, {\"loc\": \"MC LEAN, VA\", \"n_donations\": 4692, \"total_amount\": 3656109, \"mean_amount\": 779.2218670076726, \"median_amount\": 305}, {\"loc\": \"PALM BEACH, FL\", \"n_donations\": 7140, \"total_amount\": 17212425, \"mean_amount\": 2410.703781512605, \"median_amount\": 262}, {\"loc\": \"PARADISE VALLEY, AZ\", \"n_donations\": 8197, \"total_amount\": 7035291, \"mean_amount\": 858.2763206050994, \"median_amount\": 250}, {\"loc\": \"MISSION HILLS, KS\", \"n_donations\": 2258, \"total_amount\": 1642339, \"mean_amount\": 727.3423383525244, \"median_amount\": 250}, {\"loc\": \"DOVER, MA\", \"n_donations\": 1040, \"total_amount\": 976757, \"mean_amount\": 939.189423076923, \"median_amount\": 250}, {\"loc\": \"ATHERTON, CA\", \"n_donations\": 8780, \"total_amount\": 11595391, \"mean_amount\": 1320.6595671981777, \"median_amount\": 250}, {\"loc\": \"KENILWORTH, IL\", \"n_donations\": 1500, \"total_amount\": 855723, \"mean_amount\": 570.482, \"median_amount\": 250}, {\"loc\": \"WOODSIDE, CA\", \"n_donations\": 5626, \"total_amount\": 6110007, \"mean_amount\": 1086.030394596516, \"median_amount\": 249}]}}, {\"mode\": \"vega-lite\"});\n</script>\n```\n:::\n:::\n\n\n### By month\n\nWhen do the donations come in?\n\n::: {#6d0776d2 .cell execution_count=30}\n``` {.python .cell-code}\nby_month = summary_by(featured, _.date.month().name(\"month_int\"))\n# Sorta hacky, .substritute doesn't work to change dtypes (yet?)\n# so we cast to string and then do our mapping\nmonth_map = {\n    \"1\": \"Jan\",\n    \"2\": \"Feb\",\n    \"3\": \"Mar\",\n    \"4\": \"Apr\",\n    \"5\": \"May\",\n    \"6\": \"Jun\",\n    \"7\": \"Jul\",\n    \"8\": \"Aug\",\n    \"9\": \"Sep\",\n    \"10\": \"Oct\",\n    \"11\": \"Nov\",\n    \"12\": \"Dec\",\n}\nby_month = by_month.mutate(month_str=_.month_int.cast(str).substitute(month_map))\nby_month\n```\n\n::: {.cell-output .cell-output-display execution_count=30}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> month_int </span>┃<span style=\"font-weight: bold\"> n_donations </span>┃<span style=\"font-weight: bold\"> total_amount </span>┃<span style=\"font-weight: bold\"> mean_amount </span>┃<span style=\"font-weight: bold\"> median_amount </span>┃<span style=\"font-weight: bold\"> month_str </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int32</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │\n├───────────┼─────────────┼──────────────┼─────────────┼───────────────┼───────────┤\n│      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1514</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">250297</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">165.321664</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">99</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>      │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">348979</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">174837854</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500.998209</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">122</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Jan      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">581646</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">255997655</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">440.126219</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Feb      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1042577</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">430906797</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">413.309326</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">81</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Mar      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1088244</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">299252692</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">274.986760</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Apr      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1374247</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">387317192</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">281.839576</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">48</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">May      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1667285</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">465305247</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">279.079610</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">44</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Jun      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1607053</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320528605</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">199.451172</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Jul      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023466</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">473544182</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">234.026261</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Aug      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">9</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2583847</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">697888624</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">270.096729</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">38</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Sep      </span> │\n│         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │\n└───────────┴─────────────┴──────────────┴─────────────┴───────────────┴───────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#a2b27c61 .cell execution_count=31}\n``` {.python .cell-code}\nmonths_in_order = list(month_map.values())\nalt.Chart(by_month.execute()).mark_bar().encode(\n    x=alt.X(\"month_str:O\", sort=months_in_order),\n    y=\"n_donations:Q\",\n)\n```\n\n::: {.cell-output .cell-output-display execution_count=31}\n```{=html}\n\n<style>\n  #altair-viz-8f3242609f02454fa698b7480db40ce3.vega-embed {\n    width: 100%;\n    display: flex;\n  }\n\n  #altair-viz-8f3242609f02454fa698b7480db40ce3.vega-embed details,\n  #altair-viz-8f3242609f02454fa698b7480db40ce3.vega-embed details summary {\n    position: relative;\n  }\n</style>\n<div id=\"altair-viz-8f3242609f02454fa698b7480db40ce3\"></div>\n<script type=\"text/javascript\">\n  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n  (function(spec, embedOpt){\n    let outputDiv = document.currentScript.previousElementSibling;\n    if (outputDiv.id !== \"altair-viz-8f3242609f02454fa698b7480db40ce3\") {\n      outputDiv = document.getElementById(\"altair-viz-8f3242609f02454fa698b7480db40ce3\");\n    }\n    const paths = {\n      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.8.0?noext\",\n      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n    };\n\n    function maybeLoadScript(lib, version) {\n      var key = `${lib.replace(\"-\", \"\")}_version`;\n      return (VEGA_DEBUG[key] == version) ?\n        Promise.resolve(paths[lib]) :\n        new Promise(function(resolve, reject) {\n          var s = document.createElement('script');\n          document.getElementsByTagName(\"head\")[0].appendChild(s);\n          s.async = true;\n          s.onload = () => {\n            VEGA_DEBUG[key] = version;\n            return resolve(paths[lib]);\n          };\n          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n          s.src = paths[lib];\n        });\n    }\n\n    function showError(err) {\n      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n      throw err;\n    }\n\n    function displayChart(vegaEmbed) {\n      vegaEmbed(outputDiv, spec, embedOpt)\n        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n    }\n\n    if(typeof define === \"function\" && define.amd) {\n      requirejs.config({paths});\n      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n    } else {\n      maybeLoadScript(\"vega\", \"5\")\n        .then(() => maybeLoadScript(\"vega-lite\", \"5.8.0\"))\n        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n        .catch(showError)\n        .then(() => displayChart(vegaEmbed));\n    }\n  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"data\": {\"name\": \"data-60ff3524db1ce5308db1f55005eff874\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"month_str\", \"sort\": [\"Jan\", \"Feb\", \"Mar\", \"Apr\", \"May\", \"Jun\", \"Jul\", \"Aug\", \"Sep\", \"Oct\", \"Nov\", \"Dec\"], \"type\": \"ordinal\"}, \"y\": {\"field\": \"n_donations\", \"type\": \"quantitative\"}}, \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.8.0.json\", \"datasets\": {\"data-60ff3524db1ce5308db1f55005eff874\": [{\"month_int\": null, \"n_donations\": 1514, \"total_amount\": 250297, \"mean_amount\": 165.3216644649934, \"median_amount\": 100, \"month_str\": null}, {\"month_int\": 1.0, \"n_donations\": 348979, \"total_amount\": 174837854, \"mean_amount\": 500.9982090612902, \"median_amount\": 120, \"month_str\": \"Jan\"}, {\"month_int\": 2.0, \"n_donations\": 581646, \"total_amount\": 255997655, \"mean_amount\": 440.126219384299, \"median_amount\": 100, \"month_str\": \"Feb\"}, {\"month_int\": 3.0, \"n_donations\": 1042577, \"total_amount\": 430906797, \"mean_amount\": 413.3093258339672, \"median_amount\": 81, \"month_str\": \"Mar\"}, {\"month_int\": 4.0, \"n_donations\": 1088244, \"total_amount\": 299252692, \"mean_amount\": 274.98676032213365, \"median_amount\": 50, \"month_str\": \"Apr\"}, {\"month_int\": 5.0, \"n_donations\": 1374247, \"total_amount\": 387317192, \"mean_amount\": 281.83957614606396, \"median_amount\": 48, \"month_str\": \"May\"}, {\"month_int\": 6.0, \"n_donations\": 1667285, \"total_amount\": 465305247, \"mean_amount\": 279.07960966481437, \"median_amount\": 44, \"month_str\": \"Jun\"}, {\"month_int\": 7.0, \"n_donations\": 1607053, \"total_amount\": 320528605, \"mean_amount\": 199.45117242555162, \"median_amount\": 35, \"month_str\": \"Jul\"}, {\"month_int\": 8.0, \"n_donations\": 2023466, \"total_amount\": 473544182, \"mean_amount\": 234.02626088108227, \"median_amount\": 35, \"month_str\": \"Aug\"}, {\"month_int\": 9.0, \"n_donations\": 2583847, \"total_amount\": 697888624, \"mean_amount\": 270.0967294116099, \"median_amount\": 38, \"month_str\": \"Sep\"}, {\"month_int\": 10.0, \"n_donations\": 3686024, \"total_amount\": 850820707, \"mean_amount\": 230.82343115508743, \"median_amount\": 29, \"month_str\": \"Oct\"}, {\"month_int\": 11.0, \"n_donations\": 2545616, \"total_amount\": 285143995, \"mean_amount\": 112.01375030640914, \"median_amount\": 25, \"month_str\": \"Nov\"}, {\"month_int\": 12.0, \"n_donations\": 2119311, \"total_amount\": 283081648, \"mean_amount\": 133.57249030463203, \"median_amount\": 25, \"month_str\": \"Dec\"}]}}, {\"mode\": \"vega-lite\"});\n</script>\n```\n:::\n:::\n\n\n## Conclusion\n\nThanks for following along! I hope you've learned something about Ibis, and\nmaybe even about campaign finance.\n\nIbis is a great tool for exploring data. I now find myself reaching for it\nwhen in the past I would have reached for pandas.\n\nSome of the highlights for me:\n\n- Fast, lazy execution, a great display format, and good type hinting/editor support for a great REPL experience.\n- Very well thought-out API and semantics (e.g. `isinstance(val, NumericValue)`?? That's beautiful!)\n- Fast and fairly complete string support, since I work with a lot of text data.\n- Extremely responsive maintainers. Sometimes I've submitted multiple feature requests and bug reports in a single day, and a PR has been merged by the next day.\n- Escape hatch to SQL. I didn't have to use that here, but if something isn't supported, you can always fall back to SQL.\n\nCheck out [The Ibis Website](https://ibis-project.org/) for more information.\n\n",
+    "engine": "jupyter",
+    "markdown": "---\ntitle: \"Exploring campaign finance data\"\nauthor: \"Nick Crews\"\ndate: \"2023-03-24\"\ncategories:\n    - blog\n    - data engineering\n    - case study\n    - duckdb\n    - performance\n---\n\nHi! My name is [Nick Crews](https://www.linkedin.com/in/nicholas-b-crews/),\nand I'm a data engineer that looks at public campaign finance data.\n\nIn this post, I'll walk through how I use Ibis to explore public campaign contribution\ndata from the Federal Election Commission (FEC). We'll do some loading,\ncleaning, featurizing, and visualization. There will be filtering, sorting, grouping,\nand aggregation.\n\n## Downloading The Data\n\n::: {#e29f35c8 .cell execution_count=1}\n``` {.python .cell-code}\nfrom pathlib import Path\nfrom zipfile import ZipFile\nfrom urllib.request import urlretrieve\n\n# Download and unzip the 2018 individual contributions data\nurl = \"https://cg-519a459a-0ea3-42c2-b7bc-fa1143481f74.s3-us-gov-west-1.amazonaws.com/bulk-downloads/2018/indiv18.zip\"\nzip_path = Path(\"indiv18.zip\")\ncsv_path = Path(\"indiv18.csv\")\n\nif not zip_path.exists():\n    urlretrieve(url, zip_path)\n\nif not csv_path.exists():\n    with ZipFile(zip_path) as zip_file, csv_path.open(\"w\") as csv_file:\n        for line in zip_file.open(\"itcont.txt\"):\n            csv_file.write(line.decode())\n```\n:::\n\n\n## Loading the data\n\nNow that we have our raw data in a .csv format, let's load it into Ibis,\nusing the duckdb backend.\n\nNote that a 4.3 GB .csv would be near the limit of what pandas could\nhandle on my laptop with 16GB of RAM. In pandas, typically every time\nyou perform a transformation on the data, a copy of the data is made.\nI could only do a few transformations before I ran out of memory.\n\nWith Ibis, this problem is solved in two different ways.\n\nFirst, because they are designed to work with very large datasets,\nmany (all?) SQL backends support out of core operations.\nThe data lives on disk, and are only loaded in a streaming fashion\nwhen needed, and then written back to disk as the operation is performed.\n\nSecond, unless you explicitly ask for it, Ibis makes use of lazy\nevaluation. This means that when you ask for a result, the\nresult is not persisted in memory. Only the original source\ndata is persisted. Everything else is derived from this on the fly.\n\n::: {#0a6991f4 .cell execution_count=2}\n``` {.python .cell-code}\nimport ibis\nfrom ibis import _\n\nibis.options.interactive = True\n\n# The raw .csv file doesn't have column names, so we will add them in the next step.\nraw = ibis.read_csv(csv_path)\nraw\n```\n\n::: {.cell-output .cell-output-display execution_count=16}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> C00401224 </span>┃<span style=\"font-weight: bold\"> A      </span>┃<span style=\"font-weight: bold\"> M6     </span>┃<span style=\"font-weight: bold\"> P      </span>┃<span style=\"font-weight: bold\"> 201804059101866001 </span>┃<span style=\"font-weight: bold\"> 24T    </span>┃<span style=\"font-weight: bold\"> IND    </span>┃<span style=\"font-weight: bold\"> STOUFFER, LEIGH   </span>┃<span style=\"font-weight: bold\"> AMSTELVEEN   </span>┃<span style=\"font-weight: bold\"> ZZ     </span>┃<span style=\"font-weight: bold\"> 1187RC    </span>┃<span style=\"font-weight: bold\"> MYSELF            </span>┃<span style=\"font-weight: bold\"> SELF EMPLOYED           </span>┃<span style=\"font-weight: bold\"> 05172017 </span>┃<span style=\"font-weight: bold\"> 10    </span>┃<span style=\"font-weight: bold\"> C00458000 </span>┃<span style=\"font-weight: bold\"> SA11AI_81445687 </span>┃<span style=\"font-weight: bold\"> 1217152 </span>┃<span style=\"font-weight: bold\"> column18 </span>┃<span style=\"font-weight: bold\"> EARMARKED FOR PROGRESSIVE CHANGE CAMPAIGN COMMITTEE (C00458000) </span>┃<span style=\"font-weight: bold\"> 4050820181544765358 </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                                          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>               │\n├───────────┼────────┼────────┼────────┼────────────────────┼────────┼────────┼───────────────────┼──────────────┼────────┼───────────┼───────────────────┼─────────────────────────┼──────────┼───────┼───────────┼─────────────────┼─────────┼──────────┼─────────────────────────────────────────────────────────────────┼─────────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101867748</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STRAWS, JOYCE    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">34761    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SILVERSEA CRUISES</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">RESERVATIONS SUPERVISOR</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05182017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81592336</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544770597</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101867748</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STRAWS, JOYCE    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">34761    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SILVERSEA CRUISES</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">RESERVATIONS SUPERVISOR</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81627562</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544770598</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865942</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">039020760</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05132017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81047921</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765179</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865942</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">039020760</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05152017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81209209</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765180</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865942</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">039020760</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81605223</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765181</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865943</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">039020760</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05242017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_82200022</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765182</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865943</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">03902    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED           </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05292017</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00213512</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_82589834</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR NANCY PELOSI FOR CONGRESS (C00213512)            </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765184</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101865944</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STOTT, JIM       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">039020760</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NONE                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05302017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_82643727</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544765185</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101867050</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STRANGE, WINIFRED</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">34216    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED           </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05162017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81325918</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544768505</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">A     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">M6    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P     </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201804059101867051</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">24T   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">STRANGE, WINIFRED</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">34216    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NOT EMPLOYED           </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05232017</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">C00000935</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SA11AI_81991189</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1217152</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008000; text-decoration-color: #008000\">EARMARKED FOR DCCC (C00000935)                                 </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4050820181544768506</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>        │     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>               │       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                                               │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└───────────┴────────┴────────┴────────┴────────────────────┴────────┴────────┴───────────────────┴──────────────┴────────┴───────────┴───────────────────┴─────────────────────────┴──────────┴───────┴───────────┴─────────────────┴─────────┴──────────┴─────────────────────────────────────────────────────────────────┴─────────────────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#ebb6e702 .cell execution_count=3}\n``` {.python .cell-code}\n# For a more comprehesive description of the columns and their meaning, see\n# https://www.fec.gov/campaign-finance-data/contributions-individuals-file-description/\ncolumns = {\n    \"CMTE_ID\": \"keep\",  # Committee ID\n    \"AMNDT_IND\": \"drop\",  # Amendment indicator. A = amendment, N = new, T = termination\n    \"RPT_TP\": \"drop\",  # Report type (monthly, quarterly, etc)\n    \"TRANSACTION_PGI\": \"keep\",  # Primary/general indicator\n    \"IMAGE_NUM\": \"drop\",  # Image number\n    \"TRANSACTION_TP\": \"drop\",  # Transaction type\n    \"ENTITY_TP\": \"keep\",  # Entity type\n    \"NAME\": \"drop\",  # Contributor name\n    \"CITY\": \"keep\",  # Contributor city\n    \"STATE\": \"keep\",  # Contributor state\n    \"ZIP_CODE\": \"drop\",  # Contributor zip code\n    \"EMPLOYER\": \"drop\",  # Contributor employer\n    \"OCCUPATION\": \"drop\",  # Contributor occupation\n    \"TRANSACTION_DT\": \"keep\",  # Transaction date\n    \"TRANSACTION_AMT\": \"keep\",  # Transaction amount\n    # Other ID. For individual contributions will be null. For contributions from\n    # other FEC committees, will be the committee ID of the other committee.\n    \"OTHER_ID\": \"drop\",\n    \"TRAN_ID\": \"drop\",  # Transaction ID\n    \"FILE_NUM\": \"drop\",  # File number, unique number assigned to each report filed with the FEC\n    \"MEMO_CD\": \"drop\",  # Memo code\n    \"MEMO_TEXT\": \"drop\",  # Memo text\n    \"SUB_ID\": \"drop\",  # Submission ID. Unique number assigned to each transaction.\n}\n\nrenaming = {old: new for old, new in zip(raw.columns, columns.keys())}\nto_keep = [k for k, v in columns.items() if v == \"keep\"]\nkept = raw.relabel(renaming)[to_keep]\nkept\n```\n\n::: {.cell-output .cell-output-display execution_count=17}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CMTE_ID   </span>┃<span style=\"font-weight: bold\"> TRANSACTION_PGI </span>┃<span style=\"font-weight: bold\"> ENTITY_TP </span>┃<span style=\"font-weight: bold\"> CITY         </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_DT </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │\n├───────────┼─────────────────┼───────────┼──────────────┼────────┼────────────────┼─────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05182017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05132017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05152017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017      </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05242017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05292017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05302017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05162017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05232017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>              │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└───────────┴─────────────────┴───────────┴──────────────┴────────┴────────────────┴─────────────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#3f4ad522 .cell execution_count=4}\n``` {.python .cell-code}\n# 21 million rows\nkept.count()\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n```\n:::\n\n::: {.cell-output .cell-output-display execution_count=18}\n\n::: {.ansi-escaped-output}\n```{=html}\n<pre>┌──────────┐\n│ <span class=\"ansi-cyan-fg ansi-bold\">21730730</span> │\n└──────────┘</pre>\n```\n:::\n\n:::\n:::\n\n\nHuh, what's up with those timings? Previewing the head only took a fraction of a second,\nbut finding the number of rows took 10 seconds.\n\nThat's because duckdb is scanning the .csv file on the fly every time we access it.\nSo we only have to read the first few lines to get that preview,\nbut we have to read the whole file to get the number of rows.\n\nNote that this isn't a feature of Ibis, but a feature of Duckdb. This what I think is\none of the strengths of Ibis: Ibis itself doesn't have to implement any of the\noptimimizations or features of the backends. Those backends can focus on what they do\nbest, and Ibis can get those things for free.\n\nSo, let's tell duckdb to actually read in the file to its native format so later accesses\nwill be faster. This will be a ~20 seconds that we'll only have to pay once.\n\n::: {#c45e7319 .cell execution_count=5}\n``` {.python .cell-code}\nkept = kept.cache()\nkept\n```\n\n::: {.cell-output .cell-output-display execution_count=19}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CMTE_ID   </span>┃<span style=\"font-weight: bold\"> TRANSACTION_PGI </span>┃<span style=\"font-weight: bold\"> ENTITY_TP </span>┃<span style=\"font-weight: bold\"> CITY         </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_DT </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │\n├───────────┼─────────────────┼───────────┼──────────────┼────────┼────────────────┼─────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05182017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">OCOEE       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05132017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05152017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05192017      </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05242017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05292017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CAPE NEDDICK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05302017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05162017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00401224</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ANNA MSRIA  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">05232017      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>              │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└───────────┴─────────────────┴───────────┴──────────────┴────────┴────────────────┴─────────────────┘\n</pre>\n```\n:::\n:::\n\n\nLook, now accessing it only takes a fraction of a second!\n\n::: {#881326dd .cell execution_count=6}\n``` {.python .cell-code}\nkept.count()\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n```\n:::\n\n::: {.cell-output .cell-output-display execution_count=20}\n\n::: {.ansi-escaped-output}\n```{=html}\n<pre>┌──────────┐\n│ <span class=\"ansi-cyan-fg ansi-bold\">21730730</span> │\n└──────────┘</pre>\n```\n:::\n\n:::\n:::\n\n\n### Committees Data\n\nThe contributions only list an opaque `CMTE_ID` column. We want to know which actual\ncommittee this is. Let's load the committees table so we can lookup from\ncommittee ID to committee name.\n\n::: {#ae8760f6 .cell execution_count=7}\n``` {.python .cell-code}\ndef read_committees():\n    committees_url = \"https://cg-519a459a-0ea3-42c2-b7bc-fa1143481f74.s3-us-gov-west-1.amazonaws.com/bulk-downloads/2018/committee_summary_2018.csv\"\n    # This just creates a view, it doesn't actually fetch the data yet\n    tmp = ibis.read_csv(committees_url)\n    tmp = tmp[\"CMTE_ID\", \"CMTE_NM\"]\n    # The raw table contains multiple rows for each committee id, so lets pick\n    # an arbitrary row for each committee id as the representative name.\n    deduped = tmp.group_by(\"CMTE_ID\").agg(CMTE_NM=_.CMTE_NM.arbitrary())\n    return deduped\n\n\ncomms = read_committees().cache()\ncomms\n```\n\n::: {.cell-output .cell-output-display execution_count=21}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CMTE_ID   </span>┃<span style=\"font-weight: bold\"> CMTE_NM                                                        </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                                         │\n├───────────┼────────────────────────────────────────────────────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00659441</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">JASON ORTITAY FOR CONGRESS                                    </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00297911</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">TEXAS FORESTRY ASSOCIATION FORESTRY POLITICAL ACTION COMMITTEE</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00340745</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">WADDELL &amp; REED FINANCIAL, INC. POLITICAL ACTION COMMITTEE     </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00679217</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CANTWELL-WARREN VICTORY FUND                                  </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00101204</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NATIONAL FISHERIES INSTITUTE (FISHPAC)                        </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00010520</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">MEREDITH CORPORATION EMPLOYEES FUND FOR BETTER GOVERNMENT     </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00532788</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">LAFAYETTE COUNTY DEMOCRATIC PARTY                             </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00128561</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">TOLL BROS. INC. PAC                                           </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00510958</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">WENDYROGERS.ORG                                               </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">C00665604</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">COMMITTEE TO ELECT BILL EBBEN                                 </span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                                              │\n└───────────┴────────────────────────────────────────────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\nNow add the committee name to the contributions table:\n\n::: {#8fe204d4 .cell execution_count=8}\n``` {.python .cell-code}\ntogether = kept.left_join(comms, \"CMTE_ID\").drop(\"CMTE_ID\", \"CMTE_ID_right\")\ntogether\n```\n\n::: {.cell-output .cell-output-display execution_count=22}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> TRANSACTION_PGI </span>┃<span style=\"font-weight: bold\"> ENTITY_TP </span>┃<span style=\"font-weight: bold\"> CITY             </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_DT </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃<span style=\"font-weight: bold\"> CMTE_NM                                         </span>┃\n┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                          │\n├─────────────────┼───────────┼──────────────────┼────────┼────────────────┼─────────────────┼─────────────────────────────────────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">COHASSET        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">MA    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">KEY LARGO       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01042017      </span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5000</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">LOOKOUT MOUNTAIN</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NORTH YARMOUTH  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ALPHARETTA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">HOLLIS CENTER   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ALEXANDRIA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">VA    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01312017      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>              │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                               │\n└─────────────────┴───────────┴──────────────────┴────────┴────────────────┴─────────────────┴─────────────────────────────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\n## Cleaning\n\nFirst, let's drop any contributions that don't have a committee name. There are only 6 of them.\n\n::: {#215670b2 .cell execution_count=9}\n``` {.python .cell-code}\n# We can do this fearlessly, no .copy() needed, because\n# everything in Ibis is immutable. If we did this in pandas,\n# we might start modifying the original DataFrame accidentally!\ncleaned = together\n\nhas_name = cleaned.CMTE_NM.notnull()\ncleaned = cleaned[has_name]\nhas_name.value_counts()\n```\n\n::: {.cell-output .cell-output-display execution_count=23}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> NotNull(CMTE_NM) </span>┃<span style=\"font-weight: bold\"> NotNull(CMTE_NM)_count </span>┃\n┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">boolean</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>                  │\n├──────────────────┼────────────────────────┤\n│ True             │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21730724</span> │\n│ False            │                      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span> │\n└──────────────────┴────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\nLet's look at the `ENTITY_TP` column. This represents the type of entity that\nmade the contribution:\n\n::: {#8e39507b .cell execution_count=10}\n``` {.python .cell-code}\ntogether.ENTITY_TP.value_counts()\n```\n\n::: {.cell-output .cell-output-display execution_count=24}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> ENTITY_TP </span>┃<span style=\"font-weight: bold\"> ENTITY_TP_count </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │\n├───────────┼─────────────────┤\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>      │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5289</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">CAN      </span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13659</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">COM      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">867</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">IND      </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21687992</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">ORG      </span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">18555</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">PAC      </span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3621</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">PTY      </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">49</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">CCM      </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">698</span> │\n└───────────┴─────────────────┘\n</pre>\n```\n:::\n:::\n\n\nWe only care about contributions from individuals.\n\nOnce we filter on this column, the contents of it are irrelevant, so let's drop it.\n\n::: {#e1453e27 .cell execution_count=11}\n``` {.python .cell-code}\ncleaned = together[_.ENTITY_TP == \"IND\"].drop(\"ENTITY_TP\")\n```\n:::\n\n\nIt looks like the `TRANSACTION_DT` column was a raw string like \"MMDDYYYY\",\nso let's convert that to a proper date type.\n\n::: {#bf3dadc7 .cell execution_count=12}\n``` {.python .cell-code}\nfrom ibis.expr.types import StringValue, DateValue\n\n\ndef mmddyyyy_to_date(val: StringValue) -> DateValue:\n    return val.cast(str).lpad(8, \"0\").to_timestamp(\"%m%d%Y\").date()\n\n\ncleaned = cleaned.mutate(date=mmddyyyy_to_date(_.TRANSACTION_DT)).drop(\"TRANSACTION_DT\")\ncleaned\n```\n\n::: {.cell-output .cell-output-display execution_count=26}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> TRANSACTION_PGI </span>┃<span style=\"font-weight: bold\"> CITY             </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃<span style=\"font-weight: bold\"> CMTE_NM                                         </span>┃<span style=\"font-weight: bold\"> date       </span>┃\n┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>       │\n├─────────────────┼──────────────────┼────────┼─────────────────┼─────────────────────────────────────────────────┼────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">COHASSET        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">MA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">KEY LARGO       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FL    </span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5000</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-04</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">LOOKOUT MOUNTAIN</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NORTH YARMOUTH  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ALPHARETTA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">HOLLIS CENTER   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">FALMOUTH        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ME    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ALEXANDRIA      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">VA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">UNUM GROUP POLITICAL ACTION COMMITTEE (UNUMPAC)</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-01-31</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>          │\n└─────────────────┴──────────────────┴────────┴─────────────────┴─────────────────────────────────────────────────┴────────────┘\n</pre>\n```\n:::\n:::\n\n\nThe `TRANSACTION_PGI` column represents the type (primary, general, etc) of election,\nand the year. But it seems to be not very consistent:\n\n::: {#6cb98e2b .cell execution_count=13}\n``` {.python .cell-code}\ncleaned.TRANSACTION_PGI.topk(10)\n```\n\n::: {.cell-output .cell-output-display execution_count=27}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> TRANSACTION_PGI </span>┃<span style=\"font-weight: bold\"> CountStar() </span>┃\n┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │\n├─────────────────┼─────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P              </span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">17013596</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">G2018          </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2095123</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P2018          </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1677183</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P2020          </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">208501</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">O2018          </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">161874</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">S2017          </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">124336</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">G2017          </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">98401</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P2022          </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">91136</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">P2017          </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">61153</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">R2017          </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">54281</span> │\n└─────────────────┴─────────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#463caa6b .cell execution_count=14}\n``` {.python .cell-code}\ndef get_election_type(pgi: StringValue) -> StringValue:\n    \"\"\"Use the first letter of the TRANSACTION_PGI column to determine the election type\n\n    If the first letter is not one of the known election stage, then return null.\n    \"\"\"\n    election_types = {\n        \"P\": \"primary\",\n        \"G\": \"general\",\n        \"O\": \"other\",\n        \"C\": \"convention\",\n        \"R\": \"runoff\",\n        \"S\": \"special\",\n        \"E\": \"recount\",\n    }\n    first_letter = pgi[0]\n    return first_letter.substitute(election_types, else_=ibis.null())\n\n\ncleaned = cleaned.mutate(election_type=get_election_type(_.TRANSACTION_PGI)).drop(\n    \"TRANSACTION_PGI\"\n)\ncleaned\n```\n\n::: {.cell-output .cell-output-display execution_count=28}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CITY       </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃<span style=\"font-weight: bold\"> CMTE_NM                   </span>┃<span style=\"font-weight: bold\"> date       </span>┃<span style=\"font-weight: bold\"> election_type </span>┃\n┡━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │\n├────────────┼────────┼─────────────────┼───────────────────────────┼────────────┼───────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">ATLANTA   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">GA    </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NANCY PELOSI FOR CONGRESS</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-20</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">AUSTIN    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">TX    </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NANCY PELOSI FOR CONGRESS</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-04</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">WASHINGTON</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">DC    </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NANCY PELOSI FOR CONGRESS</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-23</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">HONOLULU  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">HI    </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NANCY PELOSI FOR CONGRESS</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-04-20</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">MAMARONECK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NY    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">110</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NANCY PELOSI FOR CONGRESS</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-02</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">REHOBOTH  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">MA    </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NANCY PELOSI FOR CONGRESS</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-01</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">BERKELEY  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CA    </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NANCY PELOSI FOR CONGRESS</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-05</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">BEAUMONT  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">TX    </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NANCY PELOSI FOR CONGRESS</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-04-12</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">CONCORD   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">MA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">200</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NANCY PELOSI FOR CONGRESS</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-05-05</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">OXNARD    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CA    </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NANCY PELOSI FOR CONGRESS</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-03-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │\n└────────────┴────────┴─────────────────┴───────────────────────────┴────────────┴───────────────┘\n</pre>\n```\n:::\n:::\n\n\nThat worked well! There are 0 nulls in the resulting column, so we always were\nable to determine the election type.\n\n::: {#ead49c9e .cell execution_count=15}\n``` {.python .cell-code}\ncleaned.election_type.topk(10)\n```\n\n::: {.cell-output .cell-output-display execution_count=29}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> election_type </span>┃<span style=\"font-weight: bold\"> CountStar() </span>┃\n┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │\n├───────────────┼─────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">19061953</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2216685</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">other        </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">161965</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">special      </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">149572</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">runoff       </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">69637</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">convention   </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">22453</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">recount      </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5063</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>          │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">664</span> │\n└───────────────┴─────────────┘\n</pre>\n```\n:::\n:::\n\n\nAbout 1/20 of transactions are negative. These could represent refunds, or they\ncould be data entry errors. Let's drop them to keep it simple.\n\n::: {#ee56a3f3 .cell execution_count=16}\n``` {.python .cell-code}\nabove_zero = cleaned.TRANSACTION_AMT > 0\ncleaned = cleaned[above_zero]\nabove_zero.value_counts()\n```\n\n::: {.cell-output .cell-output-display execution_count=30}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> Greater(TRANSACTION_AMT, 0) </span>┃<span style=\"font-weight: bold\"> Greater(TRANSACTION_AMT, 0)_count </span>┃\n┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">boolean</span>                     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>                             │\n├─────────────────────────────┼───────────────────────────────────┤\n│ True                        │                          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">20669809</span> │\n│ False                       │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1018183</span> │\n└─────────────────────────────┴───────────────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\n## Adding Features\n\nNow that the data is cleaned up to a usable format, let's add some features.\n\nFirst, it's useful to categorize donations by size, placing them into buckets\nof small, medium, large, etc.\n\n::: {#0ccc57df .cell execution_count=17}\n``` {.python .cell-code}\nedges = [\n    10,\n    50,\n    100,\n    500,\n    1000,\n    5000,\n]\nlabels = [\n    \"<10\",\n    \"10-50\",\n    \"50-100\",\n    \"100-500\",\n    \"500-1000\",\n    \"1000-5000\",\n    \"5000+\",\n]\n\n\ndef bucketize(vals, edges, str_labels):\n    # Uses Ibis's .bucket() method to create a categorical column\n    int_labels = vals.bucket(edges, include_under=True, include_over=True)\n    # Map the integer labels to the string labels\n    int_to_str = {str(i): s for i, s in enumerate(str_labels)}\n    return int_labels.cast(str).substitute(int_to_str)\n\n\nfeatured = cleaned.mutate(amount_bucket=bucketize(_.TRANSACTION_AMT, edges, labels))\nfeatured\n```\n\n::: {.cell-output .cell-output-display execution_count=31}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CITY         </span>┃<span style=\"font-weight: bold\"> STATE  </span>┃<span style=\"font-weight: bold\"> TRANSACTION_AMT </span>┃<span style=\"font-weight: bold\"> CMTE_NM               </span>┃<span style=\"font-weight: bold\"> date       </span>┃<span style=\"font-weight: bold\"> election_type </span>┃<span style=\"font-weight: bold\"> amount_bucket </span>┃\n┡━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │\n├──────────────┼────────┼─────────────────┼───────────────────────┼────────────┼───────────────┼───────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">REMINGTON   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IN    </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">AMERICA'S LIBERTY PAC</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-05-30</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">50-100       </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">REMINGTON   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">IN    </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">AMERICA'S LIBERTY PAC</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-05</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">50-100       </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">VANCOUVER   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">WA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">AMERICA'S LIBERTY PAC</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-07</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">SOLANA BEACH</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">CA    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">AMERICA'S LIBERTY PAC</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-26</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">HILLSDALE   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">MI    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">250</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">AMERICA'S LIBERTY PAC</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-05-15</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">MIDDLEBURY  </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">VT    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NBT PAC FEDERAL FUND </span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-05</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">WILLISTON   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">VT    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NBT PAC FEDERAL FUND </span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-05-30</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">GLENMONT    </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NY    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">350</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NBT PAC FEDERAL FUND </span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-01</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">NORWICH     </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NY    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">250</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NBT PAC FEDERAL FUND </span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-05-31</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">CLIFTON PARK</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NY    </span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">250</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">NBT PAC FEDERAL FUND </span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2017-06-26</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>      │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │\n└──────────────┴────────┴─────────────────┴───────────────────────┴────────────┴───────────────┴───────────────┘\n</pre>\n```\n:::\n:::\n\n\n## Analysis\n\n### By donation size\n\nOne thing we can look at is the donation breakdown by size:\n- Are most donations small or large?\n- Where do politicians/committees get most of their money from? Large or small donations?\n\nWe also will compare performance of Ibis vs pandas during this groupby.\n\n::: {#6c9dae32 .cell execution_count=18}\n``` {.python .cell-code}\ndef summary_by(table, by):\n    return table.group_by(by).agg(\n        n_donations=_.count(),\n        total_amount=_.TRANSACTION_AMT.sum(),\n        mean_amount=_.TRANSACTION_AMT.mean(),\n        median_amount=_.TRANSACTION_AMT.approx_median(),\n    )\n\n\ndef summary_by_pandas(df, by):\n    return df.groupby(by, as_index=False).agg(\n        n_donations=(\"election_type\", \"count\"),\n        total_amount=(\"TRANSACTION_AMT\", \"sum\"),\n        mean_amount=(\"TRANSACTION_AMT\", \"mean\"),\n        median_amount=(\"TRANSACTION_AMT\", \"median\"),\n    )\n\n\n# persist the input data so the following timings of the group_by are accurate.\nsubset = featured[\"election_type\", \"amount_bucket\", \"TRANSACTION_AMT\"]\nsubset = subset.cache()\npandas_subset = subset.execute()\n```\n:::\n\n\nLet's take a look at what we are actually computing:\n\n::: {#1b310e3e .cell execution_count=19}\n``` {.python .cell-code}\nby_type_and_bucket = summary_by(subset, [\"election_type\", \"amount_bucket\"])\nby_type_and_bucket\n```\n\n::: {.cell-output .cell-output-display execution_count=33}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> election_type </span>┃<span style=\"font-weight: bold\"> amount_bucket </span>┃<span style=\"font-weight: bold\"> n_donations </span>┃<span style=\"font-weight: bold\"> total_amount </span>┃<span style=\"font-weight: bold\"> mean_amount  </span>┃<span style=\"font-weight: bold\"> median_amount </span>┃\n┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>         │\n├───────────────┼───────────────┼─────────────┼──────────────┼──────────────┼───────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">634677</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">334630687</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">527.245649</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">5000+        </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3125</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">44496373</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14238.839360</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7537</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">special      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7811</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4003293</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">512.519908</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">runoff       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">18193</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3088289</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">169.751498</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">convention   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1824</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">945321</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">518.268092</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">&lt;10          </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">115873</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">536742</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.632158</span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">50-100       </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">304363</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16184312</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">53.174374</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">1000-5000    </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">246101</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">460025242</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1869.253851</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1978</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">10-50        </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">660787</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14411588</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21.809733</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">other        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">119</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">62535</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">525.504202</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└───────────────┴───────────────┴─────────────┴──────────────┴──────────────┴───────────────┘\n</pre>\n```\n:::\n:::\n\n\nOK, now let's do our timings.\n\nOne interesting thing to pay attention to here is the execution time for the following\ngroupby. Before, we could get away with lazy execution: because we only wanted to preview\nthe first few rows, we only had to compute the first few rows, so all our previews were\nvery fast.\n\nBut now, as soon as we do a groupby, we have to actually go through the whole dataset\nin order to compute the aggregate per group. So this is going to be slower. BUT,\nduckdb is still quite fast. It only takes milliseconds to groupby-agg all 20 million rows!\n\n::: {#32424707 .cell execution_count=20}\n``` {.python .cell-code}\n%timeit summary_by(subset, [\"election_type\", \"amount_bucket\"]).execute()  # .execute() so we actually fetch the data\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n161 ms ± 4.75 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n```\n:::\n:::\n\n\nNow let's try the same thing in pandas:\n\n::: {#cc653b7f .cell execution_count=21}\n``` {.python .cell-code}\n%timeit summary_by_pandas(pandas_subset, [\"election_type\", \"amount_bucket\"])\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n2.19 s ± 6.54 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n```\n:::\n:::\n\n\nIt takes about 4 seconds, which is about 10 times slower than duckdb.\nAt this scale, it again doesn't matter,\nbut you could imagine with a dataset much larger than this, it would matter.\n\nLet's also think about memory usage:\n\n::: {#c967896c .cell execution_count=22}\n``` {.python .cell-code}\npandas_subset.memory_usage(deep=True).sum() / 1e9  # GB\n```\n\n::: {.cell-output .cell-output-display execution_count=36}\n```\n2.782586667\n```\n:::\n:::\n\n\nThe source dataframe is couple gigabytes, so probably during the groupby,\nthe peak memory usage is going to be a bit higher than this. You could use a profiler\nsuch as [FIL](https://github.com/pythonspeed/filprofiler) if you wanted an exact number,\nI was too lazy to use that here.\n\nAgain, this works on my laptop at this dataset size, but much larger than this and I'd\nstart having problems. Duckdb on the other hand is designed around working out of core\nso it should scale to datasets into the hundreds of gigabytes, much larger than your\ncomputer's RAM.\n\n### Back to analysis\n\nOK, let's plot the result of that groupby.\n\nSurprise! (Or maybe not...) Most donations are small. But most of the money comes\nfrom donations larger than $1000.\n\nWell if that's the case, why do politicians spend so much time soliciting small\ndonations? One explanation is that they can use the number of donations\nas a marketing pitch, to show how popular they are, and thus how viable of a\ncandidate they are.\n\nThis also might explain whose interests are being served by our politicians.\n\n::: {#6808107a .cell execution_count=23}\n``` {.python .cell-code}\nimport altair as alt\n\n# Do some bookkeeping so the buckets are displayed smallest to largest on the charts\nbucket_col = alt.Column(\"amount_bucket:N\", sort=labels)\n\nn_by_bucket = (\n    alt.Chart(by_type_and_bucket.execute())\n    .mark_bar()\n    .encode(\n        x=bucket_col,\n        y=\"n_donations:Q\",\n        color=\"election_type:N\",\n    )\n)\ntotal_by_bucket = (\n    alt.Chart(by_type_and_bucket.execute())\n    .mark_bar()\n    .encode(\n        x=bucket_col,\n        y=\"total_amount:Q\",\n        color=\"election_type:N\",\n    )\n)\nn_by_bucket | total_by_bucket\n```\n\n::: {.cell-output .cell-output-display execution_count=37}\n```{=html}\n\n<style>\n  #altair-viz-6fbf17b0e95f4f8c9babe5bb35792a50.vega-embed {\n    width: 100%;\n    display: flex;\n  }\n\n  #altair-viz-6fbf17b0e95f4f8c9babe5bb35792a50.vega-embed details,\n  #altair-viz-6fbf17b0e95f4f8c9babe5bb35792a50.vega-embed details summary {\n    position: relative;\n  }\n</style>\n<div id=\"altair-viz-6fbf17b0e95f4f8c9babe5bb35792a50\"></div>\n<script type=\"text/javascript\">\n  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n  (function(spec, embedOpt){\n    let outputDiv = document.currentScript.previousElementSibling;\n    if (outputDiv.id !== \"altair-viz-6fbf17b0e95f4f8c9babe5bb35792a50\") {\n      outputDiv = document.getElementById(\"altair-viz-6fbf17b0e95f4f8c9babe5bb35792a50\");\n    }\n    const paths = {\n      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.16.3?noext\",\n      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n    };\n\n    function maybeLoadScript(lib, version) {\n      var key = `${lib.replace(\"-\", \"\")}_version`;\n      return (VEGA_DEBUG[key] == version) ?\n        Promise.resolve(paths[lib]) :\n        new Promise(function(resolve, reject) {\n          var s = document.createElement('script');\n          document.getElementsByTagName(\"head\")[0].appendChild(s);\n          s.async = true;\n          s.onload = () => {\n            VEGA_DEBUG[key] = version;\n            return resolve(paths[lib]);\n          };\n          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n          s.src = paths[lib];\n        });\n    }\n\n    function showError(err) {\n      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n      throw err;\n    }\n\n    function displayChart(vegaEmbed) {\n      vegaEmbed(outputDiv, spec, embedOpt)\n        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n    }\n\n    if(typeof define === \"function\" && define.amd) {\n      requirejs.config({paths});\n      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n    } else {\n      maybeLoadScript(\"vega\", \"5\")\n        .then(() => maybeLoadScript(\"vega-lite\", \"5.16.3\"))\n        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n        .catch(showError)\n        .then(() => displayChart(vegaEmbed));\n    }\n  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"hconcat\": [{\"data\": {\"name\": \"data-ec402682d040f07539df5cc760e76274\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"color\": {\"field\": \"election_type\", \"type\": \"nominal\"}, \"x\": {\"field\": \"amount_bucket\", \"sort\": [\"<10\", \"10-50\", \"50-100\", \"100-500\", \"500-1000\", \"1000-5000\", \"5000+\"], \"type\": \"nominal\"}, \"y\": {\"field\": \"n_donations\", \"type\": \"quantitative\"}}}, {\"data\": {\"name\": \"data-c12eefc8ce67300e6225801f6dacde98\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"color\": {\"field\": \"election_type\", \"type\": \"nominal\"}, \"x\": {\"field\": \"amount_bucket\", \"sort\": [\"<10\", \"10-50\", \"50-100\", \"100-500\", \"500-1000\", \"1000-5000\", \"5000+\"], \"type\": \"nominal\"}, \"y\": {\"field\": \"total_amount\", \"type\": \"quantitative\"}}}], \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.16.3.json\", \"datasets\": {\"data-ec402682d040f07539df5cc760e76274\": [{\"election_type\": \"general\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 246101, \"total_amount\": 460025242, \"mean_amount\": 1869.2538510611496, \"median_amount\": 1965}, {\"election_type\": \"general\", \"amount_bucket\": \"10-50\", \"n_donations\": 660787, \"total_amount\": 14411588, \"mean_amount\": 21.809732939661345, \"median_amount\": 25}, {\"election_type\": \"general\", \"amount_bucket\": \"50-100\", \"n_donations\": 304363, \"total_amount\": 16184312, \"mean_amount\": 53.174374020495264, \"median_amount\": 50}, {\"election_type\": \"general\", \"amount_bucket\": \"<10\", \"n_donations\": 115873, \"total_amount\": 536742, \"mean_amount\": 4.632157620843509, \"median_amount\": 5}, {\"election_type\": \"runoff\", \"amount_bucket\": \"5000+\", \"n_donations\": 37, \"total_amount\": 211400, \"mean_amount\": 5713.513513513513, \"median_amount\": 5400}, {\"election_type\": \"other\", \"amount_bucket\": \"500-1000\", \"n_donations\": 119, \"total_amount\": 62535, \"mean_amount\": 525.5042016806723, \"median_amount\": 500}, {\"election_type\": \"special\", \"amount_bucket\": \"500-1000\", \"n_donations\": 7811, \"total_amount\": 4003293, \"mean_amount\": 512.5199078223019, \"median_amount\": 500}, {\"election_type\": \"convention\", \"amount_bucket\": \"500-1000\", \"n_donations\": 1824, \"total_amount\": 945321, \"mean_amount\": 518.2680921052631, \"median_amount\": 500}, {\"election_type\": \"runoff\", \"amount_bucket\": \"100-500\", \"n_donations\": 18193, \"total_amount\": 3088289, \"mean_amount\": 169.75149782883526, \"median_amount\": 101}, {\"election_type\": \"general\", \"amount_bucket\": \"100-500\", \"n_donations\": 700821, \"total_amount\": 123174568, \"mean_amount\": 175.75753009684357, \"median_amount\": 149}, {\"election_type\": null, \"amount_bucket\": \"500-1000\", \"n_donations\": 89, \"total_amount\": 48290, \"mean_amount\": 542.5842696629213, \"median_amount\": 500}, {\"election_type\": \"general\", \"amount_bucket\": \"500-1000\", \"n_donations\": 174182, \"total_amount\": 91015697, \"mean_amount\": 522.5321617618354, \"median_amount\": 500}, {\"election_type\": \"primary\", \"amount_bucket\": \"5000+\", \"n_donations\": 44085, \"total_amount\": 1558371116, \"mean_amount\": 35349.237064761255, \"median_amount\": 10000}, {\"election_type\": \"recount\", \"amount_bucket\": \"5000+\", \"n_donations\": 26, \"total_amount\": 1888024, \"mean_amount\": 72616.30769230769, \"median_amount\": 101450}, {\"election_type\": null, \"amount_bucket\": \"100-500\", \"n_donations\": 195, \"total_amount\": 46746, \"mean_amount\": 239.72307692307692, \"median_amount\": 250}, {\"election_type\": \"other\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 235, \"total_amount\": 548212, \"mean_amount\": 2332.817021276596, \"median_amount\": 2633}, {\"election_type\": \"other\", \"amount_bucket\": \"50-100\", \"n_donations\": 451, \"total_amount\": 27149, \"mean_amount\": 60.19733924611973, \"median_amount\": 50}, {\"election_type\": \"other\", \"amount_bucket\": \"10-50\", \"n_donations\": 2644, \"total_amount\": 64297, \"mean_amount\": 24.318078668683814, \"median_amount\": 23}, {\"election_type\": \"other\", \"amount_bucket\": \"<10\", \"n_donations\": 10993, \"total_amount\": 25816, \"mean_amount\": 2.3484035295187846, \"median_amount\": 1}, {\"election_type\": \"special\", \"amount_bucket\": \"5000+\", \"n_donations\": 129, \"total_amount\": 788712, \"mean_amount\": 6114.046511627907, \"median_amount\": 5400}, {\"election_type\": null, \"amount_bucket\": \"1000-5000\", \"n_donations\": 116, \"total_amount\": 228657, \"mean_amount\": 1971.1810344827586, \"median_amount\": 1300}, {\"election_type\": \"convention\", \"amount_bucket\": \"5000+\", \"n_donations\": 219, \"total_amount\": 1590300, \"mean_amount\": 7261.643835616438, \"median_amount\": 8100}, {\"election_type\": \"other\", \"amount_bucket\": \"100-500\", \"n_donations\": 630, \"total_amount\": 117988, \"mean_amount\": 187.2825396825397, \"median_amount\": 192}, {\"election_type\": null, \"amount_bucket\": \"<10\", \"n_donations\": 24, \"total_amount\": 108, \"mean_amount\": 4.5, \"median_amount\": 5}, {\"election_type\": null, \"amount_bucket\": \"10-50\", \"n_donations\": 151, \"total_amount\": 3167, \"mean_amount\": 20.973509933774835, \"median_amount\": 25}, {\"election_type\": null, \"amount_bucket\": \"50-100\", \"n_donations\": 36, \"total_amount\": 1880, \"mean_amount\": 52.22222222222222, \"median_amount\": 50}, {\"election_type\": \"primary\", \"amount_bucket\": \"500-1000\", \"n_donations\": 634677, \"total_amount\": 334630687, \"mean_amount\": 527.2456493618014, \"median_amount\": 500}, {\"election_type\": \"runoff\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 5196, \"total_amount\": 9601993, \"mean_amount\": 1847.958622016936, \"median_amount\": 1913}, {\"election_type\": \"runoff\", \"amount_bucket\": \"10-50\", \"n_donations\": 20166, \"total_amount\": 461107, \"mean_amount\": 22.865565803828225, \"median_amount\": 25}, {\"election_type\": \"general\", \"amount_bucket\": \"5000+\", \"n_donations\": 3125, \"total_amount\": 44496373, \"mean_amount\": 14238.83936, \"median_amount\": 7534}, {\"election_type\": \"runoff\", \"amount_bucket\": \"<10\", \"n_donations\": 10191, \"total_amount\": 49621, \"mean_amount\": 4.869100186439015, \"median_amount\": 5}, {\"election_type\": \"runoff\", \"amount_bucket\": \"50-100\", \"n_donations\": 11578, \"total_amount\": 585827, \"mean_amount\": 50.59828986007946, \"median_amount\": 50}, {\"election_type\": \"recount\", \"amount_bucket\": \"500-1000\", \"n_donations\": 494, \"total_amount\": 250960, \"mean_amount\": 508.0161943319838, \"median_amount\": 500}, {\"election_type\": \"primary\", \"amount_bucket\": \"100-500\", \"n_donations\": 3636287, \"total_amount\": 637353634, \"mean_amount\": 175.27594329050484, \"median_amount\": 149}, {\"election_type\": \"convention\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 2822, \"total_amount\": 4977314, \"mean_amount\": 1763.7540751240256, \"median_amount\": 1440}, {\"election_type\": \"special\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 7935, \"total_amount\": 13493154, \"mean_amount\": 1700.4604914933836, \"median_amount\": 1000}, {\"election_type\": \"convention\", \"amount_bucket\": \"50-100\", \"n_donations\": 2966, \"total_amount\": 153281, \"mean_amount\": 51.67936614969656, \"median_amount\": 50}, {\"election_type\": \"special\", \"amount_bucket\": \"10-50\", \"n_donations\": 51066, \"total_amount\": 1134616, \"mean_amount\": 22.21861904202405, \"median_amount\": 25}, {\"election_type\": \"special\", \"amount_bucket\": \"<10\", \"n_donations\": 25115, \"total_amount\": 122898, \"mean_amount\": 4.893410312562214, \"median_amount\": 5}, {\"election_type\": \"special\", \"amount_bucket\": \"50-100\", \"n_donations\": 22859, \"total_amount\": 1177660, \"mean_amount\": 51.518439126820944, \"median_amount\": 50}, {\"election_type\": null, \"amount_bucket\": \"5000+\", \"n_donations\": 48, \"total_amount\": 1622455, \"mean_amount\": 33801.145833333336, \"median_amount\": 21731}, {\"election_type\": \"convention\", \"amount_bucket\": \"10-50\", \"n_donations\": 6848, \"total_amount\": 141604, \"mean_amount\": 20.678154205607477, \"median_amount\": 25}, {\"election_type\": \"convention\", \"amount_bucket\": \"<10\", \"n_donations\": 945, \"total_amount\": 4660, \"mean_amount\": 4.931216931216931, \"median_amount\": 5}, {\"election_type\": \"recount\", \"amount_bucket\": \"100-500\", \"n_donations\": 2232, \"total_amount\": 413753, \"mean_amount\": 185.37320788530465, \"median_amount\": 200}, {\"election_type\": \"primary\", \"amount_bucket\": \"<10\", \"n_donations\": 2423728, \"total_amount\": 10080721, \"mean_amount\": 4.159179990493983, \"median_amount\": 5}, {\"election_type\": \"primary\", \"amount_bucket\": \"10-50\", \"n_donations\": 8115403, \"total_amount\": 187666251, \"mean_amount\": 23.12469892129818, \"median_amount\": 25}, {\"election_type\": \"primary\", \"amount_bucket\": \"50-100\", \"n_donations\": 2663933, \"total_amount\": 155426540, \"mean_amount\": 58.34476317535013, \"median_amount\": 50}, {\"election_type\": \"primary\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 684755, \"total_amount\": 1231394874, \"mean_amount\": 1798.2999379340056, \"median_amount\": 1004}, {\"election_type\": \"special\", \"amount_bucket\": \"100-500\", \"n_donations\": 34497, \"total_amount\": 5943498, \"mean_amount\": 172.29028611183583, \"median_amount\": 118}, {\"election_type\": \"convention\", \"amount_bucket\": \"100-500\", \"n_donations\": 6350, \"total_amount\": 1097843, \"mean_amount\": 172.88866141732282, \"median_amount\": 138}, {\"election_type\": \"runoff\", \"amount_bucket\": \"500-1000\", \"n_donations\": 4117, \"total_amount\": 2110393, \"mean_amount\": 512.6045664318679, \"median_amount\": 500}, {\"election_type\": \"recount\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 589, \"total_amount\": 1113150, \"mean_amount\": 1889.8981324278438, \"median_amount\": 1965}, {\"election_type\": \"recount\", \"amount_bucket\": \"10-50\", \"n_donations\": 883, \"total_amount\": 20860, \"mean_amount\": 23.62400906002265, \"median_amount\": 25}, {\"election_type\": \"recount\", \"amount_bucket\": \"50-100\", \"n_donations\": 712, \"total_amount\": 38450, \"mean_amount\": 54.002808988764045, \"median_amount\": 50}, {\"election_type\": \"recount\", \"amount_bucket\": \"<10\", \"n_donations\": 110, \"total_amount\": 569, \"mean_amount\": 5.172727272727273, \"median_amount\": 5}, {\"election_type\": \"other\", \"amount_bucket\": \"5000+\", \"n_donations\": 48, \"total_amount\": 1901300, \"mean_amount\": 39610.416666666664, \"median_amount\": 16950}], \"data-c12eefc8ce67300e6225801f6dacde98\": [{\"election_type\": \"primary\", \"amount_bucket\": \"500-1000\", \"n_donations\": 634677, \"total_amount\": 334630687, \"mean_amount\": 527.2456493618014, \"median_amount\": 500}, {\"election_type\": \"general\", \"amount_bucket\": \"5000+\", \"n_donations\": 3125, \"total_amount\": 44496373, \"mean_amount\": 14238.83936, \"median_amount\": 7527}, {\"election_type\": \"runoff\", \"amount_bucket\": \"10-50\", \"n_donations\": 20166, \"total_amount\": 461107, \"mean_amount\": 22.865565803828225, \"median_amount\": 25}, {\"election_type\": \"runoff\", \"amount_bucket\": \"50-100\", \"n_donations\": 11578, \"total_amount\": 585827, \"mean_amount\": 50.59828986007946, \"median_amount\": 50}, {\"election_type\": \"runoff\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 5196, \"total_amount\": 9601993, \"mean_amount\": 1847.958622016936, \"median_amount\": 1897}, {\"election_type\": \"recount\", \"amount_bucket\": \"500-1000\", \"n_donations\": 494, \"total_amount\": 250960, \"mean_amount\": 508.0161943319838, \"median_amount\": 500}, {\"election_type\": \"runoff\", \"amount_bucket\": \"<10\", \"n_donations\": 10191, \"total_amount\": 49621, \"mean_amount\": 4.869100186439015, \"median_amount\": 5}, {\"election_type\": \"special\", \"amount_bucket\": \"500-1000\", \"n_donations\": 7811, \"total_amount\": 4003293, \"mean_amount\": 512.5199078223019, \"median_amount\": 500}, {\"election_type\": \"runoff\", \"amount_bucket\": \"100-500\", \"n_donations\": 18193, \"total_amount\": 3088289, \"mean_amount\": 169.75149782883526, \"median_amount\": 101}, {\"election_type\": \"convention\", \"amount_bucket\": \"500-1000\", \"n_donations\": 1824, \"total_amount\": 945321, \"mean_amount\": 518.2680921052631, \"median_amount\": 500}, {\"election_type\": \"primary\", \"amount_bucket\": \"100-500\", \"n_donations\": 3636287, \"total_amount\": 637353634, \"mean_amount\": 175.27594329050484, \"median_amount\": 149}, {\"election_type\": \"special\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 7935, \"total_amount\": 13493154, \"mean_amount\": 1700.4604914933836, \"median_amount\": 1000}, {\"election_type\": \"special\", \"amount_bucket\": \"<10\", \"n_donations\": 25115, \"total_amount\": 122898, \"mean_amount\": 4.893410312562214, \"median_amount\": 5}, {\"election_type\": \"special\", \"amount_bucket\": \"10-50\", \"n_donations\": 51066, \"total_amount\": 1134616, \"mean_amount\": 22.21861904202405, \"median_amount\": 25}, {\"election_type\": \"special\", \"amount_bucket\": \"50-100\", \"n_donations\": 22859, \"total_amount\": 1177660, \"mean_amount\": 51.518439126820944, \"median_amount\": 50}, {\"election_type\": \"convention\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 2822, \"total_amount\": 4977314, \"mean_amount\": 1763.7540751240256, \"median_amount\": 1441}, {\"election_type\": null, \"amount_bucket\": \"5000+\", \"n_donations\": 48, \"total_amount\": 1622455, \"mean_amount\": 33801.145833333336, \"median_amount\": 21731}, {\"election_type\": \"recount\", \"amount_bucket\": \"100-500\", \"n_donations\": 2232, \"total_amount\": 413753, \"mean_amount\": 185.37320788530465, \"median_amount\": 200}, {\"election_type\": \"convention\", \"amount_bucket\": \"10-50\", \"n_donations\": 6848, \"total_amount\": 141604, \"mean_amount\": 20.678154205607477, \"median_amount\": 25}, {\"election_type\": \"convention\", \"amount_bucket\": \"50-100\", \"n_donations\": 2966, \"total_amount\": 153281, \"mean_amount\": 51.67936614969656, \"median_amount\": 50}, {\"election_type\": \"convention\", \"amount_bucket\": \"<10\", \"n_donations\": 945, \"total_amount\": 4660, \"mean_amount\": 4.931216931216931, \"median_amount\": 5}, {\"election_type\": \"general\", \"amount_bucket\": \"<10\", \"n_donations\": 115873, \"total_amount\": 536742, \"mean_amount\": 4.632157620843509, \"median_amount\": 5}, {\"election_type\": \"general\", \"amount_bucket\": \"10-50\", \"n_donations\": 660787, \"total_amount\": 14411588, \"mean_amount\": 21.809732939661345, \"median_amount\": 25}, {\"election_type\": \"general\", \"amount_bucket\": \"50-100\", \"n_donations\": 304363, \"total_amount\": 16184312, \"mean_amount\": 53.174374020495264, \"median_amount\": 50}, {\"election_type\": \"general\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 246101, \"total_amount\": 460025242, \"mean_amount\": 1869.2538510611496, \"median_amount\": 1970}, {\"election_type\": \"runoff\", \"amount_bucket\": \"5000+\", \"n_donations\": 37, \"total_amount\": 211400, \"mean_amount\": 5713.513513513513, \"median_amount\": 5400}, {\"election_type\": \"other\", \"amount_bucket\": \"500-1000\", \"n_donations\": 119, \"total_amount\": 62535, \"mean_amount\": 525.5042016806723, \"median_amount\": 500}, {\"election_type\": \"general\", \"amount_bucket\": \"100-500\", \"n_donations\": 700821, \"total_amount\": 123174568, \"mean_amount\": 175.75753009684357, \"median_amount\": 151}, {\"election_type\": null, \"amount_bucket\": \"500-1000\", \"n_donations\": 89, \"total_amount\": 48290, \"mean_amount\": 542.5842696629213, \"median_amount\": 500}, {\"election_type\": \"special\", \"amount_bucket\": \"5000+\", \"n_donations\": 129, \"total_amount\": 788712, \"mean_amount\": 6114.046511627907, \"median_amount\": 5400}, {\"election_type\": \"other\", \"amount_bucket\": \"100-500\", \"n_donations\": 630, \"total_amount\": 117988, \"mean_amount\": 187.2825396825397, \"median_amount\": 192}, {\"election_type\": null, \"amount_bucket\": \"1000-5000\", \"n_donations\": 116, \"total_amount\": 228657, \"mean_amount\": 1971.1810344827586, \"median_amount\": 1300}, {\"election_type\": \"convention\", \"amount_bucket\": \"5000+\", \"n_donations\": 219, \"total_amount\": 1590300, \"mean_amount\": 7261.643835616438, \"median_amount\": 8100}, {\"election_type\": null, \"amount_bucket\": \"10-50\", \"n_donations\": 151, \"total_amount\": 3167, \"mean_amount\": 20.973509933774835, \"median_amount\": 25}, {\"election_type\": null, \"amount_bucket\": \"<10\", \"n_donations\": 24, \"total_amount\": 108, \"mean_amount\": 4.5, \"median_amount\": 5}, {\"election_type\": null, \"amount_bucket\": \"50-100\", \"n_donations\": 36, \"total_amount\": 1880, \"mean_amount\": 52.22222222222222, \"median_amount\": 50}, {\"election_type\": \"primary\", \"amount_bucket\": \"10-50\", \"n_donations\": 8115403, \"total_amount\": 187666251, \"mean_amount\": 23.12469892129818, \"median_amount\": 25}, {\"election_type\": \"primary\", \"amount_bucket\": \"50-100\", \"n_donations\": 2663933, \"total_amount\": 155426540, \"mean_amount\": 58.34476317535013, \"median_amount\": 50}, {\"election_type\": \"primary\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 684755, \"total_amount\": 1231394874, \"mean_amount\": 1798.2999379340056, \"median_amount\": 1002}, {\"election_type\": \"primary\", \"amount_bucket\": \"<10\", \"n_donations\": 2423728, \"total_amount\": 10080721, \"mean_amount\": 4.159179990493983, \"median_amount\": 5}, {\"election_type\": \"special\", \"amount_bucket\": \"100-500\", \"n_donations\": 34497, \"total_amount\": 5943498, \"mean_amount\": 172.29028611183583, \"median_amount\": 119}, {\"election_type\": \"runoff\", \"amount_bucket\": \"500-1000\", \"n_donations\": 4117, \"total_amount\": 2110393, \"mean_amount\": 512.6045664318679, \"median_amount\": 500}, {\"election_type\": \"convention\", \"amount_bucket\": \"100-500\", \"n_donations\": 6350, \"total_amount\": 1097843, \"mean_amount\": 172.88866141732282, \"median_amount\": 137}, {\"election_type\": \"recount\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 589, \"total_amount\": 1113150, \"mean_amount\": 1889.8981324278438, \"median_amount\": 1965}, {\"election_type\": \"recount\", \"amount_bucket\": \"10-50\", \"n_donations\": 883, \"total_amount\": 20860, \"mean_amount\": 23.62400906002265, \"median_amount\": 25}, {\"election_type\": \"recount\", \"amount_bucket\": \"50-100\", \"n_donations\": 712, \"total_amount\": 38450, \"mean_amount\": 54.002808988764045, \"median_amount\": 50}, {\"election_type\": \"other\", \"amount_bucket\": \"5000+\", \"n_donations\": 48, \"total_amount\": 1901300, \"mean_amount\": 39610.416666666664, \"median_amount\": 16950}, {\"election_type\": \"recount\", \"amount_bucket\": \"<10\", \"n_donations\": 110, \"total_amount\": 569, \"mean_amount\": 5.172727272727273, \"median_amount\": 5}, {\"election_type\": \"general\", \"amount_bucket\": \"500-1000\", \"n_donations\": 174182, \"total_amount\": 91015697, \"mean_amount\": 522.5321617618354, \"median_amount\": 500}, {\"election_type\": \"primary\", \"amount_bucket\": \"5000+\", \"n_donations\": 44085, \"total_amount\": 1558371116, \"mean_amount\": 35349.237064761255, \"median_amount\": 10000}, {\"election_type\": \"recount\", \"amount_bucket\": \"5000+\", \"n_donations\": 26, \"total_amount\": 1888024, \"mean_amount\": 72616.30769230769, \"median_amount\": 101450}, {\"election_type\": \"other\", \"amount_bucket\": \"10-50\", \"n_donations\": 2644, \"total_amount\": 64297, \"mean_amount\": 24.318078668683814, \"median_amount\": 23}, {\"election_type\": \"other\", \"amount_bucket\": \"<10\", \"n_donations\": 10993, \"total_amount\": 25816, \"mean_amount\": 2.3484035295187846, \"median_amount\": 1}, {\"election_type\": \"other\", \"amount_bucket\": \"50-100\", \"n_donations\": 451, \"total_amount\": 27149, \"mean_amount\": 60.19733924611973, \"median_amount\": 50}, {\"election_type\": null, \"amount_bucket\": \"100-500\", \"n_donations\": 195, \"total_amount\": 46746, \"mean_amount\": 239.72307692307692, \"median_amount\": 250}, {\"election_type\": \"other\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 235, \"total_amount\": 548212, \"mean_amount\": 2332.817021276596, \"median_amount\": 2633}]}}, {\"mode\": \"vega-lite\"});\n</script>\n```\n:::\n:::\n\n\n### By election stage\n\nLet's look at how donations break down by election stage. Do people donate\ndifferently for primary elections vs general elections?\n\nLet's ignore everything but primary and general elections, since they are the\nmost common, and arguably the most important.\n\n::: {#8a758b63 .cell execution_count=24}\n``` {.python .cell-code}\ngb2 = by_type_and_bucket[_.election_type.isin((\"primary\", \"general\"))]\nn_donations_per_election_type = _.n_donations.sum().over(group_by=\"election_type\")\nfrac = _.n_donations / n_donations_per_election_type\ngb2 = gb2.mutate(frac_n_donations_per_election_type=frac)\ngb2\n```\n\n::: {.cell-output .cell-output-display execution_count=38}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> election_type </span>┃<span style=\"font-weight: bold\"> amount_bucket </span>┃<span style=\"font-weight: bold\"> n_donations </span>┃<span style=\"font-weight: bold\"> total_amount </span>┃<span style=\"font-weight: bold\"> mean_amount  </span>┃<span style=\"font-weight: bold\"> median_amount </span>┃<span style=\"font-weight: bold\"> frac_n_donations_per_election_type </span>┃\n┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                            │\n├───────────────┼───────────────┼─────────────┼──────────────┼──────────────┼───────────────┼────────────────────────────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">&lt;10          </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">115873</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">536742</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.632158</span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.052544</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">50-100       </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">304363</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16184312</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">53.174374</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.138017</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">1000-5000    </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">246101</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">460025242</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1869.253851</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1961</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.111598</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">10-50        </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">660787</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14411588</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21.809733</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.299642</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">700821</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">123174568</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">175.757530</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">150</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.317796</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">174182</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">91015697</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">522.532162</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.078985</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">general      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">5000+        </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3125</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">44496373</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14238.839360</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7601</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.001417</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">5000+        </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">44085</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1558371116</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35349.237065</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10000</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.002422</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">100-500      </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3636287</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">637353634</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">175.275943</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">150</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.199765</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">primary      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">500-1000     </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">634677</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">334630687</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">527.245649</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.034867</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>             │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │                                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└───────────────┴───────────────┴─────────────┴──────────────┴──────────────┴───────────────┴────────────────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\nIt looks like primary elections get a larger proportion of small donations.\n\n::: {#30710ce2 .cell execution_count=25}\n``` {.python .cell-code}\nalt.Chart(gb2.execute()).mark_bar().encode(\n    x=\"election_type:O\",\n    y=\"frac_n_donations_per_election_type:Q\",\n    color=bucket_col,\n)\n```\n\n::: {.cell-output .cell-output-display execution_count=39}\n```{=html}\n\n<style>\n  #altair-viz-c7d7cb33ea8c45b6bbd9679963751c34.vega-embed {\n    width: 100%;\n    display: flex;\n  }\n\n  #altair-viz-c7d7cb33ea8c45b6bbd9679963751c34.vega-embed details,\n  #altair-viz-c7d7cb33ea8c45b6bbd9679963751c34.vega-embed details summary {\n    position: relative;\n  }\n</style>\n<div id=\"altair-viz-c7d7cb33ea8c45b6bbd9679963751c34\"></div>\n<script type=\"text/javascript\">\n  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n  (function(spec, embedOpt){\n    let outputDiv = document.currentScript.previousElementSibling;\n    if (outputDiv.id !== \"altair-viz-c7d7cb33ea8c45b6bbd9679963751c34\") {\n      outputDiv = document.getElementById(\"altair-viz-c7d7cb33ea8c45b6bbd9679963751c34\");\n    }\n    const paths = {\n      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.16.3?noext\",\n      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n    };\n\n    function maybeLoadScript(lib, version) {\n      var key = `${lib.replace(\"-\", \"\")}_version`;\n      return (VEGA_DEBUG[key] == version) ?\n        Promise.resolve(paths[lib]) :\n        new Promise(function(resolve, reject) {\n          var s = document.createElement('script');\n          document.getElementsByTagName(\"head\")[0].appendChild(s);\n          s.async = true;\n          s.onload = () => {\n            VEGA_DEBUG[key] = version;\n            return resolve(paths[lib]);\n          };\n          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n          s.src = paths[lib];\n        });\n    }\n\n    function showError(err) {\n      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n      throw err;\n    }\n\n    function displayChart(vegaEmbed) {\n      vegaEmbed(outputDiv, spec, embedOpt)\n        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n    }\n\n    if(typeof define === \"function\" && define.amd) {\n      requirejs.config({paths});\n      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n    } else {\n      maybeLoadScript(\"vega\", \"5\")\n        .then(() => maybeLoadScript(\"vega-lite\", \"5.16.3\"))\n        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n        .catch(showError)\n        .then(() => displayChart(vegaEmbed));\n    }\n  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"data\": {\"name\": \"data-ebbd3804e28f31fe1e149c3016fe9de2\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"color\": {\"field\": \"amount_bucket\", \"sort\": [\"<10\", \"10-50\", \"50-100\", \"100-500\", \"500-1000\", \"1000-5000\", \"5000+\"], \"type\": \"nominal\"}, \"x\": {\"field\": \"election_type\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"frac_n_donations_per_election_type\", \"type\": \"quantitative\"}}, \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.16.3.json\", \"datasets\": {\"data-ebbd3804e28f31fe1e149c3016fe9de2\": [{\"election_type\": \"general\", \"amount_bucket\": \"500-1000\", \"n_donations\": 174182, \"total_amount\": 91015697, \"mean_amount\": 522.5321617618354, \"median_amount\": 500, \"frac_n_donations_per_election_type\": 0.0789850774423966}, {\"election_type\": \"general\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 246101, \"total_amount\": 460025242, \"mean_amount\": 1869.2538510611496, \"median_amount\": 1962, \"frac_n_donations_per_election_type\": 0.1115976768187944}, {\"election_type\": \"general\", \"amount_bucket\": \"50-100\", \"n_donations\": 304363, \"total_amount\": 16184312, \"mean_amount\": 53.174374020495264, \"median_amount\": 50, \"frac_n_donations_per_election_type\": 0.13801733316645898}, {\"election_type\": \"general\", \"amount_bucket\": \"<10\", \"n_donations\": 115873, \"total_amount\": 536742, \"mean_amount\": 4.632157620843509, \"median_amount\": 5, \"frac_n_donations_per_election_type\": 0.052544108337731925}, {\"election_type\": \"general\", \"amount_bucket\": \"10-50\", \"n_donations\": 660787, \"total_amount\": 14411588, \"mean_amount\": 21.809732939661345, \"median_amount\": 25, \"frac_n_donations_per_election_type\": 0.2996423991453131}, {\"election_type\": \"general\", \"amount_bucket\": \"5000+\", \"n_donations\": 3125, \"total_amount\": 44496373, \"mean_amount\": 14238.83936, \"median_amount\": 7534, \"frac_n_donations_per_election_type\": 0.0014170716090496688}, {\"election_type\": \"general\", \"amount_bucket\": \"100-500\", \"n_donations\": 700821, \"total_amount\": 123174568, \"mean_amount\": 175.75753009684357, \"median_amount\": 150, \"frac_n_donations_per_election_type\": 0.3177963334802553}, {\"election_type\": \"primary\", \"amount_bucket\": \"1000-5000\", \"n_donations\": 684755, \"total_amount\": 1231394874, \"mean_amount\": 1798.2999379340056, \"median_amount\": 1005, \"frac_n_donations_per_election_type\": 0.037617973167744775}, {\"election_type\": \"primary\", \"amount_bucket\": \"50-100\", \"n_donations\": 2663933, \"total_amount\": 155426540, \"mean_amount\": 58.34476317535013, \"median_amount\": 50, \"frac_n_donations_per_election_type\": 0.14634688335925966}, {\"election_type\": \"primary\", \"amount_bucket\": \"<10\", \"n_donations\": 2423728, \"total_amount\": 10080721, \"mean_amount\": 4.159179990493983, \"median_amount\": 5, \"frac_n_donations_per_election_type\": 0.13315088589336582}, {\"election_type\": \"primary\", \"amount_bucket\": \"10-50\", \"n_donations\": 8115403, \"total_amount\": 187666251, \"mean_amount\": 23.12469892129818, \"median_amount\": 25, \"frac_n_donations_per_election_type\": 0.44583100860809405}, {\"election_type\": \"primary\", \"amount_bucket\": \"5000+\", \"n_donations\": 44085, \"total_amount\": 1558371116, \"mean_amount\": 35349.237064761255, \"median_amount\": 10000, \"frac_n_donations_per_election_type\": 0.0024218711029492714}, {\"election_type\": \"primary\", \"amount_bucket\": \"500-1000\", \"n_donations\": 634677, \"total_amount\": 334630687, \"mean_amount\": 527.2456493618014, \"median_amount\": 500, \"frac_n_donations_per_election_type\": 0.03486686823197312}, {\"election_type\": \"primary\", \"amount_bucket\": \"100-500\", \"n_donations\": 3636287, \"total_amount\": 637353634, \"mean_amount\": 175.27594329050484, \"median_amount\": 150, \"frac_n_donations_per_election_type\": 0.1997645096366133}]}}, {\"mode\": \"vega-lite\"});\n</script>\n```\n:::\n:::\n\n\n### By recipient\n\nLet's look at the top players. Who gets the most donations?\n\nFar and away it is ActBlue, which acts as a conduit for donations to Democratic\ninterests.\n\nBeto O'Rourke is the top individual politician, hats off to him!\n\n::: {#97c0a2c8 .cell execution_count=26}\n``` {.python .cell-code}\nby_recip = summary_by(featured, \"CMTE_NM\")\nby_recip\n```\n\n::: {.cell-output .cell-output-display execution_count=40}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> CMTE_NM                                                          </span>┃<span style=\"font-weight: bold\"> n_donations </span>┃<span style=\"font-weight: bold\"> total_amount </span>┃<span style=\"font-weight: bold\"> mean_amount </span>┃<span style=\"font-weight: bold\"> median_amount </span>┃\n┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                                                           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>         │\n├──────────────────────────────────────────────────────────────────┼─────────────┼──────────────┼─────────────┼───────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">INDIANA DENTAL PAC                                              </span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">111</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">62236</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">560.684685</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">410</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">BEAM SUNTORY INC POLITICAL ACTION COMMITTEE                     </span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">407</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">64806</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">159.228501</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">65</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">AMEDISYS, INC. POLITICAL ACTION COMMITTEE                       </span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">132</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25000</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">189.393939</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">75</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">PIEDMONT TRIAD ANESTHESIA P A FEDERAL PAC                       </span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">132</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">90375</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">684.659091</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">600</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">AHOLD DELHAIZE USA, INC POLITICAL ACTION COMMITTEE              </span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">369</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">48062</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">130.249322</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">DIMITRI FOR CONGRESS                                            </span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">87</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">34719</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">399.068966</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">250</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">RELX INC. POLITICAL ACTION COMMITTEE                            </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5491</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">306908</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">55.892916</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">34</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">MAKING INVESTMENTS MAJORITY INSURED PAC                         </span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30600</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2185.714286</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1000</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">AMERICAN ACADEMY OF OTOLARYNGOLOGY-HEAD AND NECK SURGERY ENT PAC</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">765</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">285756</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">373.537255</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">365</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">MIMI WALTERS VICTORY FUND                                       </span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">840</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2514824</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2993.838095</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2506</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>                                                                │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└──────────────────────────────────────────────────────────────────┴─────────────┴──────────────┴─────────────┴───────────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#56418e6e .cell execution_count=27}\n``` {.python .cell-code}\ntop_recip = by_recip.order_by(ibis.desc(\"n_donations\")).head(10)\nalt.Chart(top_recip.execute()).mark_bar().encode(\n    x=alt.X(\"CMTE_NM:O\", sort=\"-y\"),\n    y=\"n_donations:Q\",\n)\n```\n\n::: {.cell-output .cell-output-display execution_count=41}\n```{=html}\n\n<style>\n  #altair-viz-22b463e147124dfdb214c1ccc86159eb.vega-embed {\n    width: 100%;\n    display: flex;\n  }\n\n  #altair-viz-22b463e147124dfdb214c1ccc86159eb.vega-embed details,\n  #altair-viz-22b463e147124dfdb214c1ccc86159eb.vega-embed details summary {\n    position: relative;\n  }\n</style>\n<div id=\"altair-viz-22b463e147124dfdb214c1ccc86159eb\"></div>\n<script type=\"text/javascript\">\n  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n  (function(spec, embedOpt){\n    let outputDiv = document.currentScript.previousElementSibling;\n    if (outputDiv.id !== \"altair-viz-22b463e147124dfdb214c1ccc86159eb\") {\n      outputDiv = document.getElementById(\"altair-viz-22b463e147124dfdb214c1ccc86159eb\");\n    }\n    const paths = {\n      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.16.3?noext\",\n      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n    };\n\n    function maybeLoadScript(lib, version) {\n      var key = `${lib.replace(\"-\", \"\")}_version`;\n      return (VEGA_DEBUG[key] == version) ?\n        Promise.resolve(paths[lib]) :\n        new Promise(function(resolve, reject) {\n          var s = document.createElement('script');\n          document.getElementsByTagName(\"head\")[0].appendChild(s);\n          s.async = true;\n          s.onload = () => {\n            VEGA_DEBUG[key] = version;\n            return resolve(paths[lib]);\n          };\n          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n          s.src = paths[lib];\n        });\n    }\n\n    function showError(err) {\n      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n      throw err;\n    }\n\n    function displayChart(vegaEmbed) {\n      vegaEmbed(outputDiv, spec, embedOpt)\n        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n    }\n\n    if(typeof define === \"function\" && define.amd) {\n      requirejs.config({paths});\n      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n    } else {\n      maybeLoadScript(\"vega\", \"5\")\n        .then(() => maybeLoadScript(\"vega-lite\", \"5.16.3\"))\n        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n        .catch(showError)\n        .then(() => displayChart(vegaEmbed));\n    }\n  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"data\": {\"name\": \"data-ce10b3f5b7c7e35451245a008d469163\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"CMTE_NM\", \"sort\": \"-y\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"n_donations\", \"type\": \"quantitative\"}}, \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.16.3.json\", \"datasets\": {\"data-ce10b3f5b7c7e35451245a008d469163\": [{\"CMTE_NM\": \"ACTBLUE\", \"n_donations\": 5820888, \"total_amount\": 693057213, \"mean_amount\": 119.06382892094814, \"median_amount\": 25}, {\"CMTE_NM\": \"DCCC\", \"n_donations\": 1315476, \"total_amount\": 124802082, \"mean_amount\": 94.87218466927561, \"median_amount\": 25}, {\"CMTE_NM\": \"REPUBLICAN NATIONAL COMMITTEE\", \"n_donations\": 570561, \"total_amount\": 131525422, \"mean_amount\": 230.5194746924518, \"median_amount\": 50}, {\"CMTE_NM\": \"END CITIZENS UNITED\", \"n_donations\": 489710, \"total_amount\": 13654987, \"mean_amount\": 27.8838230789651, \"median_amount\": 15}, {\"CMTE_NM\": \"DSCC\", \"n_donations\": 347493, \"total_amount\": 67844824, \"mean_amount\": 195.2408365060591, \"median_amount\": 35}, {\"CMTE_NM\": \"PROGRESSIVE TURNOUT PROJECT\", \"n_donations\": 313433, \"total_amount\": 9251647, \"mean_amount\": 29.517144014829327, \"median_amount\": 15}, {\"CMTE_NM\": \"DNC SERVICES CORP./DEM. NAT'L COMMITTEE\", \"n_donations\": 280264, \"total_amount\": 70156788, \"mean_amount\": 250.32393743042275, \"median_amount\": 50}, {\"CMTE_NM\": \"BETO FOR TEXAS\", \"n_donations\": 280027, \"total_amount\": 44914966, \"mean_amount\": 160.39512618426082, \"median_amount\": 50}, {\"CMTE_NM\": \"NRSC\", \"n_donations\": 203124, \"total_amount\": 55384644, \"mean_amount\": 272.66420511608675, \"median_amount\": 50}, {\"CMTE_NM\": \"NRCC\", \"n_donations\": 178176, \"total_amount\": 38646560, \"mean_amount\": 216.90104166666666, \"median_amount\": 50}]}}, {\"mode\": \"vega-lite\"});\n</script>\n```\n:::\n:::\n\n\n### By Location\n\nWhere are the largest donations coming from?\n\n::: {#55b19fc3 .cell execution_count=28}\n``` {.python .cell-code}\nf2 = featured.mutate(loc=_.CITY + \", \" + _.STATE).drop(\"CITY\", \"STATE\")\nby_loc = summary_by(f2, \"loc\")\n# Drop the places with a small number of donations so we're\n# resistant to outliers for the mean\nby_loc = by_loc[_.n_donations > 1000]\nby_loc\n```\n\n::: {.cell-output .cell-output-display execution_count=42}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> loc             </span>┃<span style=\"font-weight: bold\"> n_donations </span>┃<span style=\"font-weight: bold\"> total_amount </span>┃<span style=\"font-weight: bold\"> mean_amount </span>┃<span style=\"font-weight: bold\"> median_amount </span>┃\n┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>         │\n├─────────────────┼─────────────┼──────────────┼─────────────┼───────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">NAZARETH, PA   </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1460</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">138710</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">95.006849</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">38</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">FULSHEAR, TX   </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1504</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">346778</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">230.570479</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">GLOUCESTER, MA </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4956</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">563331</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">113.666465</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">NORMAN, OK     </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6195</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">945333</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">152.596126</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">OAK PARK, IL   </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">12017</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3413138</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">284.025797</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">AUSTIN, TX     </span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">189865</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">33315922</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">175.471635</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">38</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">MIAMI BEACH, FL</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">12825</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10598453</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">826.390097</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">SAN ANTONIO, TX</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">140529</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">18925978</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">134.676672</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">HAMBURG, NY    </span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2322</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">170254</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">73.322136</span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">PITTSBURGH, PA </span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">74208</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14358578</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">193.490971</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">42</span> │\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>               │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │\n└─────────────────┴─────────────┴──────────────┴─────────────┴───────────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#cc1697c5 .cell execution_count=29}\n``` {.python .cell-code}\ndef top_by(col):\n    top = by_loc.order_by(ibis.desc(col)).head(10)\n    return (\n        alt.Chart(top.execute())\n        .mark_bar()\n        .encode(\n            x=alt.X('loc:O', sort=\"-y\"),\n            y=col,\n        )\n    )\n\n\ntop_by(\"n_donations\") | top_by(\"total_amount\") | top_by(\"mean_amount\") | top_by(\n    \"median_amount\"\n)\n```\n\n::: {.cell-output .cell-output-display execution_count=43}\n```{=html}\n\n<style>\n  #altair-viz-8a6f1f32899f46cc9b2d62e6535d25ef.vega-embed {\n    width: 100%;\n    display: flex;\n  }\n\n  #altair-viz-8a6f1f32899f46cc9b2d62e6535d25ef.vega-embed details,\n  #altair-viz-8a6f1f32899f46cc9b2d62e6535d25ef.vega-embed details summary {\n    position: relative;\n  }\n</style>\n<div id=\"altair-viz-8a6f1f32899f46cc9b2d62e6535d25ef\"></div>\n<script type=\"text/javascript\">\n  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n  (function(spec, embedOpt){\n    let outputDiv = document.currentScript.previousElementSibling;\n    if (outputDiv.id !== \"altair-viz-8a6f1f32899f46cc9b2d62e6535d25ef\") {\n      outputDiv = document.getElementById(\"altair-viz-8a6f1f32899f46cc9b2d62e6535d25ef\");\n    }\n    const paths = {\n      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.16.3?noext\",\n      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n    };\n\n    function maybeLoadScript(lib, version) {\n      var key = `${lib.replace(\"-\", \"\")}_version`;\n      return (VEGA_DEBUG[key] == version) ?\n        Promise.resolve(paths[lib]) :\n        new Promise(function(resolve, reject) {\n          var s = document.createElement('script');\n          document.getElementsByTagName(\"head\")[0].appendChild(s);\n          s.async = true;\n          s.onload = () => {\n            VEGA_DEBUG[key] = version;\n            return resolve(paths[lib]);\n          };\n          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n          s.src = paths[lib];\n        });\n    }\n\n    function showError(err) {\n      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n      throw err;\n    }\n\n    function displayChart(vegaEmbed) {\n      vegaEmbed(outputDiv, spec, embedOpt)\n        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n    }\n\n    if(typeof define === \"function\" && define.amd) {\n      requirejs.config({paths});\n      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n    } else {\n      maybeLoadScript(\"vega\", \"5\")\n        .then(() => maybeLoadScript(\"vega-lite\", \"5.16.3\"))\n        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n        .catch(showError)\n        .then(() => displayChart(vegaEmbed));\n    }\n  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"hconcat\": [{\"data\": {\"name\": \"data-88a0fd8958c48a49df7689732aa79f72\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"loc\", \"sort\": \"-y\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"n_donations\", \"type\": \"quantitative\"}}}, {\"data\": {\"name\": \"data-63253180effd0e4c1cba64f44dc0588d\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"loc\", \"sort\": \"-y\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"total_amount\", \"type\": \"quantitative\"}}}, {\"data\": {\"name\": \"data-0053131ee6866911d91bd779adba39b7\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"loc\", \"sort\": \"-y\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"mean_amount\", \"type\": \"quantitative\"}}}, {\"data\": {\"name\": \"data-c6410b8d1e323aef90b691e51bbd6e4d\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"loc\", \"sort\": \"-y\", \"type\": \"ordinal\"}, \"y\": {\"field\": \"median_amount\", \"type\": \"quantitative\"}}}], \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.16.3.json\", \"datasets\": {\"data-88a0fd8958c48a49df7689732aa79f72\": [{\"loc\": \"NEW YORK, NY\", \"n_donations\": 695091, \"total_amount\": 444600108, \"mean_amount\": 639.6286356750411, \"median_amount\": 50}, {\"loc\": \"WASHINGTON, DC\", \"n_donations\": 401498, \"total_amount\": 124456508, \"mean_amount\": 309.9803934266173, \"median_amount\": 57}, {\"loc\": \"HOUSTON, TX\", \"n_donations\": 251960, \"total_amount\": 83026989, \"mean_amount\": 329.5244840450865, \"median_amount\": 50}, {\"loc\": \"LOS ANGELES, CA\", \"n_donations\": 245263, \"total_amount\": 89881980, \"mean_amount\": 366.4718282007478, \"median_amount\": 50}, {\"loc\": \"SAN FRANCISCO, CA\", \"n_donations\": 238117, \"total_amount\": 189799961, \"mean_amount\": 797.086982449804, \"median_amount\": 50}, {\"loc\": \"PHILADELPHIA, PA\", \"n_donations\": 222938, \"total_amount\": 36054977, \"mean_amount\": 161.72647552234253, \"median_amount\": 62}, {\"loc\": \"CHICAGO, IL\", \"n_donations\": 212527, \"total_amount\": 108119674, \"mean_amount\": 508.7338267608351, \"median_amount\": 40}, {\"loc\": \"SEATTLE, WA\", \"n_donations\": 197671, \"total_amount\": 52867387, \"mean_amount\": 267.4514066302088, \"median_amount\": 36}, {\"loc\": \"AUSTIN, TX\", \"n_donations\": 189865, \"total_amount\": 33315922, \"mean_amount\": 175.4716351091565, \"median_amount\": 38}, {\"loc\": \"ARLINGTON, VA\", \"n_donations\": 163168, \"total_amount\": 23382868, \"mean_amount\": 143.30547656403218, \"median_amount\": 50}], \"data-63253180effd0e4c1cba64f44dc0588d\": [{\"loc\": \"NEW YORK, NY\", \"n_donations\": 695091, \"total_amount\": 444600108, \"mean_amount\": 639.6286356750411, \"median_amount\": 50}, {\"loc\": \"SAN FRANCISCO, CA\", \"n_donations\": 238117, \"total_amount\": 189799961, \"mean_amount\": 797.086982449804, \"median_amount\": 50}, {\"loc\": \"LAS VEGAS, NV\", \"n_donations\": 65940, \"total_amount\": 153467387, \"mean_amount\": 2327.37923870185, \"median_amount\": 46}, {\"loc\": \"WASHINGTON, DC\", \"n_donations\": 401498, \"total_amount\": 124456508, \"mean_amount\": 309.9803934266173, \"median_amount\": 57}, {\"loc\": \"CHICAGO, IL\", \"n_donations\": 212527, \"total_amount\": 108119674, \"mean_amount\": 508.7338267608351, \"median_amount\": 40}, {\"loc\": \"LOS ANGELES, CA\", \"n_donations\": 245263, \"total_amount\": 89881980, \"mean_amount\": 366.4718282007478, \"median_amount\": 50}, {\"loc\": \"HOUSTON, TX\", \"n_donations\": 251960, \"total_amount\": 83026989, \"mean_amount\": 329.5244840450865, \"median_amount\": 50}, {\"loc\": \"DALLAS, TX\", \"n_donations\": 154038, \"total_amount\": 66558403, \"mean_amount\": 432.09080226956985, \"median_amount\": 57}, {\"loc\": \"SEATTLE, WA\", \"n_donations\": 197671, \"total_amount\": 52867387, \"mean_amount\": 267.4514066302088, \"median_amount\": 36}, {\"loc\": \"BOSTON, MA\", \"n_donations\": 82925, \"total_amount\": 47592049, \"mean_amount\": 573.9167802230932, \"median_amount\": 58}], \"data-0053131ee6866911d91bd779adba39b7\": [{\"loc\": \"LAKE FOREST, IL\", \"n_donations\": 5636, \"total_amount\": 37486362, \"mean_amount\": 6651.235273243435, \"median_amount\": 100}, {\"loc\": \"MOUNT VERNON, OH\", \"n_donations\": 1431, \"total_amount\": 5605857, \"mean_amount\": 3917.4402515723273, \"median_amount\": 46}, {\"loc\": \"LOS ALTOS HILLS, CA\", \"n_donations\": 4098, \"total_amount\": 10367629, \"mean_amount\": 2529.92410932162, \"median_amount\": 326}, {\"loc\": \"PALM BEACH, FL\", \"n_donations\": 7140, \"total_amount\": 17212425, \"mean_amount\": 2410.703781512605, \"median_amount\": 255}, {\"loc\": \"LAS VEGAS, NV\", \"n_donations\": 65940, \"total_amount\": 153467387, \"mean_amount\": 2327.37923870185, \"median_amount\": 45}, {\"loc\": \"RHINEBECK, NY\", \"n_donations\": 3014, \"total_amount\": 5942571, \"mean_amount\": 1971.6559389515594, \"median_amount\": 46}, {\"loc\": \"JOPLIN, MO\", \"n_donations\": 1839, \"total_amount\": 3617186, \"mean_amount\": 1966.9309407286569, \"median_amount\": 50}, {\"loc\": \"BALA CYNWYD, PA\", \"n_donations\": 3668, \"total_amount\": 6949933, \"mean_amount\": 1894.7472737186479, \"median_amount\": 100}, {\"loc\": \"CARMEL, IN\", \"n_donations\": 10932, \"total_amount\": 20383688, \"mean_amount\": 1864.5890962312478, \"median_amount\": 53}, {\"loc\": \"WAYLAND, MA\", \"n_donations\": 5283, \"total_amount\": 9704279, \"mean_amount\": 1836.8879424569373, \"median_amount\": 50}], \"data-c6410b8d1e323aef90b691e51bbd6e4d\": [{\"loc\": \"GLADWYNE, PA\", \"n_donations\": 1727, \"total_amount\": 1333243, \"mean_amount\": 771.9994209612044, \"median_amount\": 337}, {\"loc\": \"LOS ALTOS HILLS, CA\", \"n_donations\": 4098, \"total_amount\": 10367629, \"mean_amount\": 2529.92410932162, \"median_amount\": 313}, {\"loc\": \"MC LEAN, VA\", \"n_donations\": 4692, \"total_amount\": 3656109, \"mean_amount\": 779.2218670076726, \"median_amount\": 307}, {\"loc\": \"PALM BEACH, FL\", \"n_donations\": 7140, \"total_amount\": 17212425, \"mean_amount\": 2410.703781512605, \"median_amount\": 255}, {\"loc\": \"MISSION HILLS, KS\", \"n_donations\": 2258, \"total_amount\": 1642339, \"mean_amount\": 727.3423383525244, \"median_amount\": 250}, {\"loc\": \"DOVER, MA\", \"n_donations\": 1040, \"total_amount\": 976757, \"mean_amount\": 939.189423076923, \"median_amount\": 250}, {\"loc\": \"SHORT HILLS, NJ\", \"n_donations\": 3555, \"total_amount\": 3396742, \"mean_amount\": 955.4829817158931, \"median_amount\": 250}, {\"loc\": \"PARADISE VALLEY, AZ\", \"n_donations\": 8197, \"total_amount\": 7035291, \"mean_amount\": 858.2763206050994, \"median_amount\": 250}, {\"loc\": \"ATHERTON, CA\", \"n_donations\": 8780, \"total_amount\": 11595391, \"mean_amount\": 1320.6595671981777, \"median_amount\": 250}, {\"loc\": \"KENILWORTH, IL\", \"n_donations\": 1500, \"total_amount\": 855723, \"mean_amount\": 570.482, \"median_amount\": 250}]}}, {\"mode\": \"vega-lite\"});\n</script>\n```\n:::\n:::\n\n\n### By month\n\nWhen do the donations come in?\n\n::: {#0d055d90 .cell execution_count=30}\n``` {.python .cell-code}\nby_month = summary_by(featured, _.date.month().name(\"month_int\"))\n# Sorta hacky, .substritute doesn't work to change dtypes (yet?)\n# so we cast to string and then do our mapping\nmonth_map = {\n    \"1\": \"Jan\",\n    \"2\": \"Feb\",\n    \"3\": \"Mar\",\n    \"4\": \"Apr\",\n    \"5\": \"May\",\n    \"6\": \"Jun\",\n    \"7\": \"Jul\",\n    \"8\": \"Aug\",\n    \"9\": \"Sep\",\n    \"10\": \"Oct\",\n    \"11\": \"Nov\",\n    \"12\": \"Dec\",\n}\nby_month = by_month.mutate(month_str=_.month_int.cast(str).substitute(month_map))\nby_month\n```\n\n::: {.cell-output .cell-output-display execution_count=44}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> month_int </span>┃<span style=\"font-weight: bold\"> n_donations </span>┃<span style=\"font-weight: bold\"> total_amount </span>┃<span style=\"font-weight: bold\"> mean_amount </span>┃<span style=\"font-weight: bold\"> median_amount </span>┃<span style=\"font-weight: bold\"> month_str </span>┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int32</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>    │\n├───────────┼─────────────┼──────────────┼─────────────┼───────────────┼───────────┤\n│      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1514</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">250297</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">165.321664</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>      │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">348979</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">174837854</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">500.998209</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">124</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Jan      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">581646</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">255997655</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">440.126219</span> │           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Feb      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1042577</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">430906797</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">413.309326</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">81</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Mar      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1088244</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">299252692</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">274.986760</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Apr      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1374247</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">387317192</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">281.839576</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">48</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">May      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1667285</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">465305247</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">279.079610</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">44</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Jun      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1607053</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320528605</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">199.451172</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Jul      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023466</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">473544182</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">234.026261</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">35</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Aug      </span> │\n│         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">9</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2583847</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">697888624</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">270.096729</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">38</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">Sep      </span> │\n│         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">…</span>         │\n└───────────┴─────────────┴──────────────┴─────────────┴───────────────┴───────────┘\n</pre>\n```\n:::\n:::\n\n\n::: {#7002ddb8 .cell execution_count=31}\n``` {.python .cell-code}\nmonths_in_order = list(month_map.values())\nalt.Chart(by_month.execute()).mark_bar().encode(\n    x=alt.X(\"month_str:O\", sort=months_in_order),\n    y=\"n_donations:Q\",\n)\n```\n\n::: {.cell-output .cell-output-display execution_count=45}\n```{=html}\n\n<style>\n  #altair-viz-fd0cec36d6a04996a5559215e6fce4df.vega-embed {\n    width: 100%;\n    display: flex;\n  }\n\n  #altair-viz-fd0cec36d6a04996a5559215e6fce4df.vega-embed details,\n  #altair-viz-fd0cec36d6a04996a5559215e6fce4df.vega-embed details summary {\n    position: relative;\n  }\n</style>\n<div id=\"altair-viz-fd0cec36d6a04996a5559215e6fce4df\"></div>\n<script type=\"text/javascript\">\n  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n  (function(spec, embedOpt){\n    let outputDiv = document.currentScript.previousElementSibling;\n    if (outputDiv.id !== \"altair-viz-fd0cec36d6a04996a5559215e6fce4df\") {\n      outputDiv = document.getElementById(\"altair-viz-fd0cec36d6a04996a5559215e6fce4df\");\n    }\n    const paths = {\n      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.16.3?noext\",\n      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n    };\n\n    function maybeLoadScript(lib, version) {\n      var key = `${lib.replace(\"-\", \"\")}_version`;\n      return (VEGA_DEBUG[key] == version) ?\n        Promise.resolve(paths[lib]) :\n        new Promise(function(resolve, reject) {\n          var s = document.createElement('script');\n          document.getElementsByTagName(\"head\")[0].appendChild(s);\n          s.async = true;\n          s.onload = () => {\n            VEGA_DEBUG[key] = version;\n            return resolve(paths[lib]);\n          };\n          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n          s.src = paths[lib];\n        });\n    }\n\n    function showError(err) {\n      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n      throw err;\n    }\n\n    function displayChart(vegaEmbed) {\n      vegaEmbed(outputDiv, spec, embedOpt)\n        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n    }\n\n    if(typeof define === \"function\" && define.amd) {\n      requirejs.config({paths});\n      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n    } else {\n      maybeLoadScript(\"vega\", \"5\")\n        .then(() => maybeLoadScript(\"vega-lite\", \"5.16.3\"))\n        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n        .catch(showError)\n        .then(() => displayChart(vegaEmbed));\n    }\n  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"data\": {\"name\": \"data-f45f98a3a06c83bbafc4d3d94cb29c0b\"}, \"mark\": {\"type\": \"bar\"}, \"encoding\": {\"x\": {\"field\": \"month_str\", \"sort\": [\"Jan\", \"Feb\", \"Mar\", \"Apr\", \"May\", \"Jun\", \"Jul\", \"Aug\", \"Sep\", \"Oct\", \"Nov\", \"Dec\"], \"type\": \"ordinal\"}, \"y\": {\"field\": \"n_donations\", \"type\": \"quantitative\"}}, \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.16.3.json\", \"datasets\": {\"data-f45f98a3a06c83bbafc4d3d94cb29c0b\": [{\"month_int\": null, \"n_donations\": 1514, \"total_amount\": 250297, \"mean_amount\": 165.3216644649934, \"median_amount\": 100, \"month_str\": null}, {\"month_int\": 1.0, \"n_donations\": 348979, \"total_amount\": 174837854, \"mean_amount\": 500.9982090612902, \"median_amount\": 122, \"month_str\": \"Jan\"}, {\"month_int\": 2.0, \"n_donations\": 581646, \"total_amount\": 255997655, \"mean_amount\": 440.126219384299, \"median_amount\": 100, \"month_str\": \"Feb\"}, {\"month_int\": 3.0, \"n_donations\": 1042577, \"total_amount\": 430906797, \"mean_amount\": 413.3093258339672, \"median_amount\": 80, \"month_str\": \"Mar\"}, {\"month_int\": 4.0, \"n_donations\": 1088244, \"total_amount\": 299252692, \"mean_amount\": 274.98676032213365, \"median_amount\": 50, \"month_str\": \"Apr\"}, {\"month_int\": 5.0, \"n_donations\": 1374247, \"total_amount\": 387317192, \"mean_amount\": 281.83957614606396, \"median_amount\": 48, \"month_str\": \"May\"}, {\"month_int\": 6.0, \"n_donations\": 1667285, \"total_amount\": 465305247, \"mean_amount\": 279.07960966481437, \"median_amount\": 44, \"month_str\": \"Jun\"}, {\"month_int\": 7.0, \"n_donations\": 1607053, \"total_amount\": 320528605, \"mean_amount\": 199.45117242555162, \"median_amount\": 35, \"month_str\": \"Jul\"}, {\"month_int\": 8.0, \"n_donations\": 2023466, \"total_amount\": 473544182, \"mean_amount\": 234.02626088108227, \"median_amount\": 35, \"month_str\": \"Aug\"}, {\"month_int\": 9.0, \"n_donations\": 2583847, \"total_amount\": 697888624, \"mean_amount\": 270.0967294116099, \"median_amount\": 38, \"month_str\": \"Sep\"}, {\"month_int\": 10.0, \"n_donations\": 3686024, \"total_amount\": 850820707, \"mean_amount\": 230.82343115508743, \"median_amount\": 29, \"month_str\": \"Oct\"}, {\"month_int\": 11.0, \"n_donations\": 2545616, \"total_amount\": 285143995, \"mean_amount\": 112.01375030640914, \"median_amount\": 25, \"month_str\": \"Nov\"}, {\"month_int\": 12.0, \"n_donations\": 2119311, \"total_amount\": 283081648, \"mean_amount\": 133.57249030463203, \"median_amount\": 25, \"month_str\": \"Dec\"}]}}, {\"mode\": \"vega-lite\"});\n</script>\n```\n:::\n:::\n\n\n## Conclusion\n\nThanks for following along! I hope you've learned something about Ibis, and\nmaybe even about campaign finance.\n\nIbis is a great tool for exploring data. I now find myself reaching for it\nwhen in the past I would have reached for pandas.\n\nSome of the highlights for me:\n\n- Fast, lazy execution, a great display format, and good type hinting/editor support for a great REPL experience.\n- Very well thought-out API and semantics (e.g. `isinstance(val, NumericValue)`?? That's beautiful!)\n- Fast and fairly complete string support, since I work with a lot of text data.\n- Extremely responsive maintainers. Sometimes I've submitted multiple feature requests and bug reports in a single day, and a PR has been merged by the next day.\n- Escape hatch to SQL. I didn't have to use that here, but if something isn't supported, you can always fall back to SQL.\n\nCheck out [The Ibis Website](https://ibis-project.org/) for more information.\n\n",
     "supporting": [
-      "index_files/figure-html"
+      "index_files"
     ],
     "filters": [],
     "includes": {
       "include-in-header": [
-        "<script src=\"https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js\" integrity=\"sha512-c3Nl8+7g4LMSTdrm621y7kf9v3SDPnhxLNhcjFJbKECVnmZHTdo+IRO05sNLTH/D3vA6u1X32ehoLC7WFVdheg==\" crossorigin=\"anonymous\"></script>\n<script src=\"https://cdnjs.cloudflare.com/ajax/libs/jquery/3.5.1/jquery.min.js\" integrity=\"sha512-bLT0Qm9VnAYZDflyKcBaQ2gg0hSYNQrJ8RilYldYQ1FxQYoCLtUjuuRuZo+fjqhx/qtq/1itJ0C2ejDxltZVFg==\" crossorigin=\"anonymous\"></script>\n<script type=\"application/javascript\">define('jquery', [],function() {return window.jQuery;})</script>\n"
+        "<script src=\"https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js\" integrity=\"sha512-c3Nl8+7g4LMSTdrm621y7kf9v3SDPnhxLNhcjFJbKECVnmZHTdo+IRO05sNLTH/D3vA6u1X32ehoLC7WFVdheg==\" crossorigin=\"anonymous\"></script>\n<script src=\"https://cdnjs.cloudflare.com/ajax/libs/jquery/3.5.1/jquery.min.js\" integrity=\"sha512-bLT0Qm9VnAYZDflyKcBaQ2gg0hSYNQrJ8RilYldYQ1FxQYoCLtUjuuRuZo+fjqhx/qtq/1itJ0C2ejDxltZVFg==\" crossorigin=\"anonymous\" data-relocate-top=\"true\"></script>\n<script type=\"application/javascript\">define('jquery', [],function() {return window.jQuery;})</script>\n"
       ]
     }
   }
diff --git a/docs/_quarto.yml b/docs/_quarto.yml
index b3c11e4494ff..d2d3ec3c1af8 100644
--- a/docs/_quarto.yml
+++ b/docs/_quarto.yml
@@ -298,10 +298,6 @@ quartodoc:
             - name: param
               dynamic: true
               signature_name: full
-            - name: NA
-              # Ideally exposed under `ibis` but that doesn't seem to work??
-              package: ibis.expr.api
-              signature_name: full
             - name: "null"
               dynamic: true
               signature_name: full
diff --git a/docs/how-to/timeseries/sessionize.qmd b/docs/how-to/timeseries/sessionize.qmd
index c949076eec86..490e556a3b78 100644
--- a/docs/how-to/timeseries/sessionize.qmd
+++ b/docs/how-to/timeseries/sessionize.qmd
@@ -59,7 +59,7 @@ sessionized = (
     data
     # Create a session id for each character by using a cumulative sum
     # over the `new_session` column.
-    .mutate(new_session=is_new_session.fillna(True))
+    .mutate(new_session=is_new_session.fill_null(True))
     # Create a session id for each character by using a cumulative sum
     # over the `new_session` column.
     .mutate(session_id=c.new_session.sum().over(entity_window))
diff --git a/docs/posts/campaign-finance/index.qmd b/docs/posts/campaign-finance/index.qmd
index 3d8d9fc19330..a2a0a287e388 100644
--- a/docs/posts/campaign-finance/index.qmd
+++ b/docs/posts/campaign-finance/index.qmd
@@ -245,7 +245,7 @@ def get_election_type(pgi: StringValue) -> StringValue:
         "E": "recount",
     }
     first_letter = pgi[0]
-    return first_letter.substitute(election_types, else_=ibis.NA)
+    return first_letter.substitute(election_types, else_=ibis.null())
 
 
 cleaned = cleaned.mutate(election_type=get_election_type(_.TRANSACTION_PGI)).drop(
diff --git a/docs/posts/ibis-analytics/index.qmd b/docs/posts/ibis-analytics/index.qmd
index a684fab26641..1b1efced1650 100644
--- a/docs/posts/ibis-analytics/index.qmd
+++ b/docs/posts/ibis-analytics/index.qmd
@@ -1220,7 +1220,7 @@ def transform_downloads(extract_downloads):
         )
         .order_by(ibis._.timestamp.desc())
     )
-    downloads = downloads.mutate(ibis._["python"].fillna("").name("python_full"))
+    downloads = downloads.mutate(ibis._["python"].fill_null("").name("python_full"))
     downloads = downloads.mutate(
         f.clean_version(downloads["python_full"], patch=False).name("python")
     )
diff --git a/docs/tutorials/ibis-for-pandas-users.qmd b/docs/tutorials/ibis-for-pandas-users.qmd
index a680876c2df8..b640d524addd 100644
--- a/docs/tutorials/ibis-for-pandas-users.qmd
+++ b/docs/tutorials/ibis-for-pandas-users.qmd
@@ -507,7 +507,7 @@ represented by `NaN`. This can be confusing when working with numeric data,
 since `NaN` is also a valid floating point value (along with `+/-inf`).
 
 In Ibis, we try to be more precise: All data types are nullable, and we use
-`ibis.NA` to represent `NULL` values, and all datatypes have a `.isnull()` method.
+`ibis.null()` to represent `NULL` values, and all datatypes have a `.isnull()` method.
 For floating point values, we use different values for `NaN` and `+/-inf`, and there
 are the additional methods `.isnan()` and `.isinf()`.
 
@@ -532,17 +532,17 @@ the column name for the value to apply to.
 
 
 ```{python}
-no_null_peng = penguins.fillna(dict(bill_depth_mm=0, bill_length_mm=0))
+no_null_peng = penguins.fill_null(dict(bill_depth_mm=0, bill_length_mm=0))
 ```
 
 ### Replacing `NULL`s
 
-Both pandas and Ibis have `fillna` methods which allow you to specify a replacement value
+The Ibis equivalent of pandas `fillna` is `filnull`, this method allows you to specify a replacement value
 for `NULL` values.
 
 
 ```{python}
-bill_length_no_nulls = penguins.bill_length_mm.fillna(0)
+bill_length_no_nulls = penguins.bill_length_mm.fill_null(0)
 ```
 
 ## Type casts
diff --git a/docs/tutorials/ibis-for-sql-users.qmd b/docs/tutorials/ibis-for-sql-users.qmd
index 577f7b015111..1f348ca13b70 100644
--- a/docs/tutorials/ibis-for-sql-users.qmd
+++ b/docs/tutorials/ibis-for-sql-users.qmd
@@ -522,10 +522,10 @@ ibis.to_sql(expr)
 
 ### Using `NULL` in expressions
 
-To use `NULL` in an expression, either use the special `ibis.NA` value:
+To use `NULL` in an expression, use `ibis.null()`:
 
 ```{python}
-pos_two = (t.two > 0).ifelse(t.two, ibis.NA)
+pos_two = (t.two > 0).ifelse(t.two, ibis.null())
 expr = t.mutate(two_positive=pos_two)
 ibis.to_sql(expr)
 ```
diff --git a/ibis/__init__.py b/ibis/__init__.py
index 2ec14e182330..e7927bd7fd2d 100644
--- a/ibis/__init__.py
+++ b/ibis/__init__.py
@@ -4,6 +4,9 @@
 
 __version__ = "9.0.0"
 
+import warnings
+from typing import Any
+
 from ibis import examples, util
 from ibis.backends import BaseBackend
 from ibis.common.exceptions import IbisError
@@ -36,7 +39,7 @@ def __dir__() -> list[str]:
     return sorted(out)
 
 
-def __getattr__(name: str) -> BaseBackend:
+def load_backend(name: str) -> BaseBackend:
     """Load backends in a lazy way with `ibis.<backend-name>`.
 
     This also registers the backend options.
@@ -125,3 +128,18 @@ def connect(*args, **kwargs):
         setattr(proxy, name, getattr(backend, name))
 
     return proxy
+
+
+def __getattr__(name: str) -> Any:
+    if name == "NA":
+        warnings.warn(
+            "Accessing 'ibis.NA' is deprecated as of v9.1 and will be removed in a future version. "
+            "Use 'ibis.null()' instead.",
+            DeprecationWarning,
+            stacklevel=2,
+        )
+        import ibis
+
+        return ibis.null()
+    else:
+        return load_backend(name)
diff --git a/ibis/backends/clickhouse/tests/test_functions.py b/ibis/backends/clickhouse/tests/test_functions.py
index 04b53f4d840d..dfe9d5e0f01e 100644
--- a/ibis/backends/clickhouse/tests/test_functions.py
+++ b/ibis/backends/clickhouse/tests/test_functions.py
@@ -116,8 +116,8 @@ def test_isnull_notnull(con, expr, expected):
     ("expr", "expected"),
     [
         (ibis.coalesce(5, None, 4), 5),
-        (ibis.coalesce(ibis.NA, 4, ibis.NA), 4),
-        (ibis.coalesce(ibis.NA, ibis.NA, 3.14), 3.14),
+        (ibis.coalesce(ibis.null(), 4, ibis.null()), 4),
+        (ibis.coalesce(ibis.null(), ibis.null(), 3.14), 3.14),
     ],
 )
 def test_coalesce(con, expr, expected):
@@ -127,13 +127,13 @@ def test_coalesce(con, expr, expected):
 @pytest.mark.parametrize(
     ("expr", "expected"),
     [
-        (ibis.NA.fillna(5), 5),
-        (L(5).fillna(10), 5),
+        (ibis.null().fill_null(5), 5),
+        (L(5).fill_null(10), 5),
         (L(5).nullif(5), None),
         (L(10).nullif(5), 10),
     ],
 )
-def test_fillna_nullif(con, expr, expected):
+def test_fill_null_nullif(con, expr, expected):
     result = con.execute(expr)
     if expected is None:
         assert pd.isnull(result)
@@ -150,7 +150,7 @@ def test_fillna_nullif(con, expr, expected):
         (L(datetime(2015, 9, 1, hour=14, minute=48, second=5)), "DateTime"),
         (L(date(2015, 9, 1)), "Date"),
         param(
-            ibis.NA,
+            ibis.null(),
             "Null",
             marks=pytest.mark.xfail(
                 raises=AssertionError,
@@ -418,7 +418,7 @@ def test_numeric_builtins_work(alltypes, df):
 def test_null_column(alltypes):
     t = alltypes
     nrows = t.count().execute()
-    expr = t.mutate(na_column=ibis.NA).na_column
+    expr = t.mutate(na_column=ibis.null()).na_column
     result = expr.execute()
     expected = pd.Series([None] * nrows, name="na_column")
     tm.assert_series_equal(result, expected)
diff --git a/ibis/backends/clickhouse/tests/test_select.py b/ibis/backends/clickhouse/tests/test_select.py
index 9b9e69f3d52c..3087b15bbdeb 100644
--- a/ibis/backends/clickhouse/tests/test_select.py
+++ b/ibis/backends/clickhouse/tests/test_select.py
@@ -362,7 +362,7 @@ def test_count_name(assert_sql):
     t = ibis.table(dict(a="string", b="bool"), name="t")
 
     expr = t.group_by(t.a).agg(
-        A=t.count(where=~t.b).fillna(0), B=t.count(where=t.b).fillna(0)
+        A=t.count(where=~t.b).fill_null(0), B=t.count(where=t.b).fill_null(0)
     )
     assert_sql(expr)
 
diff --git a/ibis/backends/dask/tests/test_window.py b/ibis/backends/dask/tests/test_window.py
index ef2249dd099d..2a6d17c67e13 100644
--- a/ibis/backends/dask/tests/test_window.py
+++ b/ibis/backends/dask/tests/test_window.py
@@ -20,7 +20,7 @@ def sort_kind():
     return "mergesort"
 
 
-default = pytest.mark.parametrize("default", [ibis.NA, ibis.literal("a")])
+default = pytest.mark.parametrize("default", [ibis.null(), ibis.literal("a")])
 row_offset = pytest.mark.parametrize("row_offset", list(map(ibis.literal, [-1, 1, 0])))
 range_offset = pytest.mark.parametrize(
     "range_offset",
@@ -48,7 +48,7 @@ def test_lead(con, t, df, row_offset, default, row_window):
     expr = t.dup_strings.lead(row_offset, default=default).over(row_window)
     result = expr.execute()
     expected = df.dup_strings.shift(con.execute(-row_offset)).compute()
-    if default is not ibis.NA:
+    if default is not ibis.null():
         expected = expected.fillna(con.execute(default))
     tm.assert_series_equal(result, expected, check_names=False)
 
@@ -59,7 +59,7 @@ def test_lag(con, t, df, row_offset, default, row_window):
     expr = t.dup_strings.lag(row_offset, default=default).over(row_window)
     result = expr.execute()
     expected = df.dup_strings.shift(con.execute(row_offset)).compute()
-    if default is not ibis.NA:
+    if default is not ibis.null():
         expected = expected.fillna(con.execute(default))
     tm.assert_series_equal(result, expected, check_names=False)
 
@@ -78,7 +78,7 @@ def test_lead_delta(con, t, pandas_df, range_offset, default, range_window):
         .reindex(pandas_df.plain_datetimes_naive)
         .reset_index(drop=True)
     )
-    if default is not ibis.NA:
+    if default is not ibis.null():
         expected = expected.fillna(con.execute(default))
     tm.assert_series_equal(result, expected, check_names=False)
 
@@ -98,7 +98,7 @@ def test_lag_delta(t, con, pandas_df, range_offset, default, range_window):
         .reindex(pandas_df.plain_datetimes_naive)
         .reset_index(drop=True)
     )
-    if default is not ibis.NA:
+    if default is not ibis.null():
         expected = expected.fillna(con.execute(default))
     tm.assert_series_equal(result, expected, check_names=False)
 
diff --git a/ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fillna_cast_arg/fillna_l_extendedprice/out.sql b/ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fill_null_cast_arg/fill_null_l_extendedprice/out.sql
similarity index 100%
rename from ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fillna_cast_arg/fillna_l_extendedprice/out.sql
rename to ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fill_null_cast_arg/fill_null_l_extendedprice/out.sql
diff --git a/ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fillna_cast_arg/fillna_l_extendedprice_double/out.sql b/ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fill_null_cast_arg/fill_null_l_extendedprice_double/out.sql
similarity index 100%
rename from ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fillna_cast_arg/fillna_l_extendedprice_double/out.sql
rename to ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fill_null_cast_arg/fill_null_l_extendedprice_double/out.sql
diff --git a/ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fillna_cast_arg/fillna_l_quantity/out.sql b/ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fill_null_cast_arg/fill_null_l_quantity/out.sql
similarity index 100%
rename from ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fillna_cast_arg/fillna_l_quantity/out.sql
rename to ibis/backends/impala/tests/snapshots/test_case_exprs/test_decimal_fill_null_cast_arg/fill_null_l_quantity/out.sql
diff --git a/ibis/backends/impala/tests/test_case_exprs.py b/ibis/backends/impala/tests/test_case_exprs.py
index e23a9436c6fb..a195928b1221 100644
--- a/ibis/backends/impala/tests/test_case_exprs.py
+++ b/ibis/backends/impala/tests/test_case_exprs.py
@@ -76,16 +76,17 @@ def test_nullif_ifnull(tpch_lineitem, expr_fn, snapshot):
 @pytest.mark.parametrize(
     "expr_fn",
     [
-        pytest.param(lambda t: t.l_quantity.fillna(0), id="fillna_l_quantity"),
+        pytest.param(lambda t: t.l_quantity.fill_null(0), id="fill_null_l_quantity"),
         pytest.param(
-            lambda t: t.l_extendedprice.fillna(0), id="fillna_l_extendedprice"
+            lambda t: t.l_extendedprice.fill_null(0), id="fill_null_l_extendedprice"
         ),
         pytest.param(
-            lambda t: t.l_extendedprice.fillna(0.0), id="fillna_l_extendedprice_double"
+            lambda t: t.l_extendedprice.fill_null(0.0),
+            id="fill_null_l_extendedprice_double",
         ),
     ],
 )
-def test_decimal_fillna_cast_arg(tpch_lineitem, expr_fn, snapshot):
+def test_decimal_fill_null_cast_arg(tpch_lineitem, expr_fn, snapshot):
     expr = expr_fn(tpch_lineitem)
     result = translate(expr)
     snapshot.assert_match(result, "out.sql")
@@ -99,6 +100,6 @@ def test_identical_to(mockcon, snapshot):
 
 
 def test_identical_to_special_case(snapshot):
-    expr = ibis.NA.cast("int64").identical_to(ibis.NA.cast("int64")).name("tmp")
+    expr = ibis.null().cast("int64").identical_to(ibis.null().cast("int64")).name("tmp")
     result = ibis.to_sql(expr, dialect="impala")
     snapshot.assert_match(result, "out.sql")
diff --git a/ibis/backends/impala/tests/test_exprs.py b/ibis/backends/impala/tests/test_exprs.py
index 8fdc4e1b4358..45a0ac96d76e 100644
--- a/ibis/backends/impala/tests/test_exprs.py
+++ b/ibis/backends/impala/tests/test_exprs.py
@@ -52,9 +52,9 @@ def test_builtins(con, alltypes):
         i4 % 10,
         20 % i1,
         d % 5,
-        i1.fillna(0),
-        i4.fillna(0),
-        i8.fillna(0),
+        i1.fill_null(0),
+        i4.fill_null(0),
+        i8.fill_null(0),
         i4.to_timestamp("s"),
         i4.to_timestamp("ms"),
         i4.to_timestamp("us"),
@@ -65,7 +65,7 @@ def test_builtins(con, alltypes):
         d.ceil(),
         d.exp(),
         d.isnull(),
-        d.fillna(0),
+        d.fill_null(0),
         d.floor(),
         d.log(),
         d.ln(),
@@ -164,7 +164,7 @@ def _check_impala_output_types_match(con, table):
         (5 / L(50).nullif(0), 0.1),
         (5 / L(50).nullif(L(50000)), 0.1),
         (5 / L(50000).nullif(0), 0.0001),
-        (L(50000).fillna(0), 50000),
+        (L(50000).fill_null(0), 50000),
     ],
 )
 def test_int_builtins(con, expr, expected):
@@ -257,13 +257,13 @@ def approx_equal(a, b, eps):
     [
         pytest.param(lambda dc: dc, "5.245", id="id"),
         pytest.param(lambda dc: dc % 5, "0.245", id="mod"),
-        pytest.param(lambda dc: dc.fillna(0), "5.245", id="fillna"),
+        pytest.param(lambda dc: dc.fill_null(0), "5.245", id="fill_null"),
         pytest.param(lambda dc: dc.exp(), "189.6158", id="exp"),
         pytest.param(lambda dc: dc.log(), "1.65728", id="log"),
         pytest.param(lambda dc: dc.log2(), "2.39094", id="log2"),
         pytest.param(lambda dc: dc.log10(), "0.71975", id="log10"),
         pytest.param(lambda dc: dc.sqrt(), "2.29019", id="sqrt"),
-        pytest.param(lambda dc: dc.fillna(0), "5.245", id="zero_ifnull"),
+        pytest.param(lambda dc: dc.fill_null(0), "5.245", id="zero_ifnull"),
         pytest.param(lambda dc: -dc, "-5.245", id="neg"),
     ],
 )
@@ -384,8 +384,8 @@ def test_decimal_timestamp_builtins(con):
         dc * 2,
         dc**2,
         dc.cast("double"),
-        api.ifelse(table.l_discount > 0, dc * table.l_discount, api.NA),
-        dc.fillna(0),
+        api.ifelse(table.l_discount > 0, dc * table.l_discount, api.null()),
+        dc.fill_null(0),
         ts < (ibis.now() + ibis.interval(months=3)),
         ts < (ibis.timestamp("2005-01-01") + ibis.interval(months=3)),
         # hashing
@@ -632,10 +632,10 @@ def test_unions_with_ctes(con, alltypes):
 @pytest.mark.parametrize(
     ("left", "right", "expected"),
     [
-        (ibis.NA.cast("int64"), ibis.NA.cast("int64"), True),
+        (ibis.null().cast("int64"), ibis.null().cast("int64"), True),
         (L(1), L(1), True),
-        (ibis.NA.cast("int64"), L(1), False),
-        (L(1), ibis.NA.cast("int64"), False),
+        (ibis.null().cast("int64"), L(1), False),
+        (L(1), ibis.null().cast("int64"), False),
         (L(0), L(1), False),
         (L(1), L(0), False),
     ],
diff --git a/ibis/backends/impala/tests/test_unary_builtins.py b/ibis/backends/impala/tests/test_unary_builtins.py
index b0b605d06060..5b1855eb4509 100644
--- a/ibis/backends/impala/tests/test_unary_builtins.py
+++ b/ibis/backends/impala/tests/test_unary_builtins.py
@@ -29,7 +29,7 @@ def table(mockcon):
         param(lambda x: x.log2(), id="log2"),
         param(lambda x: x.log10(), id="log10"),
         param(lambda x: x.nullif(0), id="nullif_zero"),
-        param(lambda x: x.fillna(0), id="zero_ifnull"),
+        param(lambda x: x.fill_null(0), id="zero_ifnull"),
     ],
 )
 @pytest.mark.parametrize("cname", ["double_col", "int_col"])
diff --git a/ibis/backends/pandas/executor.py b/ibis/backends/pandas/executor.py
index e0b3e19940f9..7e1886b408ed 100644
--- a/ibis/backends/pandas/executor.py
+++ b/ibis/backends/pandas/executor.py
@@ -740,7 +740,7 @@ def visit(cls, op: ops.Distinct, parent):
         return parent.drop_duplicates()
 
     @classmethod
-    def visit(cls, op: ops.DropNa, parent, how, subset):
+    def visit(cls, op: ops.DropNull, parent, how, subset):
         if op.subset is not None:
             subset = [col.name for col in op.subset]
         else:
@@ -748,7 +748,7 @@ def visit(cls, op: ops.DropNa, parent, how, subset):
         return parent.dropna(how=how, subset=subset)
 
     @classmethod
-    def visit(cls, op: ops.FillNa, parent, replacements):
+    def visit(cls, op: ops.FillNull, parent, replacements):
         return parent.fillna(replacements)
 
     @classmethod
diff --git a/ibis/backends/pandas/tests/test_join.py b/ibis/backends/pandas/tests/test_join.py
index 926cd5ce6129..c4f730e84ea0 100644
--- a/ibis/backends/pandas/tests/test_join.py
+++ b/ibis/backends/pandas/tests/test_join.py
@@ -502,10 +502,10 @@ def test_mutate_after_join():
             .isnull()
             .ifelse(joined["q_Order_Priority"], joined["p_Order_Priority"])
         ),
-        p_count=joined["p_count"].fillna(0),
-        q_count=joined["q_count"].fillna(0),
-        p_density=joined.p_density.fillna(1e-10),
-        q_density=joined.q_density.fillna(1e-10),
+        p_count=joined["p_count"].fill_null(0),
+        q_count=joined["q_count"].fill_null(0),
+        p_density=joined.p_density.fill_null(1e-10),
+        q_density=joined.q_density.fill_null(1e-10),
         features=ibis.literal("Order_Priority"),
     )
 
diff --git a/ibis/backends/pandas/tests/test_window.py b/ibis/backends/pandas/tests/test_window.py
index 791f29133abb..d588120b8fd4 100644
--- a/ibis/backends/pandas/tests/test_window.py
+++ b/ibis/backends/pandas/tests/test_window.py
@@ -20,7 +20,7 @@ def sort_kind():
     return "mergesort"
 
 
-default = pytest.mark.parametrize("default", [ibis.NA, ibis.literal("a")])
+default = pytest.mark.parametrize("default", [ibis.null(), ibis.literal("a")])
 row_offset = pytest.mark.parametrize("row_offset", list(map(ibis.literal, [-1, 1, 0])))
 range_offset = pytest.mark.parametrize(
     "range_offset",
@@ -49,7 +49,7 @@ def test_lead(t, df, row_offset, default, row_window):
     expr = t.dup_strings.lead(row_offset, default=default).over(row_window)
     result = expr.execute()
     expected = df.dup_strings.shift(con.execute(-row_offset))
-    if default is not ibis.NA:
+    if default is not ibis.null():
         expected = expected.fillna(con.execute(default))
     tm.assert_series_equal(result, expected.rename("tmp"))
 
@@ -61,7 +61,7 @@ def test_lag(t, df, row_offset, default, row_window):
     expr = t.dup_strings.lag(row_offset, default=default).over(row_window)
     result = expr.execute()
     expected = df.dup_strings.shift(con.execute(row_offset))
-    if default is not ibis.NA:
+    if default is not ibis.null():
         expected = expected.fillna(con.execute(default))
     tm.assert_series_equal(result, expected.rename("tmp"))
 
@@ -80,7 +80,7 @@ def test_lead_delta(t, df, range_offset, default, range_window):
         .reindex(df.plain_datetimes_naive)
         .reset_index(drop=True)
     )
-    if default is not ibis.NA:
+    if default is not ibis.null():
         expected = expected.fillna(con.execute(default))
     tm.assert_series_equal(result, expected.rename("tmp"))
 
@@ -100,7 +100,7 @@ def test_lag_delta(t, df, range_offset, default, range_window):
         .reindex(df.plain_datetimes_naive)
         .reset_index(drop=True)
     )
-    if default is not ibis.NA:
+    if default is not ibis.null():
         expected = expected.fillna(con.execute(default))
     tm.assert_series_equal(result, expected.rename("tmp"))
 
diff --git a/ibis/backends/polars/compiler.py b/ibis/backends/polars/compiler.py
index 50350c5a420b..0bb19b6d8c45 100644
--- a/ibis/backends/polars/compiler.py
+++ b/ibis/backends/polars/compiler.py
@@ -367,8 +367,8 @@ def asof_join(op, **kw):
     return joined
 
 
-@translate.register(ops.DropNa)
-def dropna(op, **kw):
+@translate.register(ops.DropNull)
+def drop_null(op, **kw):
     lf = translate(op.parent, **kw)
 
     if op.subset is None:
@@ -385,8 +385,8 @@ def dropna(op, **kw):
     return lf.drop_nulls(subset)
 
 
-@translate.register(ops.FillNa)
-def fillna(op, **kw):
+@translate.register(ops.FillNull)
+def fill_null(op, **kw):
     table = translate(op.parent, **kw)
 
     columns = []
diff --git a/ibis/backends/postgres/tests/test_functions.py b/ibis/backends/postgres/tests/test_functions.py
index 386a7d792025..93467491a3d4 100644
--- a/ibis/backends/postgres/tests/test_functions.py
+++ b/ibis/backends/postgres/tests/test_functions.py
@@ -150,7 +150,7 @@ def test_strftime(con, pattern):
     [
         param(L("foo_bar"), "text", id="text"),
         param(L(5), "integer", id="integer"),
-        param(ibis.NA, "null", id="null"),
+        param(ibis.null(), "null", id="null"),
         # TODO(phillipc): should this really be double?
         param(L(1.2345), "numeric", id="numeric"),
         param(
@@ -335,13 +335,13 @@ def test_regexp_extract(con, expr, expected):
 @pytest.mark.parametrize(
     ("expr", "expected"),
     [
-        param(ibis.NA.fillna(5), 5, id="filled"),
-        param(L(5).fillna(10), 5, id="not_filled"),
+        param(ibis.null().fill_null(5), 5, id="filled"),
+        param(L(5).fill_null(10), 5, id="not_filled"),
         param(L(5).nullif(5), None, id="nullif_null"),
         param(L(10).nullif(5), 10, id="nullif_not_null"),
     ],
 )
-def test_fillna_nullif(con, expr, expected):
+def test_fill_null_nullif(con, expr, expected):
     assert con.execute(expr) == expected
 
 
@@ -349,8 +349,8 @@ def test_fillna_nullif(con, expr, expected):
     ("expr", "expected"),
     [
         param(ibis.coalesce(5, None, 4), 5, id="first"),
-        param(ibis.coalesce(ibis.NA, 4, ibis.NA), 4, id="second"),
-        param(ibis.coalesce(ibis.NA, ibis.NA, 3.14), 3.14, id="third"),
+        param(ibis.coalesce(ibis.null(), 4, ibis.null()), 4, id="second"),
+        param(ibis.coalesce(ibis.null(), ibis.null(), 3.14), 3.14, id="third"),
     ],
 )
 def test_coalesce(con, expr, expected):
@@ -360,12 +360,12 @@ def test_coalesce(con, expr, expected):
 @pytest.mark.parametrize(
     ("expr", "expected"),
     [
-        param(ibis.coalesce(ibis.NA, ibis.NA), None, id="all_null"),
+        param(ibis.coalesce(ibis.null(), ibis.null()), None, id="all_null"),
         param(
             ibis.coalesce(
-                ibis.NA.cast("int8"),
-                ibis.NA.cast("int8"),
-                ibis.NA.cast("int8"),
+                ibis.null().cast("int8"),
+                ibis.null().cast("int8"),
+                ibis.null().cast("int8"),
             ),
             None,
             id="all_nulls_with_all_cast",
@@ -377,12 +377,12 @@ def test_coalesce_all_na(con, expr, expected):
 
 
 def test_coalesce_all_na_double(con):
-    expr = ibis.coalesce(ibis.NA, ibis.NA, ibis.NA.cast("double"))
+    expr = ibis.coalesce(ibis.null(), ibis.null(), ibis.null().cast("double"))
     assert np.isnan(con.execute(expr))
 
 
 def test_numeric_builtins_work(alltypes, df):
-    expr = alltypes.double_col.fillna(0)
+    expr = alltypes.double_col.fill_null(0)
     result = expr.execute()
     expected = df.double_col.fillna(0)
     expected.name = "Coalesce()"
@@ -670,7 +670,9 @@ def test_interactive_repr_shows_error(alltypes):
 def test_subquery(alltypes, df):
     t = alltypes
 
-    expr = t.mutate(d=t.double_col.fillna(0)).limit(1000).group_by("string_col").size()
+    expr = (
+        t.mutate(d=t.double_col.fill_null(0)).limit(1000).group_by("string_col").size()
+    )
     result = expr.execute().sort_values("string_col").reset_index(drop=True)
     expected = (
         df.assign(d=df.double_col.fillna(0))
@@ -813,14 +815,14 @@ def test_first_last_value(alltypes, df, func, expected_index):
 def test_null_column(alltypes):
     t = alltypes
     nrows = t.count().execute()
-    expr = t.mutate(na_column=ibis.NA).na_column
+    expr = t.mutate(na_column=ibis.null()).na_column
     result = expr.execute()
     tm.assert_series_equal(result, pd.Series([None] * nrows, name="na_column"))
 
 
 def test_null_column_union(alltypes, df):
     t = alltypes
-    s = alltypes[["double_col"]].mutate(string_col=ibis.NA.cast("string"))
+    s = alltypes[["double_col"]].mutate(string_col=ibis.null().cast("string"))
     expr = t[["double_col", "string_col"]].union(s)
     result = expr.execute()
     nrows = t.count().execute()
diff --git a/ibis/backends/risingwave/tests/test_functions.py b/ibis/backends/risingwave/tests/test_functions.py
index 86861c2d2844..89c012e7f026 100644
--- a/ibis/backends/risingwave/tests/test_functions.py
+++ b/ibis/backends/risingwave/tests/test_functions.py
@@ -166,13 +166,13 @@ def test_regexp(con, expr, expected):
 @pytest.mark.parametrize(
     ("expr", "expected"),
     [
-        param(ibis.NA.fillna(5), 5, id="filled"),
-        param(L(5).fillna(10), 5, id="not_filled"),
+        param(ibis.null().fill_null(5), 5, id="filled"),
+        param(L(5).fill_null(10), 5, id="not_filled"),
         param(L(5).nullif(5), None, id="nullif_null"),
         param(L(10).nullif(5), 10, id="nullif_not_null"),
     ],
 )
-def test_fillna_nullif(con, expr, expected):
+def test_fill_null_nullif(con, expr, expected):
     assert con.execute(expr) == expected
 
 
@@ -180,8 +180,8 @@ def test_fillna_nullif(con, expr, expected):
     ("expr", "expected"),
     [
         param(ibis.coalesce(5, None, 4), 5, id="first"),
-        param(ibis.coalesce(ibis.NA, 4, ibis.NA), 4, id="second"),
-        param(ibis.coalesce(ibis.NA, ibis.NA, 3.14), 3.14, id="third"),
+        param(ibis.coalesce(ibis.null(), 4, ibis.null()), 4, id="second"),
+        param(ibis.coalesce(ibis.null(), ibis.null(), 3.14), 3.14, id="third"),
     ],
 )
 def test_coalesce(con, expr, expected):
@@ -191,12 +191,12 @@ def test_coalesce(con, expr, expected):
 @pytest.mark.parametrize(
     ("expr", "expected"),
     [
-        param(ibis.coalesce(ibis.NA, ibis.NA), None, id="all_null"),
+        param(ibis.coalesce(ibis.null(), ibis.null()), None, id="all_null"),
         param(
             ibis.coalesce(
-                ibis.NA.cast("int8"),
-                ibis.NA.cast("int8"),
-                ibis.NA.cast("int8"),
+                ibis.null().cast("int8"),
+                ibis.null().cast("int8"),
+                ibis.null().cast("int8"),
             ),
             None,
             id="all_nulls_with_all_cast",
@@ -208,12 +208,12 @@ def test_coalesce_all_na(con, expr, expected):
 
 
 def test_coalesce_all_na_double(con):
-    expr = ibis.coalesce(ibis.NA, ibis.NA, ibis.NA.cast("double"))
+    expr = ibis.coalesce(ibis.null(), ibis.null(), ibis.null().cast("double"))
     assert np.isnan(con.execute(expr))
 
 
 def test_numeric_builtins_work(alltypes, df):
-    expr = alltypes.double_col.fillna(0)
+    expr = alltypes.double_col.fill_null(0)
     result = expr.execute()
     expected = df.double_col.fillna(0)
     expected.name = "Coalesce()"
@@ -461,7 +461,9 @@ def test_not_exists(alltypes, df):
 def test_subquery(alltypes, df):
     t = alltypes
 
-    expr = t.mutate(d=t.double_col.fillna(0)).limit(1000).group_by("string_col").size()
+    expr = (
+        t.mutate(d=t.double_col.fill_null(0)).limit(1000).group_by("string_col").size()
+    )
     result = expr.execute().sort_values("string_col").reset_index(drop=True)
     expected = (
         df.assign(d=df.double_col.fillna(0))
@@ -593,7 +595,7 @@ def test_first_last_value(alltypes, df, func, expected_index):
 def test_null_column(alltypes):
     t = alltypes
     nrows = t.count().execute()
-    expr = t.mutate(na_column=ibis.NA).na_column
+    expr = t.mutate(na_column=ibis.null()).na_column
     result = expr.execute()
     tm.assert_series_equal(result, pd.Series([None] * nrows, name="na_column"))
 
diff --git a/ibis/backends/sql/rewrites.py b/ibis/backends/sql/rewrites.py
index b8898744bbdf..19307cd3a129 100644
--- a/ibis/backends/sql/rewrites.py
+++ b/ibis/backends/sql/rewrites.py
@@ -111,9 +111,9 @@ def sort_to_select(_, **kwargs):
     return Select(_.parent, selections=_.values, sort_keys=_.keys)
 
 
-@replace(p.FillNa)
-def fillna_to_select(_, **kwargs):
-    """Rewrite FillNa to a Select node."""
+@replace(p.FillNull)
+def fill_null_to_select(_, **kwargs):
+    """Rewrite FillNull to a Select node."""
     if isinstance(_.replacements, Mapping):
         mapping = _.replacements
     else:
@@ -136,9 +136,9 @@ def fillna_to_select(_, **kwargs):
     return Select(_.parent, selections=selections)
 
 
-@replace(p.DropNa)
-def dropna_to_select(_, **kwargs):
-    """Rewrite DropNa to a Select node."""
+@replace(p.DropNull)
+def drop_null_to_select(_, **kwargs):
+    """Rewrite DropNull to a Select node."""
     if _.subset is None:
         columns = [ops.Field(_.parent, name) for name in _.parent.schema.names]
     else:
@@ -290,8 +290,8 @@ def sqlize(
         | project_to_select
         | filter_to_select
         | sort_to_select
-        | fillna_to_select
-        | dropna_to_select
+        | fill_null_to_select
+        | drop_null_to_select
         | first_to_firstvalue,
         context=context,
     )
diff --git a/ibis/backends/sqlite/tests/test_client.py b/ibis/backends/sqlite/tests/test_client.py
index cafea0abe3cd..d7d5def383b0 100644
--- a/ibis/backends/sqlite/tests/test_client.py
+++ b/ibis/backends/sqlite/tests/test_client.py
@@ -47,7 +47,7 @@ def test_builtin_agg_udf(con):
     def total(x) -> float:
         """Totally total."""
 
-    expr = total(con.tables.functional_alltypes.limit(2).select(n=ibis.NA).n)
+    expr = total(con.tables.functional_alltypes.limit(2).select(n=ibis.null()).n)
     result = con.execute(expr)
     assert result == 0.0
 
diff --git a/ibis/backends/tests/sql/test_sql.py b/ibis/backends/tests/sql/test_sql.py
index 1f9e95542e12..6f1d116374e9 100644
--- a/ibis/backends/tests/sql/test_sql.py
+++ b/ibis/backends/tests/sql/test_sql.py
@@ -121,7 +121,7 @@ def test_coalesce(functional_alltypes, snapshot):
     d = functional_alltypes.double_col
     f = functional_alltypes.float_col
 
-    expr = ibis.coalesce((d > 30).ifelse(d, ibis.NA), ibis.NA, f).name("tmp")
+    expr = ibis.coalesce((d > 30).ifelse(d, ibis.null()), ibis.null(), f).name("tmp")
     snapshot.assert_match(to_sql(expr.name("tmp")), "out.sql")
 
 
diff --git a/ibis/backends/tests/test_aggregation.py b/ibis/backends/tests/test_aggregation.py
index d5cb74c7f663..4a48bdb2e77d 100644
--- a/ibis/backends/tests/test_aggregation.py
+++ b/ibis/backends/tests/test_aggregation.py
@@ -1464,7 +1464,10 @@ def test_grouped_case(backend, con):
     case_expr = ibis.case().when(table.value < 25, table.value).else_(ibis.null()).end()
 
     expr = (
-        table.group_by(k="key").aggregate(mx=case_expr.max()).dropna("k").order_by("k")
+        table.group_by(k="key")
+        .aggregate(mx=case_expr.max())
+        .drop_null("k")
+        .order_by("k")
     )
     result = con.execute(expr)
     expected = pd.DataFrame({"k": [1, 2], "mx": [10, 20]})
diff --git a/ibis/backends/tests/test_generic.py b/ibis/backends/tests/test_generic.py
index c9ea6f8e6ed2..cf7c1f1e3b69 100644
--- a/ibis/backends/tests/test_generic.py
+++ b/ibis/backends/tests/test_generic.py
@@ -118,13 +118,13 @@ def test_boolean_literal(con, backend):
 @pytest.mark.parametrize(
     ("expr", "expected"),
     [
-        param(ibis.NA.fillna(5), 5, id="na_fillna"),
-        param(ibis.literal(5).fillna(10), 5, id="non_na_fillna"),
+        param(ibis.null().fill_null(5), 5, id="na_fill_null"),
+        param(ibis.literal(5).fill_null(10), 5, id="non_na_fill_null"),
         param(ibis.literal(5).nullif(5), None, id="nullif_null"),
         param(ibis.literal(10).nullif(5), 10, id="nullif_not_null"),
     ],
 )
-def test_scalar_fillna_nullif(con, expr, expected):
+def test_scalar_fill_null_nullif(con, expr, expected):
     if expected is None:
         # The exact kind of null value used differs per backend (and version).
         # Example 1: Pandas returns np.nan while BigQuery returns None.
@@ -159,7 +159,10 @@ def test_scalar_fillna_nullif(con, expr, expected):
             id="nan_col",
         ),
         param(
-            "none_col", ibis.NA.cast("float64"), methodcaller("isnull"), id="none_col"
+            "none_col",
+            ibis.null().cast("float64"),
+            methodcaller("isnull"),
+            id="none_col",
         ),
     ],
 )
@@ -211,11 +214,11 @@ def test_isna(backend, alltypes, col, value, filt):
         ),
     ],
 )
-def test_column_fillna(backend, alltypes, value):
+def test_column_fill_null(backend, alltypes, value):
     table = alltypes.mutate(missing=ibis.literal(value).cast("float64"))
     pd_table = table.execute()
 
-    res = table.mutate(missing=table.missing.fillna(0.0)).execute()
+    res = table.mutate(missing=table.missing.fill_null(0.0)).execute()
     sol = pd_table.assign(missing=pd_table.missing.fillna(0.0))
     backend.assert_frame_equal(res.reset_index(drop=True), sol.reset_index(drop=True))
 
@@ -224,8 +227,8 @@ def test_column_fillna(backend, alltypes, value):
     ("expr", "expected"),
     [
         param(ibis.coalesce(5, None, 4), 5, id="generic"),
-        param(ibis.coalesce(ibis.NA, 4, ibis.NA), 4, id="null_start_end"),
-        param(ibis.coalesce(ibis.NA, ibis.NA, 3.14), 3.14, id="non_null_last"),
+        param(ibis.coalesce(ibis.null(), 4, ibis.null()), 4, id="null_start_end"),
+        param(ibis.coalesce(ibis.null(), ibis.null(), 3.14), 3.14, id="non_null_last"),
     ],
 )
 def test_coalesce(con, expr, expected):
@@ -441,21 +444,21 @@ def test_select_filter_mutate(backend, alltypes, df):
     backend.assert_series_equal(result.float_col, expected.float_col)
 
 
-def test_table_fillna_invalid(alltypes):
+def test_table_fill_null_invalid(alltypes):
     with pytest.raises(
         com.IbisTypeError, match=r"Column 'invalid_col' is not found in table"
     ):
-        alltypes.fillna({"invalid_col": 0.0})
+        alltypes.fill_null({"invalid_col": 0.0})
 
     with pytest.raises(
-        com.IbisTypeError, match="Cannot fillna on column 'string_col' of type.*"
+        com.IbisTypeError, match="Cannot fill_null on column 'string_col' of type.*"
     ):
-        alltypes[["int_col", "string_col"]].fillna(0)
+        alltypes[["int_col", "string_col"]].fill_null(0)
 
     with pytest.raises(
-        com.IbisTypeError, match="Cannot fillna on column 'int_col' of type.*"
+        com.IbisTypeError, match="Cannot fill_null on column 'int_col' of type.*"
     ):
-        alltypes.fillna({"int_col": "oops"})
+        alltypes.fill_null({"int_col": "oops"})
 
 
 @pytest.mark.parametrize(
@@ -467,7 +470,7 @@ def test_table_fillna_invalid(alltypes):
         param({}, id="empty"),
     ],
 )
-def test_table_fillna_mapping(backend, alltypes, replacements):
+def test_table_fill_null_mapping(backend, alltypes, replacements):
     table = alltypes.mutate(
         int_col=alltypes.int_col.nullif(1),
         double_col=alltypes.double_col.nullif(3.0),
@@ -475,13 +478,13 @@ def test_table_fillna_mapping(backend, alltypes, replacements):
     ).select("id", "int_col", "double_col", "string_col")
     pd_table = table.execute()
 
-    result = table.fillna(replacements).execute().reset_index(drop=True)
+    result = table.fill_null(replacements).execute().reset_index(drop=True)
     expected = pd_table.fillna(replacements).reset_index(drop=True)
 
     backend.assert_frame_equal(result, expected, check_dtype=False)
 
 
-def test_table_fillna_scalar(backend, alltypes):
+def test_table_fill_null_scalar(backend, alltypes):
     table = alltypes.mutate(
         int_col=alltypes.int_col.nullif(1),
         double_col=alltypes.double_col.nullif(3.0),
@@ -489,11 +492,11 @@ def test_table_fillna_scalar(backend, alltypes):
     ).select("id", "int_col", "double_col", "string_col")
     pd_table = table.execute()
 
-    res = table[["int_col", "double_col"]].fillna(0).execute().reset_index(drop=True)
+    res = table[["int_col", "double_col"]].fill_null(0).execute().reset_index(drop=True)
     sol = pd_table[["int_col", "double_col"]].fillna(0).reset_index(drop=True)
     backend.assert_frame_equal(res, sol, check_dtype=False)
 
-    res = table[["string_col"]].fillna("missing").execute().reset_index(drop=True)
+    res = table[["string_col"]].fill_null("missing").execute().reset_index(drop=True)
     sol = pd_table[["string_col"]].fillna("missing").reset_index(drop=True)
     backend.assert_frame_equal(res, sol, check_dtype=False)
 
@@ -509,14 +512,14 @@ def test_mutate_rename(alltypes):
     assert list(result.columns) == ["bool_col", "string_col", "dupe_col"]
 
 
-def test_dropna_invalid(alltypes):
+def test_drop_null_invalid(alltypes):
     with pytest.raises(
         com.IbisTypeError, match=r"Column 'invalid_col' is not found in table"
     ):
-        alltypes.dropna(subset=["invalid_col"])
+        alltypes.drop_null(subset=["invalid_col"])
 
     with pytest.raises(ValidationError):
-        alltypes.dropna(how="invalid")
+        alltypes.drop_null(how="invalid")
 
 
 @pytest.mark.parametrize("how", ["any", "all"])
@@ -534,18 +537,18 @@ def test_dropna_invalid(alltypes):
         param(["col_1", "col_3"], id="one-and-three"),
     ],
 )
-def test_dropna_table(backend, alltypes, how, subset):
+def test_drop_null_table(backend, alltypes, how, subset):
     is_two = alltypes.int_col == 2
     is_four = alltypes.int_col == 4
 
     table = alltypes.mutate(
-        col_1=is_two.ifelse(ibis.NA, alltypes.float_col),
-        col_2=is_four.ifelse(ibis.NA, alltypes.float_col),
-        col_3=(is_two | is_four).ifelse(ibis.NA, alltypes.float_col),
+        col_1=is_two.ifelse(ibis.null(), alltypes.float_col),
+        col_2=is_four.ifelse(ibis.null(), alltypes.float_col),
+        col_3=(is_two | is_four).ifelse(ibis.null(), alltypes.float_col),
     ).select("col_1", "col_2", "col_3")
 
     table_pandas = table.execute()
-    result = table.dropna(subset, how).execute().reset_index(drop=True)
+    result = table.drop_null(subset, how).execute().reset_index(drop=True)
     expected = table_pandas.dropna(how=how, subset=subset).reset_index(drop=True)
 
     backend.assert_frame_equal(result, expected)
@@ -931,12 +934,12 @@ def test_logical_negation_column(backend, alltypes, df, op):
     [("int64", 0, 1), ("float64", 0.0, 1.0)],
 )
 def test_zero_ifnull_literals(con, dtype, zero, expected):
-    assert con.execute(ibis.NA.cast(dtype).fillna(0)) == zero
-    assert con.execute(ibis.literal(expected, type=dtype).fillna(0)) == expected
+    assert con.execute(ibis.null().cast(dtype).fill_null(0)) == zero
+    assert con.execute(ibis.literal(expected, type=dtype).fill_null(0)) == expected
 
 
 def test_zero_ifnull_column(backend, alltypes, df):
-    expr = alltypes.int_col.nullif(1).fillna(0).name("tmp")
+    expr = alltypes.int_col.nullif(1).fill_null(0).name("tmp")
     result = expr.execute().astype("int32")
     expected = df.int_col.replace(1, 0).rename("tmp").astype("int32")
     backend.assert_series_equal(result, expected)
diff --git a/ibis/backends/tests/test_map.py b/ibis/backends/tests/test_map.py
index 491efc46f281..e637fb19a8e0 100644
--- a/ibis/backends/tests/test_map.py
+++ b/ibis/backends/tests/test_map.py
@@ -606,7 +606,7 @@ def test_map_get_with_incompatible_value_different_kind(con):
 
 @mark_notimpl_risingwave_hstore
 @mark_notyet_postgres
-@pytest.mark.parametrize("null_value", [None, ibis.NA])
+@pytest.mark.parametrize("null_value", [None, ibis.null()])
 def test_map_get_with_null_on_not_nullable(con, null_value):
     map_type = dt.Map(dt.string, dt.Int16(nullable=False))
     value = ibis.literal({"A": 1000, "B": 2000}).cast(map_type)
@@ -615,7 +615,7 @@ def test_map_get_with_null_on_not_nullable(con, null_value):
     assert pd.isna(result)
 
 
-@pytest.mark.parametrize("null_value", [None, ibis.NA])
+@pytest.mark.parametrize("null_value", [None, ibis.null()])
 @pytest.mark.notyet(
     ["flink"], raises=Py4JJavaError, reason="Flink cannot handle typeless nulls"
 )
diff --git a/ibis/backends/tests/test_string.py b/ibis/backends/tests/test_string.py
index 7761253ebbeb..423b116e440d 100644
--- a/ibis/backends/tests/test_string.py
+++ b/ibis/backends/tests/test_string.py
@@ -923,7 +923,7 @@ def test_levenshtein(con, right):
     "expr",
     [
         param(ibis.case().when(True, "%").end(), id="case"),
-        param(ibis.ifelse(True, "%", ibis.NA), id="ifelse"),
+        param(ibis.ifelse(True, "%", ibis.null()), id="ifelse"),
     ],
 )
 def test_no_conditional_percent_escape(con, expr):
diff --git a/ibis/backends/tests/test_struct.py b/ibis/backends/tests/test_struct.py
index c791318f15d6..6a7429a6c2ff 100644
--- a/ibis/backends/tests/test_struct.py
+++ b/ibis/backends/tests/test_struct.py
@@ -73,7 +73,7 @@ def test_all_fields(struct, struct_df):
     _SIMPLE_DICT,
     type="struct<a: int64, b: string, c: float64>",
 )
-_NULL_STRUCT_LITERAL = ibis.NA.cast("struct<a: int64, b: string, c: float64>")
+_NULL_STRUCT_LITERAL = ibis.null().cast("struct<a: int64, b: string, c: float64>")
 
 
 @pytest.mark.notimpl(["postgres", "risingwave"])
diff --git a/ibis/backends/tests/test_window.py b/ibis/backends/tests/test_window.py
index 88ad8e55ab15..b0684a278a5d 100644
--- a/ibis/backends/tests/test_window.py
+++ b/ibis/backends/tests/test_window.py
@@ -637,7 +637,7 @@ def test_simple_ungrouped_unbound_following_window(
 @pytest.mark.xfail_version(datafusion=["datafusion==35"])
 def test_simple_ungrouped_window_with_scalar_order_by(alltypes):
     t = alltypes[alltypes.double_col < 50].order_by("id")
-    w = ibis.window(rows=(0, None), order_by=ibis.NA)
+    w = ibis.window(rows=(0, None), order_by=ibis.null())
     expr = t.double_col.sum().over(w).name("double_col")
     # hard to reproduce this in pandas, so just test that it actually executes
     expr.execute()
diff --git a/ibis/expr/api.py b/ibis/expr/api.py
index 0a6455755d18..e99f4d8facb2 100644
--- a/ibis/expr/api.py
+++ b/ibis/expr/api.py
@@ -57,7 +57,6 @@
     "Column",
     "Deferred",
     "Expr",
-    "NA",
     "Scalar",
     "Schema",
     "Table",
@@ -197,35 +196,6 @@
 pi = ops.Pi().to_expr()
 
 
-NA = null()
-"""The NULL scalar.
-
-This is an untyped NULL. If you want a typed NULL, use eg `ibis.null(str)`.
-
-Examples
---------
->>> import ibis
->>> ibis.options.interactive = True
->>> ibis.NA.isnull()
-┌──────┐
-│ True │
-└──────┘
-
-datatype-specific methods aren't available on `NA`:
-
->>> ibis.NA.upper()  # quartodoc: +EXPECTED_FAILURE
-Traceback (most recent call last):
-  ...
-AttributeError: 'NullScalar' object has no attribute 'upper'
-
-Instead, use the typed `ibis.null`:
-
->>> ibis.null(str).upper().isnull()
-┌──────┐
-│ True │
-└──────┘
-"""
-
 deferred = _
 """Deferred expression object.
 
@@ -2428,7 +2398,7 @@ def coalesce(*args: Any) -> ir.Value:
     See Also
     --------
     [`Value.coalesce()`](#ibis.expr.types.generic.Value.coalesce)
-    [`Value.fillna()`](#ibis.expr.types.generic.Value.fillna)
+    [`Value.fill_null()`](#ibis.expr.types.generic.Value.fill_null)
 
     Examples
     --------
diff --git a/ibis/expr/format.py b/ibis/expr/format.py
index 00b93ae5f2f6..25d52fc721c9 100644
--- a/ibis/expr/format.py
+++ b/ibis/expr/format.py
@@ -264,9 +264,9 @@ def _sql_query_result(op, query, **kwargs):
     return top + render_fields({"query": query, "schema": schema}, 1)
 
 
-@fmt.register(ops.FillNa)
-@fmt.register(ops.DropNa)
-def _fill_na(op, parent, **kwargs):
+@fmt.register(ops.FillNull)
+@fmt.register(ops.DropNull)
+def _fill_null(op, parent, **kwargs):
     name = f"{op.__class__.__name__}[{parent}]\n"
     return name + render_fields(kwargs, 1)
 
diff --git a/ibis/expr/operations/relations.py b/ibis/expr/operations/relations.py
index 23f8578b6197..44735e596192 100644
--- a/ibis/expr/operations/relations.py
+++ b/ibis/expr/operations/relations.py
@@ -440,14 +440,14 @@ def schema(self):
 
 
 @public
-class FillNa(Simple):
+class FillNull(Simple):
     """Fill null values in the table."""
 
     replacements: typing.Union[Value[dt.Numeric | dt.String], FrozenDict[str, Any]]
 
 
 @public
-class DropNa(Simple):
+class DropNull(Simple):
     """Drop null values in the table."""
 
     how: typing.Literal["any", "all"]
diff --git a/ibis/expr/tests/snapshots/test_format/test_fillna/fillna_dict_repr.txt b/ibis/expr/tests/snapshots/test_format/test_fill_null/fill_null_dict_repr.txt
similarity index 83%
rename from ibis/expr/tests/snapshots/test_format/test_fillna/fillna_dict_repr.txt
rename to ibis/expr/tests/snapshots/test_format/test_fill_null/fill_null_dict_repr.txt
index 960ac1160204..3ddc4fa7edaa 100644
--- a/ibis/expr/tests/snapshots/test_format/test_fillna/fillna_dict_repr.txt
+++ b/ibis/expr/tests/snapshots/test_format/test_fill_null/fill_null_dict_repr.txt
@@ -2,6 +2,6 @@ r0 := UnboundTable: t
   a int64
   b string
 
-FillNa[r0]
+FillNull[r0]
   replacements:
     a: 3
\ No newline at end of file
diff --git a/ibis/expr/tests/snapshots/test_format/test_fillna/fillna_int_repr.txt b/ibis/expr/tests/snapshots/test_format/test_fill_null/fill_null_int_repr.txt
similarity index 87%
rename from ibis/expr/tests/snapshots/test_format/test_fillna/fillna_int_repr.txt
rename to ibis/expr/tests/snapshots/test_format/test_fill_null/fill_null_int_repr.txt
index 7ffb48f8a9f9..b138e3a5fe4b 100644
--- a/ibis/expr/tests/snapshots/test_format/test_fillna/fillna_int_repr.txt
+++ b/ibis/expr/tests/snapshots/test_format/test_fill_null/fill_null_int_repr.txt
@@ -5,6 +5,6 @@ r0 := UnboundTable: t
 r1 := Project[r0]
   a: r0.a
 
-FillNa[r1]
+FillNull[r1]
   replacements:
     3
\ No newline at end of file
diff --git a/ibis/expr/tests/snapshots/test_format/test_fillna/fillna_str_repr.txt b/ibis/expr/tests/snapshots/test_format/test_fill_null/fill_null_str_repr.txt
similarity index 88%
rename from ibis/expr/tests/snapshots/test_format/test_fillna/fillna_str_repr.txt
rename to ibis/expr/tests/snapshots/test_format/test_fill_null/fill_null_str_repr.txt
index e23131448904..3ee225e52931 100644
--- a/ibis/expr/tests/snapshots/test_format/test_fillna/fillna_str_repr.txt
+++ b/ibis/expr/tests/snapshots/test_format/test_fill_null/fill_null_str_repr.txt
@@ -5,6 +5,6 @@ r0 := UnboundTable: t
 r1 := Project[r0]
   b: r0.b
 
-FillNa[r1]
+FillNull[r1]
   replacements:
     'foo'
\ No newline at end of file
diff --git a/ibis/expr/tests/test_format.py b/ibis/expr/tests/test_format.py
index 4c24f09c7bc6..ee8fed0a6f69 100644
--- a/ibis/expr/tests/test_format.py
+++ b/ibis/expr/tests/test_format.py
@@ -293,17 +293,17 @@ def test_window_group_by(snapshot):
     snapshot.assert_match(result, "repr.txt")
 
 
-def test_fillna(snapshot):
+def test_fill_null(snapshot):
     t = ibis.table(dict(a="int64", b="string"), name="t")
 
-    expr = t.fillna({"a": 3})
-    snapshot.assert_match(repr(expr), "fillna_dict_repr.txt")
+    expr = t.fill_null({"a": 3})
+    snapshot.assert_match(repr(expr), "fill_null_dict_repr.txt")
 
-    expr = t[["a"]].fillna(3)
-    snapshot.assert_match(repr(expr), "fillna_int_repr.txt")
+    expr = t[["a"]].fill_null(3)
+    snapshot.assert_match(repr(expr), "fill_null_int_repr.txt")
 
-    expr = t[["b"]].fillna("foo")
-    snapshot.assert_match(repr(expr), "fillna_str_repr.txt")
+    expr = t[["b"]].fill_null("foo")
+    snapshot.assert_match(repr(expr), "fill_null_str_repr.txt")
 
 
 def test_asof_join(snapshot):
diff --git a/ibis/expr/types/generic.py b/ibis/expr/types/generic.py
index 34603095b388..67e91bea3b7f 100644
--- a/ibis/expr/types/generic.py
+++ b/ibis/expr/types/generic.py
@@ -290,7 +290,7 @@ def coalesce(self, *args: Value) -> Value:
         See Also
         --------
         [`ibis.coalesce()`](./expression-generic.qmd#ibis.coalesce)
-        [`Value.fillna()`](./expression-generic.qmd#ibis.expr.types.generic.Value.fillna)
+        [`Value.fill_null()`](./expression-generic.qmd#ibis.expr.types.generic.Value.fill_null)
 
         Examples
         --------
@@ -358,13 +358,13 @@ def typeof(self) -> ir.StringValue:
         """
         return ops.TypeOf(self).to_expr()
 
-    def fillna(self, fill_value: Scalar) -> Value:
+    def fill_null(self, fill_value: Scalar) -> Value:
         """Replace any null values with the indicated fill value.
 
         Parameters
         ----------
         fill_value
-            Value with which to replace `NA` values in `self`
+            Value with which to replace `NULL` values in `self`
 
         See Also
         --------
@@ -388,7 +388,7 @@ def fillna(self, fill_value: Scalar) -> Value:
         │ NULL   │
         │ female │
         └────────┘
-        >>> t.sex.fillna("unrecorded").name("sex")
+        >>> t.sex.fill_null("unrecorded").name("sex")
         ┏━━━━━━━━━━━━┓
         ┃ sex        ┃
         ┡━━━━━━━━━━━━┩
@@ -404,10 +404,15 @@ def fillna(self, fill_value: Scalar) -> Value:
         Returns
         -------
         Value
-            `self` filled with `fill_value` where it is `NA`
+            `self` filled with `fill_value` where it is `NULL`
         """
         return ops.Coalesce((self, fill_value)).to_expr()
 
+    @deprecated(as_of="9.1", instead="use fill_null instead")
+    def fillna(self, fill_value: Scalar) -> Value:
+        """Deprecated - use `fill_null` instead."""
+        return self.fill_null(fill_value)
+
     def nullif(self, null_if_expr: Value) -> Value:
         """Set values to null if they equal the values `null_if_expr`.
 
diff --git a/ibis/expr/types/joins.py b/ibis/expr/types/joins.py
index 4f890439d67c..65aa3c911b5f 100644
--- a/ibis/expr/types/joins.py
+++ b/ibis/expr/types/joins.py
@@ -404,7 +404,7 @@ def select(self, *args, **kwargs):
     drop = finished(Table.drop)
     dropna = finished(Table.dropna)
     execute = finished(Table.execute)
-    fillna = finished(Table.fillna)
+    fill_null = finished(Table.fill_null)
     filter = finished(Table.filter)
     group_by = finished(Table.group_by)
     intersect = finished(Table.intersect)
diff --git a/ibis/expr/types/relations.py b/ibis/expr/types/relations.py
index 42ed95a7b7f2..a46b5725ccd0 100644
--- a/ibis/expr/types/relations.py
+++ b/ibis/expr/types/relations.py
@@ -2490,7 +2490,7 @@ def filter(
         │ Adelie  │ Torgersen │           42.0 │          20.2 │               190 │ … │
         │ …       │ …         │              … │             … │                 … │ … │
         └─────────┴───────────┴────────────────┴───────────────┴───────────────────┴───┘
-        >>> t.filter([t.species == "Adelie", t.body_mass_g > 3500]).sex.value_counts().dropna(
+        >>> t.filter([t.species == "Adelie", t.body_mass_g > 3500]).sex.value_counts().drop_null(
         ...     "sex"
         ... ).order_by("sex")
         ┏━━━━━━━━┳━━━━━━━━━━━┓
@@ -2596,7 +2596,7 @@ def count(self, where: ir.BooleanValue | None = None) -> ir.IntegerScalar:
             (where,) = bind(self, where)
         return ops.CountStar(self, where=where).to_expr()
 
-    def dropna(
+    def drop_null(
         self,
         subset: Sequence[str] | str | None = None,
         how: Literal["any", "all"] = "any",
@@ -2645,27 +2645,27 @@ def dropna(
         ┌─────┐
         │ 344 │
         └─────┘
-        >>> t.dropna(["bill_length_mm", "body_mass_g"]).count()
+        >>> t.drop_null(["bill_length_mm", "body_mass_g"]).count()
         ┌─────┐
         │ 342 │
         └─────┘
-        >>> t.dropna(how="all").count()  # no rows where all columns are null
+        >>> t.drop_null(how="all").count()  # no rows where all columns are null
         ┌─────┐
         │ 344 │
         └─────┘
         """
         if subset is not None:
             subset = self.bind(subset)
-        return ops.DropNa(self, how, subset).to_expr()
+        return ops.DropNull(self, how, subset).to_expr()
 
-    def fillna(
+    def fill_null(
         self,
         replacements: ir.Scalar | Mapping[str, ir.Scalar],
     ) -> Table:
         """Fill null values in a table expression.
 
         ::: {.callout-note}
-        ## There is potential lack of type stability with the `fillna` API
+        ## There is potential lack of type stability with the `fill_null` API
 
         For example, different library versions may impact whether a given
         backend promotes integer replacement values to floats.
@@ -2678,6 +2678,11 @@ def fillna(
             keys are column names that map to their replacement value. If
             passed as a scalar all columns are filled with that value.
 
+        Returns
+        -------
+        Table
+            Table expression
+
         Examples
         --------
         >>> import ibis
@@ -2701,7 +2706,7 @@ def fillna(
         │ NULL   │
         │ …      │
         └────────┘
-        >>> t.fillna({"sex": "unrecorded"}).sex
+        >>> t.fill_null({"sex": "unrecorded"}).sex
         ┏━━━━━━━━━━━━┓
         ┃ sex        ┃
         ┡━━━━━━━━━━━━┩
@@ -2719,11 +2724,6 @@ def fillna(
         │ unrecorded │
         │ …          │
         └────────────┘
-
-        Returns
-        -------
-        Table
-            Table expression
         """
         schema = self.schema()
 
@@ -2740,7 +2740,59 @@ def fillna(
                 val_type = val.type() if isinstance(val, Expr) else dt.infer(val)
                 if not val_type.castable(col_type):
                     raise com.IbisTypeError(
-                        f"Cannot fillna on column {col!r} of type {col_type} with a "
+                        f"Cannot fill_null on column {col!r} of type {col_type} with a "
+                        f"value of type {val_type}"
+                    )
+        else:
+            val_type = (
+                replacements.type()
+                if isinstance(replacements, Expr)
+                else dt.infer(replacements)
+            )
+            for col, col_type in schema.items():
+                if col_type.nullable and not val_type.castable(col_type):
+                    raise com.IbisTypeError(
+                        f"Cannot fill_null on column {col!r} of type {col_type} with a "
+                        f"value of type {val_type} - pass in an explicit mapping "
+                        f"of fill values to `fill_null` instead."
+                    )
+        return ops.FillNull(self, replacements).to_expr()
+
+    @deprecated(as_of="9.1", instead="use drop_null instead")
+    def dropna(
+        self,
+        subset: Sequence[str] | str | None = None,
+        how: Literal["any", "all"] = "any",
+    ) -> Table:
+        """Deprecated - use `drop_null` instead."""
+
+        if subset is not None:
+            subset = self.bind(subset)
+        return self.drop_null(subset, how)
+
+    @deprecated(as_of="9.1", instead="use fill_null instead")
+    def fillna(
+        self,
+        replacements: ir.Scalar | Mapping[str, ir.Scalar],
+    ) -> Table:
+        """Deprecated - use `fill_null` instead."""
+
+        schema = self.schema()
+
+        if isinstance(replacements, Mapping):
+            for col, val in replacements.items():
+                if col not in schema:
+                    columns_formatted = ", ".join(map(repr, schema.names))
+                    raise com.IbisTypeError(
+                        f"Column {col!r} is not found in table. "
+                        f"Existing columns: {columns_formatted}."
+                    ) from None
+
+                col_type = schema[col]
+                val_type = val.type() if isinstance(val, Expr) else dt.infer(val)
+                if not val_type.castable(col_type):
+                    raise com.IbisTypeError(
+                        f"Cannot fill_null on column {col!r} of type {col_type} with a "
                         f"value of type {val_type}"
                     )
         else:
@@ -2756,7 +2808,7 @@ def fillna(
                         f"value of type {val_type} - pass in an explicit mapping "
                         f"of fill values to `fillna` instead."
                     )
-        return ops.FillNa(self, replacements).to_expr()
+        return self.fill_null(replacements)
 
     def unpack(self, *columns: str) -> Table:
         """Project the struct fields of each of `columns` into `self`.
@@ -3699,7 +3751,7 @@ def pivot_longer(
         ...     names_transform=int,
         ...     values_to="rank",
         ...     values_transform=_.cast("int"),
-        ... ).dropna("rank")
+        ... ).drop_null("rank")
         ┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━┓
         ┃ artist  ┃ track                   ┃ date_entered ┃ week ┃ rank  ┃
         ┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━┩
diff --git a/ibis/tests/benchmarks/benchfuncs.py b/ibis/tests/benchmarks/benchfuncs.py
index 4e7ebef08d68..f9c78a8464fc 100644
--- a/ibis/tests/benchmarks/benchfuncs.py
+++ b/ibis/tests/benchmarks/benchfuncs.py
@@ -39,8 +39,8 @@ def is_nan_like(col: ir.Value) -> ir.BooleanValue:
     if not col.type().is_string():
         return col.isnull()
     result = col.isnull()
-    result |= col.lower().isin(NAN_LIKE).fillna(False)
-    result |= ~contains_alphanum(col).fillna(False)
+    result |= col.lower().isin(NAN_LIKE).fill_null(False)
+    result |= ~contains_alphanum(col).fill_null(False)
     return result
 
 
@@ -131,7 +131,7 @@ def norm_whitespace(s: ir.StringValue) -> ir.StringValue:
 
 def to_ascii(s: ir.StringValue) -> ir.StringValue:
     """Remove any non-ascii characters."""
-    # return norm_whitespace(s.fillna("").apply(unidecode).astype(s.dtype))
+    # return norm_whitespace(s.fill_null("").apply(unidecode).astype(s.dtype))
     # We don't have access to the unidecode function, so just strip out
     # non-ascii characters
     s = s.cast("string")
@@ -144,7 +144,7 @@ def num_tokens(s: ir.StringValue) -> ir.IntegerValue:
     s = s.re_replace(r"\s+", " ")
     s = s.strip()
     s = s.nullif("")
-    return s.split(" ").length().fillna(0)
+    return s.split(" ").length().fill_null(0)
 
 
 NAME_COLUMNS = [
@@ -241,8 +241,8 @@ def filter_tokens(t, col):
 
 
 def choose_longer(s1: ir.StringColumn, s2: ir.StringColumn) -> ir.StringColumn:
-    l1 = s1.length().fillna(0)
-    l2 = s2.length().fillna(0)
+    l1 = s1.length().fill_null(0)
+    l2 = s2.length().fill_null(0)
     return (l1 > l2).ifelse(s1, s2)
 
 
@@ -277,12 +277,14 @@ def parse_middle(
     b = first.re_extract(pattern, 2).nullif("")
 
     # Deal with "Kay Ellen", "E" should yield "Kay", "Ellen"
-    middle_is_middle = (starts_with(middle, b) | starts_with(b, middle)).fillna(False)
+    middle_is_middle = (starts_with(middle, b) | starts_with(b, middle)).fill_null(
+        False
+    )
     result_first = middle_is_middle.ifelse(a, first)
     result_middle = middle_is_middle.ifelse(choose_longer(b, middle), middle)
 
-    al = a.length().fillna(0)
-    bl = b.length().fillna(0)
+    al = a.length().fill_null(0)
+    bl = b.length().fill_null(0)
     short_long = (al == 1) & (bl > 1)  # A Jones
     long_short = (al > 1) & (bl == 1)  # Alice J
     idx &= short_long | long_short
@@ -290,7 +292,7 @@ def parse_middle(
     # Many rows are of the form first_name="H Daniel", last_name="Hull"
     # where the first token of the first name is actually the
     # first letter of the last name. Catch this.
-    first_is_last = starts_with(last, a).fillna(False)
+    first_is_last = starts_with(last, a).fill_null(False)
     fil = idx & first_is_last
     result_first = fil.ifelse(b, result_first)
     result_middle = fil.ifelse(ibis.null(), result_middle)
@@ -302,8 +304,8 @@ def parse_middle(
     # Correct for when the last name is "A Jones"
     a = last.re_extract(pattern, 1).nullif("")
     b = last.re_extract(pattern, 2).nullif("")
-    al = a.length().fillna(0)
-    bl = b.length().fillna(0)
+    al = a.length().fill_null(0)
+    bl = b.length().fill_null(0)
     idx = (al == 1) & (bl > 1)  # A Jones
     idx &= middle.isnull()
     result_middle = idx.ifelse(a, result_middle)
@@ -325,7 +327,7 @@ def fix_nickname_is_middle(t: ir.Table) -> ir.Table:
     Watch out for when the nickname is probably not related to the middle name,
     Such as with 'Carolyn "Care" c smith' (Care is short for Carolyn, not the middle)
     """
-    todo = starts_with(t["nickname"], t["middle_name"]).fillna(False)
+    todo = starts_with(t["nickname"], t["middle_name"]).fill_null(False)
     # Get rid of the 'Carolyn "Care" c smith' case
     todo &= ~starts_with(t["first_name"], t["middle_name"])
     return t.mutate(middle_name=todo.ifelse(t.nickname, t.middle_name))
@@ -357,7 +359,7 @@ def fix_last_comma_first(t: ir.Table) -> ir.Table:
     a = norm_whitespace(a)
     b = norm_whitespace(b)
     one_each = (num_tokens(a) == 1) & (num_tokens(b) == 1)
-    first_empty = t.first_name.strip().fillna("") == ""
+    first_empty = t.first_name.strip().fill_null("") == ""
     todo = first_empty & one_each
     return t.mutate(
         first_name=todo.ifelse(b, t.first_name),
diff --git a/ibis/tests/expr/test_decimal.py b/ibis/tests/expr/test_decimal.py
index 5aac8c276542..34be555f5da6 100644
--- a/ibis/tests/expr/test_decimal.py
+++ b/ibis/tests/expr/test_decimal.py
@@ -106,11 +106,11 @@ def test_ifelse(lineitem):
     assert isinstance(expr, ir.DecimalScalar)
 
 
-def test_fillna(lineitem):
-    expr = lineitem.l_extendedprice.fillna(0)
+def test_fill_null(lineitem):
+    expr = lineitem.l_extendedprice.fill_null(0)
     assert isinstance(expr, ir.DecimalColumn)
 
-    expr = lineitem.l_extendedprice.fillna(lineitem.l_quantity)
+    expr = lineitem.l_extendedprice.fill_null(lineitem.l_quantity)
     assert isinstance(expr, ir.DecimalColumn)
 
 
diff --git a/ibis/tests/expr/test_sql_builtins.py b/ibis/tests/expr/test_sql_builtins.py
index 0d773e85ac88..4c4b785f703a 100644
--- a/ibis/tests/expr/test_sql_builtins.py
+++ b/ibis/tests/expr/test_sql_builtins.py
@@ -83,9 +83,9 @@ def test_group_concat(functional_alltypes):
 
 
 def test_zero_ifnull(functional_alltypes):
-    dresult = functional_alltypes.double_col.fillna(0)
+    dresult = functional_alltypes.double_col.fill_null(0)
 
-    iresult = functional_alltypes.int_col.fillna(0)
+    iresult = functional_alltypes.int_col.fill_null(0)
 
     assert type(dresult.op()) == ops.Coalesce
     assert type(dresult) == ir.FloatingColumn
@@ -94,17 +94,17 @@ def test_zero_ifnull(functional_alltypes):
     assert type(iresult) == type(iresult)
 
 
-def test_fillna(functional_alltypes):
-    result = functional_alltypes.double_col.fillna(5)
+def test_fill_null(functional_alltypes):
+    result = functional_alltypes.double_col.fill_null(5)
     assert isinstance(result, ir.FloatingColumn)
 
     assert isinstance(result.op(), ops.Coalesce)
 
-    result = functional_alltypes.bool_col.fillna(True)
+    result = functional_alltypes.bool_col.fill_null(True)
     assert isinstance(result, ir.BooleanColumn)
 
     # Highest precedence type
-    result = functional_alltypes.int_col.fillna(functional_alltypes.bigint_col)
+    result = functional_alltypes.int_col.fill_null(functional_alltypes.bigint_col)
     assert isinstance(result, ir.IntegerColumn)
 
 
diff --git a/ibis/tests/expr/test_table.py b/ibis/tests/expr/test_table.py
index 29b2e01637b1..4dab6aee362a 100644
--- a/ibis/tests/expr/test_table.py
+++ b/ibis/tests/expr/test_table.py
@@ -569,7 +569,7 @@ def test_order_by_asc_deferred_sort_key(table):
 @pytest.mark.parametrize(
     ("key", "expected"),
     [
-        param(ibis.NA, ibis.NA.op(), id="na"),
+        param(ibis.null(), ibis.null().op(), id="na"),
         param(rand, rand.op(), id="random"),
         param(1.0, ibis.literal(1.0).op(), id="float"),
         param(ibis.literal("a"), ibis.literal("a").op(), id="string"),
@@ -1731,8 +1731,8 @@ def test_unbound_table_using_class_definition():
 
 def test_mutate_chain():
     one = ibis.table([("a", "string"), ("b", "string")], name="t")
-    two = one.mutate(b=lambda t: t.b.fillna("Short Term"))
-    three = two.mutate(a=lambda t: t.a.fillna("Short Term"))
+    two = one.mutate(b=lambda t: t.b.fill_null("Short Term"))
+    three = two.mutate(a=lambda t: t.a.fill_null("Short Term"))
 
     values = three.op().values
     assert isinstance(values["a"], ops.Coalesce)
@@ -1743,8 +1743,8 @@ def test_mutate_chain():
     assert three_opt == ops.Project(
         parent=one,
         values={
-            "a": one.a.fillna("Short Term"),
-            "b": one.b.fillna("Short Term"),
+            "a": one.a.fill_null("Short Term"),
+            "b": one.b.fill_null("Short Term"),
         },
     )
 
@@ -2191,3 +2191,17 @@ def utter_failure(x):
 
     with pytest.raises(ValueError, match="¡moo!"):
         t.bind(foo=utter_failure)
+
+
+# TODO: remove when dropna is fully deprecated
+def test_table_dropna_depr_warn():
+    t = ibis.memtable([{"a": 1, "b": None}, {"a": 2, "b": "baz"}])
+    with pytest.warns(FutureWarning, match="v9.1"):
+        t.dropna()
+
+
+# TODO: remove when fillna is fully deprecated
+def test_table_fillna_depr_warn():
+    t = ibis.memtable([{"a": 1, "b": None}, {"a": 2, "b": "baz"}])
+    with pytest.warns(FutureWarning, match="v9.1"):
+        t.fillna({"b": "missing"})
diff --git a/ibis/tests/expr/test_timestamp.py b/ibis/tests/expr/test_timestamp.py
index 454355458a62..45b8972d5dc4 100644
--- a/ibis/tests/expr/test_timestamp.py
+++ b/ibis/tests/expr/test_timestamp.py
@@ -106,7 +106,7 @@ def test_greater_comparison_pandas_timestamp(alltypes):
 
 def test_timestamp_precedence():
     ts = ibis.literal(datetime.now())
-    highest_type = rlz.highest_precedence_dtype([ibis.NA.op(), ts.op()])
+    highest_type = rlz.highest_precedence_dtype([ibis.null().op(), ts.op()])
     assert highest_type == dt.timestamp
 
 
diff --git a/ibis/tests/expr/test_value_exprs.py b/ibis/tests/expr/test_value_exprs.py
index ec8e3a224750..e7b57376052a 100644
--- a/ibis/tests/expr/test_value_exprs.py
+++ b/ibis/tests/expr/test_value_exprs.py
@@ -352,7 +352,7 @@ def test_notnull(table):
 
 @pytest.mark.parametrize(
     "value",
-    [None, ibis.NA, ibis.literal(None, type="int32")],
+    [None, ibis.null(), ibis.literal(None, type="int32")],
     ids=["none", "NA", "typed-null"],
 )
 def test_null_eq_and_ne(table, value):
@@ -648,7 +648,7 @@ def test_or_(table):
 
 def test_null_column():
     t = ibis.table([("a", "string")], name="t")
-    s = t.mutate(b=ibis.NA)
+    s = t.mutate(b=ibis.null())
     assert s.b.type() == dt.null
     assert isinstance(s.b, ir.NullColumn)
 
@@ -657,8 +657,8 @@ def test_null_column_union():
     s = ibis.table([("a", "string"), ("b", "double")])
     t = ibis.table([("a", "string")])
     with pytest.raises(ibis.common.exceptions.RelationError):
-        s.union(t.mutate(b=ibis.NA))  # needs a type
-    assert s.union(t.mutate(b=ibis.NA.cast("double"))).schema() == s.schema()
+        s.union(t.mutate(b=ibis.null()))  # needs a type
+    assert s.union(t.mutate(b=ibis.null().cast("double"))).schema() == s.schema()
 
 
 def test_string_compare_numeric_array(table):
@@ -843,12 +843,12 @@ def test_substitute_dict():
     )
     assert_equal(result, expected)
 
-    result = table.foo.substitute(subs, else_=ibis.NA)
+    result = table.foo.substitute(subs, else_=ibis.null())
     expected = (
         ibis.case()
         .when(table.foo == "a", "one")
         .when(table.foo == "b", table.bar)
-        .else_(ibis.NA)
+        .else_(ibis.null())
         .end()
     )
     assert_equal(result, expected)
@@ -925,8 +925,8 @@ def test_generic_value_api_no_arithmetic(value, operation):
 @pytest.mark.parametrize(
     ("value", "expected"), [(5, dt.int8), (5.4, dt.double), ("abc", dt.string)]
 )
-def test_fillna_null(value, expected):
-    assert ibis.NA.fillna(value).type().equals(expected)
+def test_fill_null_null(value, expected):
+    assert ibis.null().fill_null(value).type().equals(expected)
 
 
 @pytest.mark.parametrize(
@@ -1229,7 +1229,7 @@ def test_map_get_with_incompatible_value_different_kind():
     assert value.get("C", 3.0).type() == dt.float64
 
 
-@pytest.mark.parametrize("null_value", [None, ibis.NA])
+@pytest.mark.parametrize("null_value", [None, ibis.null()])
 def test_map_get_with_null_on_not_nullable(null_value):
     map_type = dt.Map(dt.string, dt.Int16(nullable=False))
     value = ibis.literal({"A": 1000, "B": 2000}).cast(map_type)
@@ -1238,14 +1238,14 @@ def test_map_get_with_null_on_not_nullable(null_value):
     assert expr.type() == dt.Int16(nullable=True)
 
 
-@pytest.mark.parametrize("null_value", [None, ibis.NA])
+@pytest.mark.parametrize("null_value", [None, ibis.null()])
 def test_map_get_with_null_on_nullable(null_value):
     value = ibis.literal({"A": 1000, "B": None})
     result = value.get("C", null_value)
     assert result.type().nullable
 
 
-@pytest.mark.parametrize("null_value", [None, ibis.NA])
+@pytest.mark.parametrize("null_value", [None, ibis.null()])
 def test_map_get_with_null_on_null_type_with_null(null_value):
     value = ibis.literal({"A": None, "B": None})
     result = value.get("C", null_value)
@@ -1378,13 +1378,13 @@ def test_repr_list_of_lists_in_table():
 @pytest.mark.parametrize(
     ("expr", "expected_type"),
     [
-        (ibis.coalesce(ibis.NA, 1), dt.int8),
-        (ibis.coalesce(1, ibis.NA), dt.int8),
-        (ibis.coalesce(ibis.NA, 1000), dt.int16),
-        (ibis.coalesce(ibis.NA), dt.null),
-        (ibis.coalesce(ibis.NA, ibis.NA), dt.null),
+        (ibis.coalesce(ibis.null(), 1), dt.int8),
+        (ibis.coalesce(1, ibis.null()), dt.int8),
+        (ibis.coalesce(ibis.null(), 1000), dt.int16),
+        (ibis.coalesce(ibis.null()), dt.null),
+        (ibis.coalesce(ibis.null(), ibis.null()), dt.null),
         (
-            ibis.coalesce(ibis.NA, ibis.NA.cast("array<string>")),
+            ibis.coalesce(ibis.null(), ibis.null().cast("array<string>")),
             dt.Array(dt.string),
         ),
     ],
@@ -1508,14 +1508,14 @@ def test_deferred_r_ops(op_name, expected_left, expected_right):
 @pytest.mark.parametrize(
     ("expr_fn", "expected_type"),
     [
-        (lambda t: ibis.ifelse(t.a == 1, t.b, ibis.NA), dt.string),
+        (lambda t: ibis.ifelse(t.a == 1, t.b, ibis.null()), dt.string),
         (lambda t: ibis.ifelse(t.a == 1, t.b, t.a.cast("string")), dt.string),
         (
             lambda t: ibis.ifelse(t.a == 1, t.b, t.a.cast("!string")),
             dt.string.copy(nullable=False),
         ),
-        (lambda _: ibis.ifelse(True, ibis.NA, ibis.NA), dt.null),
-        (lambda _: ibis.ifelse(False, ibis.NA, ibis.NA), dt.null),
+        (lambda _: ibis.ifelse(True, ibis.null(), ibis.null()), dt.null),
+        (lambda _: ibis.ifelse(False, ibis.null(), ibis.null()), dt.null),
     ],
 )
 def test_non_null_with_null_precedence(expr_fn, expected_type):
@@ -1728,3 +1728,10 @@ def test_in_subquery_shape():
 
     expr = ibis.literal(2).isin(t.a)
     assert expr.op().shape.is_scalar()
+
+
+# TODO: remove when fillna is fully deprecated
+def test_value_fillna_depr_warn():
+    t = ibis.memtable([{"a": 1, "b": None}, {"a": 2, "b": "baz"}])
+    with pytest.warns(FutureWarning, match="v9.1"):
+        t.b.fillna("missing")
diff --git a/ibis/tests/expr/test_window_frames.py b/ibis/tests/expr/test_window_frames.py
index 2e88f2c2cac0..5560a3608501 100644
--- a/ibis/tests/expr/test_window_frames.py
+++ b/ibis/tests/expr/test_window_frames.py
@@ -234,7 +234,7 @@ def test_window_api_supports_value_expressions(t):
 
 
 def test_window_api_supports_scalar_order_by(t):
-    window = ibis.window(order_by=ibis.NA)
+    window = ibis.window(order_by=ibis.null())
     expr = t.a.sum().over(window).op()
     expected = ops.WindowFunction(
         t.a.sum(),
@@ -242,7 +242,7 @@ def test_window_api_supports_scalar_order_by(t):
         start=None,
         end=None,
         group_by=(),
-        order_by=(ibis.NA.op(),),
+        order_by=(ibis.null().op(),),
     )
     assert expr == expected
 
diff --git a/ibis/tests/test_api.py b/ibis/tests/test_api.py
index fc672a4af8ed..b3c68a0ce3de 100644
--- a/ibis/tests/test_api.py
+++ b/ibis/tests/test_api.py
@@ -69,3 +69,8 @@ def test_no_import(module):
 assert "{module}" not in sys.modules
 """
     subprocess.run([sys.executable, "-c", script], check=True)
+
+
+def test_ibis_na_deprecation_warning():
+    with pytest.warns(DeprecationWarning, match="'ibis.NA' is deprecated as of v9.1"):
+        ibis.NA  # noqa: B018