BC Assessment XREF Update

Problem Statement

When retrieving data from the BC Assessment API for a point on the map, it returns all possible folios from properties at that location.

For locations with only a single property, this is not an issue. For locations with multiple overlapping parcels, there is no property of the BC Assessment data that allows PIMS to match folios to parcels.

Solution

The BC Data Catalogue contains a cross-reference (XREF) table that matches PIDs (attached to parcels) to Jurisdiction Codes and Roll Numbers (attached to folios), but because they do not offer unrestricted access via an API, PIMS must store a copy of this table in order to match parcels and folios.

The table name is jur_roll_pid_xref. When a request for a parcel in PIMS is made, the XREF table is queried by the PID to find the matching Jurisdiction Code and Roll Number. The frontend then uses this information to render the correct data from BC Assessment.

Risk

Due to the constantly updating nature of parcels and folios in BC, this dataset becomes out of date very quickly. It is possible that a few records may be updated each day, making the local XREF table more and more out of date.

The chances that this will affect the end user are low. A very small subset of Administrator users have access to the BC Assessment data, and the chances of needed the data for one specific parcel is less than 1 in 2.1 million based on the number of possible parcels.

Update Procedure

This data should be periodically updated according to a schedule agreed upon by the Product Owner and the IMB development team.

Follow these steps to update the XREF data:

Getting New Data

New sets of data are downloaded from the Jurisdiction, Roll Number, PID Cross Reference Table.

You must be logged in with your IDIR to view this table.

Select Access/Download from the right side.
Set the download options to receive the CSV file, enter your email, and submit.
After a short amount of time, a link to the CSV file will appear in your inbox. Follow it to download the file.

Cleaning the Data

Unfortunately, the data from the BC Data Catalogue is not cleaned. It contains empty rows and many duplicate values. These issues make the upload process more difficult.

To clean the CSV file, run this command:

## Command to run:
awk -F, -v OFS=',' '{print $2,$3,$4}' PM_JRL_PID.csv | awk '!a[$0]++' | awk '!/^,,$/' > upload_unique.csv 

## Here we explain what each part does:
# Print only the three columns we're interested from this file
awk -F, -v OFS=',' '{print $2,$3,$4}' PM_JRL_PID.csv
# Remove duplicates
awk '!a[$0]++'
# Remove lines with only ,,
awk '!/^,,$/'
# Write to this new file
> upload_unique.csv

Connecting to the Database

The Crunchy Data instance that manages the PIMS database is housed in OpenShift. To access this database, log in to the OpenShift Portal.

If you do not have the OpenShift CLI installed, do so now.

From OpenShift, grab the command for the CLI login by clicking on your profile and the Copy Login Command option.

The page will ask you to log in again. After, click Display Token and copy the command under Log in with this token. It will look something like this: oc login --token .....

Use that command in the terminal to log in to OpenShift CLI.

Make sure you are connected to the correct project. Change projects with the command:

oc project <projectname>

Once connected, port-forward a connection to the pgbouncer service.

oc port-forward service/pims-crunchy-pgbouncer 22226:5432

In this example, we are forwarding a local port (22226) to the port of the database (5432).

To connect to the database, use DBeaver. We will also use this tool for the data upload. Download it now if you do not already have it installed.

Add a connection in DBeaver. Specific drivers for your device may be needed to connect with PostgreSQL. Configuration settings are minimal, but you will need to obtain the username and password for your project environment (DEV, TEST, or PROD) from Vault.

At this point, you should be able to query the database from DBeaver. If so, you are ready for the final step.

Uploading Data

Right click on the jur_roll_pid_xref table and select Import Data.

Choose to upload data from CSV.

Ensure that the connection is to the intended database. Click Configure to access more details.

The columns from the CSV must be matched to the columns from the database. Click on the Target Column to select the correct target. The option at the end should change to existing. We do not want DBeaver to create additional columns. The Target Type is also important. Match the types shown below.

Use these settings for the import. The table must be truncated each time for a clean refresh of the data. Bulk load is much faster.

Click Proceed to start the upload. At the time of writing, it takes about 1 minute for the process to complete.
Verify that the records were uploaded correctly by querying the table.

-- Some possible queries to check.
SELECT COUNT(*) FROM public.jur_roll_pid_xref;
-- The count should be the same as the count of the CSV minus one line for the headers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly