Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1746 Add SR CSV download for Address and Count #1752

Merged
merged 3 commits into from
Jun 28, 2024

Conversation

kdow
Copy link
Member

@kdow kdow commented Jun 8, 2024

Fixes #1746

Notes:

  • I had to change the table from requests to requests_2024 for the queries to work. Should this be left as requests or requests_2024 for now?

  • This implementation will only generate the SR count CSV if it meets the following conditions: exactly one SR type is selected, a NC is selected, and status is Open. Should any of those requirements be changed?

  • I've verified that the CSV is generated in the specified conditions, along with NeighborhoodData.csv that was previously implemented

    • Up to date with main branch
    • Branch name follows guidelines
    • All PR Status checks are successful
    • Peer reviewed and approved

Any questions? See the getting started guide

@Skydodle Skydodle self-requested a review June 8, 2024 22:48
@Skydodle
Copy link
Member

Skydodle commented Jun 8, 2024

Hi @kdow, welcome and thanks for your hard work!

I'm addressing the question regarding the table name querying:

Background
Up until the last couple of months, our app was only able to display current year's data, which is why the older files like ExportButton.jsx have a static table name like requests because we only need 2024 year's data at the time. We recently restructured our data setup and integrated the ability to display multiple years, and table names are now dynamically generated by year: ie. `requests_${year}` such as request_2024 table contains 2024 data, request_2023 table contains 2023 data etc.

Dynamic Queries
We want the user to be able to export data based on the date ranges they selected on the UI calendar, therefore it's not ideal to hardcode the table name to query only the requests_2024 table. Instead we should make the table name in the query dynamic as `requests_${year}` based on the user's currently selected start date and end date. We can extract the start date year and end date year from the redux props filters.startDate and filters.endDate, which are also dynamic and updates whenever the user changes the date range selection on the UI calendar.

Year extraction implementations can be found in components/Map/index.js using the Moment.js library specifically in the getAllRequests function.

Consider Cross-Year Dates
Since our data tables are segregated by year, the query logic needs to consider when the user selects different years in startDate and endDate. This has also been implemented in getAllRequests in components/Map/index.js since we encountered this issue as a bug previously.
Here is the text explanation on the logic: #1711 (comment)

Extra Info

  • If you want to dig into how data flows from source into DuckDB, check out components/db/DbProvider.jsx
  • For how DuckDB create the data tables, check out createRequestsTable in components/Map/index.js

I hope this explanation helps. Please let me know if you have any questions, I'd be happy to help!

@kdow
Copy link
Member Author

kdow commented Jun 9, 2024

That does help, thanks for the thorough explanation! I'll make that change in a revision today or tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you said to disregard, but I think this can be resolved if you pull latest from main, switch to your branch, and merge main into your branch. We aren't being picky about merge commits -- although let us know if you believe there is a concern with this approach

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to merge another newer commit from main without it adding but I wasn't able to push my changes without pulling from my branch first and it got added in. I can try fixing it again in another commit or closing this PR and opening a new one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so sorry about all that. I was using github to create a new template and didn't quite see the harm in pushing directly to main. I'll just stick to the usual process to avoid this in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one last option would be for you to run git rebase main and resolve the merge conflicts by accepting the current changes (aka choosing your changes each time). Let me confirm this by running it locally to see if it prunes out that extra commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I've attempted it on a branch that derives from your branch....

  1. on branch main run git pull
  2. switch to your branch, 1746-sr-csv-download-address-count
  3. run git rebase main
  4. There will be a first merge conflict
  • address it by accepting all instances of your changes
  • save the file
  • Run git add . to keep those changes
  • Run git rebase --continue
  1. There will be a second merge conflict
  • repeat steps above
  1. You might get a warning about "The previous cherry-pick is now empty..."
  • you can run git rebase --skip to conclude the rebase
  1. Run git push
  2. You'll see that you only have a few commits, since we've pruned my commit, plus a couple of your repeat commits/merges

Here's my end result on a demo PR: #1766

components/main/Desktop/ExportButton.jsx Show resolved Hide resolved
AND RequestType IN (${formattedRequestTypes})
GROUP BY Address`;
}
return `SELECT * FROM requests_${startYear}
Copy link
Member

@ryanfchase ryanfchase Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would either convert this logic block to explicitly be if-else, so readers can see startYear === endYear and grouped are the deciding factors between which logical pathway is taken. OR I would add comments indicating what the state of logic should be...
i.e. here on line 66: //here, our data comes from the same year and we wish to obtain all rows that match the filters

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're considering comments, here's my other request for adding a note: #1746 (comment)

@Skydodle
Copy link
Member

Hi @kdow, just wanna let you know that I didn't forget about reviewing this PR. I'm just waiting to see what you think about Ryan's requested changes. Thanks.

Copy link
Member

@Skydodle Skydodle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kdow Thanks for your great work on parameterizing the query table name.

Pulled down the PR and tested locally. I uncommented the ExportButton in FilterMenu.js and downloaded the csv.

The columns that suppose to have date values in date format, such as CreatedDate, UpdatedDate, ServiceDate, and ClosedDate columns, appeared to be timestamps in milliseconds. See screenshot below:

Screenshot of downloaded csv

Screenshot 2024-06-19 at 1 18 40 PM

In this example, the CreatedDate column for the first row value is 1718226384000, which is timestamp (milliseconds since the Unix epoch) for 2024-06-12 21:06:24.

We need to convert the timestamps into human-readable date format before exporting to CSV.

const neighborhoodDataQuery = generateQuery();
const neighborhoodDataToExport = await conn.query(neighborhoodDataQuery);
const neighborhoodResults = ddbh.getTableData(neighborhoodDataToExport);

Copy link
Member

@Skydodle Skydodle Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do the timestamp-to-date conversion here before passing neighborhoodResults to Papa.unparse.

Something like

const formattedResults = neighborhoodResults.map(row => ({
  ...row,
  CreatedDate: row.CreatedDate ? moment(row.CreatedDate).format('YYYY-MM-DD HH:mm:ss') : null,
  UpdatedDate: row.UpdatedDate ? moment(row.UpdatedDate).format('YYYY-MM-DD HH:mm:ss') : null,
  serviceDate: row.serviceDate ? moment(row.serviceDate).format('YYYY-MM-DD HH:mm:ss') : null,
}));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to include ClosedDate column in there

const neighborhoodDataToExport = await conn.query(neighborhoodDataQuery);
const neighborhoodResults = ddbh.getTableData(neighborhoodDataToExport);

if (!isEmpty(neighborhoodResults)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then pass formattedResults here instead of neighborhoodResults, same for Papa.unparse() below.

@kdow kdow force-pushed the 1746-sr-csv-download-address-count branch from 329c588 to f88cdf2 Compare June 19, 2024 22:31
@kdow kdow requested review from Skydodle and ryanfchase June 19, 2024 22:34
Copy link
Member

@ryanfchase ryanfchase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the PR is looking clean, and the final commit on formatting the date output in the csv looks good too. Approved.

@Skydodle
Copy link
Member

Skydodle commented Jun 20, 2024

Hi @kdow, thank you for cleaning up the timestamp transformation, the date columns are now in correct date format.

However, I encountered some other issues while testing different date ranges, with the number of rows/SRs in the CSV inconsistent with the number of dots/SRs visually showing on the UI map. See screenshots below.

Take "Downtown LA" date range 2/1 to 2/15 for example, when I clicked the Export button, the error alert "No 311 data available within the selected filters..." shows up, indicating the filtered results or formattedResults are empty. I reverted back to feeding the if statement with neighborhoodResults and it's the same error, indicating neighborhoodResults is also empty.

This was tested with several different districts and different date ranges.

Screenshots of inconsistent result examples

Sun Valley Area NC 02/01/2024 to 02/15/2024

  • Map visual shows more than 10 requests, CSV shows only 1 request
Screenshot 2024-06-20 at 12 08 54 AM

Screenshot 2024-06-20 at 12 09 20 AM

Downtown Los Angeles 05/01/2024 to 05/31/2024

  • Map shows more than 20 requests, CSV shows only 7 requests
Screenshot 2024-06-20 at 12 14 13 AM

Screenshot 2024-06-20 at 12 14 19 AM

Downtown Los Angeles 02/01/2024 to 02/15/2025

  • Map shows multiple requests, export button throws error for no requests
Screenshot 2024-06-20 at 12 15 38 AM

Given the complexity of the issue, I don't have enough time during this PR review to fully debug the problem and was not able to pinpoint where the issue is. Let me add comment later today for suggestions on steps to investigate that may be able to help you.

Also looping in @ryanfchase @aqandrew @bphan002 to get fresh insights that could maybe help us identify or pinpoint the problem.

P.S. Sorry if y'all were getting multiple email notifications from me editing this comment, I'm very bad at writing and usually require multiple edits

@Skydodle Skydodle requested review from aqandrew and bphan002 June 20, 2024 13:47
@bphan002
Copy link
Member

Availability for PR Review Friday 6/20/24

I looked at this briefly, but will need more time to look into it more on Friday.

@ryanfchase @Skydodle
On a side note(Not sure if it is within the scope of this issue), If a user selects a year in both 2023-2024, then only the 2023 data is exported out.

@bphan002
Copy link
Member

bphan002 commented Jun 22, 2024

@Skydodle. @kdow

I also got that alert message for some of the export request even though there should be data. I think this is the issue and here is my test case.

For NC United NEIGHBORHOODS NC
date range 06/14/2024 - 06/21/2024
Request Types Single Streetlight

I noticed the value in the formatted requestType is 'Single Streetlight', but the actual RequestType should be 'Single Streetlight Issue'

My guess is that there may be several requestTypes that are not mapping correctly to the actual names of the request. Since the names do not match exactly, the SQL query is filtering them out and causing the alert message to appear.

Other examples such as 'Metal Appliances' should be 'Metal/Household Appliances' instead. I'm sure there are more that are not mapping to the correct names. One solution I can think of is to create a hashmap for the ones that are not mapping correctly.

Here are some side findings.
For the function getAllRequests, groupRequestsByAddress we needed to change AND Created Date < '${endDate}' to be <=
I think this may also contribute to the rows and pins not matching.

Even though there should be 5 rows I got 9. I noticed there the same request has been created within a span of a couple of minutes so the data is overlapping. I'm not sure if we want this filtered out in the future.

Copy link
Member

@bphan002 bphan002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map Request Type to the exact string to see if that solves the issue

@kdow
Copy link
Member Author

kdow commented Jun 24, 2024

Thanks all for helping to find and figure out some of these issues. I've updated the requestType names and createdBy end dates that were effecting the results.

I also found another issue after fixing those. Throughout the codebase we check for the SR status as 'Open' or 'Closed', but I've found there's also a status of 'Pending'. A lot of the results that were missing from the export but appeared in the browser had the 'Pending' status.

I'll submit an updated revision tonight or tomorrow with the fixes.

@ryanfchase

This comment was marked as resolved.

kdow added 3 commits June 24, 2024 17:38
Revise csv export to account for multiple years
Covert timestamps to human-readable date in csv export
@kdow kdow force-pushed the 1746-sr-csv-download-address-count branch from f88cdf2 to c6396b7 Compare June 25, 2024 00:41
@kdow kdow requested a review from bphan002 June 25, 2024 00:43
@Skydodle
Copy link
Member

Will complete review before Thurs 6/27 EOD

Copy link
Member

@bphan002 bphan002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The date range now includes the last date and the csv now populates if the range involve multiple years. Great job on these modifications.

Using the same test case

Boundaries United NEIGHBORHOODS NC
Date Range 06/14/2024 - 06/21/2024
Request Types Single Streetlight

The address 2336 S 4TH AVE, 90018 pin does not appear, but was included in the csv. I'm not sure what is going on, but everything else looks great. I think another issue is needed in order to look into this. @ryanfchase @Skydodle

Approving on the condition that another issue needs to be created for the pins not matching with the csv. (Unless the issue is related to the code and I couldn't pinpoint it)

@kdow
Copy link
Member Author

kdow commented Jun 26, 2024

I think the pin for 2334 S 4TH AVE, 90018 might be overlapping the pin for 2336 S 4TH AVE, 90018 and that's why we can't see it on the map but it's on the export. If that is the case, it would be ideal to have it viewable on the map in some way.

Copy link
Member

@Skydodle Skydodle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked on Google Map, 2334 S 4TH AVE and 2336 S 4TH AVE are actually in the same house same building. The request for 2336 was created before 2334. So it seem like in our pin rendering system, if different requests on the same building, it will only render pin for the latest request and older ones gets replaced. Or it could be overlapping as Kelly said, maybe it doesnt know how to display a second pin at a different spot on the same house.

2243 W 20TH ST also has two requests, and only the latest one is rendering a pin.

I think this bug should be address in another ticket with our pin rendering logic and not part of this issue. Everything else looks good. Will go ahead and approve. Thank you so much for your hard work @kdow !

@kdow kdow merged commit 2fc9d4f into main Jun 28, 2024
@kdow kdow deleted the 1746-sr-csv-download-address-count branch June 28, 2024 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SR CSV download functionality needs an Address and Count export
4 participants