Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redistribute cases logic #12190

Merged
merged 20 commits into from
Nov 14, 2019
Merged

Redistribute cases logic #12190

merged 20 commits into from
Nov 14, 2019

Conversation

pkarman
Copy link
Contributor

@pkarman pkarman commented Sep 23, 2019

connects #11713

Description

Sometimes it's appropriate to re-distribute a legacy case via automatic case distribution. In that circumstance, we must update the existing distributed_case case_id value to preserve the uniqueness constraint. This PR attempts to automate that re-distribution.

@pkarman pkarman self-assigned this Sep 23, 2019
@codeclimate
Copy link

codeclimate bot commented Sep 23, 2019

Code Climate has analyzed commit 631352b and detected 0 issues on this pull request.

View more on Code Climate.

@yoomlam
Copy link
Contributor

yoomlam commented Nov 10, 2019

Based on advice from Peter (https://dsva.slack.com/archives/CJL810329/p1573135068290400?thread_ts=1573133363.290200&cid=CJL810329), I examined all the ACD redistributed cases (those with -attempt in their case-id) to detect patterns that we can use to programmatically identify and fix such cases in the future, thereby preventing this problem from getting to Bat Team.
Following are my results:

# Find cases that were allowed to be redistributed and fixed by Bat Team
fixed_distribcases=DistributedCase.where("case_id like ?", "%-attempt%")
fixed_distribcases.count
=> 114

# Extract the vacols_ids to look up the appeals and find patterns
vacols_ids=fixed_distribcases.map(&:case_id).map{ |id_with_attempt| id_with_attempt.split("-")[0] }

# Sample one of those legacy appeals
appeal = LegacyAppeal.find_by(vacols_id: vacols_ids[10])
puts appeal.structure_render(:id, :status, :created_at)
LegacyAppeal XXXXX10 [id, status, created_at]
└── RootTask 139658, completed, 2019-03-12 13:47:50 UTC
appeal.tasks.count
=> 1
fixed_distribcases[10].ready_at
=> Tue, 26 Feb 2019 00:00:00 UTC +00:00

I interpret this as the appeal initially had no tasks (not even a RootTask!), it was fixed by Bat Team and distributed on Feb 26, and the RootTask was added on Mar 12.
FYI: According to a data dictionary (dated 04182019v2), ready-at is "The time the case first became eligible to be distributed".

Here is an appeal with more tasks to examine:

appeal = LegacyAppeal.find_by(vacols_id: vacols_ids[3])
puts appeal.structure_render(:id, :status, :created_at)
LegacyAppeal XXXXX03 [id, status, created_at]
└── RootTask 56260, on_hold, 2019-02-15 16:08:59 UTC
    ├── HearingTask 56261, cancelled, 2019-02-15 16:08:59 UTC
    │   ├── ScheduleHearingTask 56262, completed, 2019-02-15 16:08:59 UTC
    │   └── AssignHearingDispositionTask 139668, cancelled, 2019-03-12 13:53:57 UTC
    ├── HearingTask 163765, completed, 2019-04-03 13:35:04 UTC
    │   └── AssignHearingDispositionTask 163766, completed, 2019-04-03 13:35:06 UTC
    ├── HearingTask 230925, cancelled, 2019-06-10 17:00:34 UTC
    │   ├── ScheduleHearingTask 230926, completed, 2019-06-10 17:00:34 UTC
    │   └── AssignHearingDispositionTask 234593, cancelled, 2019-06-12 20:14:28 UTC
    ├── HearingTask 283969, cancelled, 2019-07-19 19:39:47 UTC
    │   └── AssignHearingDispositionTask 283970, cancelled, 2019-07-19 19:39:48 UTC
    ├── HearingTask 289254, cancelled, 2019-07-24 05:04:40 UTC
    │   ├── ScheduleHearingTask 289255, completed, 2019-07-24 05:04:40 UTC
    │   └── AssignHearingDispositionTask 348558, cancelled, 2019-08-23 15:48:26 UTC
    └── HearingTask 348573, cancelled, 2019-08-23 15:53:39 UTC
        └── ScheduleHearingTask 348574, cancelled, 2019-08-23 15:53:39 UTC
fixed_distribcases[3].ready_at
=> Wed, 03 Apr 2019 09:35:08 UTC +00:00

This appeal had 4 tasks before it was fixed: RootTask, HearingTask, ScheduleHearingTask, and AssignHearingDispositionTask.

I'm only examining tasks before the appeal was fixed since those are the only tasks that are available to the code when the problem is presented to Bat Team. The only date consistently set on DistributedCases is ready_at, so I'm using that as the time it was fixed by Bat Team.

# Create a list of fixed cases where each list element is a tuple <ready_at, vacols_id>
fixed_cases=fixed_distribcases.map(&:ready_at).zip(vacols_ids)

# Among the fixed cases, count the tasks that existed before the fix
tasks_before_fix=fixed_cases.map { |fixed_date, v_id| 
  appeal=LegacyAppeal.find_by(vacols_id: v_id)
  appeal.tasks.select{|t| t.created_at < fixed_date}.count
}
=> [1, 0, 0, 4, 0, 0, 0, 8, 0, 4, 0, 0, 4, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 4, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 3, 7, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 5, 6, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 1, 0, 0, 0, 0, 1, 7, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 4, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0]

There are only 29 of 114 appeals with tasks before the fix.
Pattern 1: Most appeals don't have any tasks before the fix.

Let's ignore RootTask and TrackVeteranTask because those were not considered relevant in this PR's code:

relev_tasks_before_fix=fixed_cases.map { |fixed_date, v_id| 
  appeal=LegacyAppeal.find_by(vacols_id: v_id)
  tasks_before_fix=appeal.tasks.reject { |task| task.type == "TrackVeteranTask" || task.type == "RootTask" }
    .select{|t| t.created_at < fixed_date}
}
relev_tasks_before_fix.map(&:count)
=> [0, 0, 0, 3, 0, 0, 0, 6, 0, 3, 0, 0, 3, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 5, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 3, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
relev_tasks_before_fix.reject(&:empty?).map(&:count)
=> [3, 6, 3, 3, 3, 3, 2, 5, 2, 2, 3, 5, 5, 5, 3, 3]
relev_tasks_before_fix.reject(&:empty?).map(&:count).count
=> 16

So there are 16 appeals with relevant tasks, from which to try to find a pattern. Let's look at the tasks for those 16 appeals:

# get relevant tasks before appeal was fixed, skipping appeals with no relevant tasks
# also sort the tasks by creation date
get_tasks_before_fix = -> (fixed_date, v_id) { 
  appeal=LegacyAppeal.find_by(vacols_id: v_id)
  tasks_before_fix=appeal.tasks.reject { |task| task.type == "TrackVeteranTask" || task.type == "RootTask" }
    .select{|t| t.created_at < fixed_date}
  next if tasks_before_fix.empty?
  [fixed_date, v_id, tasks_before_fix.sort_by{|t| t.created_at}]
}
tasks_before_fix=fixed_cases.map(&get_tasks_before_fix).reject(&:nil?)
tasks_before_fix.count
=> 16

tasks_before_fix.map{ |fixed_date, v_id, rel_tasks|
  rel_tasks.map(&:status).reject{|s| s=="cancelled" || s=="completed"}
}
=> [[], [], [], [], [], [], [], [], [], [], [], [], [], [], [], []]

Pattern 2: For appeals with some relevant tasks before the fix, all those tasks were closed. A task is considered closed if it is completed or cancelled -- see

def self.closed_statuses
.

# for each appeal, show the tasks sorted by creation date
pp tasks_before_fix.map{ |tup| tup[2].map(&:type) }
[["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask", "NoShowHearingTask", "ScheduleHearingTask", "HearingTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask"],
 ["ScheduleHearingTask", "HearingTask", "AssignHearingDispositionTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask"],
 ["ScheduleHearingTask", "HearingTask"],
 ["HearingTask",  "ScheduleHearingTask", "AssignHearingDispositionTask", "HearingTask",  "ScheduleHearingTask"],
 ["HearingTask", "ScheduleHearingTask"],
 ["HearingTask", "ScheduleHearingTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask", "HearingTask", "AssignHearingDispositionTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask", "HearingTask", "AssignHearingDispositionTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask", "HearingTask", "AssignHearingDispositionTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask"]]

Note that a few elements have ScheduleHearingTask before HearingTask -- that's because they have the exact same creation date, so I'll ignore that anomaly:

v_id=tasks_before_fix[4][1]
puts LegacyAppeal.find_by(vacols_id: v_id).structure_render(:created_at, :status, :updated_at)
LegacyAppeal YYYYY04 [created_at, status, updated_at]
└── RootTask 2019-04-15 15:00:28 UTC, on_hold, 2019-04-15 15:00:28 UTC
    └── HearingTask 2019-04-15 15:00:28 UTC, completed, 2019-05-17 05:06:04 UTC
        ├── ScheduleHearingTask 2019-04-15 15:00:28 UTC, completed, 2019-04-15 17:08:15 UTC
        └── AssignHearingDispositionTask 2019-04-15 17:08:15 UTC, completed, 2019-06-25 21:07:31 UTC

Something to note that I'm not sure how to deal with:

v_id=tasks_before_fix[12][1]
puts LegacyAppeal.find_by(vacols_id: v_id).structure_render(:created_at, :status, :updated_at)
LegacyAppeal YYYYY12 [created_at, status, updated_at]
├── RootTask 2019-02-15 15:47:52 UTC, on_hold, 2019-02-15 15:47:52 UTC
│   ├── HearingTask 2019-02-15 15:47:52 UTC, cancelled, 2019-02-15 15:47:52 UTC
│   │   ├── ScheduleHearingTask 2019-02-15 15:47:52 UTC, completed, 2019-04-08 14:13:53 UTC
│   │   └── AssignHearingDispositionTask 2019-04-08 14:13:53 UTC, cancelled, 2019-06-25 21:07:30 UTC
│   ├── TrackVeteranTask 2019-04-08 15:00:16 UTC, in_progress, 2019-07-05 19:19:16 UTC
│   ├── HearingTask 2019-04-08 19:18:30 UTC, completed, 2019-05-23 05:02:04 UTC
│   │   └── AssignHearingDispositionTask 2019-04-08 19:18:31 UTC, completed, 2019-06-25 21:06:58 UTC
│   ├── HearingTask 2019-05-16 11:53:05 UTC, cancelled, 2019-06-21 20:47:54 UTC
│   │   └── ScheduleHearingTask 2019-05-16 11:53:05 UTC, cancelled, 2019-08-01 13:07:15 UTC
│   └── HearingTask 2019-05-16 11:53:06 UTC, completed, 2019-08-01 13:06:34 UTC
│       └── ScheduleHearingTask 2019-05-16 11:53:06 UTC, cancelled, 2019-08-01 13:06:34 UTC
└── ScheduleHearingColocatedTask 2019-05-16 11:09:53 UTC, completed, 2019-07-19 14:49:49 UTC
    └── ScheduleHearingColocatedTask 2019-05-16 11:09:55 UTC, completed, 2019-07-19 14:49:49 UTC
tasks_before_fix[12][0]
=> Mon, 13 May 2019 00:00:00 UTC +00:00

The above has 5 relevant tasks before it was fixed. Both the AssignHearingDispositionTasks were updated after the fix, which means the status of those tasks may not have been completed or cancelled. I don't know how to retrieve a task's status before a certain date. So if the tasks did not have a closed status before the fix, Pattern 2 may not be valid.

Moving on ...
From the list, we have 5 unique sequences of task creation:

 ["HearingTask", "ScheduleHearingTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask", "NoShowHearingTask", "ScheduleHearingTask", "HearingTask"],
 ["HearingTask",  "ScheduleHearingTask", "AssignHearingDispositionTask", "HearingTask",  "ScheduleHearingTask"],
 ["HearingTask", "ScheduleHearingTask", "AssignHearingDispositionTask", "HearingTask", "AssignHearingDispositionTask"],

From the above, the only pattern is a "HearingTask" followed by a "ScheduleHearingTask", which isn't unique in itself. Triggering on that would result in a lot of false positives.

At this point, I would conclude that checking for tasks that are not open (as the current PR code already does) is sufficient, pending further feedback on my interpretation of these results.

@pkarman
Copy link
Contributor Author

pkarman commented Nov 11, 2019

I agree with "Pattern 1"

Pattern 2 is trickier. Specifically, "cancelled" is significantly different than "completed". If all the tasks of a certain type (e.g. ScheduleHearingTask) were cancelled, then that step in the case flow was never finished and it seems legit to re-distribute.

@yoomlam
Copy link
Contributor

yoomlam commented Nov 11, 2019

I agree with "Pattern 1"

Pattern 2 is trickier. Specifically, "cancelled" is significantly different than "completed". If all the tasks of a certain type (e.g. ScheduleHearingTask) were cancelled, then that step in the case flow was never finished and it seems legit to re-distribute.

How does the following sound instead of Pattern 2?
Pattern 3: For appeals with some relevant tasks before the fix, if all ScheduleHearingTask tasks are cancelled, then it is ok_to_redistribute.

Could we generalize it to "if all HearingTask tasks are cancelled, then it is ok_to_redistribute"?

Is there anything else I should examine?

@pkarman
Copy link
Contributor Author

pkarman commented Nov 11, 2019

Could we generalize it to "if all HearingTask tasks are cancelled, then it is ok_to_redistribute"?

yes that sounds legit to start with.

@va-bot
Copy link
Collaborator

va-bot commented Nov 13, 2019

1 Warning
⚠️ This PR adds one or more new specs. If the specs use the DB, see if you can rewrite them so they don’t use the DB, such as by using build_stubbed. If they must use the DB, please remember to add the appropriate require statements and either the :postgres or :all_dbs tags, as documented in our Wiki: https://github.com/department-of-veterans-affairs/caseflow/wiki/Testing-Best-Practices#tests-that-write-to-the-db

Generated by 🚫 Danger

Copy link
Contributor Author

@pkarman pkarman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic looks right.

suggested some more idiomatic Ruby, otherwise lgtm.

I can't officially give an Approval on this PR since I created it, but I give my +1 to this and you can approve and merge it.

app/services/redistributed_case.rb Outdated Show resolved Hide resolved
spec/models/redistributed_case_spec.rb Outdated Show resolved Hide resolved
spec/models/redistributed_case_spec.rb Outdated Show resolved Hide resolved
@yoomlam
Copy link
Contributor

yoomlam commented Nov 13, 2019

If deeper analysis is desired, try examining changes to the location code in VACOLS mentioned in https://dsva.slack.com/archives/C3EAF3Q15/p1573667863033800:

vacols_ids[0..3].each_with_index do |id, i|
  puts "vacols_id=#{id}"
  pp VACOLS::Priorloc.where(lockey: id).order(:locdout).pluck(:locdout, :locstto, :locstrcv)
end

Copy link
Contributor

@kevmo kevmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work finishing this up, @yoomlam! And obviously, also great work to @pkarman getting it going.

@yoomlam yoomlam added the Ready-to-Merge This PR is ready to be merged and will be picked up by va-bot to automatically merge to master label Nov 14, 2019
@va-bot va-bot merged commit 390ffbf into master Nov 14, 2019
@va-bot va-bot deleted the pek-auto-redistribute branch November 14, 2019 18:40

# send to Sentry but do not raise exception.
def alert_existing_distributed_case_not_unique
error = CannotRedistribute.new("DistributedCase #{case_id} already exists")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it alright if I revert this error message to the previous "Distributed case already exists" without the case_id? The error no longer gets automatically grouped in Sentry because the error messages are now unique to the specific case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was an attempt to make it easier to see in the Sentry log, but if the case_id is present in some other way, sure, fine to revert.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's present in the Additional Data section, usually as vacols_id

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready-to-Merge This PR is ready to be merged and will be picked up by va-bot to automatically merge to master
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants