Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annual Metric: Caseflow Reader adoption #12556

Closed
alisan16 opened this issue Nov 1, 2019 · 10 comments
Closed

Annual Metric: Caseflow Reader adoption #12556

alisan16 opened this issue Nov 1, 2019 · 10 comments
Assignees
Labels
Eng: Data Mark when data has been backfilled, or the issue has any data integrity concerns Priority: Medium Blocking issue w/workaround, or "second in" priority for new work. Stakeholder: BVA Functionality associated with the Board of Veterans' Appeals workflows/feature requests Team: Delta 🔺 Type: Metrics or Reporting

Comments

@alisan16
Copy link
Contributor

alisan16 commented Nov 1, 2019

This metric is required by OIT (Metric ID: 1909040940).

Metric description: Adoption of Caseflow Reader for appeal decisions (percentage)

Metric definition: (Date of decision minus date of appeal establishment/number of appeals processed)

@D-L-Ware
Copy link

So the confusion comes from the fact that this metric is proposed to be changed from "Average time (days) to process an appeal" to "Adoption of Caseflow Reader for appeal decisions (percentage))"

The non-sequitur like description of the metric comes from the fact that the description has not been changed yet.

@D-L-Ware
Copy link

D-L-Ware commented Nov 19, 2019

Looks like we probably have a little room to define how to measure Reader adoption. Here is the change justification in case it helps "An outcome-oriented metric that is more directly connected to the value that Caseflow offers is in usage of Caseflow Reader by attorneys and Veterans Law Judges. Reader is designed for these users to help them more efficiently complete time-consuming tasks related to evidence review and annotation as they draft decisions on appeals. Using Reader is not mandatory, and attorneys and VLJs can access the same information through VBMS if they prefer. Adoption of Caseflow Reader thus measures the extent to which Caseflow is delivering a better tool that empowers VA employees to improve the timeliness of the appeals process."

The justification suggest to me we measure of appeals with documents in Caseflow Reader/VBMS how many people viewed the docs in Reader vs. VBMS. I was unaware we could even measure that.

@D-L-Ware
Copy link

D-L-Ware commented Nov 19, 2019

It looks like this metrics number has been changed to 1909040940 with the number reference above referring to the metric before it was changed.

The goal is 98% adoption.

https://dvagov.sharepoint.com/:x:/r/sites/epmoepmd/benefits/_layouts/15/guestaccess.aspx?e=sIhwv4&share=EUgfXGIOUjNDgl67Ni70fVcBr5qddsswVWjhnbI3kC9DCw

Link to the spreadsheet with metrics. It is my understanding that only the first tab is the ones we need to report on.

@alisan16 alisan16 changed the title Metric: Caseflow Reader adoption Annual Metric: Caseflow Reader adoption Dec 9, 2019
@jimruggiero jimruggiero added Priority: Medium Blocking issue w/workaround, or "second in" priority for new work. Stakeholder: BVA Functionality associated with the Board of Veterans' Appeals workflows/feature requests Eng: Data Mark when data has been backfilled, or the issue has any data integrity concerns labels Dec 28, 2019
@yoomlam
Copy link
Contributor

yoomlam commented Sep 21, 2020

appeals with documents in Caseflow Reader/VBMS

I believe all appeals have documents in Reader/VBMS.

how many people viewed the docs in Reader vs. VBMS

We cannot get stats for VBMS usage.

Some additional ideas:

  • count of unique users (and their role: VLJ vs attorney) using Reader per month
    • "using Reader" = opened at least 3(?) docs in Reader
    • compare against total count of VLJs and attorneys (that review documents?). Use this for "98% adoption" goal if it means 'user adoption'.
  • histogram showing percentage of docs per appeal opened in Reader (as the x-axis) per month
    • count of appeals where Reader was used for 80%(?) of docs on appeal
    • Use this for "98% adoption" goal if "98%" refers to number of documents opened in Reader.

More detailed stats:

@alisan16
Copy link
Contributor Author

Great options! I pulled up the previous Caseflow Product Impact Statement from 2018 and check out page 26 - this may have been where the 98% came from, it's defined as "percentage of decisions where Reader was used to view case files". I can't seem to find the corresponding sql in the looker archives though.

@yoomlam
Copy link
Contributor

yoomlam commented Sep 21, 2020

OIT has asked us to provide this for the end of the fiscal year (9/30).
Planning to implement as a quick turnaround:

count of unique users (and their role: VLJ vs attorney) using Reader per month
compare against total count of VLJs and attorneys

@yoomlam
Copy link
Contributor

yoomlam commented Sep 22, 2020

TLDR: Go to the last section and run those commands.

First, Reader is used by different sets of user types:

  • Judges (aka VLJs) and attorneys
  • Intake users
  • ...

According to the metrics intent in Dustin's comment above, we are only concerned with VLJs and attorneys Reader adoption.

Next, how many users are a VLJ or attorney

A JudgeTeam is lead by a judge (aka "team admin") and whose members are attorneys working for that judge.

JudgeTeam.count
=> 124
JudgeTeam.active.group_by{|t| t.users.count}.transform_values(&:count)
=> {10=>39, 8=>10, 7=>2, 1=>15, 9=>34, 12=>4, 11=>14, 13=>1, 2=>3, 6=>1, 3=>1}
us=JudgeTeam.all.map(&:users).flatten
us.count
=> 1035
uus=us.uniq
us.uniq.count
=> 994

## Judges and attorney roles overlap:
uus.group_by(&:judge?).transform_values(&:count)
=> {true=>307, false=>687}
uus.group_by(&:attorney?).transform_values(&:count)
=> {false=>113, true=>881}
uus.group_by{|u| u.judge? && u.attorney?}.transform_values(&:count)
=> {false=>799, true=>195}
# It would be challenging to determine which role a user was when they viewed a document.

## How many of these users have the "Reader" role?
uus.group_by{|u| u.roles.include?("Reader")}.transform_values(&:count)
=> {true=>992, false=>2}
uus.reject{|u| u.roles.include?("Reader")}
=> [#<User:0x000000000983d3e0
  id: 2922,
  roles: ["Mail Intake"],
#<User:0x00000000094e56c0
  id: 3779,
  roles: ["Establish Claim", "Mail Intake"],  
uus.reject{|u| u.roles.include?("Reader")}.map{|u| [u.judge?, u.attorney?]}
=> [[false, true], [false, true]]

2 attorneys do not have the "Reader" role. Should they?

Now, query DocumentView for each group of users for the month

Since DocumentView is a huge table, some queries time out, so we'll have to query by week.
Since we're only counting unique users and we don't need to tally all document views across the entire month, we can reduce the number of queries in subsequent weeks by not querying users who have met the condition "used Reader this month".

month=Time.new(2020,9)
time_span=month..month.next_month
=> 2020-09-01 00:00:00 -0500..2020-10-01 00:00:00 -0500

## split_into_weeks
week=time_span.begin..time_span.begin.end_of_week
time_periods=[week]
while week.begin.next_week.end_of_week < time_span.end do
  week=week.begin.next_week..week.begin.next_week.end_of_week
  time_periods << week
end
time_periods << (week.begin.next_week..(time_span.end-1))

## query in small groups to avoid timing out
# unique users to query
uus=JudgeTeam.all.map(&:users).flatten.uniq
ud_hash={}
time_periods.each{|week|
  uus.in_groups_of(1) {|usrs|
    begin
      h=DocumentView.where(user: usrs, created_at: week).group(:user_id).count
      pp h
      ud_hash.merge!(h){ |key, v1, v2| v1+v2 }
    rescue
      pp usrs.pluck(:id)
    end  
  }

  # for subsequent queries, remove users who have met the condition
  pp ud_hash.count
  nr=uus.map(&:id) - ud_hash.select{|k,v| v>5}.keys;
  pp nr.count
  uus=nr.map{|i| User.find(i)}
  pp uus.count
}
ud_hash

Interim results

So far for September:

  • Less than 51 of the 994 (5%) JudgeTeam users have not used Reader

Make sure we've accounted for all users using Reader:

dv_users=[]
time_periods.each{|week| 
  dv_users += DocumentView.select(:user_id).where(created_at: week).distinct.to_a
}
dv_users = dv_users.uniq

users_queried = ... # uus for all user groups
unaccounted=dv_users - users_queried.map(&:id)

Running into query time out errors!

Taking a different approach using record id

Much faster! Ignore prior command; use this one.

month=Time.new(2020,9)
time_span=month..month.next_month
first_hours=time_span.begin..time_span.begin+1.hour
last_hours=time_span.end-1.hour..time_span.end

# Find the first record for the month
DocumentView.where(created_at: day).order(:id).first
# Times out
dv_first=DocumentView.where(id: 70000000..71500000).where(created_at: first_hours).order(:id).first
=> #<DocumentView:0x000000000ea6b428
 id: 71089325,
 document_id: 24049861,
 user_id: 965,
 first_viewed_at: Tue, 01 Sep 2020 04:00:03 UTC +00:00,
 created_at: Tue, 01 Sep 2020 04:00:03 UTC +00:00,
 
# Find the last record for the month
dv_last=DocumentView.where(id: dv_first.id..DocumentView.last.id).where(created_at: last_hours).order(:id).last
dv_last=DocumentView.last # since the end of the month hasn't arrived

counts=[]
tally={}
DocumentView.in_batches(of: 100_000, start: dv_first.id, finish: dv_last.id).each_with_index do |relation, batch_index|
  puts "Processing relation batch ##{batch_index}"
  result = relation.group(:user_id).count;
  counts << result
  tally.merge!(result){ |key,ov,tv| ov+tv }
  pp tally
end;
tally.count
=> 1104

uus=JudgeTeam.all.map(&:users).flatten.uniq
uus.reject{|u| tally[u.id]}.count
=> 38
remaining_uus=uus.reject{|u| tally[u.id]&.>5};
remaining_uus.count
=> 48

# Percent of the 994 JudgeTeam users who have **not** used Reader
remaining_uus.count.to_f/uus.count
=> 0.0482897384305835

# Other users that use Reader outside of JudgeTeams
others=tally.keys - uus.map(&:id)
others.count
=> 148
ous=others.map{|i| User.find(i)}
ous.select{|u| (u.roles & ["Mail Intake"]).empty?}.count
=> 69
ous.select{|u| (u.roles & ["Mail Intake"]).empty?}.group_by(&:roles).transform_values(&:count)
=> {["Reader"]=>28,
 ["Reader", "Hearing Prep", "Reader", "Admin Intake"]=>4,
 ["Reader", "Admin Intake", "Build HearSched", "Edit HearSched"]=>1,
 ["System Admin"]=>7,
 ["Reader", "Hearing Prep", "Reader"]=>4,
 ["Download eFolder", "Reader", "Edit HearSched"]=>2,
 ["Reader", "Build HearSched", "Edit HearSched"]=>4,
 ["Download eFolder", "Reader", "Admin Intake"]=>2,
 ["Download eFolder", "Reader"]=>5,
 ["Reader", "Edit HearSched"]=>5,
 ["Reader", "Admin Intake"]=>1,
 ["Download eFolder", "System Admin"]=>1,
 ["Download eFolder", "Reader", "Hearing Prep", "Reader"]=>1,
 ["Manage Claims Establishme", "Manage Claim Establishment", "Reader"]=>2,
 ["User", "Reader"]=>1,
 ["System Admin", "Reader"]=>1}

@yoomlam
Copy link
Contributor

yoomlam commented Sep 22, 2020

@alisan16 Results for the month so far. I'll rerun the query on 9/30 end-of-day and update the last row.

Notes:

  • According to the metrics intent from Dustin's comment above, we are only concerned with Reader adoption for VLJs and attorneys. Caseflow models a JudgeTeam, who is lead by a judge (aka "team admin") and whose members are attorneys working for that judge.
  • The set of JudgeTeam users changes over time; however, the percentages below are calculated based on the current set of 994 JudgeTeam users.
  • A user is considered to have "used Reader" if that user opened at least 5 different documents during the month.

Columns:

  • remaining users count: count of users who did not use Reader during the month
  • percent adoption: 1 - (remaining users count/994)
month remaining users count percent adoption
2019-10 146 0.85
2019-11 146 0.85
2019-12 146 0.85
2020-01 151 0.85
2020-02 160 0.84
2020-03 140 0.86
2020-04 117 0.88
2020-05 114 0.89
2020-06 81 0.92
2020-07 74 0.93
2020-08 44 0.96
2020-09 48 0.95

13 users did not use Reader across these months.

(Detailed) Reader adoption monthly stats

# US fiscal year
span=Time.new(2019,10)..Time.new(2020,10)-1

## find id of first DocumentView record for each month
def find_first_record_for(init_id, first_hours)
  curr_id=init_id
  dv_first=nil
  until dv_first do
    return nil if DocumentView.find(curr_id).first_viewed_at > first_hours.end # need to increase range of first_hours
    dv_first=DocumentView.where(id: curr_id..(curr_id+1_000_000)) .where(first_viewed_at: first_hours).order(:id).first
    curr_id=curr_id+1_000_000
    pp curr_id, first_hours
  end
  dv_first
end

month=span.begin.beginning_of_month
curr_id=40_000_000
monthly_ids={}
while month < span.end do
  first_hours=month..month+6.hour
  dv_first=find_first_record_for(curr_id, first_hours)
  monthly_ids[month] = dv_first.id
  pp monthly_ids
  curr_id = dv_first.id
  month=month.next_month
end
monthly_ids
{2019-10-01 00:00:00 -0400=>48034519,
 2019-11-01 00:00:00 -0400=>50061717,
 2019-12-01 00:00:00 -0500=>52019658,
 2020-01-01 00:00:00 -0500=>53970029,
 2020-02-01 00:00:00 -0500=>56218777,
 2020-03-01 00:00:00 -0500=>58418293,
 2020-04-01 00:00:00 -0400=>60862816,
 2020-05-01 00:00:00 -0400=>63086836,
 2020-06-01 00:00:00 -0400=>65083340,
 2020-07-01 00:00:00 -0400=>67220638,
 2020-08-01 00:00:00 -0400=>69209253,
 2020-09-01 00:00:00 -0400=>71089325}

# Double-check first and last DocumentView dates (in UTC) for each month
month=span.begin.beginning_of_month
dv_first_last=[]
dv_first_last_ids=[]
while month < span.end do
  dv_first=DocumentView.find(monthly_ids[month])
  dv_last=monthly_ids[month.next_month] ? DocumentView.find(monthly_ids[month.next_month]-1) : DocumentView.last
  dv_first_last << [dv_first.created_at, dv_last.created_at]
  dv_first_last_ids << [dv_first.id, dv_last.id]
  month=month.next_month
end
dv_first_last
=> [[Tue, 01 Oct 2019 04:00:00 UTC +00:00, Fri, 01 Nov 2019 03:59:48 UTC +00:00],
 [Fri, 01 Nov 2019 04:00:10 UTC +00:00, Sun, 01 Dec 2019 04:58:29 UTC +00:00],
 [Sun, 01 Dec 2019 05:01:59 UTC +00:00, Wed, 01 Jan 2020 04:47:26 UTC +00:00],
 [Wed, 01 Jan 2020 05:35:37 UTC +00:00, Sat, 01 Feb 2020 04:59:18 UTC +00:00],
 [Sat, 01 Feb 2020 05:00:04 UTC +00:00, Sun, 01 Mar 2020 02:29:28 UTC +00:00],
 [Sun, 01 Mar 2020 07:52:01 UTC +00:00, Wed, 01 Apr 2020 03:59:40 UTC +00:00],
 [Wed, 01 Apr 2020 04:00:14 UTC +00:00, Fri, 01 May 2020 03:59:58 UTC +00:00],
 [Fri, 01 May 2020 04:00:05 UTC +00:00, Mon, 01 Jun 2020 03:55:47 UTC +00:00],
 [Mon, 01 Jun 2020 04:00:39 UTC +00:00, Wed, 01 Jul 2020 03:59:57 UTC +00:00],
 [Wed, 01 Jul 2020 04:00:01 UTC +00:00, Sat, 01 Aug 2020 03:59:56 UTC +00:00],
 [Sat, 01 Aug 2020 04:00:10 UTC +00:00, Tue, 01 Sep 2020 03:59:53 UTC +00:00],
 [Tue, 01 Sep 2020 04:00:03 UTC +00:00, Tue, 22 Sep 2020 21:05:28 UTC +00:00]]
dv_first_last_ids
=> [[48034519, 50061716],
 [50061717, 52019657],
 [52019658, 53970028],
 [53970029, 56218776],
 [56218777, 58418292],
 [58418293, 60862815],
 [60862816, 63086835],
 [63086836, 65083339],
 [65083340, 67220637],
 [67220638, 69209252],
 [69209253, 71089324],
 [71089325, 72682542]]

# Get tallies for each month
## reduce verbosity
ActiveRecord::Base.logger.level = :warn 
## May take a while
tallies=dv_first_last_ids.map{|dv_first_id, dv_last_id|
  counts=[]
  tally={}
  DocumentView.in_batches(of: 100_000, start: dv_first_id, finish: dv_last_id).each_with_index do |relation, batch_index|
    puts "Processing relation batch ##{batch_index}"
    result = relation.group(:user_id).count;
    counts << result
    tally.merge!(result){ |key,ov,tv| ov+tv }
    pp tally
  end;
  [DocumentView.find(dv_first_id).created_at.beginning_of_month, tally]
};

# Compute metric
uniq_users=JudgeTeam.all.map(&:users).flatten.uniq;
tallies.map{|month, tally|
  remaining=uniq_users.reject{|u| tally[u.id]&.>5};
  # calculate percent of the JudgeTeam users who have used Reader
  [month.strftime("%Y-%m"), remaining.count, "%0.2f" % (1.0 - remaining.count.to_f/uniq_users.count)]
}
=> [
 ["2019-10", 146, "0.85"],
 ["2019-11", 146, "0.85"],
 ["2019-12", 146, "0.85"],
 ["2020-01", 151, "0.85"],
 ["2020-02", 160, "0.84"],
 ["2020-03", 140, "0.86"],
 ["2020-04", 117, "0.88"],
 ["2020-05", 114, "0.89"],
 ["2020-06", 81, "0.92"],
 ["2020-07", 74, "0.93"],
 ["2020-08", 44, "0.96"],
 ["2020-09", 48, "0.95"]]
# Note the increasing trend of Reader adoption over time. 
# Caveat: the set of JudgeTeam users change over time; percentages are calculated based on the current JudgeTeam users.

# How many users did not use Reader at all across these months?
remaining_users=uniq_users;
tallies.each{|month, tally| remaining_users-=uniq_users.select{|u| tally[u.id]&.>5} };
remaining_users.count
=> 13
remaining_users.pluck(:id, :css_id)
=> [[1098, "BVACMURRAY"],
 [1475, "VACORASOON"],
 [16912, "VACOVRTISE"],
 [16908, "VACOSLOMKC"],
 [1706, "VACOVEMULR"],
 [2291, "VACOWERDAP1"],
 [2922, "VBAPHOTHOMPM"],
 [1995, "VACOSHELTA"],
 [3779, "VBASAJVALENN"],
 [869, "VACOGRIFFJ"],
 [2023, "VACOBOGNAS"],
 [2275, "VACOMCLEOK1"],
 [1530, "BVARSCHARNB"]]

remaining_users=uniq_users;
tallies.each{|month, tally| remaining_users-=uniq_users.select{|u| tally[u.id]&.>1} }.count
=> 12

@yoomlam
Copy link
Contributor

yoomlam commented Sep 30, 2020

@alisan16 Reader adoption for FY20:
Using percent adoption = 1 - (number of users who did not use Reader/unique users in JudgeTeams),
1 - (11 / 992) = 98.9%

  • A user is considered to have "used Reader" if that user opened at least 5 different documents during the month.

An alternative to the Rails commands below is Metabase dashboard 14.

Updated detailed stats

dv_first_last_ids
=> [[48034519, 50061716],
 [50061717, 52019657],
 [52019658, 53970028],
 [53970029, 56218776],
 [56218777, 58418292],
 [58418293, 60862815],
 [60862816, 63086835],
 [63086836, 65083339],
 [65083340, 67220637],
 [67220638, 69209252],
 [69209253, 71089324],
 [71089325, 73204919]]

uniq_users=JudgeTeam.all.map(&:users).flatten.uniq;
uniq_users.count
=> 992 # This number was 994 last week.

tallies.map{|month, tally|
  remaining=uniq_users.reject{|u| tally[u.id]&.>5};
  # calculate percent of the JudgeTeam users who have used Reader
  [month.strftime("%Y-%m"), remaining.count, "%0.2f" % (1.0 - remaining.count.to_f/uniq_users.count)]
}
=> [
  ["2019-10", 146, "0.85"],
 ["2019-11", 145, "0.85"],
 ["2019-12", 145, "0.85"],
 ["2020-01", 150, "0.85"],
 ["2020-02", 159, "0.84"],
 ["2020-03", 139, "0.86"],
 ["2020-04", 117, "0.88"],
 ["2020-05", 114, "0.89"],
 ["2020-06", 81, "0.92"],
 ["2020-07", 75, "0.92"],
 ["2020-08", 44, "0.96"],
 ["2020-09", 42, "0.96"]]

remaining_users=uniq_users;
tallies.each{|month, tally| remaining_users-=uniq_users.select{|u| tally[u.id]&.>5} };
remaining_users.count
=> 11

remaining_users=uniq_users;
tallies.each{|month, tally| remaining_users-=uniq_users.select{|u| tally[u.id]&.>1} }.count
remaining_users.count
=> 10

remaining_users.pluck(:id, :css_id)
=> [[1098, "BVACMURRAY"],
 [1475, "VACORASOON"],
 [1706, "VACOVEMULR"],
 [2291, "VACOWERDAP1"],
 [2922, "VBAPHOTHOMPM"],
 [1995, "VACOSHELTA"],
 [3779, "VBASAJVALENN"],
 [869, "VACOGRIFFJ"],
 [2023, "VACOBOGNAS"],
 [1530, "BVARSCHARNB"]]

@alisan16
Copy link
Contributor Author

alisan16 commented Oct 6, 2020

Delivered to the Board on 10/6/2020. Next due date for this metric is 10/1/2021.

@alisan16 alisan16 closed this as completed Oct 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Eng: Data Mark when data has been backfilled, or the issue has any data integrity concerns Priority: Medium Blocking issue w/workaround, or "second in" priority for new work. Stakeholder: BVA Functionality associated with the Board of Veterans' Appeals workflows/feature requests Team: Delta 🔺 Type: Metrics or Reporting
Projects
None yet
Development

No branches or pull requests

4 participants