Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Regression test required] [$500] HIGH: Add Avatar > About > Troubleshooting section #34082

Closed
quinthar opened this issue Jan 8, 2024 · 138 comments
Assignees
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Daily KSv2 Internal Requires API changes or must be handled by Expensify staff NewFeature Something to build that is a new item. Task

Comments

@quinthar
Copy link
Contributor

quinthar commented Jan 8, 2024

Strategy:
A billion users means a billion problems. To diagnose and solve them at scale, we capture client-side logs and upload centrally.

Problem:
Some problems can't be diagnosed cleanly after the fact from the logs, for a few reasons:

  • External contributors don't have access to production logs.
  • Internal engineers do have production access, but they're often cumbersome to access.
  • The logs are often ambiguous, such as if you have multiple devices connected and tabs open at the same time -- reverse engineering what happened for a particular client is confusing and time consuming.
  • Observing logs in realtime is cumbersome or sometimes impossible, which slows down and complicates testing.
  • Even if you have all the access you need, when talking with a user sometimes it's difficult to find the logs specific to them.

Solution:
Let's expand our ability to remotely diagnose production devices with better dev tools built into the app itself. Specifically:

  • Add a Troubleshooting section to the Avatar > About page, just above Report a bug
    • This will always be available, both on staging and production builds
  • Move the following Test preferences settings into this new menu:
    • Use staging server
    • Force offline
    • Simulate failing network requests
    • Authentication status: Invalidate
    • Device credentials: Destroy
  • Add Reset and refresh, which wipes Onyx clean except for the minimum needed to call OpenApp (so you don't need to reauthenticate)
  • Add View console which opens a simple console:
    • This would first open up and initialize with all locally stored logs, so you can scroll back and see what happened
    • It would also append new logs in realtime
    • There would also be a Save button that just copies all available logs into a text file that is saved on the device
    • Add a Share button that brings up a Search selector and just posts the logs into the selected room (and then leaves the user navigated there, so they can talk about them). I'm imagining this would be useful for chatting with a real world user and asking them to do something and send you the logs.
    • At the bottom is a command line that works similar to the Chrome console, where you can execute arbitrary JavaScript, with the results output to the console

This is blocking #33040.

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~01c2a08b8b8ae90e21
  • Upwork Job ID: 1744202378047897600
  • Last Price Increase: 2024-01-15
  • Automatic offers:
    • fedirjh | Reviewer | 28115102
@quinthar quinthar added External Added to denote the issue can be worked on by a contributor Daily KSv2 NewFeature Something to build that is a new item. Task labels Jan 8, 2024
@melvin-bot melvin-bot bot changed the title HIGH: Add Avatar > About > Troubleshooting section [$500] HIGH: Add Avatar > About > Troubleshooting section Jan 8, 2024
Copy link

melvin-bot bot commented Jan 8, 2024

Job added to Upwork: https://www.upwork.com/jobs/~01c2a08b8b8ae90e21

Copy link

melvin-bot bot commented Jan 8, 2024

@melvin-bot melvin-bot bot added Weekly KSv2 Help Wanted Apply this label when an issue is open to proposals by contributors and removed Daily KSv2 labels Jan 8, 2024
Copy link

melvin-bot bot commented Jan 8, 2024

Triggered auto assignment to Contributor-plus team member for initial proposal review - @fedirjh (External)

@kmwamasali
Copy link

kmwamasali commented Jan 8, 2024

Proposal

Please re-state the problem that we are trying to solve in this issue.

External contributors don't have access to production logs.
Internal engineers do have production access, but they're often cumbersome to access.
The logs are often ambiguous, such as if you have multiple devices connected and tabs open at the same time -- reverse engineering what happened for a particular client is confusing and time consuming.
Observing logs in realtime is cumbersome or sometimes impossible, which slows down and complicates testing.
Even if you have all the access you need, when talking with a user sometimes it's difficult to find the logs specific to them.

What is the root cause of that problem?

What changes do you think we should make in order to solve the problem?

Expand app ability to remotely diagnose production devices with better dev tools built into the app itself. Specifically:

Add a Troubleshooting section to the Avatar > About page, just above Report a bug
This will always be available, both on staging and production builds
Move the following Test preferences settings into this new menu:
Use staging server
Force offline
Simulate failing network requests
Authentication status: Invalidate
Device credentials: Destroy
Add Reset and refresh, which wipes Onyx clean except for the minimum needed to call OpenApp (so you don't need to reauthenticate)
Add View console which opens a simple console:
This would first open up and initialize with all locally stored logs, so you can scroll back and see what happened
It would also append new logs in realtime
There would also be a Save button that just copies all available logs into a text file that is saved on the device
Add a Share button that brings up a Search selector and just posts the logs into the selected room (and then leaves the user navigated there, so they can talk about them). I'm imagining this would be useful for chatting with a real world user and asking them to do something and send you the logs.
At the bottom is a command line that works similar to the Chrome console, where you can execute arbitrary JavaScript, with the results output to the console

@shubham1206agra
Copy link
Contributor

Can I directly work on this without a proposal?

@Tony-MK
Copy link
Contributor

Tony-MK commented Jan 8, 2024

Proposal

Please re-state the problem that we are trying to solve in this issue.

Some problems can't be diagnosed cleanly after the fact from the logs.

What is the root cause of that problem?

The settings page is not optimal for diagnosing.

What changes do you think we should make in order to solve the problem?

Troubleshoot

  1. In the AboutPage, create a menu item titled Troubleshoot above the ReportBug menu item on this line
    Sample Component
 { 
     translationKey: 'initialSettingsPage.aboutPage.troubleshoot', 
     icon: Expensicons.Troubleshoot, 
     action: waitForNavigate(Report.navigateToTroubleshoot), 
 },

Test preferences

  1. Add a new menu item called Test preferences to the settings page.
  2. Create a menu item on the settings menu for the new page.
  3. Move the necessary menu items (use staging server, force offline, etc..) to the new page.

Reset and refresh

  1. Add another item in the Test preferences page called Reset and refresh.
  2. Set its action to a created function to refresh the page after setting every onyx collection, except for data required to call the OpenApp endpoint, to null.

View console

  1. Add a new menu item called View console to the Test preferences page.
  2. Bind on console.log, console.info, console.info, etc... depending on the type of messages we want to collect, then render before/after calling console.stdlog.apply.
  3. Create a component to get the logs from the console and render them.
  4. Create an input component for the js script and execute the input script using the inbuilt eval function. I recommend a feature to access the history of input scripts and replay if desired.
  5. Add a share button on the header component to take the required logs and send them to the appropriate channels.

@quinthar
Copy link
Contributor Author

quinthar commented Jan 8, 2024 via email

@samsonmobisa
Copy link

This is a thoughtful and comprehensive proposal that tackles the challenge that's encountered in diagnosing and solving issues at scale. I appreciate the clarity in the suggested changes, especially the addition of the Troubleshooting section and the integration of better dev tools within the app.

The proposed enhancements, such as the Console View, Reset and Refresh option, and the ability to Share Logs, would significantly improve the ability to remotely diagnose production devices. I particularly like the idea of having a command line for executing JavaScript, resembling the Chrome console.

I'm eager to contribute to the implementation of these features. Before diving in, is there any specific technical considerations or preferences regarding the implementation? Additionally, if there are any specific testing scenarios or edge cases to keep in mind, please provide guidance.

Looking forward to collaborating on this improvement

Copy link

melvin-bot bot commented Jan 8, 2024

📣 @samsonmobisa! 📣
Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork.
Please follow these steps:

  1. Make sure you've read and understood the contributing guidelines.
  2. Get the email address used to login to your Expensify account. If you don't already have an Expensify account, create one here. If you have multiple accounts (e.g. one for testing), please use your main account email.
  3. Get the link to your Upwork profile. It's necessary because we only pay via Upwork. You can access it by logging in, and then clicking on your name. It'll look like this. If you don't already have an account, sign up for one here.
  4. Copy the format below and paste it in a comment on this issue. Replace the placeholder text with your actual details.
    Screen Shot 2022-11-16 at 4 42 54 PM
    Format:
Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>

@Tony-MK
Copy link
Contributor

Tony-MK commented Jan 8, 2024

I have some questions about the View console expected result.

  1. Should the View console page be full screen because long lines of code will be displayed and the console input component be wide enough to type in scripts?

  2. Should we have a separate output log component to differentiate the output from the user-performed scripts? Another option is to color or tag it differently from the rest of the application's console log.

  3. Are we going to preserve logs after the window refreshes?

@samsonmobisa
Copy link

Contributor details
Your Expensify account email: [email protected]
Upwork Profile Link: https://www.upwork.com/freelancers/~018e78ebb343f52495

Copy link

melvin-bot bot commented Jan 8, 2024

✅ Contributor details stored successfully. Thank you for contributing to Expensify!

@samsonmobisa
Copy link

Have already submitted my details

@Piyush-Kumar-Ghosh
Copy link

Piyush-Kumar-Ghosh commented Jan 8, 2024

Problem Overview:
The current challenge involves diagnosing issues at scale, where existing logs may not provide a clear understanding of problems. External contributors lack access to production logs, and even for internal engineers with access, retrieving logs can be cumbersome and time-consuming. The logs' ambiguity, coupled with difficulties in observing real-time logs, further complicates issue resolution.

Proposed Changes:

  1. Troubleshooting Section:

    • Add a Troubleshooting section to the Avatar > About page, above Report a bug.
    • Always available on staging and production builds.
  2. Test Preferences Settings:

    • Move Test preferences settings to the Troubleshooting menu.
    • Settings include Use staging server, Force offline, Simulate failing network requests, Invalidate Authentication status, and Destroy Device credentials.
  3. Reset and Refresh:

    • Introduce a Reset and refresh option in Troubleshooting.
    • Wipes Onyx clean, retaining only essentials for OpenApp calls, avoiding ReAuthentication.
  4. View Console:

    • Add a View console option in Troubleshooting.
    • Opens a console with locally stored logs, allowing real-time updates.
    • Includes a Save button for exporting logs to a text file on the device.
  5. Share Logs:

    • Include a Share button in Troubleshooting.
    • Facilitates posting logs into a selected room for collaboration.
    • Enhances communication with users for issue resolution.
  6. Command Line:

    • Integrate a command line at the bottom of the console.
    • Allows execution of arbitrary JavaScript, akin to the Chrome console.

Implementation Details:

  1. Troubleshooting sections should have a clear UI design, ensuring easy navigation.
  2. Test preferences settings must seamlessly transition to the Troubleshooting menu.
  3. Reset and refresh should be implemented efficiently to maintain essential data.
  4. View consoles should offer a user-friendly interface, supporting both existing and real-time logs.
  5. Share logs feature should provide options for selecting rooms and streamline the collaboration process.
  6. Command line functionality should resemble the Chrome console, with proper error handling.

Expected Results:

  1. Improved accessibility for external contributors and internal engineers.
  2. Enhanced clarity in log interpretation and diagnosis.
  3. Streamlined troubleshooting process with efficient tools.
  4. Real-time log observation for quicker issue resolution.
  5. Seamless collaboration through log sharing.

I would appreciate your feedback and insights on this proposal. Looking forward to discussing this further.

Copy link

melvin-bot bot commented Jan 8, 2024

📣 @Piyush-Kumar-Ghosh! 📣
Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork.
Please follow these steps:

  1. Make sure you've read and understood the contributing guidelines.
  2. Get the email address used to login to your Expensify account. If you don't already have an Expensify account, create one here. If you have multiple accounts (e.g. one for testing), please use your main account email.
  3. Get the link to your Upwork profile. It's necessary because we only pay via Upwork. You can access it by logging in, and then clicking on your name. It'll look like this. If you don't already have an account, sign up for one here.
  4. Copy the format below and paste it in a comment on this issue. Replace the placeholder text with your actual details.
    Screen Shot 2022-11-16 at 4 42 54 PM
    Format:
Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>

@LLPeckham
Copy link
Contributor

Ahh yes. How quickly I forget. Thanks 😂

@shawnborton
Copy link
Contributor

Hmm good question. I think technically Troubleshoot should only get the full card treatment if we make it its own standalone section in the LHN and pull it out of About. I don't think that's a bad idea... we might want to put it just below About in the Settings LHN? cc @muttmuure @Expensify/design for thoughts on that. But I think this one could quite easily be converted to that wider card style that other sections are using.

@dannymcclain
Copy link
Contributor

I think technically Troubleshoot should only get the full card treatment if we make it its own standalone section in the LHN and pull it out of About. I don't think that's a bad idea... we might want to put it just below About in the Settings LHN?

I honestly don't feel super strongly either way. I agree that it could easily become it's own standalone page (and I like your suggestion of putting it underneath the About nav item).

Copy link

melvin-bot bot commented Feb 15, 2024

⚠️ Looks like this issue was linked to a Deploy Blocker here

If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.

If a regression has occurred and you are the assigned CM follow the instructions here.

If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.

@melvin-bot melvin-bot bot added Reviewing Has a PR in review Weekly KSv2 Awaiting Payment Auto-added when associated PR is deployed to production and removed Daily KSv2 Weekly KSv2 labels Feb 16, 2024
@melvin-bot melvin-bot bot changed the title [$500] HIGH: Add Avatar > About > Troubleshooting section [HOLD for payment 2024-02-26] [$500] HIGH: Add Avatar > About > Troubleshooting section Feb 19, 2024
@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label Feb 19, 2024
Copy link

melvin-bot bot commented Feb 19, 2024

Reviewing label has been removed, please complete the "BugZero Checklist".

Copy link

melvin-bot bot commented Feb 19, 2024

The solution for this issue has been 🚀 deployed to production 🚀 in version 1.4.42-5 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2024-02-26. 🎊

For reference, here are some details about the assignees on this issue:

Copy link

melvin-bot bot commented Feb 19, 2024

BugZero Checklist: The PR adding this new feature has been merged! The following checklist (instructions) will need to be completed before the issue can be closed:

  • [@fedirjh] Please propose regression test steps to ensure the new feature will work correctly on production in further releases.
  • [@greg-schroeder] Link the GH issue for creating/updating the regression test once above steps have been agreed upon.

@marcochavezf
Copy link
Contributor

Hi @TMisiukiewicz! We got an alert in the backend that we're receiving a massive message log from NewDot, and it seems it's coming from this PR that was deployed recently. This is an example of the message:

LogAPI - Massive log message given ~~ 0: '[info] [Onyx] merge() called for key: logs properties: 1708058686372,1708058686377,1708058696327,1708058730961,1708058732236,1708058732237,1708058734681,1708058734923,1708058734924,1708058734925,1708058734926,1708058734934,1708058734983,1708058734984,1708058734985,1708058734986,1708058735023,1708058735064,1708058735083,1708058735101,1708058735102,1708058735106,1708058735138,1708058735153,1708058735167,1708058735302,1708058735331,1708058735446,1708058735453,1708058735459,1708058735465,1708058735572,1708058735586,1708058735588,1708058735771,1708058735787,1708058735859,1708058735877,1708058735886,1708058735887,1708058735992,1708058735993,1708058736070,1708058736119,1708058736120,1708058737117,1708058737118,1708058737128,1708058737139,1708058741789,1708058741809,1708058741810,1708058741811,1708058741812,1708058741966,1708058741967,1708058741976,1708058742114,1708058748941,1708058749061,1708058749062,1708058749070,1708058749209,1708058749218,1708058749219,1708058749233,1708058749235,1708058749272,1708058749573,170'

We are already trimming the message in the backend using this logic:

if (strlen($log['message']) > 1024 * 1024) {
    self::alert('Massive log message given', substr($log['message'], 0, 1024));
} 

And I was wondering if we do the same in the frontend also to avoid sending large amounts of data for the logs, thoughts?

@TMisiukiewicz
Copy link
Contributor

Hi @marcochavezf! That's weird because there is a condition to not attach logs with pattern [Onyx] merge() called for key: logs because otherwise it would create an infinite loop and end up with tons of huge log messages (since new log is attached to logs, which triggers logging a message that merge was called for key logs... and so on). Is this happening frequently? My only idea for now is someone did some changes in the codebase locally and removed this condition

@marcochavezf
Copy link
Contributor

Oh interesting, I happened several times yesterday, and according to the logs it wasn't coming from a local environment. I will report again if the alert pops up again, thanks!

@melvin-bot melvin-bot bot added Daily KSv2 and removed Weekly KSv2 labels Feb 26, 2024
@greg-schroeder
Copy link
Contributor

Reviewing

@greg-schroeder
Copy link
Contributor

Paid!

@fedirjh can you create a regression test for this new feature? Thanks!

@greg-schroeder greg-schroeder changed the title [HOLD for payment 2024-02-26] [$500] HIGH: Add Avatar > About > Troubleshooting section [Regression test required] [$500] HIGH: Add Avatar > About > Troubleshooting section Feb 26, 2024
@fedirjh
Copy link
Contributor

fedirjh commented Feb 26, 2024

Regression Test Proposal

Test 1 :

1. Go to Profile -> About
2. Verify "Troubleshoot" is available in the menu
3. Go to Troubleshoot
4. Verify "Test preferences" section is visible
5. Press "Reset and refresh"
6. Confirm the modal
7. Go back to home screen
8. Verify list of chats is visible
9. If you are on the home screen and the list is not visible yet, wait a moment until it's available (it means data is still loading)

Test 2 :

1. Go to Profile -> About -> Troubleshoot
2. Verify "Client side logging" is disabled
3. Enable "Client side logging"
4. Verify "View debug console" option appears
5. Press "View debug console"
6. Verify Debug console screen is open and logs are displayed in a console in real-time
7. Type 5+5 in the input below the console and press "Execute"
8. Verify 10 was displayed in a console
9. Press "Save log"
10. Verify file is downloaded to a device
11. Press "Share log"
12. Verify you are navigated to the search page
13. Press on any report
14. Verify you are moved to the report and logs are sent as an attachment
  • Do we agree 👍 or 👎

@techievivek
Copy link
Contributor

Nice, regression tests look good to me. 👍 CC @greg-schroeder

@greg-schroeder
Copy link
Contributor

LGTM! Filing!

@github-project-automation github-project-automation bot moved this from HIGH to CRITICAL in [#whatsnext] #vip-vsb Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Daily KSv2 Internal Requires API changes or must be handled by Expensify staff NewFeature Something to build that is a new item. Task
Projects
No open projects
Status: CRITICAL
Development

No branches or pull requests