-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Graphs & Configure Infrastructure #1023
Fix Graphs & Configure Infrastructure #1023
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@vconzola Don't focus on the graph data -- this PR is not for that. See the breadcrumbs, name, dropdown options for the time range... that kind of stuff. If those are good, we can consider the UX on-track. Will look to add more effort behind the x-axis and making that stack line chart for a future PR. |
A couple small things... |
Not yet -- but they will be -- I have a TODO in the code to divide them. But without real data for these things, it's hard to test.
Understood, I'll look to update. |
d5c8ad7
to
36ba734
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been through the code, nothing really caught my eye as being obviously wrong - if that's worth anything 😆
It has been very useful for me to see what areas of the codebase you're changing for metrics / what components you're using however.
So far I haven't been able to get your KFDef working in OSD, I'll have another crack at that tomorrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed a couple refactor typos to correct 😓 Whoops.
timeframe: TimeframeTitle, | ||
lastUpdateTime: number, | ||
setLastUpdateTime: (time: number) => void, | ||
): { | ||
data: Record<ModelServingMetricType, ContextResourceData<PrometheusQueryRangeResultValue>>; | ||
data: Record<RuntimeMetricType, ContextResourceData<PrometheusQueryRangeResultValue>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo -- needs to have both types; sad TypeScript didn't catch this during my refactoring.
import RuntimeGraphs from '~/pages/modelServing/screens/metrics/RuntimeGraphs'; | ||
import { MetricType } from '~/pages/modelServing/screens/types'; | ||
|
||
const ProjectInferenceMetricsWrapper: React.FC = () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo
return ( | ||
<ModelServingMetricsProvider queries={queries} type={MetricType.RUNTIME}> | ||
<MetricsPage | ||
title={`ovm metrics`} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fixed label but we are currently supporting custom runtimes, it might be weird having a different runtime and displaying ovms. And I think is ovms (OpenVINO Model Server).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand it's fixed, this will need to be adjusted when we get the runtime proper changes...
cc @vconzola please address the comment about the name -- I was given this "ovm" concept from you. I can definitely expand it -- I just didn't know what OVM was 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewballantyne As I mentioned in my comment above, "ovms" is the model server type - OpenVINO Model Server, which comes from the Type column of the model server table. Currently that's all we support. But as we add support for Watson Core Serving, and other runtimes this value will change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for the model server metrics the breadcrumb should be "Data science projects > Andrew's test > ovms metrics", where "ovms" is the model server type
@vconzola you explicitly said ovm
... do we want to expand it to the fully name? That's fine, just need clarification. Do we say OpenVINO Model Serving
anywhere? 🤔
And yes, I am aware it is fixed text... but unless we have all of this text floating around in objects (I don't think we do) it's hardcoded until we support custom runtimes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ovms" is what shows up in the model server table Type column. I'm not sure where that text comes from because the user currently doesn't select a runtime. It must come the backend someplace. I want what's in the chart title to match what's in the table so the user can make a 1-1 connection. Just FYI, once we support multiple servers (which should be next sprint, I think) the "Type" is going to be replace by a "Name", so all this will change. I'll show what I mean in the UX meeting tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ServingRuntime
object has a name
attribute, the default one has ovms
for example. I think we can dynamically fetch that name since it can change depending on the custom runtime installed.
link: `/projects/${currentProject.metadata.name}`, | ||
}, | ||
{ | ||
label: `ovm metrics`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with this label.
Will adjust the PR tomorrow -- might even get some queries as well 🎉 |
Going to merge this into the feature branch -- we can swing back to the comments pending and the right queries. I've made notes on the ticket so the conversations are not lost. Adding labels manually so @alexcreasy can work off this refactor work. |
Going to merge this into the feature branch -- we can swing back to the comments pending and the right queries. I've made notes on the ticket so the conversations are not lost. Adding labels manually so @alexcreasy can work off this refactor work. |
1 similar comment
Going to merge this into the feature branch -- we can swing back to the comments pending and the right queries. I've made notes on the ticket so the conversations are not lost. Adding labels manually so @alexcreasy can work off this refactor work. |
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* Rework Inference Metrics & Add Runtime Metrics (invalid quieries) * Add stacked line chart functionality
* Re-enable Metrics * Fix Graphs & Configure Infrastructure (#1023) * Rework Inference Metrics & Add Runtime Metrics (invalid quieries) * Add stacked line chart functionality * Trustyai demo phase0 (#1093) * Explainability: Fairness and Bias Metrics (Phase 0) (#1001) (#1006) (#1007) (#1008) - Initial feature set for TrustyAI related UI functionality - Adds tab based navigation to modelServing screen - Adds a bias metrics tab with charts for visualising SPD and DIR metrics - Enhances prometheus query features for accessing TrustyAI data - Enhacements to MetricsChart component making it more configurable * Update key of request name to match trusty backend * Remove unnecessary div and inline style from tooltip * Remove 15 minutes refresh option * Prefer optional prop to type union with undefined * Move function definitions inline * Prefer narrowing over type conversion * Inline tab change handler * Remove toolbar option from ApplicationsPage * Inline domain calculator functions * Move defaultDomainCalculator to utils * Return null instead of undefined * Use threshold label instead of index for key * Add enum for tab keys * Remove magic numbers from domain calculations * Make ResponsePredicate mandatory and add predicate to useQueryRangeResourceData * TrustyAI Client (#1318) * Add support for insecure http requests in development mode * Adds low level API client for TrustyAI service * Adds TrustyAI high level API and contexts * Get scheme of TrustyAI route from k8s data * Add model bias configuration table (#1290) * Add model bias configuration table * rebase and remove mock data * Update Trusty AI client to handle API changes (#1336) (#1337) * Add bias metrics configuration modal (#1343) * Add configuration modal * address comments * get rid of some TODOs and refine the route * Multi-metric display on model bias screen (#1273) (#1349) * Enhancements to model bias screen * Display of multiple bias charts simultaneously * Multi-select component, allowing free text, or select-from-list selection of chartst to display * Ability to collapse / expand individual charts * User selectable refresh rates of chart data * Chart selection and open / closed status is persisted to session cache for life of user's browser session Display user defined threshold values on charts (#1163) * Clean up of bias chart logic * Displays thresholds chosen by user, or defaults if none. * Improves domain and threshold calculation based on user values or defaults * Fix metrics submission issue and handle errors (#1378) * Fix metrics submission issue and handle errors * fix lint issue * use error handler on GET functions * Default and restrict threshold, add tooltips, default duplicate name and set feature flag (#1390) * Default and restrict threshold, add tooltips, default duplicate name and set feature flag * fix lint * add tooltips and dropdown descriptions * clear data when closing configuration modal * really solve deleting issue, make empty table view a common component and apply it everywhere * address comments * Minor enhancements to bias chart (#1386) (#1399) * Adds refresh interval options that match openshift observability dashboard * Show first chart from list, if none selected when user first navigates to bias tab * Use search icon instead of plus for nothing selected empty state * Fix error with calculation of 30 days constant * Deleted charts are removed from session storage * Fixes issue with bias charts auto-refreshing with stale data (#1403) (#1404) * Refactor prometheus queries to remove duplication * Fix graph not refreshing issue * Add code review suggestions * Add performance metrics feature flag and refactor runtime server route (#1413) * Add performance metrics feature flag, refactor runtime server route and solve layout issues * revert some style changes * Model serving metrics renaming (#1421) * Adds support for TrustyAI Operator (#1443) * Adds support for TrustyAI Operator (#1276) * Changes from feedback --------- Co-authored-by: Andrew Ballantyne <[email protected]> Co-authored-by: Andrew Ballantyne <[email protected]> Co-authored-by: Alex Creasy <[email protected]> Co-authored-by: Alex Creasy <[email protected]>
Work towards: #1022
Description
Added the Runtime Metrics page & reworked the Inference Metrics page. Added a stacked line chart.
----Queries do not represent data right now----
Global - Inference Metrics
Project - Metrics kebab item added to Runtime:
Project - Runtime Metrics
Project - Inference Metrics
Scale options
Stacked Line Chart (random data):
Things that still need to be done (post this PR):
Improve x-axis to have more desired and static number of items (disable the auto x-axis from Victory)How Has This Been Tested?
KFDef
Dashboard steps
Test Impact
At this time, no tests are being done. Will need to look at what we can really test here -- it's all dependant on graph values -- and testing the graphs themselves is more like testing Victory. Not sure we have a Storybook test. Might have unit tests, I'll look to see if any utilities could benefit from it.
Request review criteria: