Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Incorrect feature importance visualization for classification #146122

Closed
valeriy42 opened this issue Nov 23, 2022 · 1 comment · Fixed by #150816
Closed

[ML] Incorrect feature importance visualization for classification #146122

valeriy42 opened this issue Nov 23, 2022 · 1 comment · Fixed by #150816
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Data Frame Analytics ML data frame analytics features :ml Team:ML Team label for ML (also use :ml) v8.7.0

Comments

@valeriy42
Copy link
Contributor

Kibana version:

Current main branch. (8.7)

Elasticsearch version:

Current main branch (8.7)

Describe the bug:
When I change the class name in the drop-down menu, the feature importance graph is incorrectly recomputed:

Default class:
image

Changed class:
image

The DestCountry feature cannot change the probability to a value over 1.0.

Steps to reproduce:

I used the kbana_sample_data_flights dataset:

{
  "id": "fimps-test-classification",
  "create_time": 1669196535717,
  "version": "8.7.0",
  "authorization": {
    "roles": [
      "superuser"
    ]
  },
  "description": "",
  "source": {
    "index": [
      "kibana_sample_data_flights"
    ],
    "query": {
      "match_all": {}
    },
    "runtime_mappings": {
      "hour_of_day": {
        "type": "long",
        "script": {
          "source": "emit(doc['timestamp'].value.getHour());"
        }
      }
    }
  },
  "dest": {
    "index": "fimps-test-classification",
    "results_field": "ml"
  },
  "analysis": {
    "classification": {
      "dependent_variable": "Cancelled",
      "num_top_feature_importance_values": 5,
      "class_assignment_objective": "maximize_minimum_recall",
      "num_top_classes": -1,
      "prediction_field_name": "Cancelled_prediction",
      "training_percent": 40,
      "randomize_seed": -6164897408045031000,
      "early_stopping_enabled": true
    }
  },
  "analyzed_fields": {
    "includes": [
      "AvgTicketPrice",
      "Cancelled",
      "Carrier",
      "Dest",
      "DestAirportID",
      "DestCityName",
      "DestCountry",
      "DestRegion",
      "DestWeather",
      "DistanceKilometers",
      "DistanceMiles",
      "FlightDelay",
      "FlightDelayMin",
      "FlightDelayType",
      "FlightNum",
      "FlightTimeHour",
      "FlightTimeMin",
      "Origin",
      "OriginAirportID",
      "OriginCityName",
      "OriginCountry",
      "OriginRegion",
      "OriginWeather",
      "dayOfWeek",
      "hour_of_day"
    ],
    "excludes": []
  },
  "model_memory_limit": "31mb",
  "allow_lazy_start": false,
  "max_num_threads": 8
}
@valeriy42 valeriy42 added bug Fixes for quality problems that affect the customer experience :ml Team:ML Team label for ML (also use :ml) labels Nov 23, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Data Frame Analytics ML data frame analytics features :ml Team:ML Team label for ML (also use :ml) v8.7.0
Projects
None yet
4 participants