Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] add support for script in group_by #53167

Merged
merged 3 commits into from
Mar 9, 2020

Conversation

hendrikmuhs
Copy link

@hendrikmuhs hendrikmuhs commented Mar 5, 2020

add the possibility to base the group_by on the output of a script.

closes #43152

Example usecase:

POST _transform/_preview
{
  "source": {
    "index": [
      "kibana_sample_data_logs"
    ]
  },
  "pivot": {
    "group_by": {
      "agent": {
        "terms": {
          "script": {
            "source": """String agent = doc['agent.keyword'].value; 
            if (agent.contains("MSIE")) { 
              return "internet explorer";
            } else if (agent.contains("AppleWebKit")) { 
              return "safari"; 
            } else if (agent.contains('Firefox')) { 
              return "firefox";
            } else { return agent }""",
            "lang": "painless"
          }
        }
      }
    },
    "aggregations": {
      "200": {
        "filter": {
          "term": {
            "response": "200"
          }
        }
      },
      "404": {
        "filter": {
          "term": {
            "response": "404"
          }
        }
      },
      "503": {
        "filter": {
          "term": {
            "response": "503"
          }
        }
      }
    }
  },
  "dest": {
    "index": "pivot_logs"
  }
}

outputs

{
  "preview" : [
    {
      "agent" : "firefox",
      "200" : 4931,
      "404" : 259,
      "503" : 172
    },
    {
      "agent" : "internet explorer",
      "200" : 3674,
      "404" : 210,
      "503" : 126
    },
    {
      "agent" : "safari",
      "200" : 4227,
      "404" : 332,
      "503" : 143
    }
  ],
  "mappings" : {
    "properties" : {
      "200" : {
        "type" : "long"
      },
      "agent" : {
        "type" : "keyword"
      },
      "404" : {
        "type" : "long"
      },
      "503" : {
        "type" : "long"
      }
    }
  }
}

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing #53135 was discovered while developing this.

I think we should hold off on merging this until the composite agg issue is fixed yes?

@hendrikmuhs
Copy link
Author

I am guessing #53135 was discovered while developing this.

I think we should hold off on merging this until the composite agg issue is fixed yes?

I do not think #53135 is a blocker for this PR, because it only affects a special case namely using _value in scripts which is a shortcut for doc['field_i_am_interested_in'].value. Another alternative is to use the context object. Both using doc or ctx seem to be more common than _value.

It would of course be great to fix #53135 before 7.7.0, still I see no justification for a blocker.

@benwtrent
Copy link
Member

@hendrikmuhs I misunderstood the issue then. I did not know it was only for the _value case.

@hendrikmuhs hendrikmuhs merged commit e4f45db into elastic:master Mar 9, 2020
@hendrikmuhs hendrikmuhs deleted the transform-group-script branch March 9, 2020 08:59
hendrikmuhs pushed a commit to hendrikmuhs/elasticsearch that referenced this pull request Mar 10, 2020
add the possibility to base the group_by on the output of a script.

closes elastic#43152
hendrikmuhs pushed a commit that referenced this pull request Mar 10, 2020
add the possibility to base the group_by on the output of a script.

closes #43152
backport #53167
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Transform] Add data frame scripted fields for group_by fields
5 participants