Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ECS] Multiple users in an event proposal #809

Closed
webmat opened this issue Mar 27, 2020 · 17 comments
Closed

[ECS] Multiple users in an event proposal #809

webmat opened this issue Mar 27, 2020 · 17 comments
Assignees
Labels
epic Feature: ECS ready Issues we'd like to address in the future. Team: ECS

Comments

@webmat
Copy link
Contributor

webmat commented Mar 27, 2020

After reviewing proposals and discussions, new and old, private and public (#117, #234, #589, #678, elastic/beats#9963, elastic/beats#10111, elastic/beats#10192), here's a high level proposal. I'd like to get feedback on this.

When user.* is populated at the root of an event, it's meant to capture the user performing an action, or the only user relevant to an event, when that's the situation.

The goal of this proposal is to solve the following use cases:

  1. User A creates or deletes User B (no prior or post state)
  2. User A modifies User B (there's a prior and a post state)
  3. User A assumes the identity of User B (or fails to)

One thing to note, is that with real-time privilege escalation (e.g. sudo, MacOS modal) or auditing frameworks like auditd, it's useful to be able to identify all 3 of these additional users concurrently, in a single event.

Here are locations I propose we could nest the user schema, to represent various use cases:

  • user.effective.*
  • user.target.*
  • user.new.*

Note that in each of these cases, we are nesting the user schema as a new name, similar to what we do with process.parent. To give a concrete example the effective user name would be at user.effective.name (not at user.effective.user.name).

user.effective.*

Used to identify the final identity, when logging in remotely as another user, or escalating privileges.

Simplified privilege escalation example, where "alice" escalates privileges to "root":

{
  "user": {
    "name": "alice",
    "effective": {
      "name": "root"
    }
  }
}

user.target.*

Used to identify the user being modified, or targeted by an action.

Simplified user management example, where "root" modifies user "bob":

{
  "user": {
    "name": "root",
    "target": {
      "name": "bob"
    }
  }
}

user.new.*

Used to identify what was changed about the user. Ideally this would only capture the difference, not repeat all values.

Simplified user management example, when an existing user is modified, for example changing user name. Here user "bob" is renamed "foo" by "root":

{
  "user": {
    "name": "root",
    "target": {
      "name": "bob"
    },
    "new": {
      "name": "foo"
    }
  }
}

Note: The creation of a new user would only use user.target, not user.new. user.new should only be used.

Putting it all together

If it's possible to determine who was the original user prior to privilege escalation, a full event could look like this:

{
  "user": {
    "name": "alice",
    "effective": {
      "name": "root"
    },
    "target": {
      "name": "bob"
    }
  }
}

The action being observed should be captured in event.action.

If the event is user management, the category fields should be event.category: [ "iam" ] and event.type: [ "user" ] with a secondary entry in event.type, whichever is more appropriate: "creation", "deletion" or "change".

If the event is authentication and privilege escalation, event.category: [ "authentication" ], event.type: ["start"] and event.outcome either "success" or "failure".

Discussion points

Nowhere in the above plan do we use the current nestings of the user field set: client.user, destination.user, host.user, server.user, source.user.

I'm wondering if people are using host.user or if we should deprecate it for removal in ECS 2.0.

I'd also like to know if the 4 nestings under the network-related field sets are currently being used. I realize that folks may have used for example source.user and destination.user to model remote logons or the "acting" vs "target" users, with ECS specifying nothing yet on how these interactions should be modelled until now. I'd like to get your input if the plan above makes sense.

Are there other uses that are actually network-related for source.user and destination.user?

@webmat
Copy link
Contributor Author

webmat commented Mar 27, 2020

Ping @MikePaquette @dainperkins @dcode @andrewstucki @leehinman @rw-access @marshallmain @janniten @willemdh @neu5ron @vbohata

Let's have a virtual party! #socialdistancing :-)

@willemdh
Copy link
Contributor

willemdh commented Mar 28, 2020

@webmat Really good and clear proposal!

We are not using host.user.

panw.panos contain source.user.name and destination.user.name. There is no need for source|destination.user.new|effective|target there afaik.

@neu5ron
Copy link

neu5ron commented Mar 28, 2020

Ping @MikePaquette @dainperkins @dcode @andrewstucki @leehinman @rw-access @marshallmain @janniten @willemdh @neu5ron @vbohata

Let's have a virtual party! #socialdistancing :-)

i'm in :)

@janniten
Copy link
Contributor

janniten commented Mar 28, 2020

Hi all!
I'm agree with @willemdh. This approach is very clear and represents very well all the cases we'have been discussing. I love it!
We are not using the host.user neither.
Regarding to the network-related field, for the Fortinet logs we are mapping, we only use source.user and destination.user. Same case that panw.panos

@webmat
Copy link
Contributor Author

webmat commented Apr 1, 2020

@janniten @leehinman Do you have insights you could share about the semantics of using source.user and destination.user? Would you say this maps with "3. User A assumes the identity of User B (or fails to)"

I may not have made this explicit enough in the user.effective description (I've now updated the issue body). But that scenario is meant to represent:

  • local privilege escalation like su or sudo
  • remote logon: "bob" logs on to machine B as "alice"

So I'm wondering if we should deprecate those, or clarify differences in semantics.

@janniten I'd like to see Fortinet events samples, if you can share any.

Looking at the Beats PANOS logs, it looks like it's only populating the source.user.* (and client.user.*) side of the equation, never the destination.user.*. So looking at this, I wonder if these are simply the "doer" in the event and hence should be at the top level, instead of nested there. WDYT @leehinman?

@willemdh
Copy link
Contributor

willemdh commented Apr 1, 2020

@webmat Just checked our panw logs and about 1/4 of the logs containing source.user.name have a destination.user.name.

This depends on what the user-id agents saw as the last logon to the destination IP.
So in fact this information is not always correct, as the event could be about traffic to a system which has multiple users or might even have nothing to do with any users logged in the destination ip.

@janniten
Copy link
Contributor

janniten commented Apr 2, 2020

Hi @webmat, I think the case "3. User A assumes the identity of User B (or fails to)"
is best represented using the user.name and user.effective rather than using the source.user and destination.user

Attached, some Fortinet logs samples
samples.txt

@leehinman
Copy link
Contributor

@webmat I like the proposal. Couple questions.

Would it be correct to say that user.effective would be used when we know that the event represents some kind of attempted transition across an identity boundary. Examples of identity boundary transitions could be:

  • changing identity on host (sudo, su, Run as)
  • Running a setuid/setgid program
  • Login (ssh, Kerberos cross realm auth, etc) where we know existing identity and identity you are hoping to become
  • Application dropping privileges (Sendmail becoming user to deliver mail)
  • Basic Auth in HTTP Request (if we know your current user info)

This wouldn't cover situations where we are establishing your identity for the first time or we don't have initial identity information.

  • First login to workstation
  • Firewall observing Basic Auth in HTTP Request

Also it looks like we would only support one identity boundary transition per event? So you wouldn't capture the following as a single event:

User Alice on machine A runs:
ssh [email protected] "sudo useradd malory"

I think one identity transition per event is good, I'm just making sure I'm reading it right.

I can't think of an example where we couldn't replace source.user with user and destination.user with user.effective

@rw-access
Copy link
Contributor

rw-access commented Apr 8, 2020

I just created the issue #810 for capturing information about Windows logon sessions and tokens and ran into this problem there, too. You can have multiple tokens in a single event, so whatever we do here should also be done there.

I realized yet another case for the multiple X in an event. We currently have process and process.parent, but do we also need to add process.target? That would refer to cross-process events. For example, mimikatz.exe opening a handle and reading memory from lsass.exe.

I don't think we need to pollute this issue with that specific scenario, but I figured it would be good to bring attention to the fact that "multiple users in an event" is just one example of "multiple X in an event", and the decision we make here should be easy enough to apply in multiple places.

For that reason, I'm wondering if it could make sense to have a target fieldset where we can map user, process, token, etc. underneath that. I'm not sure which is more clear to end users, but I'm not attached to either way: user.target, process.target, token.target vs target.user, target.process, target.token, etc.

Edit: There are also some cases where you could have three different X's in an event. For example, the acting process, the logging process and the target process. s/process/user happens as well.

@janniten
Copy link
Contributor

janniten commented Apr 8, 2020

I just created an issue for capturing information about Windows logon sessions and tokens and ran into this issue there, too. You can have multiple tokens in a single event, so whatever we do here should also be done there.

@rw-access, Can you provide more information about this issue? Thank you

@rw-access
Copy link
Contributor

Edited my comment above to add a link to #810, which contains the extra context. I think many of the concerns are still separate from this issue, and I don't want to hijack this thread, so any token- or session-specific discussion would be perfect on #810

@webmat
Copy link
Contributor Author

webmat commented Apr 8, 2020

@leehinman Yes, I agree with your examples. Whenever we can identify the prior identity (user at top level) and the ultimate identity , we should capture the former at user, and the latter at user.effective.

Note that I don't think this should set an undue burden on event sources to keep track of users over successive events. I think it's fine to track users across events. So there's 2 situations I see here:

  • when using an audit framework like auditd, if you can always determine who's the "real" user behind a privileged command, or the real user prior to switching user, then by all means let's always fill {"user": { "name"... "effective": {"name" ...}}
  • if the event source sees one transition (e.g. sudo su to root), then sees "root" doing stuff without the audit/original user, I think it's fine to capture only the transition event with {"user": { "name"... "effective": {"name" ...}} and the following ones with {"user": { "name": "root"... }}
    • I think people are used to tracing user activity across events like that, when there's no audit framework. @dcode @neu5ron thoughts?

Hopefully the semantics work for setuid/setgid as well. I think they do, but please let me know if you see potential edge cases.

Here's how I see the 3 levels example you're giving with ssh [email protected] "sudo useradd malory":

  • we have "alice" running the command "ssh [email protected] ..." on host a, and therefore emitted by this host.
    • {"user": { "name": "alice"... }, "destination": { "user": {"name": "bob" ...}}}
  • Then on host b we have user "bob" running the command sudo useradd malory. So this one is emitted by host b.
    • {"user": { "name": "alice"..., "effective": {"name": "bob" ...}, "target": {"name": "mallory" ...}}

WDYT?

I agree with @willemdh: there's no intention of nesting all of that new user mega-structure (user.target, user.effective...) in all places where user is currently reused. This bit is tricky, though, because I do want to drag along some of the nesting (user.group) but not all (the new ones proposed here)

@webmat webmat self-assigned this Apr 8, 2020
@webmat
Copy link
Contributor Author

webmat commented Apr 8, 2020

By the way, earlier in this thread I was asking about uses for source.user and destination.user. I'm still dubious about source.user, as I think user.* at the top level should always be the "doer".

But in my example in the previous comment, responding to Lee I did not use user.effective to capture the user at the destination, I used destination.user. I think that makes more sense.

In other words, when crossing a host boundary, user + destination.user (whether or not it's an identity change/ privilege escalation)

When changing identity on the same host (priv escalation or demotion like setuid), then it's user + user.effective.

Do we agree with that approach?

Thanks for all the feedback so far <3

@webmat
Copy link
Contributor Author

webmat commented Apr 16, 2020

Update to the proposal:

user.new.* is meant to capture changes to a modified user (e.g. name change). Note that this nesting is not meant to capture the details of the creation of a new user (the target user in a creation or deletion event should be user.target.*).

So instead of user.new.* to capture the modified details of a user, I think I'd go for user.changes.*, to reduce confusion.

I also considered other names like user.modified.* or user.changed.*, but I think the "changes" wording better captures the intent of having only the changed attributes.

So to recap, if we go forward with this, what was the section "user.new.*" will no longer need a clarifying note at the end, and the example for a user modification event would look like this instead (user root renames bob to foo):

{
  "user": {
    "name": "root",
    "target": {
      "name": "bob"
    },
    "changes": {
      "name": "foo"
    }
  }
}

@ebeahan
Copy link
Member

ebeahan commented Aug 3, 2020

By the way, earlier in this thread I was asking about uses for source.user and destination.user. I'm still dubious about source.user, as I think user.* at the top level should always be the "doer".

Some NGFW can integrate with a user directory and will build an association between a host/IP and user identity (e.g. PAN User-ID, Check Point Identity Awareness). In these cases the source.user.* fields could still have make sense with an intermediate observer generating the events.

@P1llus
Copy link
Member

P1llus commented Aug 4, 2020

@leehinman asked me to add a few examples here as well, this is for Zoom modules, often involving multiple users like (user X created user Y. Or User X modified group account Y for user Z)

Some examples, where you have operator being the user performing the action, and multiple references to another user/group etc with different names like id account_name etc

{
  "event": "account.updated",
  "payload": {
    "account_id": "abKKcd_IGRCq63yEy673lCA",
    "operator": "[email protected]",
    "operator_id": "iKoRgfbaTazDX6r2Q_eQsQL",
    "object": {
      "id": "eFs_EGRCq6ByEyA73qCA",
      "account_name": "Michael Harris",
      "account_alias": "MH"
    },
    "old_object": {
      "id": "eFs_EGRCq6ByEyA73qCA",
      "account_name": "Mike Harris",
      "account_alias": ""
    },
    "time_stamp": 1562000584527
  }
}
{
  "event": "chat_channel.member_invited",
  "payload": {
    "account_id": "vbbvnvAdsfe",
    "operator": "[email protected]",
    "operator_id": "z8dfgdfguQrdfgdf",
    "object": {
      "name": "Delivering Happiness",
      "id": "6dfgdfgdg444447b0egga",
      "type": 1,
      "date_time": "2020-02-10T21:39:50Z",
      "timestamp": 1581370790388,
      "members": [
        {
          "id": "s0hhFOCYw",
          "display_name": "Matt Y"
        }
      ]
    }
  }

Last example is a user approving a meeting invitation registration, with a example related owner of the meeting, and includes the username of the registration that was approved:

{
  "event": "meeting.registration_approved",
  "payload": {
    "account_id": "lAAAAAAAAAAAAA",
    "operator": "[email protected]",
    "operator_id": "Lobbbbbbbbbb_qQsQ",
    "object": {
      "uuid": "dj12vck6sdTn6yy7qdy3dQg==",
      "id": 150000008,
      "host_id": "uLobbbbbbbbbb_qQsQ",
      "topic": "A test meeting",
      "type": 2,
      "start_time": "2019-07-11T20:00:00Z",
      "duration": 60,
      "timezone": "America/Los_Angeles",
      "registrant": {
        "id": "U0BBBBBBBBBBfrUz1Q",
        "first_name": "Cool",
        "last_name": "Person",
        "email": "[email protected]"
      }
    }
  }
}

@ebeahan ebeahan added the ready Issues we'd like to address in the future. label Aug 4, 2020
@jamiehynds jamiehynds changed the title Multiple users in an event proposal [ECS] Multiple users in an event proposal Sep 3, 2020
@webmat
Copy link
Contributor Author

webmat commented Nov 3, 2020

Closing in favor of #1066 (see also RFC 0007 stage 3 PR #1017)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic Feature: ECS ready Issues we'd like to address in the future. Team: ECS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants