Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top level: "client" and "server" #63

Closed
ave19 opened this issue Aug 1, 2018 · 31 comments · Fixed by #236
Closed

Top level: "client" and "server" #63

ave19 opened this issue Aug 1, 2018 · 31 comments · Fixed by #236
Labels

Comments

@ave19
Copy link

ave19 commented Aug 1, 2018

Hey! We do a lot of network flow work. We have a sort of issue using "source" and "destination" because flow data comes in both directions and we get records for each. The data for a single session might look like:

source.ip source.port destination.ip desatination.port
1.2.3.4 54321 6.7.8.9 443
6.7.8.9 443 1.2.3.4 5432

So that's a problem for us. The concepts of source and destination really only apply on a packet scale anyway. We'd like to normalize both of the records into:

client.ip client.port server.ip server.port
1.2.3.4 54321 6.7.8.9 443

This would also sort through things like DNS requests and other services that open a port.

Thoughts about that?

@webmat
Copy link
Contributor

webmat commented Aug 1, 2018

Indeed, we've been pondering that already, after @robcowart's comment in this thread.

I wonder with this strategy, which side is being called "server", in cases where the infrastructure under management is calling to the outside (e.g. calling an external API, triggering webhooks to arbitrary customer endpoints).

In common parlance, my node generating the event would be the "client", and the remote (which I may or may not manage) would be the "server".

Parallel to this, it may be worth mentioning that for other reasons, we're starting to discuss doing classification of the IPs (local, private, public, multicast), which may help figure out which side is which.

@webmat
Copy link
Contributor

webmat commented Aug 1, 2018

Another note about this, more applicable for security, but also when monitoring network gear.

Who's the client and who's the server, if the flow event comes from an agent that's sitting in between? :-)

@robcowart
Copy link

Basically a server provides a service (a port or group of ports). Clients connect to those services. A server will only respond to a client. It doesn't initiate conversations. Conversely a client only listens for responses. It doesn't listen for arbitrary connection requests.

The determination of client and server can be quite tricky if you don't have a record of the initial packet transmitted (such as the SYN packet sent to initiate the TCP handshake). 20 years ago you could be >90% accurate simply by assuming the lower port value is the server and the larger value is the client. However with so many applications now listening on higher ports (e.g. ES 9200, LS 9600, Kafka 9092, etc.) you will get at best about 65% accuracy with with this method. Many log sources are bit more authoritative in this regard than flow records.

Basically there isn't a single method that works. A combination of data source specific methods that arrive at a consensus is usually necessary. With the solution we provide to our paying customers, we find that we are about 95% accurate out-of-the-box. With some tuning (it can be customized) 98-99% is possible.

@webmat can you provide a more specific example of what you are referring to regarding an "agent"?

@robcowart
Copy link

I will also add that local/private/public isn't much help when determining client/server. However reserved multicast and broadcast IP and MAC addresses will always be associated with the server end of the conversation. This is one input for the "consensus" method we use.

@robcowart
Copy link

The last point I will make is that it is not an either/or situation. While client/server is the preferred perspective for most use-cases, src/dst is needed for some types of threat detection.

Consider a few security related analytics scenarios...

  • A port scan will be from client to server.
  • However an amplification attack will look at sources to destinations, where the source port would be from a well known UDP service (e.g. 53 for DNS).

So depending on what we are looking for our analytics configuration will sometimes use src/dst and sometimes use client/server.

@ave19
Copy link
Author

ave19 commented Aug 1, 2018

@webmat

in cases where the infrastructure under management is calling to the outside (e.g. calling an external API, triggering webhooks to arbitrary customer endpoints).

The host running the API service is also generating logs. From that service's perspective, it's the server (running a service) and your caller is the client.

In the events coming from your server, its the server and things that connect to it are clients.

@ave19
Copy link
Author

ave19 commented Aug 1, 2018

@webmat

Who's the client and who's the server, if the flow event comes from an agent that's sitting in between? :-)

Whoever got the first SYN is the server. Generally, the lower port.

[edit: our pcap drop rates are really low, but not zero, so we might miss that SYN. See @robcowart comment. also also, with UDP you don't even get that. For a UDP service, unless you do protocol inspection, you can't really know whether the packet you saw was the request or the answer.]

The agent in the middle may not be able to tell, though. When we map source and destination to client and server, we don't delete the source and destination bits, those are the ones we're sure about!

@ave19
Copy link
Author

ave19 commented Aug 1, 2018

@robcowart

The determination of client and server can be quite tricky

You are so right!

@ave19
Copy link
Author

ave19 commented Aug 1, 2018

Honestly, I don't expect a flow's interpretation of client and server to be 100% accurate for all the reasons @robcowart points out.

Most of the time we're going to use those tags, we're applying them to logs coming from things like web servers. From inside a web server's event feed, the source and destination don't really apply. And, if you're one of ten servers running on a host, you might have a different server.ip from the others, and each of those different from host.ip on which you run, and agent.ip or device.ip where your logs get sent, or what have you.

It just makes a little space.

@ave19
Copy link
Author

ave19 commented Aug 1, 2018

And for cyber reasons, having all of your servers (on whatever boxes) call all of their clients client allows us to more easily track one or more IPs that might be up to something by correlating logs from all the services running on all the hosts.

@robcowart
Copy link

robcowart commented Aug 1, 2018

I agree with you @ave19, for some data sources the client and server are clear. We still set source and destination fields, but will also set something like "[metadata][isServer]" => "destination". When the event hits the client/server determination logic, this flag will cause the more complicated logic to be bypassed, and the client/server fields to be set with a simple assignment.

@ave19
Copy link
Author

ave19 commented Aug 1, 2018

@robcowart interesting... we have lots of different kinds of feeds, so lots of different parsing logic. most of the time, we can go straight in to the server.ip form.

@willemdh
Copy link
Contributor

willemdh commented Aug 1, 2018

Although I can definitely understand your points @ave19, for me source and destination are more clear and less Confusing / ambigous then client and server. When 2 applications are exchanging data through an esb, id really prefer to be able to use source and destination objects in the esb logs. But then again I'm a system engineer, not a network engineer.

@ave19
Copy link
Author

ave19 commented Aug 1, 2018

@willemdh I hear you, but think about UDP. Or think about DNS in particular. One system sends a request to another, and the other answers it. Do you swap source and destination on both sides of that? UDP is stateless, so you'd almost have to. That means the DNS server is in half the events on both sides. How do you pie chart where your requests are coming from in that scenario? Filter it in post? It is Elasticsearch I guess. It's not that doing source and destination is wrong in this scenario, it's that it's less easy to work with the data collected.

@willemdh
Copy link
Contributor

willemdh commented Aug 1, 2018

@ave19 Ok, I can defintely use the client / servers approach for F5 / Palo Alto use cases.
In case I would need a 'non-connection' related source destination info,
I can always create my own (private) source and destination objects.

So, looking at #51 this is were ECS would go then

Field Description Type
connection.server.host.ip IP address of the server.Can be one or multiple IPv4 or IPv6 addresses. ip
connection.server.host.name Hostname of the server. keyword
connection.server.host.port Port of the server. long
connection.server.host.mac MAC address of the server. keyword
connection.server.host.domain server domain. keyword
connection.server.host.subdomain server subdomain. keyword
connection.client.host.ip IP address of the client.Can be one or multiple IPv4 or IPv6 addresses. ip
connection.client.host.name Hostname of the client. keyword
connection.client.host.port Port of the client. long
connection.client.host.mac MAC address of the client. keyword
connection.client.host.domain client domain. keyword
connection.client.host.subdomain client subdomain. keyword
connection.direction Direction of the network traffic. Recommended values are:* inbound* outbound* unknown keyword
connection.forwarded_ip Host IP address when the client IP address is the proxy. ip

Shouldn't we move network.session_id to the connection object too then? See #37

@ruflin
Copy link
Member

ruflin commented Aug 2, 2018

Thanks for all the discussion above. My take away so far is that server, client are not necessarily replacing source, destination but both can exist at the same time and complete each other.

What if we have all 4? I personally like adding server, client as especially for web server logs as examples it feels more intuitive to use client and server.

@ave19
Copy link
Author

ave19 commented Aug 2, 2018

Heh, um, at the risk of scuttling my own topic: I was poking this today and decided that service might be better than server since I can pack more than one service into a single box. But that means that I can use host instead of server, and if I put things like the (possibly virtual) ip info into service.ip and service.port, I might be able to make that part work with existing top level fields. A service could be in a docker container on a host and so forth. I think calling it service makes it clear it's not necessarily a box. Thoughts about that part?

To be clear, this is mostly about logs coming from that running instance of the service (ie apache). That service will report that a client connected to it, so I think I still want client as a top level. Things like service.state (with a value like running) still apply.

The logs from that service will leave artifacts that allow me to collect information about the host and agent along the way. I think this is enough to let me trace that log event back to the origin.

@webmat
Copy link
Contributor

webmat commented Aug 3, 2018

@robcowart what I meant by "agent" was simply a monitoring agent like Packetbeats. Perhaps a misnomer, because in some cases, the event source will be a device itself being poked from the outside. But I just meant whatever was collecting the traffic event data.

Given the current consensus of how tricky it can be to reliably determine who's the server & who's the client, I think we don't have a choice but to keep source & destination. Then in cases where we can reliably determine server/client, we can add the appropriate fields.

Or were you actually removing src/dst whenever you got to reliably determine srv/cli?

@robcowart
Copy link

As I mention above both can be valuable, depending on what you are trying to determine.

I mentioned in another issue, that I would prefer to have src/dst and then a flag field like isServer which would be either src or dst. This would avoid A LOT of duplicate data. Unfortunately Kibana doesn't work like that. You could possibly get away with scripted fields for some things, but not all viz types support scripted fields (e.g. Timelion and TSVB).

Until there is more flexibility I will continue to tell myself "disks are cheap" and will value functionality and great user-experience over a few extra HDDs.

@ave19
Copy link
Author

ave19 commented Aug 4, 2018

I agree, both are valuable. I also agree disks are cheap!

In a network flow monitoring situation, the only thing you can really reliably know is source and destination. However, for cyber, that means doing extra work to figure out which end of that connection is the machine you're trying to defend. (Maybe it's the low port, etc.) After you do that work, you can map source and destination into server and client (or service) as appropriate. But, we would never give up the fields we know so we will end up with all four fields. (If your use case doesn't care about who the server was, then sorting them out is optional.)

When will we get per document field aliasing? 😄 That would be the best scenario.

In a service monitoring situation, if the service has an open port and responds to queries, it's a straight client and server (or service) case, and since those terms are more descriptive (for our cyber mission anyway) we'd use those over source or destination. If the service is a pusher and makes logs, it's still the service but it might be appropriate to call the other end the server. See, I like service now. 😄 I can pin a lot of things to it!

@dcode
Copy link
Contributor

dcode commented Aug 5, 2018

I like where this is going, but to throw a wrench in it, I'm a huge user of Bro data (and also Suricata). I like the connection top-level object concept, but bro tracks "client" and "server" a little differently, as does Suricata. Bro calls whoever initiate the TCP/IP connection the "originator" and other system in the conversation the "responder". Going a layer deeper, Bro will analyze the protocol, and for something like HTTP it will record the "originator" and "responder" of that protocol. In most common protocols, the originator is the same at the TCP/IP layer and the HTTP layer. In several protocols it's not guaranteed, like SMTP or FTP. In those protocols, it's completely possible that the "responder" of the TCP/IP connection initiates the protocol as the "originator".

All that said, I think it makes sense to manage "connection" data at only the TCP/IP layer (or equivalent transport protocol). If there's protocol specific information that confirms direction of the application protocol, that can be recorded in a protocol-specific subobject (i.e. network.smtp.originator, network.dns.responder).

Note, that I'm not trying to get into a religious war against client/server and originator/responder. I think for the purposes of ECS, it's equivalent.

Also of note, Suricata uses src and dst for IP addresses, but tracks by count as bytes_toserver and bytes_toclient using similar semantics of Bro. That is, initial assessment is based on who sent the first packet (regardless of TCP, UDP, ICMP, etc) and is confirmed if a more specific protocol analyzer is used.

All that said, I'm in favor of (where the semantics of packets/bytes mean that endpoint sent it):

  • connection.client.ip: 1.2.3.4
  • connection.client.port: 12367
  • connection.client.packets: 180
  • connection.client.bytes: 1234
  • connection.server.ip: 6.7.8.9
  • connection.server.port: 965
  • connection.server.packets: 150
  • connection.server.bytes: 1234
  • connection.protocol: tcp
  • connection.service: smtp
  • network.smtp.client.ip: 6.7.8.9

Under any case, if I'm receiving packets via a tap or span port, I have no idea which direction that's going (inbound vs outbound).

EDIT: Added example data

@robcowart
Copy link

@dcode I use client/server determination with Suricata data here...
https://github.com/koiossian/synesis_lite_suricata

I would appreciate hearing your feedback on how it is handled, and whether you see any issues.

@webmat
Copy link
Contributor

webmat commented Aug 6, 2018

This is not feedback, this is just a more precise pointer ;-) Client vs Server code starts at line 601 here

See also various places between lines 209 to 468 to see the traffic locality determination.

@webmat
Copy link
Contributor

webmat commented Aug 6, 2018

One thing I like about it is that it's entirely based on information taken from the event itself (including some fast translate-based enrichment).

It doesn't depend from doing an ElasticSearch search per event.

@webmat
Copy link
Contributor

webmat commented Aug 6, 2018

@ave19 To answer your question on aliasing, here's the progress so far. The concept of alias is available in recent builds, but still incomplete (in my opinion) for what we're trying to achieve.

So if you use a recent build of ElasticSearch, you can search in Kibana -- and even leverage the new auto-complete -- based on your "original" field just as much as your alias.

What's still missing is the ability to display based on the alias' name. Your visualizations and API results will only contain the original field. I haven't checked yet if the response includes a mapping of the aliases, so clients could handle this however they want. I suspect the alias mapping is not returned yet either.

@ave19
Copy link
Author

ave19 commented Aug 7, 2018 via email

@ruflin
Copy link
Member

ruflin commented Aug 8, 2018

This discussion triggered a more general question on my end on what our "standard" is to reusing / composing objects. I opened an issue related to it here to not mix it with this discussion here: #71

@vbohata
Copy link

vbohata commented Aug 10, 2018

+1 for having server, client, source and destination. I can imagine some application logs may require all of them (for DHCP for example). Also web application logs contains client and server (source and destination is quite odd use here).

@webmat webmat added the discuss label Aug 17, 2018
@webmat webmat mentioned this issue Sep 18, 2018
26 tasks
@ruflin ruflin mentioned this issue Oct 31, 2018
22 tasks
@dcode
Copy link
Contributor

dcode commented Nov 8, 2018

So, pre 1.0-beta, I implemented as much ECS and ECS-friendly items as I could for RockNSM. In light of a firm decision, I went with network.source, network.destination, network.client, and network.server. Since the prevailing votes for source/destination are as top-level fields, I have to re-cast my vote to top-level client/server fields, because as @ave19 noted, semantics matter.

Now, understandably, having IPs in different fields makes it more difficult to build dashboards and such. In my final logstash enrichment for generic ECS data, I added an additional field called network.community_id, which is a deterministic hash of a 5-tuple. This is inspired by some work in the Zeek and Suricata communities. This enables us to keep direction context for the logs that support it, keep the client/server context for the logs that support it, and the ability to pivot across both types of logs.

I'm not proposing we make community_id core ECS, but it addresses the problem while retaining the most of both worlds. In the meantime, I'll be renaming my fields to use the top-level names of source, destination, client, server.

@webmat
Copy link
Contributor

webmat commented Nov 8, 2018

I love the idea of supporting community_id in ECS eventually, thanks for bringing this up.

@webmat
Copy link
Contributor

webmat commented Dec 4, 2018

@dcode We're introducing network.community.id. Check out #208.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants