Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VLANs #318

Closed
neu5ron opened this issue Feb 6, 2019 · 24 comments
Closed

VLANs #318

neu5ron opened this issue Feb 6, 2019 · 24 comments

Comments

@neu5ron
Copy link

neu5ron commented Feb 6, 2019

propose similar to @dcode with what he did with related.ip
should have any VLANs copy_to (via elasticsearch) or custom via logstash:
related.vlan

@webmat
Copy link
Contributor

webmat commented Feb 6, 2019

Thanks for the feedback!

Note that related. is intentionally left pretty sparse at this moment, because this pattern of accumulating related values should be left to user discretion. In other words, feel free to add related.vlan in your events :-)

We will eventually list more fields there, that are known to be broadly applicable. And your suggestion will be considered.

@neu5ron
Copy link
Author

neu5ron commented Feb 6, 2019

thanks for quick reply, I also am curious if any VLAN fields will be added? for example the VLAN that is a part of PaloAlto interface name, or bro conn.log VLAN, or VLANs showin in Radius logs, etc...
thanks!

@webmat
Copy link
Contributor

webmat commented Feb 7, 2019

I'm adding it to the list of things to look into. Do you have suggestions on how you'd like the fields to look like? What information do you want to see in your events, about your vlans?

@neu5ron
Copy link
Author

neu5ron commented Feb 8, 2019

Thanks for asking, and I am open to any suggestions as well...I never had a perfect answer for this and why I used "any" (aka "related") VLAN field...

because sometimes it was either unknown which one was a true "src" or "dst" VLAN -- or situations of MPLS or other layer 2/3 hacks.

Some examples:

  • Had an IPS that, even after meticulously going through 500,000+ logs, was impossible (not exaggerating, had multiple people verify if I was crazy or not) to truly determine if a VLAN was src/dst -- so it had its own field such as "vendorname.vlan". This was probably due to it being an "invisible" inline IPS (ie: meaning it was not actually altering packets in terms routing but just a best guess of bad/good and therefore block/allow regardless of "direction").
  • PaloAlto src/dst interface have the VLANs at the end. so that was easy to use src/dst.
  • Bro logs with VLANs because of encapsulation things like MPLS/etc was not necessarily indicative of a true src or dst

Therefore I could recommend source.vlan and destination.vlan -- but most importantly I would leave that up to the greater community -- and just recommend to have a related.vlan for reasons mentioned above.

Would love to hear what you think. Thanks!

@webmat
Copy link
Contributor

webmat commented Feb 14, 2019

This is great information, thanks for providing this!

I think the 3 fields you suggest make sense as is.

I wonder if we should have another place to list vlans when their side/role is indeterminate. Perhaps network.vlan. Not crazy about this idea necessarily, just musing :-) The reason is that ideally related.vlan should be a place to collect information appearing multiple times in an event. But in the case of indeterminate vlans, they would only appear at related.vlan, which is not quite the intended usage. Not the end of the world, though.

@praseodym
Copy link
Contributor

Unless the log source is some device routing between VLANs, a VLAN will be something that doesn't belong just to one source/destination (layer 2 vs layer 3). In that case network.vlan definitely makes sense.

I am opposed to using related for anything but collecting information that appears elsewhere in the document. In fact, I currently never expect related to be included in the _source document because it can be easily populated with copy_to (#67).

@webmat
Copy link
Contributor

webmat commented Feb 18, 2019

copy_to is only one way to populate related.*, however. Some may prefer to duplicate the information in their event, for various reasons.

But I think your point reinforces the need for related.* to be used strictly for information seen elsewhere in the document nonetheless.

@neu5ron
Copy link
Author

neu5ron commented Feb 18, 2019

Thanks for the feedback folks.
I understand the copy_to portion and or using merge in logstash or some other way.
I guess the question I am trying to ask is, lets say you have 2 distinct vlan fields then would they all be merged into network.vlan?

@neu5ron neu5ron closed this as completed Feb 18, 2019
@webmat
Copy link
Contributor

webmat commented Feb 19, 2019

Well, if the event source does not actually indicate which vlan is associated with either side of the transaction, we may indeed have to define an equally ambiguous field indeed.

However as a first step, I think we would support cases where the mapping is clear. These are very straightforward to implement.

In the meantime, the ambiguous situations should be tracked in a custom field, and we can continue to dig into how to best address this. Especially with a first foray into supporting these, I would avoid supporting such edge cases until the situation is better understood.

@webmat
Copy link
Contributor

webmat commented Feb 19, 2019

Not sure if the closing of the issue was accidental. If so, thanks for being mindful of our growing backlog ;-)

However since the fields haven't been added yet (and I agree we should add them), I'd rather keep the issue open.

There probably won't be movement on this until after the release of the Elastic Stack 7.0 (and ECS 1.0.0 GA), but I'd like to work on this afterwards.

@webmat webmat reopened this Feb 19, 2019
@dainperkins
Copy link
Contributor

Maybe break this up into the discrete possibilities (and no rush - I know your all going nuts with the release)

1) Ignore MPLS for this discussion (& VRF/VRF-Lite) - its not a vlan (if something is reporting MPLS tags as vlans it should be routed to wherever we will represent MPLS Labels and we should for N/APM) Will probably also need to consider MPLS tag, virtual routers specifics, etc - presumably all under network, but thats literally another issue)

2) Layer 3: Device is a gateway (E.g. L3 + Firewall, router, etc) and will typically report inbound / outbound VLANs (typically if using VLANS each VLAN will have a logical interface name and .1q id - the name may not match the VLAN name on the switch as name is only locally significant)
Or L2 device bridging between two vlans (but not routing... just changing the .1q tag -- think inline Network Admission Control, or dirty vlan/clean vlan)
source/destination.vlan.id
source/destination.vlan.name (might be interface name)
source/destination.interface.id
source/destination.interface.name (might be the same as vlan name)

3) Base Layer 2: device is inline, pulling data off of the wire on one side and putting it back down on the other on the same VLAN for any given connection (may support multiple vlans on one device so needs to be specific to connections)
network.vlan.id
network.vlan.name

@neu5ron
Copy link
Author

neu5ron commented Mar 24, 2019

what about making VLAN a nested field?
I like, network.vlan.id , source.vlan.id , destination.vlan.id that you all mentioned!

@neu5ron
Copy link
Author

neu5ron commented Mar 24, 2019

also, I imagine we will be using short ES data type for this correct?
4096 is the highest int/number per the RFC

@dainperkins
Copy link
Contributor

dainperkins commented Mar 25, 2019 via email

@dainperkins
Copy link
Contributor

There also needs to potentially be an allowance for nested VLANs for QinQ (typically provider specific, but actually ran into it the other day)

@MikePaquette
Copy link
Contributor

@dainperkins Just wondering if we could come up with a convention for representing QinQ in a single *.vlan.id field and related.vlan.id rather than doing more nesting, without losing ability to search/filter?

@dainperkins
Copy link
Contributor

@MikePaquette when I put my network hat on I think I would want it represented more concretely than related.vlan (specifically thinking of how network people are likely to look at data, and the possibility in a provider network of having more than 2 tags).

Maybe something like:
*.vlan.type ( mark 802.1ad for Q-in-Q, blank or 802.1q for single tag)
*.vlan.name / .id -> 802.1q native tagging / inner vlan
*.vlan.tag1.name /id -> 802.1ad second VLAN tag
*.vlan.tag2.name /id -> 802.1ad third VLAN tag
...
*.vlantag[x].name/id -> outer VLAN tag

With no logistical limit on the number of tags (tho I would imagine 3-4 is likely the highest that would be used, and I think anything over 3 starts taking bits away from the data portion of the frame)

@neu5ron
Copy link
Author

neu5ron commented Mar 29, 2019

great stuff.
I was just looking at this last night and talking to some network folks too, and with the ability for more than 2 vlans in a single packet (and many more with jumbo frames and what not) -- that because "most" cases would just be using 802.1q and for Q-In-Q/802.1ad "most" cases would be 2.
then if there is a VLAN sandwich of 3 or more then we could use a vlan additional or use "vlan.additional.id" as an array for 3 or more AND OR when there are 2 tags but you are unable to determine (via the log source) which is inner or outer.
Example:

vlan.id
vlan.name
vlan.outer.id
vlan.outer.name
vlan.inner.id
vlan.inner.name
vlan.additional.id
vlan.additional.name

Love the vlan.type

I am afraid using a sequential tag numbering, will be a bit difficult on most logstash users (maybe even myself ;) -- but more importantly I think it provides a sort of "ordering" issue from a data analyst perspective - as an analyst I could see looking at "vlan.tag2.id" and thinking "ok that may be the 2nd

I think there is no real %100 "correct" way to go with this, because of especially taking into consideration all the passive devices and i'm sure there are network vendors out there who don't differniate an inner/outer (or more) VLAN in there logging output.
and so this is where the related.vlan_id comes even more useful - I am thinking of switching to related.vlan_id because somebody later on could implement (custom or in ECS) related.vlan_name
and I don't want to at the same time make the related field infinitely nestable too..

I am open to more discussion... this is tough for sure...

@dainperkins
Copy link
Contributor

I like the inner/outer + "related" (especially if the source doesn't report in one order or another) particularly if we can use an array in the vlan.related.id/name fields (I think you are right that deciding order programmatically between multiple vendors logs may be exceedingly difficult - I am reminded of the days of Cabletron starting vlan numbering at 0, and everyone else at 1)

I do wonder if it would be better to have vlan.nested.[id/name arrays] or something other than related (vlan.qinq, or vlan.8021ad) in case there is another use case for related.vlan[id/name] that could get confusing with mixed index types?

@willemdh
Copy link
Contributor

Hello,

This issue seems very silent for some time. I was working on ECS'ing some sflow data and came to the conclusion there is no room for source.vlan.id and destination.vlan.id yet?

This seems like a logical expansion to at least start with:

vlan.id
vlan.name

and enable nesting under source and destination?

Grtz

Willem

@dainperkins
Copy link
Contributor

dainperkins commented Nov 29, 2019

I feel like we've conflated 2 specific use cases - @willemdh has a simpler idea for source/destination where there is a distinct single vlan for each (e.g. firewall / netflow packet moving), as opposed to e.g. packet analysis where we might be dealing with multiple q-in-q issues. I propose adding [source|destination].vlan.[id/name] in the near term for dealing with routing decisions, and we'll work a little more on the top level in regards to q-in-q issues?

I'll put in a PR for the basics for source/dest with specific guidelines on when to use, and then we can decide if on the wire vlan details go under vlan. or network., I don't think they would be relevant to the source / dest as anything processing the packet is liable to be (reporting on) making a decision at only 1 .1q level at a time

@dainperkins
Copy link
Contributor

@webmat @MikePaquette Thinking about VLANS in 2 ways: source / destination for e.g. netflow & firewalls vs the idea of e.g. inventory in terms of polling SNMP vlan information from a config, or recording any host/observer sub-interface info

I'm probably just wrapping myself around an unnecessary axle but I see the following being useful for source / destination (and probably client/server/host) where the incoming information is likely to be just vlan id / vlan name

(source/destination).vlan.id
(source/destination).vlan.name

and the previous packet level analysis adds the vlan.inner, vlan.outer, vlan.related...

but the for a top level vlan fields (coming from CISCO Docs) - can we mark just vlan.id & vlan.name as reuseable under source/destination and interface? or should I just use the docs for each field to specify how the items are to be used?

  • vlan.id
  • vlan.said (10000 + vlan id)
  • vlan.name
  • vlan.description
  • vlan.state (active/suspended)
  • vlan.mtu
  • vlan.translational_bridge [0-1005, 0-1005] . (e.g. QinQ egress/ingress translations)
  • vlan.mode (uni-eni, private, rspan)

(skipping private specifics & rspan specifics for now)

@dainperkins
Copy link
Contributor

Submitted PR 688
#688

@webmat
Copy link
Contributor

webmat commented Dec 3, 2020

Went looking for something random, and found this oldie. With the vlan fields in ECS for a while, can this be closed?

My understanding of discussions with @dcode and @neu5ron led me to believe that outer and inner should be plenty, as there's rarely more than 2 nested vlans, making the specific initial request of related.vlan moot. Is my understanding correct?

I'm closing now, but feel free to reopen if I'm mistaken :-)

@webmat webmat closed this as completed Dec 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants