Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add field reuse links to relevant field sets #569

Closed
wants to merge 9 commits into from

Conversation

benskelker
Copy link
Contributor

Just added the field reuse link to as fields. If we're good to go, I'll add it to all relevant field sets in this PR.

@benskelker benskelker requested a review from webmat September 25, 2019 09:31
@webmat
Copy link
Contributor

webmat commented Sep 25, 2019

@benskelker I didn't realize NOTE: was the code word to generate this nice looking interstitial. Good stuff! Thanks for getting this started.

The build is breaking because this whole file is actually generated by the master/scripts/generators/asciidoc_fields.py file. The templates are below line 142. However the logic to decide when to show this will have to be in the code above that.

Do you want to try your hand at playing in the Python code? If that's out of the cards I can take it from here. If you'd like to try this, we can have a quick discussion and I show you around the code.

WDYT?

@webmat
Copy link
Contributor

webmat commented Sep 27, 2019

As discussed yesterday, here's the code part that you'd need to integrate in the script file to only get the note to appear when there's field reuse going on.

Add this function above the definition of function render_fieldset_reuses_text:

def render_fieldset_reuse_link(fieldset):
    '''Render a link to field reuse section, only when appropriate'''
    if ('nestings' in fieldset or 'reusable' in fieldset):
        return 'NOTE: ...'
    else:
        return ''

You'll add the result of calling this function in the render_fieldset function, here it is in full, with the additional line to set the new param fieldset_reuse_links:

def render_fieldset(fieldset, ecs_nested):
    text = field_details_table_header().format(
        fieldset_title=fieldset['title'],
        fieldset_name=fieldset['name'],
        fieldset_description=render_asciidoc_paragraphs(fieldset['description']),
        fieldset_reuse_links=render_fieldset_reuse_link(fieldset)
    )

And finally, in the template function field_details_table_header, you'll want to output this new param below {fieldset_description}:

{fieldset_description}

{fieldset_reuse_links}

With this in place, we can now iterate on how this looks by tweaking what's in return 'NOTE: ...' :-)

In the future we can collaborate more smoothly on this if you give me push access to your fork. This way I could directly push some code changes to your PRs, and you could focus on shaping the text & so on. We can discuss this when you're back.

@webmat
Copy link
Contributor

webmat commented Sep 27, 2019

So previous comment was about the code needed for this. Now let's also have some actual documentation discussion :-)

Only feedback there I think would be to turn this into more of a full sentence, with only part of it being the actual link.

With this in mind, there's a few situations to consider:

  1. The fieldset in question is being reused somewhere else (e.g. group)
  2. The fieldset has other fieldset(s) that can be nested under it (e.g. client)
  3. The fieldset is both: it's being reused elsewhere, and some other field set can be nested under it (e.g. user)
  4. Finally, no reuse

Perhaps we can come up with a generic sentence for situations 1, 2 and 3: <<as-field-reuse, See field reuse information.>>

Or perhaps we can adjust the wording depending on whether it's 1, 2 or 3? What do you think?

Also, I think it's fine to have nothing when there's no reuse, but this is open for discussion as well.

Once we're done with the NOTE at the top, we could work on adjusting the "field reuse" section itself, as well. But I'd keep that for a subsequent PR :-)

@MikePaquette
Copy link
Contributor

4.Finally, no reuse
Also, I think it's fine to have nothing when there's no reuse, but this is open for discussion as well.

@webmat @benskelker Since ECS is a specification, explicit documentation saying something like "This field set is not nested." will be very helpful for readers who might not read all field set definitions to even realize that nesting is a thing.

Can we add such a sentence for the fourth case?

@webmat
Copy link
Contributor

webmat commented Sep 30, 2019

@MikePaquette Yes, this makes sense. Let's err on the side of being more explicit 👍

@benskelker
Copy link
Contributor Author

benskelker commented Oct 2, 2019

@webmat @MikePaquette
Mat - I know you didn't like the initial suggestion, but can you reconsider updating the page structure:

h1: Fieldset name
Description

h3: Field Reuse
Any of the following, as appropriate:

  • None
  • Can be nested under: (e.g. group, user)
  • Can be a parent of: (e.g. client, user)
  • Must be nested under: (e.g. geo)

h2: Field Details

This way, all the information required for a fieldset is presented before describing each field.

@webmat
Copy link
Contributor

webmat commented Oct 2, 2019

@benskelker Could you mock it up and show the result? Just a screenshot of the rendered asciidoc is enough. In other words, no need to update the code for this; just tweak one page manually, render with asciidoctor & paste here as a comment.

@benskelker
Copy link
Contributor Author

Screenshot with field reuse

Screenshot 2019-10-02 at 19 02 49

@benskelker
Copy link
Contributor Author

Screenshot without field reuse
Screenshot 2019-10-02 at 18 59 48

@benskelker
Copy link
Contributor Author

The doc builds fine locally on my machine

@webmat
Copy link
Contributor

webmat commented Oct 2, 2019

@benskelker Sorry I meant a screenshot of the page in the browser :-)

@webmat
Copy link
Contributor

webmat commented Oct 2, 2019

Wait, I think I misunderstood what you were asking for in this comment #569 (comment).

I thought you were suggesting we try your initial idea of adding all of the field reuse section at the top.

But what you're showing me in these code screenshots is the approach I'm ok with. If that's what you want to go for, I'm all for this, no need to show me a POC :-)

@MikePaquette
Copy link
Contributor

@webmat @benskelker I too thought you were proposing putting all the re-use info up top after the fieldset definition.

I am OK with either approach.

@benskelker
Copy link
Contributor Author

We're good to go - just updated the text a bit in the last commit.

@webmat
Copy link
Contributor

webmat commented Oct 4, 2019

The build is failing because of the Python code formatter.

Running make fmt on your workstation will reformat it, and you'll be left with a few changes to commit.

@webmat
Copy link
Contributor

webmat commented Oct 4, 2019

I'm not sure about the big interstitial saying "Not reused" being there 100% of the time, though. Especially not for the "Base" fields. I feel like people will tune it out, if there's always this note that 75% of the time says "not reused".

I think having the "Field Set" section always present at the end, stating that these fields are not reused elsewhere nor anything nested under here would make more sense. Although I would tackle that in a separate PR. I'd rather keep this PR only focused on the notice at the top of the page.

I also think the wording when there is reuse could be simplified. Out of context, I'm not sure how the current sentence speaks to people. "field set" is a bit inside baseball, even if we use it in the docs.

Although when trying to simplify it and use more common words, I quickly get to a point of needing distinct sentences for each situation.

situation sentence example
None (nothing at the top)
These fields nested elsewhere The "Autonomous System" fields can be nested inside other ECS field sets. See details in the <<client-field-reuse, Field Reuse>> section below.
Other fields nested here Other ECS fields can be nested inside the "Client" fields. See details in the <<client-field-reuse, Field Reuse>> section below.
Both - nested elsewhere & other fields nested here The "User" fields can be nested inside other ECS field sets. Other ECS fields can be nested inside the "User" fields. See details in the <<user-field-reuse, Field Reuse>> section below.

@webmat
Copy link
Contributor

webmat commented Oct 4, 2019

If you agree with this kind of approach (and we can improve the wording together), I will adjust the code. Let me know :-)

@benskelker
Copy link
Contributor Author

This commit goes much further than the original PR.
I removed the link when fields are not reused and updated the reuse sections. I don't think it makes sense to have detailed reuse links. We just want to shout out to users when fields can/must be reused.

I had a stab at updating the code - hope that's ok. I ran make fmt but I think I still have some missing dependencies as I got an error message.

@benskelker benskelker force-pushed the add_field_reuse_links branch from 3e76626 to 754a903 Compare October 6, 2019 10:52
@benskelker
Copy link
Contributor Author

You can preview the changes.

@MikePaquette
Copy link
Contributor

This is looking good @benskelker. I still think we need some mention that a fieldset is not expected to be nested under other field sets. For example based on this latest commit, if I look at the host fieldset definition here, it is silent about whether host.* can be nested under any other fields. (This exact question has come up several times, where folks assumed that they could nest host.* under, say source.* or destination.*)

It would be clearer and helpful if the note at the the bottom was enhanced to say:
image

@benskelker
Copy link
Contributor Author

@MikePaquette
I'll wait for @webmat to comment as well before I make more changes.

@webmat
Copy link
Contributor

webmat commented Oct 7, 2019

Yeah I like the short note at the top, as you propose it. I think the wording works well no matter what kind of nesting is happening.

I also like that you adjust the wording "can" & "must", depending on whether the field set is expected at the root or nested, or exclusively nested. Good stuff 👍

Let's continue working on the wording a bit, however.

I think "These fields are not reused." is too short, and doesn't give enough context, for someone who isn't sure what this "field reuse" stuff is about. I think we need to flesh it out more, maybe something like "The {field set} fields are not expected to be nested under another field set."

I like that you're strengthening the top_level note by using "should not" instead of "are not expected". However I would not use the expression "root fields", I think it's going to be confusing: it almost sounds like user.* fields (like name, email) are no longer expected under user., but directly at the root. I think the previous wording "top level" was a bit better, as it doesn't have the connotation of no longer being nested at all.

The build is failing because the generated asciidoc files are not fully committed. If you run make locally, you need to commit all of the changes to the generated files, otherwise the build will fail.

@webmat
Copy link
Contributor

webmat commented Oct 7, 2019

Also what @MikePaquette points out is true. There happens to be fields nested under host, so we have a section talking about nested fields under host. But the page should still say that "host" should not be nested elsewhere.

In other words, the "Field Reuse" section should always be present (already done). But perhaps it should always have two subsections as well:

Field reuse

Fields Nested in {fieldset_name}

Nesting {fieldset_name} Fields in Other Fields


Agent example (no nesting)

Field reuse

Fields Nested in Agent

No other ECS fields are nested in agent.

Nesting Agent Fields in Other Fields

The agent fields are not nested elsewhere in ECS.


Host example (nest here only)

Field reuse

Fields Nested in Host

The host fields can be a parent of:

Child fields Description
host.geo.* ...
... ...

Nesting Host Fields in Other Fields

The host fields are not nested elsewhere in ECS.


Group example (nest elsewhere only, top level or nested)

Field reuse

Fields Nested in Group

No other ECS fields are nested in group.

Nesting Group Fields in Other Fields

The group fields can be nested under:

Parent fields Description
user.group.* ...

NOTE: The group fields can be used nested or at the root of the events.


"Autonomous System" example (nest elsewhere only, not at top level)

Field reuse

Fields Nested in Group

No other ECS fields are nested in as.

Nesting Autonomous System Fields in Other Fields

The as fields can be nested under:

Parent fields Description
client.as.* ...
... ...

NOTE: The as fields should only be nested as described above, they are not expected at the root of the events.


"User" example (nesting here, and nesting elsewhere)

Field reuse

Fields Nested in User

The user fields can be a parent of:

Child fields Description
user.group.* ...

Nesting User Fields in Other Fields

The user fields can be nested under:

Parent fields Description
client.user.* ...
... ...

NOTE: The user fields can be used nested or at the root of the events.

@benskelker
Copy link
Contributor Author

I used notes for clarifying when field sets cannot be nested and updated the text when there is no nesting at all.
I think it's clear (although I didn't add subheadings to the Reuse sections).

@benskelker benskelker force-pushed the add_field_reuse_links branch from 6c96175 to 3fed4fc Compare October 8, 2019 11:16
@benskelker
Copy link
Contributor Author

Preview

@benskelker
Copy link
Contributor Author

@webmat @MikePaquette
Hi - did you get a chance to take a look? Is it ok or would you rather have reuse subsections?

@MikePaquette
Copy link
Contributor

@benskelker I really like this approach. It provides the valuable nesting information for every field set, and is a great improvement in our docs. I am fine to go ahead with this as it is.

If we want to spend further time polishing this, I have three observations:

  1. We are mixing metaphors in these sections between "nesting" and "child/parent"
  2. Our table column headers say "Child Fields", but the column entries are not strictly children, rather they are "parent+children".
  3. We are using "can" and "cannot" language rather than rfc 2119 language, which I happen to like better :-)

Suggestions:

  • Can we eliminate the "child/parent" metaphor and use only "nesting/nested"?
  • Can we replace "can" with "may" and "cannot" with "must not" ?
  • Bonus: Can we customize the description in the table to have context of the "parent" field

Example (italics indicate suggested changes):
Field Reuse

Other field sets may be nested under the host field as follows:

Nested usage Description
host.geo.* Fields describing a location of a host.
host.os.* OS fields contain information about the operating system of a host.
host.user.* Fields to describe the user of a host relevant to the event.

The host fields must not be nested under other field sets.

@benskelker
Copy link
Contributor Author

@MikePaquette

  1. We are mixing metaphors in these sections between "nesting" and "child/parent"

Nesting does not explicitly describe the hierarchy, which leads to awkward sentences. Are you ok with 'fields can be children of'?
I'd prefer to keep the mixed usage, it keeps the English simple and we write for an international audience.

  1. Our table column headers say "Child Fields", but the column entries are not strictly children, rather they are "parent+children".

Yea, I thought about that but I think it's pretty self-explanatory (might be wrong).

  1. We are using "can" and "cannot" language rather than rfc 2119 language, which I happen to like better :-)

If we must, I may can change it :).

I'd love to customise the desciptions depending on the context, but let's leave that for a different PR.

@webmat any thoughts?

@webmat
Copy link
Contributor

webmat commented Oct 28, 2019

I'd love to customise the desciptions depending on the context, but let's leave that for a different PR.

100% agree that contextual descriptions would be great and I also agree with keeping that for later :-) In order to do this, we'll need to 1) modify the YAML structure to allow for these additional contextual descriptions 2) fill it in for all field reuses where it's needed 3) adjust the scripts to leverage them.

Actually I like the idea so much that it's one of the first ideas I added to the ECS docs brainstorm document, a few weeks ago (it's #4).

On wording:

  • I agree with using the RFC wording, I need to make that a habit myself :-) Let's change the reuse sentences to use may/must.
  • Should we say "nest inside" instead of "nest under"?
  • I think the headings "parent fields" & "child fields" are fine for now (unless we decide to stop talking about parent/child)
  • nesting vs parent/child: I agree with Ben using "nesting" in all sentences may be awkward. I say we can keep this wording improvement for later, when inspiration strikes, because I'd also love it if we could use only one metaphor for this.

Field sets that have nothing nested inside them but are nested elsewhere (e.g. group) are not stating that "this field set has no other fields nested inside it". Note that this is only true for fields that are nested elsewhere. Fields that have zero nesting whatsoever are fine on this front, because they have the catch-all sentence.

Fields that have zero nesting (e.g. network) could still use a small wording adjustment, because the sentence mixes both metaphors:

These fields are never nested under or a parent of other field sets.

I wonder if we couldn't instead use only one metaphor for this sentence. That, or consider again my initial proposal to always have two sub-sections inside "Field Reuse". In this case, both sections contain a simple sentence:

Fields Nested in Agent

No other ECS fields are nested in agent.

Nesting Agent Fields in Other Fields

The agent fields are not nested elsewhere in ECS.

@benskelker
Copy link
Contributor Author

benskelker commented Oct 29, 2019

@MikePaquette @webmat
Thanks for the comments.

I'll make one last plea not to use RFC wording :). May has connotations of chance and choice, whereas 'can' implies capability and purpose. I also think the recommendation is not relevant for our use case. We are not providing optionality, we are providing reusability. If users have to map geo fields, they must be nested. From the memo:

MAY This word, or the adjective "OPTIONAL", mean that an item is
truly optional. One vendor may choose to include the item because a
particular marketplace requires it or because the vendor feels that
it enhances the product while another vendor may omit the same item.
An implementation which does not include a particular option MUST be
prepared to interoperate with another implementation which does
include the option, though perhaps with reduced functionality. In the
same vein an implementation which does include a particular option
MUST be prepared to interoperate with another implementation which
does not include the option (except, of course, for the feature the
option provides.)

Let me know.

Let's try and agree on the nesting vs child/parent terminology. From your comments, I suggest we only use child/parent and never nested/nesting. Please let me know if you're okay with this.

Fields that have zero nesting (e.g. network) could still use a small wording adjustment, because the sentence mixes both metaphors:

Yes:
These fields are never a parent or child of other field sets.

Lastly:

Field sets that have nothing nested inside them but are nested elsewhere (e.g. group) are not stating that "this field set has no other fields nested inside it".

In these cases, let's add a bullet to the note:

NOTE: The field set name:

  • May also be used directly as top-level fields.
  • Must not be a parent of other fields.

@MikePaquette
Copy link
Contributor

@benskelker Thanks!

  • Using the rfc wording. 👍
  • Using only parent/child 👍
  • Keeping one section rather than subsections 👍

@webmat
Copy link
Contributor

webmat commented Oct 29, 2019

I'll make one last plea not to use RFC wording

@benskelker Ah sorry I hadn't picked up on the fact that you were asking not to use the RFC wording. From Mike's comment above, seems like he hasn't either 😂 I get what you're saying about "may", however I think "must not" is much clearer than "shoudn't". I don't think we need to make this a big issue.

On nesting vs parent/child: I'm wondering if we should table this discussion and move it to the brainstorming doc for now. The reasoning is that now we're looking at one small section of the docs; but if we want to eliminate one of the two metaphors, we should look around the docs and think about this holistically. I don't want to block this PR -- meant to point people to the "reuse" section -- with this holistic review. And I wouldn't want to eliminate one metaphor now by focusing only on this section, and upon further review decide to bring it back because it made more sense elsewhere. The secondary reason is that I don't think solely using "parent/child" is clear enough. I think introduced the Parent/Child metaphor for the column names, where this made sense to describe the position of the fields, in the context of talking about nesting. But seeing a sentence stating "this field set is not a parent of other field sets" doesn't speak to me as much. If someone new to ECS asked me "what does it mean to say a field set is a parent of another?" I'm not sure I could explain it without talking about nesting. It seems to me like nesting is the fundamental concept here.

Are we ok to table the nesting vs parent/child discussion? If so, I'll add this to the brainstorming doc.

We should still adjust the sentence that uses two metaphors, however. Options:

  • These fields are never nested under or a parent of other field sets.
  • These fields are never a parent or child of other field sets.
  • This field set is not nested in other field sets, and has no other field sets nested within it.

My favorite one is the last, as it's most in line with the wording in the ECS documentation right now.

I'm ok with not having subsections, though.

@benskelker
Copy link
Contributor Author

@webmat and @MikePaquette
Made changes and updated the wording, hopefully for the better. If you want changes, let's just have a quick zoom to finalise the PR.

Preview

@ebeahan
Copy link
Member

ebeahan commented May 27, 2021

This PR has been stale for a while, and I believe a lot of the discussed functionality has since been introduced through other contributions.

@ebeahan ebeahan closed this May 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants