Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal to improve the layout of the field reference documentation for Beats #9519

Closed
dedemorton opened this issue Dec 13, 2018 · 12 comments · Fixed by #12242
Closed

Proposal to improve the layout of the field reference documentation for Beats #9519

dedemorton opened this issue Dec 13, 2018 · 12 comments · Fixed by #12242
Assignees
Labels
discuss Issue needs further discussion. docs

Comments

@dedemorton
Copy link
Contributor

dedemorton commented Dec 13, 2018

Currently the documentation about exported fields is not easy to read and scan due to problems with the layout. This proposal recommends improvements to the layout based on what I think should be done.

Problems with the existing layout:

  • We use the same heading level for all headings, which makes the topic harder to scan and understand the relationship between sections.
  • Definition list formatting makes the content longer (by about 40%) and therefore harder to scan.
  • Redundant sections are not useful and make it look like there is a mistake. The logic in our generator creates these sections because of the filesets/metricsets used to collectd the data, but the sections don't really add value for the user.

Problems with the writing style/content:

  • Inconsistent style (some full sentences, some fragments, erratic capitalization)
  • Many descriptions lack depth and completeness. Some descriptions are missing. Users can't always tell how the fields get populated (do they enable a module? an input? define a processor?)

Proposal to address these issues:

  • Use a different heading level for sub-sections. Avoid introducing additional heading levels beyond the title and sub-headings.
  • Format the field reference info using tables instead of definition lists. Each table should have no more than 3 cols. When additional information is available (such as format, alias to, and required), include this information in the description. Why? Tables with more than 3 columns may result in formatting issues (text that overruns in the navigation) and are hard to read on smaller devices. Complex tables are also more difficult for people using screen readers.
  • Edit the existing content for consistency and define style guidelines for new contributors to encourage consistency.
  • Make sure the descriptions in each section indicate which modules/inputs/processors/etc populate the fields with data. This is useful for new users who aren't familiar with the available modules. If possible, there should be an active link back to the docs for the module that collects the data.

A prototype of the layout is available here: https://field-reference-test.firebaseapp.com/index.html Commented out because I accidentally blew this away.

The asciidoc I used is here: https://github.com/dedemorton/beats-docs-review/pull/2/files

Note that links are missing in this small test case. I've put everything into a single file to make it easy to build. The table layout I've chosen is relatively simple and easy to read as raw text in GitHub.

Some other stuff I played around with:

  • Added descriptions to the overview page. I think this is better than a bulleted list that simply repeats the navigation.
  • Removed repetition of "fields" in sub-headings. Does this work?
  • Capitalized all headings.

Layout problems caused by our CSS or asciidoc (not easy to fix):

  • Heading size and fonts do not play well together.
  • Not enough cell padding between columns.
  • No way to set the width of columns so that all tables on a page have the same column widths (this makes it easier to scan content quickly). Our docbook transform in our toolchain swallows column attributes.

Questions and open issues:

  • Can we extract the descriptions defined in the Elastic Common Schema? Users should not have to look in more than one place to figure out the content and data type of a field.
  • Can the script be modified to strip out redundant sections (sections that do not contain field descriptions)? Do these sections provide any real purpose, or are they just an artifact of how the current script logic works?
  • Who will create the script?
@dedemorton dedemorton added docs discuss Issue needs further discussion. labels Dec 13, 2018
@dedemorton
Copy link
Contributor Author

Adding @bmorelli25 because he has a dependency on this layout too.

@ruflin
Copy link
Contributor

ruflin commented Dec 13, 2018

I think this is a very big improvement over what we have today. For your questions:

  • Can we extract ECS descriptions: Yes. But there is also a twist here. Sometimes we also want to overwrite this definition with a more local / accurate one as ECS is generic.
  • Empty sections: I'm pretty sure we can adjust the script accordingly. Historically it was used to describe group of fields but in most cases this has become obsolete. I think we also need to remove some of these descriptions.
  • Happy to help with the script improvements.

@bmorelli25
Copy link
Member

I think this is a huge improvement DeDe! Not sure that I have much to add (other than a 👍), but I'm a big fan of:

  • Descriptions on the overview page
  • The table format is much easier to scan. I agree, there are some formatting issues (fixed width would be awesome), but the benefits outweigh these
  • Getting rid of fields repetition

@karenzone
Copy link
Contributor

karenzone commented Feb 13, 2019

DEFINITELY a big improvement. wrt:

"Each table should have no more than 3 cols. When additional information is available (such as format, alias to, and required), include this information in the description."

The ECS format currently uses 5 tables. No serious text overruns yet, but it's a matter of time.
I also like the idea we discussed about having a shortdesc field in schemas that we can use in overview tables, etc.

@webmat Ideas and implications for ECS here.

@webmat
Copy link
Contributor

webmat commented Feb 15, 2019

Thanks for reminding me of this, @karenzone. Momentous timing, as I'm starting on the push of ECS docs to the main website. I'll consider the work and thoughts from this issue 👍

@dedemorton
Copy link
Contributor Author

@ruflin I'd like to revisit this discussion about the field reference info now that ECS 1.0 has been released. I know you also have big changes in store for modules. I want to make sure we move forward in tandem on improvements here.

@dedemorton
Copy link
Contributor Author

As part of this effort, we need to make sure the generated doc tells users how to get the fields that are shown (module, processor, etc) and make titles unique.

@ruflin
Copy link
Contributor

ruflin commented May 3, 2019

@dedemorton I think a first step is to improve the layout and potentially context information we have around the fields. I wonder if what ECS uses as layout could also work for Beats?

For the bigger rafactoring to indicate where fields come from etc. +1. It will probably mean quite a few fields will show up in multiple places which is fine. An example for this is from my point of view what we have here as example in HAProxy (and others): https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-metricset-haproxy-info.html#_fields_42 The event shown contains quite a few ECS fields which is great, but how does the user know which is only looking at this event and does not know about ECS.

Thinking long term about modules and fields definitions ending up in Kibana, the user should see the field definition in the index pattern or when hovering over a specific field. So no need to go to the docs for this. But that is really long term.

@dedemorton
Copy link
Contributor Author

@ruflin

I think a first step is to improve the layout and potentially context information we have around the fields. I wonder if what ECS uses as layout could also work for Beats?

Agreed. I think Karen and Matt based the ECS layout on the prototype I created, so we should be able to use a similar layout.

the user should see the field definition in the index pattern or when hovering over a specific field

Totally agree! But we might also want to keep the field descriptions in the published docs where the content is searchable (as we get closer to this goal, we'll look at google analytics to see if this makes sense).

For now, I'll see if I can tweak the script to spit out the format that we'd like to see. I'll pull in our resident doc tools guy (Nik) if I can't figure it out easily myself.

@ruflin
Copy link
Contributor

ruflin commented May 6, 2019

@dedemorton Great, let me know if I can help on the scripting side for the layout.

@dedemorton
Copy link
Contributor Author

dedemorton commented May 8, 2019

@ruflin and/or @nik9000 I'm finding it quite difficult to get the script to spit out tables because of inconsistencies in how the yaml files across the module docs are structured. Usually my bulldog-like tenacity wins out, but now in this case. :-P

One way around this problem is to remove the sections completely and have one long list (or possibly table) that contains all the options.

See the example I created here: https://metricbeatexample.firebaseapp.com/exported-fields-apache.html.

To generate this, we just need to remove the code that adds the other sections and have something like:

def document_fields(output, section, sections, path):
    if "anchor" in section and "description" in section:
        output.write("[[exported-fields-{}]]\n".format(section["anchor"]))
        output.write("== {} fields\n\n".format(section["name"]))
        output.write("{}\n\n".format(section["description"]))

    if "fields" not in section or not section["fields"]:
        return

...

How do we think users are using this content? If they want to find out what a field contains, then a simple page without sections is easier, IMO. If they want to know how the field gets generated, then we need sections for each metricset, but we also need to add missing content to a lot of modules because most of the section intros don't add much value.

If we want to preserve the sub-sections, I'll need some help working through the logic.

full disclaimer: I've only tested this on Metricbeat at this point.

@ruflin
Copy link
Contributor

ruflin commented May 9, 2019

I don't think we need the section. I know we have sometimes descriptions inside but they are often just repeating the obvious. We rather have a long description on the top level about the different section or more details in each field instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs further discussion. docs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants