Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECS tooling rewrite brought about by the need to reuse field sets as a different name #864

Merged
merged 113 commits into from
Jun 11, 2020
Merged
Show file tree
Hide file tree
Changes from 97 commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
6568fe2
Function to turn single word reuse locations into the dictionary nota…
Apr 16, 2020
e0befee
Validate that keys 'as' and 'at' are present in dictionary notation
Apr 16, 2020
53556a0
Actually resolve the shorthands when reading schemas
Apr 16, 2020
88313e7
Unrelated: removed dead code
Apr 16, 2020
05d2847
Precompute the full path of a nesting, considering nesting as another…
Apr 17, 2020
ee51313
First pass at self-nesting
Apr 17, 2020
108dd35
Missed a generated file
Apr 20, 2020
ab82e6e
Code formatting
Apr 20, 2020
a3b5fde
Changelog
Apr 20, 2020
2918cdd
Fix asciidoc link when nesting as another name
Apr 20, 2020
d0c0271
Change the wording of using a field set at the root of events (instea…
Apr 20, 2020
c90885f
Remove commented code
Apr 20, 2020
df9843d
WIP
Apr 22, 2020
503f90b
Start massive refactor of schema_reader...
Apr 29, 2020
abee212
Make the 'reusable' field set properties explicit
Apr 29, 2020
e11fe38
Move a bunch of code to schema_processor.py
Apr 29, 2020
4b2039c
Give schemas/README.md some much needed attention
Apr 30, 2020
819575c
new schema loader WIP
May 6, 2020
0af1aa3
About to turn the schema loader into a shotgun...
May 7, 2020
4725167
The schema loader now merges EVERYTHING, taking special care of arrays
May 7, 2020
d16c361
Import the changes from PR #821
May 7, 2020
5b27a8b
Move nesting structure comment to top of file
May 7, 2020
50a49a3
Adjust tests for safe loading of schemas for new script name
May 7, 2020
a1c20d6
Start future schema cleaner w/ basic visitor function to navigate the…
May 8, 2020
94ce10b
Raise clear error on missing schema name
May 8, 2020
5344341
Add 'mock' testing dependency
May 8, 2020
3fe758b
Make 'title' a schema_detail as well
May 8, 2020
e28c737
Keep improving schema/readme
May 8, 2020
9bb6722
Create a helper for list subtraction...
May 8, 2020
6202462
schema level cleanups looking pretty good
May 8, 2020
ff17836
Extract the check for mandatory schema attributes to its own function
May 11, 2020
0ab1b1c
Raise on multiline short descriptions
May 12, 2020
7f7511d
Fill in field name for intermediary fields as well
May 12, 2020
c693b62
Refine docs around setting description and short in the YAMLs
May 12, 2020
ee6819b
Start the field cleaning code, reuse the short description check
May 12, 2020
2610d14
First checks for fields
May 12, 2020
c607ab0
Implement the simple defaults for fields
May 12, 2020
1ec38d1
better multi-fields datatype support than before
May 12, 2020
a6140a1
Precalculate the 'path' to a field as we build the deeply nested stru…
May 13, 2020
24bb145
Implement flat/dashed names for all fields. Close to feature parity w…
May 13, 2020
6a0bc77
Add intermediary=True to fields implicitly created to contain others.
May 15, 2020
9d19424
Correct my engrish. intermediary => intermediate
May 15, 2020
fcab216
Precalculate the leaf field name
May 15, 2020
ff718ca
In loader.py it's too early to determine namespacing under fs name
May 15, 2020
f3ab162
Comment out all the 'path' stuff. Must be calculated later
May 15, 2020
14850a5
Tentative final state for cleaner.py, much simplified
May 15, 2020
b7e1adb
Comments at the top
May 15, 2020
2d41e9b
Move validation & normalizing of field reuse attributes to new script
May 18, 2020
685d7a0
Remove 'path' stuff from cleaner tests. Will happen in finalizer
May 18, 2020
490a822
Add another test for the already normalized self reuse
May 25, 2020
caee0ee
Why does this throw my brain for a loop
May 25, 2020
45e5384
Both foreign reuse and self-reuse work now :-D
May 26, 2020
88090fd
Move visitor to its own module
May 26, 2020
a8a451d
First reasonable stab at finalizing the fields.
May 26, 2020
6051d52
New way to capture what's 'nested_here', make process.parent a reuse …
May 27, 2020
3bbc26f
Start capturing original_fieldset for reused fields as well, streamli…
May 27, 2020
c095081
Clean the allowed values as well
May 27, 2020
fe30e02
Fix problem with original_fieldset and write test for calculate_final…
May 27, 2020
0263d56
Function docstrings in finalizer.py
May 27, 2020
fccaa6c
Remove inaccurate comment. finalizer.py works on the structure in place
May 27, 2020
903b0bd
Reasonable implementation of filtering subsets
May 28, 2020
ca82b70
First pass at moving the flattening to the intermediate files generatoar
May 28, 2020
ce65c69
Calculate flat_name for multi_fields
May 28, 2020
0f91bc4
Move helper func is_intermediate(field) to the helpers
Jun 1, 2020
1326f43
WIP new generator using all the new schema scripts
Jun 1, 2020
8e836ca
WIP on intermediate_files.py generator for ECS YAML files
Jun 1, 2020
d9acd1c
Initial stab at fixing an issue around chained nestings: original_fie…
Jun 1, 2020
388204c
Remove old debugging code, replace with comment as a hint
Jun 1, 2020
fa19654
Fix bug where explicit legit object fields were incorrectly flagged a…
Jun 1, 2020
42ea8f7
Don't publish the deeply nested ecs.yml for now
Jun 2, 2020
f3f44ab
Remove old versions of the code
Jun 2, 2020
f32e56f
Introduce new optional attribute about reuse, for order of performing…
Jun 2, 2020
e13a05e
Perform reuse in prescribed order. Temp copy by reference no longer n…
Jun 2, 2020
d9b6b49
Rewrite intermediate files generator using visitor pattern
Jun 3, 2020
c17c46d
Comment out a bunch of things for initial PR review...
Jun 3, 2020
342cc98
Fix issue introduced in recent changes around reuse order. Add test.
Jun 3, 2020
d0d89c4
All generated files except asciidocs, as of latest changes
Jun 3, 2020
d716c3a
Revert back to passing normal nested fields to asciidoc generator.
Jun 3, 2020
ffd71f5
introduce the new attribute 'reused_here' immediately, after all...
Jun 3, 2020
eb83c25
Append short description to 'reused_here'
Jun 3, 2020
321cc5f
Use the new 'reused_here' attribute to populate the field reuse section
Jun 3, 2020
4ca8fe1
Fix code listing allowed_values to use nested representation
Jun 3, 2020
81e626c
Fix last remaining place that assumed deeply nested structure
Jun 3, 2020
caa0eb5
asciidoc as of now. one major problem left
Jun 3, 2020
f506a73
Fix subtle bug where self-nestings would incorrectly mark parent fiel…
Jun 3, 2020
86848b2
Remove subset tests that have been replaced...
Jun 4, 2020
8b3d52d
Use new scripts to load ECS fields in test_ecs_spec.py
Jun 4, 2020
dc71e03
Code formatting
Jun 4, 2020
b33f8d3
Make test output verbose
Jun 4, 2020
e5140b9
Reset process.yml with the duplicated fields for initial review
Jun 4, 2020
1496d83
Add nesting of code_sig to process.parent again, for initial review
Jun 4, 2020
73ad62a
Also re-nest hash at process.parent
Jun 4, 2020
face9d4
Undo a wording improvement to simplify rewrite PR review
Jun 4, 2020
fac6109
Temporarily re-add object_type for initial review
Jun 4, 2020
0fac563
Merge branch 'master' into nest-as-rampage
Jun 4, 2020
b812cf7
Remove old PR ID from relevant changelog entries. This will be a clea…
Jun 4, 2020
14c95e4
Code comment formatting
Jun 4, 2020
bf16551
Update an outdated docstring
Jun 8, 2020
96c06bb
Add failing test to sanity check, to catch mangled tracing fields
Jun 8, 2020
78a3d6d
Improve an unrelated test
Jun 8, 2020
1e63726
Remove commented out code
Jun 9, 2020
530691a
Finish the comment for subset_filter.py
Jun 9, 2020
0fa6fe1
Add __init__.py in unit tests so they're 'discovered' properly 🤦🏻‍♂️
Jun 9, 2020
035449e
Fix load_yaml_file bug
Jun 9, 2020
0353f75
Pin the version of mock
Jun 9, 2020
91e4660
Fix merge_fields bug
Jun 9, 2020
98745ca
Fix issue with tracing fields getting mangled
Jun 9, 2020
866f1b9
Adjust sanity check for new nesting under complete field name
Jun 9, 2020
34595a3
Changelog
Jun 9, 2020
97613cf
Space. The final frontier.
Jun 9, 2020
c841c53
Raise if either reused schema or destination schema have root=true
Jun 9, 2020
d6ec0ac
Fix spacing
Jun 9, 2020
91e7f16
Marshall was right, these two functions were the same. Deleted one of…
Jun 9, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.next.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ Thanks, you're awesome :-) -->
had `reusable.top_level:false`. This PR affects `ecs_flat.yml`, the csv file
and the sample Elasticsearch templates. #495, #813
* Removed the `order` attribute from the `ecs_nested.yml` and `ecs_flat.yml` files. #811
* In `ecs_nested.yml`, the array of strings that used to be in `reusable.expected`
has been replaced by an array of objects with 3 keys: 'as', 'at' and 'full'. <!-- TODO -->

#### Bugfixes

Expand All @@ -63,6 +65,8 @@ Thanks, you're awesome :-) -->
* Allow shorthand notation for including all subfields in subsets. #805
* Add `ref` option to generator allowing schemas to be built for a specific ECS version. #851
* Add `template-settings` and `mapping-settings` options to allow override of defaults in generated ES templates. #856
* Add ability to nest field sets as another name. <!-- TODO -->
* Add ability to nest field sets within themselves (e.g. `process.parent.*`). <!-- TODO -->

#### Deprecated

Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ setup: ve
# Run the ECS tests
.PHONY: test
test: ve
$(PYTHON) -m unittest discover --start-directory scripts/tests
$(PYTHON) -m unittest discover -v --start-directory scripts/tests
webmat marked this conversation as resolved.
Show resolved Hide resolved

# Create a virtualenv to run Python.
.PHONY: ve
Expand Down
47 changes: 1 addition & 46 deletions docs/field-details.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -453,12 +453,6 @@ example: `co.uk`
// ===============================================================


| <<ecs-group,client.user.group.*>>
ebeahan marked this conversation as resolved.
Show resolved Hide resolved
| User's group relevant to the event.

// ===============================================================


|=====

[[ecs-cloud]]
Expand Down Expand Up @@ -1012,12 +1006,6 @@ example: `co.uk`
// ===============================================================


| <<ecs-group,destination.user.group.*>>
| User's group relevant to the event.

// ===============================================================


|=====

[[ecs-dll]]
Expand Down Expand Up @@ -2755,12 +2743,6 @@ example: `1325`
// ===============================================================


| <<ecs-group,host.user.group.*>>
| User's group relevant to the event.

// ===============================================================


|=====

[[ecs-http]]
Expand Down Expand Up @@ -5269,12 +5251,6 @@ example: `co.uk`
// ===============================================================


| <<ecs-group,server.user.group.*>>
| User's group relevant to the event.

// ===============================================================


|=====

[[ecs-service]]
Expand Down Expand Up @@ -5610,12 +5586,6 @@ example: `co.uk`
// ===============================================================


| <<ecs-group,source.user.group.*>>
| User's group relevant to the event.

// ===============================================================


|=====

[[ecs-threat]]
Expand Down Expand Up @@ -6193,21 +6163,6 @@ Distributed tracing makes it possible to analyze performance throughout a micros

// ===============================================================

| trace.id
webmat marked this conversation as resolved.
Show resolved Hide resolved
| Unique identifier of the trace.

A trace groups multiple events like transactions that belong together. For example, a user request handled by multiple inter-connected services.

type: keyword



example: `4bf92f3577b34da6a3ce929d0e0e4736`

| extended

// ===============================================================

| transaction.id
| Unique identifier of the transaction.

Expand Down Expand Up @@ -6728,7 +6683,7 @@ example: `outside`

==== Field Reuse

The `vlan` fields are expected to be nested at: `network.vlan`, `network.inner.vlan`, `observer.egress.vlan`, `observer.ingress.vlan`.
The `vlan` fields are expected to be nested at: `network.inner.vlan`, `network.vlan`, `observer.egress.vlan`, `observer.ingress.vlan`.

Note also that the `vlan` fields are not expected to be used directly at the top level.

Expand Down
11 changes: 1 addition & 10 deletions generated/beats/fields.ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5049,16 +5049,7 @@
- to queries made through multiple back-end services.
type: group
fields:
- name: trace.id
level: extended
type: keyword
ignore_above: 1024
description: 'Unique identifier of the trace.

A trace groups multiple events like transactions that belong together. For
example, a user request handled by multiple inter-connected services.'
example: 4bf92f3577b34da6a3ce929d0e0e4736
- name: transaction.id
- name: id
level: extended
type: keyword
ignore_above: 1024
Expand Down
1 change: 1 addition & 0 deletions generated/ecs/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ecs.yml
Loading