-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Investigation] Smart Limits for Detection Rules #4150
Comments
Update Oct 24@xcrzx Was able to test loading rules with a limit of about 20k rules. We may be able to remove the limit altogether starting in 8.17. We still need to wait on the serverless testing to confirm. Long term, we all agree it would be better to have a distribution mechanism that uses git (where kibana pulls directly from the detection-rules repo branches). |
Hey @Mikaayenson, I finished testing the package limits on Serverless last week, and the results are in this PR. Long story short, there were no issues installing a 15k-rule package on a Serverless project. Memory consumption was stable and well within the 1GB limit, averaging around 40-45%, which is a great result. I couldn’t test a 20k-rule package due to some other test environment limitations, but I don’t anticipate any significant differences. We can set the total assets-per-package limit pretty high, like the tested 15k. We’ll probably never reach it, but it’s safer to keep the limit in place in case we get close before migrating from Fleet to a Git-based rule distribution system. Do you know how many saved objects would be in the package if we include current rules with all their historical versions? |
Hey @xcrzx today we have roughly around ~1250 rules in each rule packages for versions such as 8.12 --- 8.15 ( ball park numbers that can vary on actual) refer. Previous to the limiting of historical rules to the latest 2 we would have roughly around ~3700 rules in a package for older releases refer Without having the latest rules added to the EPR historical data, its difficult to arrive at the actual numbers of rules that would be in the packages once we remove the limiting of versions. Given that the tested limit was roughly around 15k, I don't see the rule assets reaching this number immediately (may be grow over time as we add more versions to the package) even if we build today without limiting historical versions. |
Thanks for the info, @shashank-elastic! The stream-based package installation has been merged, so we can plan the limit increase. It’s set for release in Kibana Let’s also discuss what we are planning to do if a package reaches the 15k limit. I’ve been considering an algorithm with these steps:
This would allow us to discard only the oldest rule versions, the ones users are least likely to need, providing an optimal rule upgrade experience. However, I recall there were challenges with retrieving the creation date of rule versions, which could complicate sorting from newest to oldest. Is that still the case? I’d also welcome any thoughts on this approach or alternative implementation ideas. Let me know if you see a different way forward. |
@Mikaayenson @approksiu @xcrzx /cc @shashank-elastic Short recap of our last zoom call and the timing:
|
@shashank-elastic we may be able to do something like this to download all the rules. import json
import semver
from pathlib import Path
from semver import Version
from detection_rules.integrations import get_integration_manifests, SecurityDetectionEngine
# Constants
MIN_VERSION = "8.14.1"
MAX_VERSION = "8.17.0"
OUTPUT_FOLDER = "all_sde_folders"
# Fetch and sort manifests
manifests = get_integration_manifests("security_detection_engine", False, MAX_VERSION)
sorted_manifests = sorted(manifests, key=lambda manifest: Version.parse(manifest["version"]), reverse=True)
# Filter versions greater than or equal to the minimum version
filtered_versions = [
semver.VersionInfo.parse(manifest["version"])
for manifest in sorted_manifests
if semver.VersionInfo.parse(manifest["version"]) >= semver.VersionInfo.parse(MIN_VERSION)
]
# Print filtered versions
print("Filtered Versions:", filtered_versions)
# Ensure output folder exists
output_path = Path(OUTPUT_FOLDER)
output_path.mkdir(parents=True, exist_ok=True)
# Load assets and save them to disk
sde = SecurityDetectionEngine()
for version in filtered_versions:
assets = sde.load_integration_assets(version)
for asset, data in assets.items():
asset_path = output_path / f"{asset}.json"
with open(asset_path, "w", encoding="utf-8") as f:
f.write(json.dumps(data)) |
@shashank-elastic Here are all the rules and some sample output. 54902e45-3467-49a4-8abc-529f2c8cfb80_211.json ad959eeb-2b7b-4722-ba08-a45f6622f005_4.json ff6cf8b9-b76c-4cc1-ac1b-4935164d1029.json
54a81f68-5f2a-421e-8eed-f888278bb712.json adb961e0-cb74-42a0-af9e-29fc41f88f5f.json ff6cf8b9-b76c-4cc1-ac1b-4935164d1029_1.json
54a81f68-5f2a-421e-8eed-f888278bb712_108.json adb961e0-cb74-42a0-af9e-29fc41f88f5f_105.json ff6cf8b9-b76c-4cc1-ac1b-4935164d1029_201.json
54a81f68-5f2a-421e-8eed-f888278bb712_109.json adb961e0-cb74-42a0-af9e-29fc41f88f5f_106.json ff9b571e-61d6-4f6c-9561-eb4cca3bafe1.json
54a81f68-5f2a-421e-8eed-f888278bb712_2.json adb961e0-cb74-42a0-af9e-29fc41f88f5f_107.json ff9b571e-61d6-4f6c-9561-eb4cca3bafe1_103.json
54a81f68-5f2a-421e-8eed-f888278bb712_210.json adb961e0-cb74-42a0-af9e-29fc41f88f5f_108.json ff9b571e-61d6-4f6c-9561-eb4cca3bafe1_104.json
54a81f68-5f2a-421e-8eed-f888278bb712_3.json adb961e0-cb74-42a0-af9e-29fc41f88f5f_109.json ff9bc8b9-f03b-4283-be58-ee0a16f5a11b.json
54a81f68-5f2a-421e-8eed-f888278bb712_4.json adb961e0-cb74-42a0-af9e-29fc41f88f5f_110.json ff9bc8b9-f03b-4283-be58-ee0a16f5a11b_1.json
54a81f68-5f2a-421e-8eed-f888278bb712_5.json adbfa3ee-777e-4747-b6b0-7bd645f30880.json ff9bc8b9-f03b-4283-be58-ee0a16f5a11b_2.json
54a81f68-5f2a-421e-8eed-f888278bb712_6.json adbfa3ee-777e-4747-b6b0-7bd645f30880_1.json ff9bc8b9-f03b-4283-be58-ee0a16f5a11b_3.json
54a81f68-5f2a-421e-8eed-f888278bb712_7.json adbfa3ee-777e-4747-b6b0-7bd645f30880_2.json ff9bc8b9-f03b-4283-be58-ee0a16f5a11b_4.json
54a81f68-5f2a-421e-8eed-f888278bb712_8.json adbfa3ee-777e-4747-b6b0-7bd645f30880_3.json ff9bc8b9-f03b-4283-be58-ee0a16f5a11b_5.json
54c3d186-0461-4dc3-9b33-2dc5c7473936.json adbfa3ee-777e-4747-b6b0-7bd645f30880_4.json
54c3d186-0461-4dc3-9b33-2dc5c7473936_103.json adbfa3ee-777e-4747-b6b0-7bd645f30880_5.json
(detection-rules-build) ➜ all_sde_folders git:(main) ✗ ls -la |wc -l
8320
(detection-rules-build) ➜ all_sde_folders git:(main) ✗ You can probably use this small script that will check the version lock file and filter out rules that aren't in the current lock. import shutil
from pathlib import Path
from detection_rules.version_lock import load_versions, DeprecatedRulesFile
# Paths
source_folder = Path("all_sde_folders")
destination_folder = Path("filtered_all_assets")
# Ensure the destination folder exists
destination_folder.mkdir(parents=True, exist_ok=True)
version_lock = load_versions()
deprecated_version_lock = DeprecatedRulesFile.load_from_file().to_dict()
# Get all filenames from the source folder
files = list(source_folder.glob("*.json"))
# Extract UUIDs from filenames and copy matching files
for file in files:
# Extract UUID from the filename (before the first underscore or `.json`)
uuid = file.stem.split("_")[0]
# Check if the UUID exists in the version lock file
if uuid in version_lock:
if uuid in deprecated_version_lock:
print(f"skipped deprecated: {uuid}")
else:
# Copy the file to the destination folder
shutil.copy(file, destination_folder / file.name)
print(f"Filtered files have been moved to {destination_folder}.") Then we can add these as a beta package to epr. @banderror @xcrzx Just an FYI, so that we do NOT break out release workflow, the package will be |
The Filtered assets have been created filtered_all_assets.zip We will now create a Beta package PR for 8.16.2-beta.1. We will not create a 8.`7 version beta package as this will really mess up or regular release cadences. Once the Package is created will notify the issue |
The test package created with historical rules is available in the EPR - https://epr.elastic.co/package/security_detection_engine/8.16.2-beta.1/ with the kibana version constrict of cc @Mikaayenson |
Repository Feature
Core Repo - (rule management, validation, testing, lib, cicd, etc.)
Problem Description
With #3842 we limited the number of versions per rule released down to 2 versions. We've been asked by D&R and our PM @approksiu to investigate the LOE to implement smart limits elastic/kibana#187645 . There are hurdles to implement this feature for both publishing and ingesting rules. This issue tracker will help to understand any technical limitations on our end.
This issue is purely an investigation to provide more insight to what limitations /options we can control from this side.
Desired Solution
Details on options / hurdles to implement smart limits. Examples:
Considered Alternatives
We may not need to implement complicated smart limits at all based on some initial testing by @xcrzx who may have a solution that supports the number of rule versions published to a limit that we theoretically would not reach (considering the qualitative balance between the number of rules vs maintaining and publishing a number of high quality rules).
This would be another option if they are successful.
Additional Context
The text was updated successfully, but these errors were encountered: