-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework toposort algorithm as an ancestor traversal #3646
base: master
Are you sure you want to change the base?
Conversation
* Feat: Add spec and parser for 'crictl_logs' Signed-off-by: shlao <[email protected]> * Updated the 'CrictlLogs' docstring Signed-off-by: shlao <[email protected]> * Used the Plurals Signed-off-by: shlao <[email protected]> * Updated the docstring Signed-off-by: shlao <[email protected]> * Use the singular form Signed-off-by: shlao <[email protected]> * Updated the docstring Signed-off-by: shlao <[email protected]>
Signed-off-by: Mark Huth <[email protected]>
* fix: fix the regression bug of soscleaner IP obsfuscating Signed-off-by: Xiangce Liu <[email protected]> * Update the ip obfuscating sort Signed-off-by: Xiangce Liu <[email protected]> * Add tests Signed-off-by: Xiangce Liu <[email protected]> * fix flake8 Signed-off-by: Xiangce Liu <[email protected]>
* feat: New parser for systemd_perms Signed-off-by: Xinting Li <[email protected]> * fix: fix ETC_SYSTEMD in test_systemd_perms.py Signed-off-by: Xinting Li <[email protected]> * fix: Fix file name and class name Signed-off-by: Xinting Li <[email protected]> * fix: add doc entry Signed-off-by: Xinting Li <[email protected]> * fix: update insights_archive Signed-off-by: Xinting Li <[email protected]> * fix: fix docs-test error Signed-off-by: Xinting Li <[email protected]> * fix: fix length of '-' Signed-off-by: Xinting Li <[email protected]> * fix: Updating test_etc_systemd() in test_ls_systemd.py Signed-off-by: Xinting Li <[email protected]> Co-authored-by: Xinting Li <[email protected]>
…3352) * feat: New spec and parser to get capsules and repos which have "background" download policy * Also reorder the classes in satellite_postgresql_query by alphabetically. Signed-off-by: Huanhuan Li <[email protected]> * Combiner the two querys about katello_root_repositories together. * Also rename the spec to query the smart_proxies table Signed-off-by: Huanhuan Li <[email protected]> * Use "deprecated" method for the deprecated class Signed-off-by: Huanhuan Li <[email protected]> * Remove "-SatelliteKatelloEmptyURLRepositories" from title * It is deprecated now Signed-off-by: Huanhuan Li <[email protected]>
* The code had an issue where it was looping over the dictionary updates that was being updated in another loop. So it was checking the beginning pkgs over and over each loop for updates which was un-necessary and caused the slowness. So I simplified the code down to the least amount of looping and variables. Signed-off-by: Ryan Blakley <[email protected]>
Signed-off-by: Rahul Srivastava <[email protected]>
* Currently core only collects /etc/sysctl.conf but none of the conf files in the /usr/lib/sysctl.d/ and /etc/sysctl.d/ directories. So I added a spec for them and added a spec for the /usr/lib/sysctl.d/ directory as well. * Updated the parsers to use a base class to avoid duplicate code. * Added a new combiner for all of the conf files in the /usr/lib/sysctl.d/ and /etc/sysctl.d/ directories along with the /etc/sysctl.conf file. Signed-off-by: Ryan Blakley <[email protected]>
Signed-off-by: Huanhuan Li <[email protected]>
* feat: New parsers for IBM proc files - /proc/powerpc/lparcfg - /proc/device-tree/openprom/ibm,fw-vernum_encoded Signed-off-by: Xiangce Liu <[email protected]> * Fix tests Signed-off-by: Xiangce Liu <[email protected]> * adjust order of specs Signed-off-by: Xiangce Liu <[email protected]>
* New parser for /usr/bin/hexdump -C /dev/cpu_dma_latency command Signed-off-by: Akshay Ghodake <[email protected]> * Updated command from hexdump to od Signed-off-by: Akshay Ghodake <[email protected]>
* fix: Keep the results once one of them is good * When some spec is collecting several files, we should keep the results once one of them is good, and only set results to empty when all of them fails with errors Signed-off-by: Huanhuan Li <[email protected]> * Simplify the returned data of 'marshal' function Signed-off-by: Huanhuan Li <[email protected]> * Remove useless variable Signed-off-by: Huanhuan Li <[email protected]>
…3364) Signed-off-by: ahitacat <[email protected]>
* Based on the comments in bz 1759080, I switched the crontab specs to read the crontab files instead of running the command since it does user id look ups every time the command is ran. But by switching to simple_file the spec would be skipped if the file doesn't exist. Signed-off-by: Ryan Blakley <[email protected]>
* Enhance parser Grub2Config Signed-off-by: jiazhang <[email protected]> * Enhance combiner grub_conf Signed-off-by: jiazhang <[email protected]>
* Using the xml element tree getiterator function has been deprecated for a while it seems. To get rid of the warning I updated to using iter wrapped in a list which is all the getiterator function did. Signed-off-by: Ryan Blakley <[email protected]>
…#3373) * fix: Enhance "PCSStatus" to make it compatible with new output format Signed-off-by: Huanhuan Li <[email protected]> * Simplify the data structure for PCSStatus Signed-off-by: Huanhuan Li <[email protected]> * Remove the prefix "*" when do comparation Signed-off-by: Huanhuan Li <[email protected]>
* Any file under /proc/net/bonding/ is a valid bond interface. The bond interface names don't have to start with bond, so change the glob to collect all file under the directory. * Updated the bond_dynamic_lb spec to glob all interface names, since the /sys/class/net/*/bonding/ directory only exist for bond devices. Signed-off-by: Ryan Blakley <[email protected]>
* Enhance combiner grub_conf_blscfg Signed-off-by: jiazhang <[email protected]> * Update grub_conf Signed-off-by: jiazhang <[email protected]> * Fix flake8 issue Signed-off-by: jiazhang <[email protected]> * Update test Signed-off-by: jiazhang <[email protected]> * Remove RHEL check Signed-off-by: jiazhang <[email protected]>
* SOS Report no longer collects ps auxwww and ps auxwwwm, it removed the auxwww since auxwwwm basically shows the same thing with threads added. * Adding this spec caused a ValueError in the ps combiner due to the thread entries containing a - in the pid column. So I updated the combiner to check to make sure the pid column entry is a digit before continuing. * Fix some grammar and flake8 issue. Signed-off-by: Ryan Blakley <[email protected]>
Account for cases where IP addresses can appear at the end of a line in the space-preserving netstat obfuscation function _sub_ip_netstat. Resolves: RHBZ#2029997 Signed-off-by: Link Dupont <[email protected]>
Due to application workload or misbehavior the semaphores count might increase significantly. Additionally, the current rules expects semids from selected owners namely 'root', 'apache', and 'oracle'. Closes-Bug: #2071058 Signed-off-by: Sachin Patil <[email protected]>
Signed-off-by: Xinting Li <[email protected]> Co-authored-by: Xinting Li <[email protected]>
This patch replaces all the string occurrences of “cloud.redhat.com” with “console.redhat.com”. Resolved: rhbz#2006386 Signed-off-by: Gael Chamoulaud (Strider) <[email protected]>
* Exclude some Specs from obfuscation Add a list of Specs that are excluded from the IP address obfuscation. This is to prevent invalid obfuscation of four-part package versions that are indistinguishable from IPv4 addresses. The excluded Specs are known to never contain IP addesses in their output. Signed-off-by: Štěpán Tomsa <[email protected]> Fix cyclic imports Test _blacklisted_files Signed-off-by: Štěpán Tomsa <[email protected]> Clean up the blacklisted_files test Signed-off-by: Štěpán Tomsa <[email protected]> Test IP obfuscation blacklist Signed-off-by: Štěpán Tomsa <[email protected]> Test mocked clean_report Signed-off-by: Štěpán Tomsa <[email protected]> Fix test for Python 2.6 Signed-off-by: Štěpán Tomsa <[email protected]> Rename obfuscation blacklist to exempted obfuscations Signed-off-by: Štěpán Tomsa <[email protected]> Add function comment Signed-off-by: Štěpán Tomsa <[email protected]> Add function comment Signed-off-by: Štěpán Tomsa <[email protected]> Rename exempt to excluded Signed-off-by: Štěpán Tomsa <[email protected]> Fix formatting Signed-off-by: Štěpán Tomsa <[email protected]> * Move excluded Specs to SOSCleaner The module architecture causes problems when the constants module is imported from the SOSCleaner. Also, the SOSCleaner is intended to be a standalone module. Signed-off-by: Štěpán Tomsa <[email protected]> * Add more Specs to the exclude list Provided by Michael Mraka <[email protected]>. Signed-off-by: Štěpán Tomsa <[email protected]> * Make excluded Specs a list For semantics. Signed-off-by: Štěpán Tomsa <[email protected]> * Fix excluded Specs test It makes no sense to replicate the Specs list in the test. Kept the test, only made it simpler. Now it checks that the list is not empty and that its items are valid Spec names. Signed-off-by: Štěpán Tomsa <[email protected]> * fixup! Fix excluded Specs test Signed-off-by: Štěpán Tomsa <[email protected]>
In old version of the client, the machine-id could be a non-hyphen UUID. Upload still works but insights-client-results.service fails with a `Error: failed to download advisor report.` error message. This patch fixes this issue by actually returning a UUID4 object with hyphens. Moreover, it also adds a unit test to check non-hyphen UUID from old client. Resolves: rhbz#1998560 Signed-off-by: Gael Chamoulaud (Strider) <[email protected]>
* fix: Update the spec "du_dirs" to filterable Signed-off-by: Huanhuan Li <[email protected]> * Move the "du_dirs_list" datasource to the datasource directory Signed-off-by: Huanhuan Li <[email protected]> * Rename to "dir_list" * Also raise SkipComponent if there are no filters Signed-off-by: Huanhuan Li <[email protected]>
* Some unused files have syntax error, as a result, it won't be hit even there is a hit for the used files. Signed-off-by: Huanhuan Li <[email protected]>
* Parser was using the wrong spec and should be using yum_updates Signed-off-by: Bob Fahr <[email protected]>
* fix: Do not log Parsers' Traceback during collection Signed-off-by: Xiangce Liu <[email protected]> * fix doc error by renaming Signed-off-by: Xiangce Liu <[email protected]> * resolve rebase conflict * revert blank line Signed-off-by: Xiangce Liu <[email protected]>
…arser (#3645) * feat: New datasource "sys_fs_cgroup_uniq_memory_swappiness" and its parser Signed-off-by: Qin Ping <[email protected]> * Updated the datasource and its parser per the review comments. Signed-off-by: Qin Ping <[email protected]> * Fix the typo in docstring. Signed-off-by: Qin Ping <[email protected]>
* feat: new RHEL component to identify "Red Hat Enterprise Linux" host Signed-off-by: Xiangce Liu <[email protected]> * add it to pro-loaded Signed-off-by: Xiangce Liu <[email protected]> * add more parsers and refine logic * seems it's not need to be preloaded Signed-off-by: Xiangce Liu <[email protected]> * tiny update * move it to combiners Signed-off-by: Xiangce Liu <[email protected]> * Update as per feedback * update comments Signed-off-by: Xiangce Liu <[email protected]> * fix py26 * to combiner * Remove SubscriptionManagerID from the candidates Signed-off-by: Xiangce Liu <[email protected]> * update comment * Refine the logic Signed-off-by: Xiangce Liu <[email protected]> * Update logic per discussion and rename to OSRelease * one more case for coverage Signed-off-by: Xiangce Liu <[email protected]> * fix doc error * removed docstring line Signed-off-by: Xiangce Liu <[email protected]> * update per feedback and resolve rebase conflicts Signed-off-by: Xiangce Liu <[email protected]> * make points a local variable instead of a member * installed_packages as well Signed-off-by: Xiangce Liu <[email protected]> * expanded the tuples in rpms list * more comments for pkgs Signed-off-by: Xiangce Liu <[email protected]>
… firstly (#3634)" (#3652) This reverts commit ed29640. Signed-off-by: Xiangce Liu <[email protected]>
@PaulWay - The same as #3604, this PR changes a very basic level method, it also needs to be tested and compared to check the performance improvement. However, in another PR #3649, @csams added and exposed a new |
) Signed-off-by: Xiangce Liu <[email protected]>
…3658) Signed-off-by: Qin Ping <[email protected]>
* fix: check the content first in class InstalledRpms Signed-off-by: Chenlizhong <[email protected]> * fix: remove the get_active_lines() Signed-off-by: Chenlizhong <[email protected]> * fix: update the importing to full path and raise the exception when content is empty Signed-off-by: Chenlizhong <[email protected]> * fix: remove empty line as some testing files are with empty line Signed-off-by: Chenlizhong <[email protected]> * fix: skip empty line in the loop Signed-off-by: Chenlizhong <[email protected]>
…y into combiner rhel_for_edge (#3657) * Add rpm_ostree check into combiner rhel_for_edge Signed-off-by: jiazhang <[email protected]> * Fix typo Signed-off-by: jiazhang <[email protected]>
* Moved the remaining exceptions that were defined in various other modules to the new separate exceptions module. That way all exceptions are defined in their own module. * Updated all imports to reference the new path. Signed-off-by: Ryan Blakley <[email protected]>
* feat: add JbossRuntimeVersions parser Signed-off-by: Chen lizhong <[email protected]> * fix: implement iterator and get for JbossRuntimeVersions class Signed-off-by: Chenlizhong <[email protected]> * fix: fix the error in python2 Signed-off-by: Chenlizhong <[email protected]> * fix: use Context class directly instead of tests/context_wrap Signed-off-by: Chenlizhong <[email protected]> * fix: use nametuple and private function to parse the line of version Signed-off-by: Chenlizhong <[email protected]> * resort the importings Signed-off-by: Xiangce Liu <[email protected]>
* fix: add '-d 0' to yum_repolist spec Signed-off-by: Xiangce Liu <[email protected]> * change to '-d 2' Signed-off-by: Xiangce Liu <[email protected]>
…s) (#3665) Signed-off-by: Mark Huth <[email protected]> Signed-off-by: Mark Huth <[email protected]>
* Print log and traceback of unexpected error Signed-off-by: Mark Huth <[email protected]>
Signed-off-by: Mark Huth <[email protected]>
Signed-off-by: XiaoXue Wang <[email protected]>
* Added the --show-skips cli arg to trigger SkipComponents to be captured in the broker for troubleshooting purposes. Signed-off-by: Ryan Blakley <[email protected]>
* Mark SkipException deprecated to help avoid confusion as the dirty parser difference isn't really needed. * Replaced all internal references with SkipComponent. Signed-off-by: Ryan Blakley <[email protected]>
* feat: add CloudInstance and OSRelease to canonical_facts Signed-off-by: Xiangce Liu <[email protected]> * insert value only when match, and add tests * enable AWSInstanceIdDoc only * fix: provider_type should be cloud_instance.provider * make the new items must exist: None by default * fix the tests * remove is_rhel Signed-off-by: Xiangce Liu <[email protected]>
The default environment for "simple_command" includes LC_ALL=C, which means ASCII (and not unicode); in case the language of the system is set to a non-English locale that uses unicode characters (e.g. non-Latin1 languages), then subscription-manager will fail to output that with encoding issues (e.g. [1]). As a simple solution, enforce an unicode English locale for subscription-manager, which should avoid those encoding issues. [1] https://bugzilla.redhat.com/show_bug.cgi?id=2074745 Signed-off-by: Pino Toscano <[email protected]>
…ules (#3670) * Enhance datasource kernel_module_filters to check the loaded modules * Update the datasource test accordingly * Add lsmod into insights.collect.default_manifest module, raise a SkipComponent when no modules are loaded. * Update the preload parser name Signed-off-by: jiazhang <[email protected]>
* Trying to make the spec name more specific, I'm working adding tags to the sos project it was requested for a more specific spec/tag name. Signed-off-by: Ryan Blakley <[email protected]>
* Capture the name of blacklisted specs, and dump them to a file in the archive. * Added parser and spec to parse the blacklisted specs and use in rules. Signed-off-by: Ryan Blakley <[email protected]>
Can one of the admins verify this patch? |
65dec4a
to
d3542c2
Compare
The 'toposort' function was identified as using significant amounts of processor time in real world archive processing. Looking at the code, it does a number of things that make it slow:
While these operations provide a logical way of processing the data, they are wasteful of both space and processing time.
The new algorithm is a kind of inverse of that method - rather than working on items with a set of dependencies, it works on a dependency with a set of 'ancestors' that depend on that item. It can be seen as a 'data flow' model rather than a 'process flow' model: the original algorithm modelled objects (combiners, parsers, specs) calling other objects, and the new algorithm models the data flowing from the specs to the parsers to the combiners and so on.
The new algorithm starts by creating a dictionary of 'ancestors', whose keys are the objects in the value sets of the original data, and whose values are sets of the keys of the original data. It also stores the count of dependencies on each original data key (object). Before iterating, it also stores the set of 'terminators' - keys in the original data that are not ancestors of any object; these are the objects that nothing depends on. Rather than adding these to the dictionary of ancestors (and thus bulking out the ancestor dictionary), these are stored as a separate set for later.
We then iterate in much the same way as the original algorithm. We find the set of ancestors that (now) have no dependents (which is now a set difference rather than a dictionary traversal), yield that set, then decrement the dependency counts for each of the dependents of those 'free' ancestors and delete them from the dictionary of ancestors. If we found no 'free' ancestors, then we've either traversed our way to the 'terminators' set of items that nothing else depends on, or we've found circular dependencies (as in the original algorithm).
Thus we've reduced the set differences to integer subtractions, the dictionary traversal to a key set difference, and eliminated the original data copy and the set union and bulking.
In testing, this is between 1.4 to 1.5 times faster than the original algorithm over small to medium sized dependency graphs (over 100,000 iterations). In other words, this reduces the time taken by 45% to 50%.
We also update the 'toposort_flatten' function to act as a generator. This doesn't improve the test speed noticeably, but it allows for a slightly faster start to processing the first item and may allow better concurrency in multi-threaded or asynchronous implementations.
Signed-off-by: Paul Wayper [email protected]
All Pull Requests:
Check all that apply: