Skip to content

Commit

Permalink
Merge pull request #18 from ansible/ttakamiy/AAP-16412/modify-no-excl…
Browse files Browse the repository at this point in the history
…ude-option

Change the behavior of --no-exclude option
  • Loading branch information
TamiTakamiya authored Oct 2, 2023
2 parents 1e120a5 + 1e1da53 commit 4a9b28b
Show file tree
Hide file tree
Showing 6 changed files with 153 additions and 53 deletions.
2 changes: 2 additions & 0 deletions .config/dictionary.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
ansiblelint
autofix
autofixed
clamav
clamscan
commandline
Expand Down
19 changes: 9 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,12 @@

## Overview

`ansible-content-parser` used for analyze Ansible files, such
as playbooks, task files, etc. in a given directory.

It runs `ansible-lint` internally against a given
source directory and
updates Ansible files (the `--fix` option of `ansible-lint`)
and generates the `lint-result.json` file, which summarizes
files found in the directory and lint errors.
`ansible-content-parser` analyzes Ansible files in a given source
(a local directory, an archive file or a git URL)
by running `ansible-lint` internally,
updates Ansible files using the [Autofix feature of `ansible-lint`](https://ansible.readthedocs.io/projects/lint/autofix/)
and generates the `ftdata.json` file, which is the training dataset
for developing custom AI models used with Ansible Lightspeed.

## Build

Expand Down Expand Up @@ -54,8 +52,9 @@ options:
effective rule transforms (the 'write_list') by passing a keywords 'all' (=default) or 'none'
or a comma separated list of rule ids or rule tags.
--skip-ansible-lint Skip the execution of ansible-lint.
--no-exclude Do not rerun ansible-lint with excluding files that caused syntax check errors. If one or more
syntax check errors were found, execution fails without generating the training dataset.
--no-exclude Do not let ansible-content-parser to generate training dataset by excluding files that caused
lint errors. With this option specified, a single lint error terminates the execution without
generating the training dataset.
-v, --verbose Explain what is being done
--source-license SOURCE_LICENSE
Specify the license that will be included in the training dataset.
Expand Down
105 changes: 69 additions & 36 deletions src/ansible_content_parser/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@
import shutil
import sys
import tarfile
import typing
import zipfile

from collections.abc import Generator
from importlib.metadata import PackageNotFoundError, version
from pathlib import Path
from typing import Any

import giturlparse # pylint: disable=import-error

Expand Down Expand Up @@ -51,16 +51,16 @@ def pushd(new_dir: str) -> Generator[None, None, None]:
def execute_ansiblelint(
argv: list[str],
work_dir: str,
) -> dict[str, typing.Any]:
) -> tuple[dict[str, list[Any]], int]:
"""Execute ansible-lint."""
with pushd(work_dir):
# Clear root logger handlers as ansible-lint adds one without checking existing ones.
logging.getLogger().handlers.clear()

result, mark_as_success = ansiblelint_main(argv)
result, mark_as_success, return_code = ansiblelint_main(argv)
return {
"files": [LintableDict(lintable) for lintable in result.files],
}
}, return_code


def parse_args(argv: list[str]) -> argparse.Namespace:
Expand Down Expand Up @@ -97,8 +97,10 @@ def parse_args(argv: list[str]) -> argparse.Namespace:
parser.add_argument(
"--no-exclude",
action="store_true",
help="Do not rerun ansible-lint with excluding files that caused syntax check errors. If one or more syntax "
"check errors were found, execution fails without generating the training dataset.",
help="Do not let ansible-content-parser to generate training dataset by "
"excluding files that caused lint errors. With this option specified, "
"a single lint error terminates the execution without generating the "
"training dataset.",
)
parser.add_argument(
"-v",
Expand Down Expand Up @@ -290,7 +292,7 @@ def main() -> None:
metadata_path = out_path / "metadata"

sarif_file = str(metadata_path / "sarif.json")
argv = ["__DUMMY__", "--sarif-file", sarif_file]
argv = ["ansible-lint", "--sarif-file", sarif_file]
update_argv(argv, args)

try:
Expand Down Expand Up @@ -335,8 +337,16 @@ def execute_lint_step(
) -> None:
"""Execute ansible-lint and create metadata files."""
exclude_paths: list[str] = []
if not args.skip_ansible_lint:
serializable_result = execute_ansiblelint(

lint_result = ""
lint_result2 = ""
sarif_file2 = ""
return_code = RC.SUCCESS

if args.skip_ansible_lint:
sarif_file = ""
else:
serializable_result, return_code = execute_ansiblelint(
argv,
str(repository_path),
)
Expand All @@ -347,54 +357,77 @@ def execute_lint_step(
) as f:
f.write(json.dumps(serializable_result))

parse_sarif_json(exclude_paths, sarif_file)

# If syntax-errors occurred on some files, kick off the second run excluding those files
if len(exclude_paths) > 0 and not args.no_exclude:
lint_result2 = str(metadata_path / "lint-result-2.json")
sarif_file2 = str(metadata_path / "sarif-2.json")
argv = ["__DUMMY__", "--sarif-file", sarif_file2]
argv.append("--exclude")
argv.extend(exclude_paths)
update_argv(argv, args)
_logger.info(",".join(argv))
serializable_result_2 = execute_ansiblelint(
argv,
str(repository_path),
)
serializable_result_2["excluded"] = exclude_paths

with Path(lint_result2).open(mode="w", encoding="utf-8") as f:
f.write(json.dumps(serializable_result_2))
else:
lint_result2 = ""
sarif_file2 = ""
if return_code == RC.SUCCESS or not args.no_exclude:
exclude_paths = parse_sarif_json(exclude_paths, sarif_file, True)

# If syntax-errors occurred on some files, kick off the second run excluding those files
if len(exclude_paths) > 0:
lint_result2 = str(metadata_path / "lint-result-2.json")
sarif_file2 = str(metadata_path / "sarif-2.json")
argv = ["ansible-lint", "--sarif-file", sarif_file2]
argv.append("--exclude")
argv.extend(exclude_paths)
update_argv(argv, args)
_logger.info(",".join(argv))
serializable_result_2, return_code = execute_ansiblelint(
argv,
str(repository_path),
)
serializable_result_2["excluded"] = exclude_paths
exclude_paths = parse_sarif_json(exclude_paths, sarif_file2, False)

_rename_excluded_files(exclude_paths, repository_path)

with Path(lint_result2).open(mode="w", encoding="utf-8") as f:
f.write(json.dumps(serializable_result_2))
else:
exclude_paths = parse_sarif_json(exclude_paths, sarif_file, False)

generate_report(
"" if args.skip_ansible_lint else lint_result2 if lint_result2 else lint_result,
lint_result,
lint_result2,
sarif_file,
sarif_file2,
args,
exclude_paths,
)

if len(exclude_paths) > 0 and args.no_exclude:
msg = "One or more syntax-check errors were found by ansible-lint"
if return_code != RC.SUCCESS and args.no_exclude:
msg = "One or more lint errors were found by ansible-lint"
raise RuntimeError(msg)


def parse_sarif_json(exclude_paths: list[str], sarif_file: str) -> None:
def _rename_excluded_files(exclude_paths: list[str], repository_path: Path) -> None:
with pushd(str(repository_path)):
for p in exclude_paths:
path = Path(p)
# Do not attempt to rename directories (e.g. role names)
if path.is_file():
Path(p).rename(p + ".__EXCLUDED__")


def parse_sarif_json(
exclude_paths: list[str],
sarif_file: str,
syntax_check_errors_only: bool,
) -> list[str]:
"""Analyze SARIF.json to see if syntax-check errors occurred or not on the first run."""
with Path(sarif_file).open("rb") as f:
o = json.load(f)
for run in o["runs"]:
for result in run["results"]:
if result["ruleId"].startswith("syntax-check"):
if (
result["ruleId"].startswith("syntax-check")
or not syntax_check_errors_only
and ("level" not in result or result["level"] == "error")
):
exclude_paths.extend(
[
location["physicalLocation"]["artifactLocation"]["uri"]
for location in result["locations"]
],
)
return sorted(set(exclude_paths))


def update_argv(argv: list[str], args: argparse.Namespace) -> None:
Expand Down
4 changes: 2 additions & 2 deletions src/ansible_content_parser/lint.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,6 @@ def ansiblelint_main(argv: list[str] | None = None) -> LintResult:
",".join(options.mock_filters),
)

app.report_outcome(result, mark_as_success=mark_as_success)
return_code = app.report_outcome(result, mark_as_success=mark_as_success)

return result, mark_as_success
return result, mark_as_success, return_code
25 changes: 20 additions & 5 deletions src/ansible_content_parser/report.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
_label_count = "Count"
_label_file_type = "File Type"
_label_file_path = "File Path"
_label_file_state = "Updated"
_label_file_state = "Excluded/Autofixed"
_label_module_name = "Module Name"
_label_total = "TOTAL"

Expand Down Expand Up @@ -74,18 +74,25 @@ def filetype_summary(result: dict[str, list[LintableDict]]) -> str:
return summary


def get_file_list_summary(files: list[LintableDict]) -> str:
def get_file_list_summary(files: list[LintableDict], excluded_paths: list[str]) -> str:
"""Get summary string from the lintable list."""
entries = []
max_filename_len = len(_label_file_path)
max_kind_len = len(_label_file_type)
max_state_len = len(_label_file_state)
kinds = {f["filename"]: f["kind"] for f in files}
updated = {f["filename"]: f["updated"] for f in files}
excluded = {f["filename"]: (f["filename"] in excluded_paths) for f in files}
for filename in sorted(kinds):
kind = kinds[filename]
if kind != "": # Skip files that was not identified by ansible-lint
state = "updated" if updated[filename] else ""
state = (
"excluded"
if excluded[filename]
else "autofixed"
if updated[filename]
else ""
)
entries.append([filename, kind, state])
if len(filename) > max_filename_len:
max_filename_len = len(filename)
Expand Down Expand Up @@ -216,9 +223,11 @@ def get_excluded_files(excluded: list[str]) -> str:

def generate_report(
json_file: str,
json_file2: str,
sarif_file: str,
sarif_file2: str,
args: argparse.Namespace,
excluded_paths: list[str],
) -> None:
"""Generate report."""
report = f"""
Expand All @@ -238,6 +247,11 @@ def generate_report(
with Path(json_file).open(encoding="utf-8") as f:
result = json.load(f)
files = result["files"]

last_json_file = json_file2 if json_file2 else json_file
if last_json_file:
with Path(last_json_file).open(encoding="utf-8") as f:
result = json.load(f)
excluded = result.get("excluded", [])

report += f"""
Expand All @@ -249,7 +263,7 @@ def generate_report(
[ List of Ansible files identified ]
{get_file_list_summary(files)}
{get_file_list_summary(files, excluded_paths)}
[ Issues found by ansible-lint ]
Expand All @@ -267,7 +281,8 @@ def generate_report(
{get_sarif_summary(metadata_path, sarif_file2)}
"""
else:
report += f"""
if sarif_file:
report += f"""
{get_sarif_summary(metadata_path, sarif_file)}
"""
with (out_path / _report_txt).open(mode="w") as f:
Expand Down
51 changes: 51 additions & 0 deletions tests/test_main.py
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,8 @@ def test_cli_with_local_directory(self) -> None:
testargs = [
"ansible-content-parser",
"-v",
"--profile",
"min",
source.name + "/", # intentionally add "/" to the end
output.name,
]
Expand All @@ -220,9 +222,58 @@ def test_cli_with_local_directory(self) -> None:

assert context.exception.code == 0, "The exit code should be 0"

found_file_counts_section = False
with (Path(output.name) / "report.txt").open("r") as f:
for line in f:
if "[ File counts per type ]" in line:
found_file_counts_section = True
if line == "Module Name Count\n":
assert found_file_counts_section is True
line = f.readline()
assert line == "---------------------\n"
line = f.readline()
assert line == "service 2\n"
line = f.readline()
assert line == "yum 2\n"
line = f.readline()
assert line == "firewalld 1\n"
line = f.readline()
assert line == "meta 1\n"
line = f.readline()
assert line == "---------------------\n"
line = f.readline()
assert line == "TOTAL 6\n"
line = f.readline()
assert line == "---------------------\n"

def test_cli_with_local_directory_with_no_ansible_lint(self) -> None:
"""Run the CLI with a local directory."""
with temp_dir() as source:
self._create_repo(source)
self._add_second_playbook(source)
self._add_third_playbook(source)
with temp_dir() as output:
testargs = [
"ansible-content-parser",
"-v",
"--skip-ansible-lint",
source.name,
output.name,
]
with patch.object(sys, "argv", testargs), self.assertRaises(
SystemExit,
) as context:
main()

assert context.exception.code == 0, "The exit code should be 0"

found_file_counts_section = False
with (Path(output.name) / "report.txt").open("r") as f:
for line in f:
if "[ File counts per type ]" in line:
found_file_counts_section = True
if line == "Module Name Count\n":
assert found_file_counts_section is False
line = f.readline()
assert line == "---------------------\n"
line = f.readline()
Expand Down

1 comment on commit 4a9b28b

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClamAV Virus Definition DB Files:
----
total 227188
-rw-r--r--  1 root root 170479789 Oct  2 06:07 main.cvd
-rw-r--r--  1 root root        69 Oct  2 06:07 freshclam.dat
-rw-r--r--  1 root root  61843294 Oct  2 06:07 daily.cvd
-rw-r--r--  1 root root    291965 Oct  2 06:07 bytecode.cvd
drwxr-xr-x 14 root root      4096 Oct  2 13:42 ..
drwxr-xr-x  2 root root      4096 Oct  2 13:42 .
----
File: /var/lib/clamav/bytecode.cvd
Build time: 22 Feb 2023 16:33 -0500
Version: 334
Signatures: 91
Functionality level: 90
Builder: anvilleg
MD5: 0464067a252b1e937012ad34e811065f
Digital signature: urVBCbhJcz8v6i1E6HedDwa8TxBHnJknqg7SE+6JWBtovATpw8MWwS+kvGAi//x5u0LIFwhPvUsgEBBeFiZE0QTTWazOhJ/LfKJK+nODqha6cTvaQdKl2rSbEOv6grv7UONV8eKi383Wv07wfSNYp+lPNpt0QmejKb1TMHAYTA
Verification OK.
----
File: /var/lib/clamav/daily.cvd
Build time: 01 Oct 2023 03:38 -0400
Version: 27048
Signatures: 2041893
Functionality level: 90
Builder: raynman
MD5: 236351170e6df671495403cd518d5f4b
Digital signature: xPPQ76mi/CVQRonhozxMHN/NUH/Pajo0j+KOkXH2tifeN1hOCWpw8ZQ0Yg4UKkAJY8fDnIE/B9xSm9z+1BmqkINwvW0rq890Om/IhorIbu2J18oHOMj8C2YK45hrRvGgbio3mIcE9e2U0N0/HRlOfUSZB6f5DbTDEe4+dsrGudc
Verification OK.
----
File: /var/lib/clamav/main.cvd
Build time: 16 Sep 2021 08:32 -0400
Version: 62
Signatures: 6647427
Functionality level: 90
Builder: sigmgr
MD5: 137eccce31aacb21b5a98bb8c21cefd6
Digital signature: twaJBls8V5q64R7QY10AatEtPNuPWoVoxTaNO1jpBg7s5jIMMXpitgG1000YLp6rb0TWkEKjRqxneGTxuxWaWm7XBjsgwX2BRWh/y4fhs7uyImdKRLzQ5y8e2EkSChegF/i8clqfn+1qetq9j4gbktJ3JZpOXPoHlyr2Dv9S/Bg
Verification OK.
----
Scanning Results:
ClamAV 1.0.2/27048/Sun Oct  1 07:38:34 2023

----------- SCAN SUMMARY -----------
Known viruses: 8673778
Engine version: 1.0.2
Scanned directories: 7162
Scanned files: 53774
Infected files: 0
Data scanned: 2286.80 MB
Data read: 1350.52 MB (ratio 1.69:1)
Time: 512.174 sec (8 m 32 s)
Start Date: 2023:10:02 13:43:03
End Date:   2023:10:02 13:51:35

Please sign in to comment.