Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude specific modules from a contract? #244

Open
apmorton opened this issue Nov 7, 2024 · 6 comments
Open

Exclude specific modules from a contract? #244

apmorton opened this issue Nov 7, 2024 · 6 comments

Comments

@apmorton
Copy link

apmorton commented Nov 7, 2024

We currently have a configuration similar to below:

[tool.importlinter]
root_packages = [
  'package',
]
include_external_packages = true

[[tool.importlinter.contracts]]
name = "only allow forbidden in subpackage"
type = "forbidden"
source_modules = [
  'package',
]

forbidden_modules = [
  'forbidden',
]

ignore_imports = [
  'package.subpackage.** -> forbidden',
]

This works, but has an unintended (for us) side-effect: transitive imports won't violate the contract

# package/thing.py
import package.subpackage.constants # should be allowed
import package.subpackage.impl # should not be allowed, but is

# package/subpackage/constants.py
XYZ = 1

# package/subpackage/impl.py
import forbidden

Is something like the following feasible/desirable?

[[tool.importlinter.contracts]]
name = "only allow forbidden in subpackage"
type = "forbidden"
source_modules = [
  'package',
]

ignore_modules = [
  'package.subpackage.**',
]

forbidden_modules = [
  'forbidden',
]

Where ignore_modules aren't themselves directly checked, but transitive imports through them would be.

@seddonym
Copy link
Owner

seddonym commented Nov 11, 2024

Thanks for the issue!

If I'm understanding correctly, you want to lint for the following: modules in package (except package.subpackage) must not import forbidden, even indirectly. Is that correct?

You could achieve already by listing of the modules in package that are not package.subpackage explicitly, e.g.

[[tool.importlinter.contracts]]
name = "only allow forbidden in subpackage"
type = "forbidden"
source_modules = [
  'package.thing',
  'package.something_else'
]
forbidden_modules = [
  'forbidden',
]

Providing you remembered to list all of the other modules in source_modules (with the exception of subpackage) then I think that would work. Admittedly it could end up being a bit verbose, so your ignore_modules could help with that.

I'm reluctant to add this to Import Linter, at least not yet. if you're really keen to avoid the verbosity you could make a custom contract type that extends the forbidden contract type but automatically includes all the modules that aren't packages.subpackage.

Side note: is it worth reconsidering this architectural rule? Sometimes you might want to add an import of forbidden to a module in subpackage, but there might happen to be a dependency from another module in package that means you can't add it. It might be worth exploring a layered architecture instead - the benefit of that is it's explicit what you can and can't import from depending on what subpackage you're in. Just my 2 cents.

Hope that's helpful.

@apmorton
Copy link
Author

Is that correct?

Yep - that's exactly it.

Providing you remembered to list all of the other modules in source_modules

Yeah, we considered this, but the problem is package is (uncontrolled) chaos - the list of subpackages is a moving target and lengthy.

Side note: is it worth reconsidering this architectural rule?

This kind of rule is not our desired end-state for sure. We're attempting to use rules like this to constrain the problem and prevent regression in an evolving codebase.

Sometimes you might want to add an import of forbidden to a module in subpackage, but there might happen to be a dependency from another module in package that means you can't add it.

That is actually the exact thing I want to prevent!

forbidden is some external library that we don't want to ever be imported outside of subpackage - the original rule I showed tried to accomplish that but failed due to indirect imports.

Admittedly it could end up being a bit verbose, so your ignore_modules could help with that.

You mention you're reluctant to adding it - is that just a time commitment/priority thing, or are there other concerns? If I take a stab at implementing ignore_modules and send a PR would you consider merging it?

@seddonym
Copy link
Owner

You mention you're reluctant to adding it - is that just a time commitment/priority thing, or are there other concerns?

Partly - it's also about not allowing Import Linter to get unnecessarily complex, so I would want to think about it a bit more about introducing this new idea. Possibly there is a better alternative.

For example, a different approach would be to have an exhaustive field on the contract - that's something that is used on the layers Import Linter contract already. That would protect against adding of new children to package, though it wouldn't help with the verbosity angle.

I'm not 100% against it, just want to move slowly. In the meantime it would be a lot less work for you to define a custom contract type - let me know if you need any guidance.

@apmorton
Copy link
Author

I've extended the ForbiddenContract in the following way, which seems to work - does this sound correct to you?

--- a/importlinter/contracts/forbidden.py   2024-10-29 09:01:04.000000000 -0500
+++ b/importlinter/contracts/forbidden.py       2024-11-12 10:51:57.109423009 -0600
@@ -36,6 +36,7 @@
 
     source_modules = fields.ListField(subfield=fields.ModuleExpressionField())
     forbidden_modules = fields.ListField(subfield=fields.ModuleExpressionField())
+    ignore_modules = fields.ListField(subfield=fields.ModuleExpressionField())
     ignore_imports = fields.SetField(subfield=fields.ImportExpressionField(), required=False)
     allow_indirect_imports = fields.BooleanField(required=False, default=False)
     unmatched_ignore_imports_alerting = fields.EnumField(AlertLevel, default=AlertLevel.ERROR)
@@ -62,6 +63,10 @@
                 self.forbidden_modules,  # type: ignore
             )
         )
+        ignore_modules = module_expressions_to_modules(
+            graph,
+            self.ignore_modules,  # type: ignore
+        )
 
         self._check_all_modules_exist_in_graph(source_modules, graph)
         self._check_external_forbidden_modules(forbidden_modules)
@@ -92,6 +97,7 @@
                         chains = graph.find_shortest_chains(
                             importer=source_module.name, imported=forbidden_module.name
                         )
+                    chains = [c for c in chains if Module(c[0]) not in ignore_modules]
                     if chains:
                         is_kept = False
                         for chain in sorted(chains):

Effectively ignoring any import chains which start with an ignored module.

@apmorton
Copy link
Author

Hmm, nevermind - this doesn't seem to do the job correctly in all cases.

I guess I need to do something more like contract_utils.remove_ignored_imports and remove the ignored modules, but retain the imports...

@apmorton
Copy link
Author

Here is the final implementation for anyone else interested - seems to work correctly.

Major differences to the normal forbidden contract are:

        for ignored_module in ignore_modules:
            imports = graph.find_modules_directly_imported_by(str(ignored_module))
            importers = graph.find_modules_that_directly_import(str(ignored_module))
            for importer in importers:
                for imp in imports:
                    graph.add_import(
                        importer=importer,
                        imported=imp,
                    )
            graph.remove_module(str(ignored_module))

and

chain = original_graph.find_shortest_chain(chain[0], chain[-1])
import copy

from importlinter.contracts.forbidden import (
    AlertLevel,
    ContractCheck,
    ForbiddenContract,
    ImportGraph,
    contract_utils,
    fields,
    module_expressions_to_modules,
    output,
    settings,
)


class CustomForbiddenContract(ForbiddenContract):
    type_name = "forbidden"

    source_modules = fields.ListField(subfield=fields.ModuleExpressionField())
    forbidden_modules = fields.ListField(subfield=fields.ModuleExpressionField())
    ignore_imports = fields.SetField(subfield=fields.ImportExpressionField(), required=False)
    allow_indirect_imports = fields.BooleanField(required=False, default=False)
    unmatched_ignore_imports_alerting = fields.EnumField(AlertLevel, default=AlertLevel.ERROR)

    ignore_modules = fields.ListField(subfield=fields.ModuleExpressionField())

    def check(self, graph: ImportGraph, verbose: bool) -> ContractCheck:
        original_graph = copy.deepcopy(graph)
        ignore_modules = module_expressions_to_modules(
            graph,
            self.ignore_modules,  # type: ignore
        )
        for ignored_module in ignore_modules:
            imports = graph.find_modules_directly_imported_by(str(ignored_module))
            importers = graph.find_modules_that_directly_import(str(ignored_module))
            for importer in importers:
                for imp in imports:
                    graph.add_import(
                        importer=importer,
                        imported=imp,
                    )
            graph.remove_module(str(ignored_module))

        is_kept = True
        invalid_chains = []

        warnings = contract_utils.remove_ignored_imports(
            graph=graph,
            ignore_imports=self.ignore_imports,  # type: ignore
            unmatched_alerting=self.unmatched_ignore_imports_alerting,  # type: ignore
        )

        source_modules = list(
            module_expressions_to_modules(
                graph,
                self.source_modules,  # type: ignore
            )
        )
        forbidden_modules = list(
            module_expressions_to_modules(
                graph,
                self.forbidden_modules,  # type: ignore
            )
        )

        self._check_all_modules_exist_in_graph(source_modules, graph)
        self._check_external_forbidden_modules(forbidden_modules)

        # We only need to check for illegal imports for forbidden modules that are in the graph.
        forbidden_modules_in_graph = [m for m in forbidden_modules if m.name in graph.modules]

        def sort_key(module):
            return module.name

        for source_module in sorted(source_modules, key=sort_key):
            for forbidden_module in sorted(forbidden_modules_in_graph, key=sort_key):
                output.verbose_print(
                    verbose,
                    "Searching for import chains from " f"{source_module} to {forbidden_module}...",
                )
                with settings.TIMER as timer:
                    subpackage_chain_data = {
                        "upstream_module": forbidden_module.name,
                        "downstream_module": source_module.name,
                        "chains": [],
                    }

                    if str(self.allow_indirect_imports).lower() == "true":
                        chains = self._get_direct_chains(source_module, forbidden_module, graph)
                    else:
                        chains = graph.find_shortest_chains(
                            importer=source_module.name, imported=forbidden_module.name
                        )
                    if chains:
                        is_kept = False
                        for chain in sorted(chains):
                            chain = original_graph.find_shortest_chain(chain[0], chain[-1])
                            chain_data = []
                            for importer, imported in [
                                (chain[i], chain[i + 1]) for i in range(len(chain) - 1)
                            ]:
                                import_details = original_graph.get_import_details(
                                    importer=importer, imported=imported
                                )
                                line_numbers = tuple(j["line_number"] for j in import_details)
                                chain_data.append(
                                    {
                                        "importer": importer,
                                        "imported": imported,
                                        "line_numbers": line_numbers,
                                    }
                                )
                            subpackage_chain_data["chains"].append(chain_data)  # type: ignore
                if subpackage_chain_data["chains"]:
                    invalid_chains.append(subpackage_chain_data)
                if verbose:
                    chain_count = len(subpackage_chain_data["chains"])
                    pluralized = "s" if chain_count != 1 else ""
                    output.print(
                        f"Found {chain_count} illegal chain{pluralized} "
                        f"in {timer.duration_in_s}s.",
                    )

        # Sorting by upstream and downstream module ensures that the output is deterministic
        # and that the same upstream and downstream modules are always adjacent in the output.
        def chain_sort_key(chain_data):
            return (chain_data["upstream_module"], chain_data["downstream_module"])

        return ContractCheck(
            kept=is_kept,
            warnings=warnings,
            metadata={"invalid_chains": sorted(invalid_chains, key=chain_sort_key)},
        )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants