Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More explicit configuration of scanner-specific options #3518

Conversation

oheger-bosch
Copy link
Member

This PR changes the scanner-specific configuration in ort.conf. The map of map of properties is replaced by a map to a new ScannerOptions class, which holds options common to all scanners and a map with scanner-specific properties.

This is a preparation for changes in the configuration related to the compatibility of scan results (see #3223). In this area, we would like to extend the configuration, and this is not possible with a plain map of options.

Note that there is a potential for merge conflicts with #3510.

* this class, it can be configured that scan results are reused even if their parameters do not exactly match the
* current scanner settings.
*/
data class ScannerCompatibilityConfiguration(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general remark about the class docs and name: IMO it's a bit misleading to refer to the (currently) only place where this class is used, i.e. as part of the scanner configuration, for compatibility checks. I'd prefer to look at this class a bit more in isolation, and at the plain properties it stores. To me, this is something like a ToolVersionMatcher; it's not even scanner-specific! We could also make use of it in the Analyzer when checking CLI tools it calls. I believe the docs and class name should reflect that, although one could probably come up with a better class name than the one I mentioned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right in the current state of the implementation. However, the class is going to be extended with additional properties that make it specific to the use case described in the docs: namely a set of patterns determining the scanner options to ignore when checking the compatibility of scan results.

It may be possible to refactor some functionality out if a generic tool version check is needed at multiple places. This could be done in a later step. (It sounds that this would affect a larger number of classes.) Also note that in its current state, the class is a plain data class and does not contain any matching logic. Therefore, a name ending on Matcher would be misleading.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namely a set of patterns determining the scanner options to ignore when checking the compatibility of scan results.

These "scanner options" are just command line options, correct? In that case I'd still argue that even with that addition, this very same class could be used e.g. to configure the compatibility of command line tools used by the analyzer, or?

Therefore, a name ending on Matcher would be misleading.

Agreed, whatever the name would be it should probably not include "Matcher".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These "scanner options" are just command line options, correct? In that case I'd still argue that even with that addition, this very same class could be used e.g. to configure the compatibility of command line tools used by the analyzer, or?

Well, if the use case is that generic. So far I have seen the compatibility checks as a scanner-specific issue. What is a bit tricky here is that the class actually does not define the compatibility of a specific tool, but of a result produced by a specific tool that was invoked with specific parameters. Is the situation similar in the analyzer use case? It seems to me that here only the compatibility of the tool itself is relevant; if a tool is considered compatible, its results (the set of dependencies in this case) are always accepted.

Comment on lines +44 to +52
val minVersion: String? = null,

/**
* The maximum scanner version of a scan result to be considered compatible. A result loaded from a storage is
* accepted only if its scanner version is less than to this version. (This bound of the version range is
* excluding.) Defaults to the next minor version of [minVersion] if unspecified; so patch level updates do not
* cause the compatibility check to fail.
*/
val maxVersion: String? = null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use an IntRange here, or would limiting us to Ints be too specific?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plain numbers would indeed not be sufficient. The version strings need to be interpreted and compared as semantic versions. They could also contain alphanumeric parts, e.g. 2.5-beta3.

* scanner. That way, this scanner can be configured in a special way. The options consist of a part that is common to
* all scanners, and a generic map of properties to be evaluated by specific scanner implementations.
*/
data class ScannerOptions(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, this class is not really scanner-specific, and could be moved as e.g. ToolOptions to its own file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer the case when the compatibility object becomes scanner-specific.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what's the rationale for putting ScannerCompatibilityConfiguration into its own file, but not ScannerOptions?

minVersion = "3.1.0"
maxVersion = "3.3"
}
properties: {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Blank line before starting a new block.

Comment on lines 51 to 52
key1 = "value1"
key2 = "value2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we already use "ScanCode" as the example (which is good), we should IMO be more concrete here and show a user would specify ScanCode CLI options here.

),
configFile = configFile
)

config.scanner?.storages shouldNotBeNull {
val postgresStorage = this["postgresStorage"]
val postgresStorage = this["postgres"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why that change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test now operates on reference.conf, so that it has more options available. And in this file, the Postgres storage is named only postgres.

postgresStorage.shouldBeInstanceOf<PostgresStorageConfiguration>()
postgresStorage.username shouldBe "username"
postgresStorage.schema shouldBe "argsSchema"
postgresStorage.password shouldBe "envPassword"
}

val scanCodeOptions = config.scanner?.options?.get("ScanCode")
scanCodeOptions.shouldNotBeNull()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please prefer to use the infix notation shouldNot beNull().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, although the variant with the infix notation has the drawback that it does not do a smart cast.

Copy link
Member

@sschuberth sschuberth Jan 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point, in that case you can use the alternate syntax

scanCodeOptions shouldNotBeNull {
    ...
}

which I personally like as it nicely visualizes in a block which code relies on the smart cast.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this looks nice.

@oheger-bosch oheger-bosch force-pushed the oheger-bosch/scanner/options branch 2 times, most recently from 1b7f147 to 8a9b4f6 Compare January 26, 2021 07:25
The options related to the compatibility of scan results have been
specified so far in the generic "options" map in ScannerConfiguration.
Extract them into a dedicated class to make this configuration more
explicit.

In future, we want to extend these options to make compatibility
checks on scan results more flexible.

Signed-off-by: Oliver Heger <[email protected]>
Rather than a map with properties, represent scanner-specific options
by a ScannerOptions object. This class allows storing the options
common to all scanners in an explicit form, but still supports a
generic map with scanner-specific properties.

As in future more complex and structured configurations for scanners
are going to be added, a plain map with options is no longer
sufficient.

Signed-off-by: Oliver Heger <[email protected]>
@oheger-bosch oheger-bosch force-pushed the oheger-bosch/scanner/options branch from 8a9b4f6 to 152c61c Compare January 26, 2021 07:37
@oheger-bosch oheger-bosch force-pushed the oheger-bosch/scanner/options branch from 152c61c to 30fbfed Compare January 26, 2021 09:37
Extend the 'options' section to demonstrate the structure of
configuration options that are now supported.

Enhance the test for OrtConfiguration to check that the new structure
is processed correctly and that properties can be overridden via
command line options.

Signed-off-by: Oliver Heger <[email protected]>
@oheger-bosch oheger-bosch force-pushed the oheger-bosch/scanner/options branch from 30fbfed to 73e72d5 Compare January 26, 2021 11:43
* this class, it can be configured that scan results are reused even if their parameters do not exactly match the
* current scanner settings.
*/
data class ScannerCompatibilityConfiguration(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namely a set of patterns determining the scanner options to ignore when checking the compatibility of scan results.

These "scanner options" are just command line options, correct? In that case I'd still argue that even with that addition, this very same class could be used e.g. to configure the compatibility of command line tools used by the analyzer, or?

Therefore, a name ending on Matcher would be misleading.

Agreed, whatever the name would be it should probably not include "Matcher".

* scanner. That way, this scanner can be configured in a special way. The options consist of a part that is common to
* all scanners, and a generic map of properties to be evaluated by specific scanner implementations.
*/
data class ScannerOptions(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what's the rationale for putting ScannerCompatibilityConfiguration into its own file, but not ScannerOptions?

compatibility?.minVersion shouldBe "3.1.0"
compatibility?.maxVersion shouldBe "4.0"
properties?.get("commandLine") shouldBe "--copyright --license --info --timeout 300"
properties?.get("debugCommandLine") shouldBe "--consolidate"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We could have nested shouldNotBeNull {} blocks for compatibility and properties.

@sschuberth
Copy link
Member

sschuberth commented Jan 26, 2021

BTW, based on this comment from @pombredanne I wonder whether we should design the ScannerOptions class after the options map that ScanCode adds to its results file:

"options": {
"input": "/home/getting-started/mime-types",
"--copyright": true,
"--ignore": [
"*.ort.yml",
"HERE_NOTICE",
"META-INF/DEPENDENCIES"
],
"--info": true,
"--json-pp": "/home/getting-started/out/scanner/native-scan-results/mime-types_ScanCode.json",
"--license": true,
"--processes": "3",
"--strip-root": true,
"--timeout": "300.0"
},

@oheger-bosch
Copy link
Member Author

This is now obsolete. #3632 has been created following a different approach.

@oheger-bosch oheger-bosch deleted the oheger-bosch/scanner/options branch February 12, 2021 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants