Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config merge #982

Merged
merged 54 commits into from
Jul 26, 2024
Merged

Config merge #982

merged 54 commits into from
Jul 26, 2024

Conversation

godrei
Copy link
Contributor

@godrei godrei commented Jul 5, 2024

Checklist

Version

Requires a MINOR version update

Context

This PR adds support for modular config files:

  1. a new CLI command (merge) is introduced, which takes a modular (with config module includes) bitrise.yml file, merges the included modules to the main config file, and outputs the merged, final bitrise.yml and the config tree used for the merge.
$ bitrise merge -h

NAME:
   bitrise merge - Resolves includes in a modular bitrise.yml and merges included config modules into a single bitrise.yml file.

USAGE:
   bitrise merge [command options] args[0]: By default, the command looks for a bitrise.yml in the current directory, custom path can be specified as an argument.

OPTIONS:
   --output value, -o value  Output directory for the merged config file (bitrise.yml) and related config file tree (config_tree.json).

  1. wires in modular config merge for all CLI commands working with the bitrise.yml file (run, trigger, trigger_check, validate, and workflows)

}
}

func (m *Merger) MergeConfig(mainConfigPth string) (string, *models.ConfigFileTreeModel, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Merger struct maybe could be a bit more higher level. It does not necessarily need to know if a config comes from a local file or remote one. What it needs to do is to read and understand the include section of the yamls, then ask an entity to return the bytes of the next yaml file, build a tree out of these and then merge it.

The other parts could be abstracted away a bit so you only know about them when you start digging deeper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a new ConfigReader from the original FileReader and moved the local vs repo file read logic there.

Comment on lines +45 to +59
includeCommit := r.Commit
isCommitValid := true
if includeCommit != "" {
isCommitValid = false

if len(includeCommit) > 5 && len(includeCommit) < 9 {
isCommitValid = true
} else if len(includeCommit) == 40 {
isCommitValid = true
}
}
if !isCommitValid {
return fmt.Errorf("invalid commit hash in reference (%s): %s", key, includeCommit)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to validate commit hashes or we can leave this job to the entity which tries to clone the repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a usual config model validation, to fail fast the merge process if the reference is in an invalid format.

opts.ReferenceName = plumbing.NewBranchReferenceName(branch)
}

repo, cloneErr := git.Clone(memory.NewStorage(), memfs.New(), &opts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to introduce a local cache for the cloned repos? That way we do not need to perform the same action in the same merge command?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed that a local file system cache should be used for a config merge process. This cache might be a folder in tmp dir and gets cleaned up, once the given bitrise.yml is resolved and a new merge for the same config would always start with a new cache.

Comment on lines 186 to 196
sameRepo := false
if repoInfo != nil {
var err error
if sameRepo, err = isSameRepoReference(reference, *repoInfo); err != nil {
m.logger.Warnf("Failed to check if the reference is from the same repository: %s", err)
}
}

if sameRepo {
return m.readLocalConfigModule(reference, dir)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed that this same repo logic could be removed because if the user enters a repository url then we just simply clone that one. Most probably if the user wants to refer to a file from the same branch/tag/commit then they will just simply enter a path value.

Comment on lines 173 to 178
return &models.ConfigFileTreeModel{
Path: key,
Contents: string(configContent),
Includes: includedConfigTrees,
Depth: depth,
}, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it help if we flatten this tree before we actually merge it? If we flatten it to an array then the merge is simply iterating over the items one by one. And at the moment it is recursive ConfigFileTreeModel.Merge function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this would make the code simpler, because we anyway need to traverse the tree first.

cli/run_util.go Outdated
Comment on lines 936 to 946
logger := logV2.NewLogger()
repoCache := configmerge.NewRepoCache()
configReader, err := configmerge.NewConfigReader(repoCache, logger)
if err != nil {
return models.BitriseDataModel{}, []string{}, fmt.Errorf("failed to create config module reader: %w", err)
}
merger := configmerge.NewMerger(configReader, logger)
mergedConfigContent, _, err := merger.MergeConfig(bitriseConfigPath)
if err != nil {
return models.BitriseDataModel{}, []string{}, fmt.Errorf("failed to merge Bitrise config (%s): %w", bitriseConfigPath, err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is exactly the same as what can be found in the merge.go file. I would extract it in a common method so it is more visible that the same merge logic runs in both places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, especially, because here I forgot to update the used logger package.
Fixed in 50e4b64

cli/run_util.go Outdated
if err != nil {
return models.BitriseDataModel{}, warnings, fmt.Errorf("Config (path:%s) is not valid: %s", bitriseConfigPath, err)
log.Warnf("Failed to check if the config is modular: %s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How probably that if there is an error then that is recoverable? I looked at the IsModularConfig function and it can through errors if it has an I/O problem (cannot open or read the file) and when it cannot parse the yaml. Both of these are kinda fatal errors. I mean there is no harm not failing early here because that final config reading will fail as well if there is an I/O problem and there we bubble up the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, updated in 9f58090

Comment on lines 3 to 19
type repoCache struct {
cache map[string]string
}

func NewRepoCache() RepoCache {
return repoCache{
cache: map[string]string{},
}
}

func (c repoCache) GetRepo(ref ConfigReference) string {
return c.cache[ref.RepoKey()]
}

func (c repoCache) SetRepo(dir string, ref ConfigReference) {
c.cache[ref.RepoKey()] = dir
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it worth having this struct just to store a map? It seems too simple and this could move into the config reader.

If this entity would manage the local directory, folder creation and cleanup as well then I would say it is definitely a keeper. But in this state I think we can get rid of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, updated in 8490578

Contents string `json:"contents,omitempty" yaml:"contents,omitempty"`
Includes []ConfigFileTreeModel `json:"includes,omitempty" yaml:"includes,omitempty"`

Depth int `json:"-" yaml:"-"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this depth field? It seems like we save the depth in it but it is not used in a meaningful way. And it is ignored in both the json and yaml serialisation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks, this is not needed, removed in 5d2cc06

@godrei godrei merged commit 0026e43 into master Jul 26, 2024
5 checks passed
@godrei godrei deleted the config-merge branch July 26, 2024 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants