Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor eval functions #2201

Merged
merged 1 commit into from
Feb 6, 2024
Merged

Conversation

mr0re1
Copy link
Collaborator

@mr0re1 mr0re1 commented Feb 3, 2024

Make Blueprint.Eval the only place where blueprint context is created and avoid recursive context creation.

Make `Blueprint.Eval` the only place where blueprint context is created.
@mr0re1 mr0re1 added the release-chore To not include into release notes label Feb 3, 2024
@mr0re1 mr0re1 requested a review from harshthakkar01 February 3, 2024 21:20
@mr0re1 mr0re1 merged commit 9b9fc82 into GoogleCloudPlatform:develop Feb 6, 2024
8 of 33 checks passed
@mr0re1 mr0re1 deleted the eval_this branch February 6, 2024 19:45
mr0re1 added a commit that referenced this pull request Feb 14, 2024
* Bump github.com/hashicorp/terraform-exec from 0.19.0 to 0.20.0

Bumps [github.com/hashicorp/terraform-exec](https://github.com/hashicorp/terraform-exec) from 0.19.0 to 0.20.0.
- [Release notes](https://github.com/hashicorp/terraform-exec/releases)
- [Changelog](https://github.com/hashicorp/terraform-exec/blob/main/CHANGELOG.md)
- [Commits](hashicorp/terraform-exec@v0.19.0...v0.20.0)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/terraform-exec
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Make `subnetwork_self_link` required, don't pass `subnetwork_project` around (#2067)

* Slurm6. Automagicaly set `nodeset.name` from module id. (#2068)

Slurm6. Automagicaly set `nodeset.name` from module id.

* VPC. Replace `s/_/-/` in `deployment_name` to avoid deploy-time error (#2083)

* Add `hpc-slurm6.yaml` to `examples/README` (#2084)

* Slurm6. QuickFix broken TPU nodeset usage (#2086)

* Slurm6. Reference TPU example in `examples/README` (#2087)

* Use `cty.Type` instead of `string` to represent type of vars. (#2088)

NOTE: after this change instead of `list` the `list(any)` will be used.

* fix: header was over-indented

* Check if supplied value matches module variable type (#2089)

* Add spelling hints for global vars and outputs (#2082)

* Point ref errors to a location within nested object (#2081)

* Refactor `Blueprint.WalkModules` (#2094)

* Add safe-version to avoid useless `return nil`;
* Supply `ModulePath` to the "dangerous" version;
* Use `WalkModules` instead of nested for-loops in few cases.

* Bump golang.org/x/sys from 0.15.0 to 0.16.0

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.15.0 to 0.16.0.
- [Commits](golang/sys@v0.15.0...v0.16.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump google.golang.org/api from 0.154.0 to 0.155.0

Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.154.0 to 0.155.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](googleapis/google-api-go-client@v0.154.0...v0.155.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update spack openfoam example to Slurm V6

* Add test that check that modules don't output forbidden names (#2091)

* Remove hpc-slurm-legacy example and references

* Remove pre existing fs example and references

* Remove slurm-two-partitions-workstation example and references

* Remove use-resources example and references

* Remove lustre-new-vpc example and references

* Remove test-gcs-fuse example and references

* Remove hpc-cluster-service-acct example and references

* Remove hpc-cluste-slurm-with-startup example and references

* Remove hpc-cluster-project exampke and references

* Remove hpc-cluster-high-io-remote-state example and references

* Update test outputs example and remove slurm partition and controller

* Slurm6. Support `additional_networks`,`reservation_name` & `access_config` (#2062)

* Add support for `additional_networks` & `reservation_name`;
* Nodeset. Pass `access_config`, do not use `enable_public_ips`

* Add zone finding for cpu partitions in hpc-enterprise-slurm test

* Rename GKE subnet with build id to avoid conflicts

* Fix slurm v6 links in example README

* Add startup script option to install stackdriver agent

* Update tests to focus on stackdriver while still testing ops agent

* Unify usage and rendering of `HintError` (#2095)

* Rename slurm tpu test to be consistent with blueprint name

* Add example script to uninstall Ops Agent and install Stackdriver Agent

* Silence make error message for old versions of git

Older versions of git do not have a '--show-current' flag on the git
branch command. This command allows fallback to the ancient approach to
determining the active branch and also redirects stderr to /dev/null. If
neither command succeeds, then ghpc --version reports detached HEAD for
the branch.

* yamllint. Don't show warnings (#2122)

Motivation: warnings doesn't cause lint to fail (only errors do),
but they will be outputed along the errors (many lines), that makes
it hard to see the actual error message

* Move `examples/hpc-slurm` to V6 (#2097)

pick f88a30f Unify usage and rendering of `HintError`

* Move `examples/hpc-slurm` to V6;
* Updated `examples/README`;
* Remove `slurm-v5-hpc-centos7` test.

* Add `has_to_be_used` behaviour to some of modules (#2092)

* Update README.md

* Reduce default maximum number of HTCondor execute points

Especialy for initial deployments, a maximum of 100 could result in
significant spend beyond what was anticipated. Reducing to 5 addresses
this while still allowing the user to deliberately scale up.

* Hint spelling for inputs (#2124)

* Simplify rendering of errors with Position but without Path (#2096)

* Remove `internalPath`;
* Add `PosError` wrapper, render it specifically.

* Bump jinja2 from 3.1.2 to 3.1.3 in /community/front-end/ofe

Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.2 to 3.1.3.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](pallets/jinja@3.1.2...3.1.3)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Address #2120: fail on bad state and succeed on reinstall

* Fix rendering of "cobra" errors (#2130)

* Move `OFE venv` PR validation into separate trigger. (#2128)

**Motivation**:
* Reduce time it takes to run `PR-validation`;
* Reduce noise in output of `PR-validation`.

* Make `cleanup_compute_nodes` `depends_on` on network (#2126)

* Update spack wrf example and references to use Slurm V6

* Update spack openfoam example to use /opt/apps directory

* Improve readability of "required setting is missing" error (#2133)

* GKE controller node pool extra features

* Improve MIG replacement policies for HTCondor Central Managers

Set the MIG replacement policy to PROACTIVE by default for Central
Managers. This ensures that configuration changes are propagated by
a terraform apply which updates the HTCondor configuration. This is safe
for Central Managers because they recover state dynamically through
periodic API calls to the rest of the cluster. Document the alternative
of OPPORTUNISTIC updates and how to manually trigger a MIG replacement.

* Improve MIG replacement policies for HTCondor Access Points

Allow configuration of the MIG replacement policy for Access
Points. Document the behavior of OPPORTUNISTIC updates and how
to manually trigger a MIG replacement or set to the alternative
of PROACTIVE replacements.

* Improve MIG replacement policies for HTCondor Execute Points

Continue using the default of OPPORTUNISTIC replacement of Execute Point
VMs so that they are (typically) replaced when a job becomes idle.
Strongly recommend this setting in the documentation but discuss the
alternative of PROACTIVE or manually issuing updates via gcloud.

* Fix HTCondow Windows URI for latest 23.0 LTS release

* Address feedback from #2140 for README formatting

* Fix broken link in HTCondor MIG documentation

* Remove intel-select blueprints and references

* Add support for string interpolation (#2076)

* Add support for string interpolation

* Support proper escaping

* Adress comments

* UX. Enable output colorization by default (#2145)

* Update DAOS blueprints to use google-cloud-daos v0.5.0, slurm v6

[DAOSGCP-182](https://daosio.atlassian.net/browse/DAOSGCP-182)

- Bump version of DAOS modules to v0.5.0 which install DAOS v2.4

- Modify community/examples/intel/hpc-slurm-daos.yaml to use
  Slurm v6 modules

- Add temporary fix to community/examples/intel/hpc-slurm-daos.yaml
  to work around issue with missing lustre-client 8.8 repo

- Update community/examples/intel/README.md to account for changes
  in DAOS v2.4

Signed-off-by: Mark Olson <[email protected]>

* Added validation and error message to login_startup_scripts_timeout because it is broken

* Update spack gromac example tutorial and reference to use Slurm V6

* Slurm6. Advance to 6.3.1 (#2146)

* Add commands to verify monitoring agents are active

* Copies python binaries instead of symlink for more isolated venv

* Increase dynamic node count to a more reasonable default value

* Bump google.golang.org/api from 0.155.0 to 0.157.0

Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.155.0 to 0.157.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](googleapis/google-api-go-client@v0.155.0...v0.157.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update README.md with fixes from review.

Signed-off-by: Mark Olson <[email protected]>

* Add example of building Slurm on top of Rocky 8

Tom provided an example blueprint that demonstrated this methodology.

Co-authored-by: Tom Downes <[email protected]>

* Update hpc slurm gromac example and references to use Slurm V6

* Clarify that zone-finding isn't available for TPUs (#2156)

* Fixed PyMarkdown issue in community/examples/intel/README.md

Signed-off-by: Mark Olson <[email protected]>

* Fixed typos in community/examples/intel/README.md

Signed-off-by: Mark Olson <[email protected]>

* Bring `$(...)` functionality on par with `((...))` (#2053)

* Use token-replacement instead of string-replacement for expresssion updates;
* Translate any BP-expression to TF-expressions by transforming used traversals;
* Remove notion of `((...))` from documentation.

* Address feedback from #2150

* Start legacy monitoring agent after installing

* Address feedback: be explicit about Ansible install

* Add documentation for Slurm building example

* Adding test for building slurm image

* Create variable to pass the Packer group name

* Fix: old ansible was not compatable with selinux package, pin to latest

* Fix false-positive `test_deployment_variable_not_used` (#2164)

Preserve original state of `Vars`

* Bump test coverage for `pkg/modulewriter` (#2163)

* Add `--force` flag to `ghpc create` (#2162)

* Improve error logging for expressions parsing (#2078)

* Show snippet with ponter to a column;

```sh
Error: :0,21-22: Invalid character; This character is not used within the language., and 1 other diagnostic(s)
34:         content: |

Error: Invalid character; This character is not used within the language.
  echo "Hello $(vars.project_id from $(vars.region)"
                                     ^
33:         content: |
```

* Prevent line-breaks within expressions.
This constraint existed before, but was accidentaly relaxed by recent PR.

* Remove quantum circuit simulator example

* Update hpc-slurm-legacy-sharedvpc example and references to use Slurm V6

* Bump `cmd` test coverage (#2165)

* Update Slurm image 6.1 -> 6.3 (#2169)

* Add login node in the spack openfoam tutorial example

* Update Toolkit docs to point to GCP Slurm fork

* Fix: added new variables to ml-slurm integration test

* Update slurm references

* Update CloudSQL blueprint to v6

* Ensure Windows VMs start HTCondor only after successful secret download

- this enables Managed Instance Group health checks to mark the node
  unhealthy for deletion

* Updated legacy-sharedvpc reference naming to sharedvpc

* Bump github.com/zclconf/go-cty from 1.14.1 to 1.14.2

Bumps [github.com/zclconf/go-cty](https://github.com/zclconf/go-cty) from 1.14.1 to 1.14.2.
- [Release notes](https://github.com/zclconf/go-cty/releases)
- [Changelog](https://github.com/zclconf/go-cty/blob/main/CHANGELOG.md)
- [Commits](zclconf/go-cty@v1.14.1...v1.14.2)

---
updated-dependencies:
- dependency-name: github.com/zclconf/go-cty
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump google.golang.org/api from 0.157.0 to 0.159.0

Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.157.0 to 0.159.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](googleapis/google-api-go-client@v0.157.0...v0.159.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Modified validation message to be more clear

* Remove project_id from image building example

* Reduce size of image builder to be compatible with initial projects

* Remove SLurm V4 modules and add note to use and reference V4 modules and
examples

* Patch Slurm integration test

Retry initial node count until sinfo command is successful

* Update Chrome Remote Desktop to Debian 12 by default

* Make TPU non-preemptible in blueprint and add retries in JAX
verification integration test

* Update startup-script module to latest release

* Bump `pkg/modulereader` test coverage 80% -> 87% (#2161)

* Fix test broken by remove module. (#2186)

* Update TPU v6 blueprint to use new VPC module

* Improve test coverage of `pkg/modulewriter` (#2188)

* Updating spack and ramble buckets to use 6 digits of hex

* Improve output of `tools/enforce_coverage.pl` (#2191)

* Output package that failed;
* Set thresholds `pkg/logging: 0; pkg/inspect: 60`

* Remove v4 reference from network storage document

* * Add function `Dict.Keys` to differentiate places that don't care about values. (#2194)

* Add shorthand `Reference.AsValue()` (#2195)

* Deprecate Dell Omnia module and example blueprint

* Change mode of maintenance.py so that it can be executed as description suggests

* Change batch-job-base template from json to YAML

Batch supports YAML job configurations now so we can use YAML
everywhere instead of JSON, which will hopefully make some of
the conditional syntax in the templates easier to manage.

* Move topological ordering of vars into separate function. (#2190)

**Motivation:** To be reused in other places

* Bump google.golang.org/api from 0.159.0 to 0.161.0

Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.159.0 to 0.161.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](googleapis/google-api-go-client@v0.159.0...v0.161.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update Slurm-GCP release to 5.10.2

* Add expression variables to the test configs (#2197)

* HTCondor: expire ClassAds more rapidly

Decrease the default CLASSAD_LIFETIME from 15 minutes to 3 minutes. In
an on-premises system, allowing for long intervals between ClassAd
updates can be good to allow more machines to reboot. In the cloud, the
absence of ClassAd updates more likely indicates the intentional (or
automated) deletion of a VM. So it should be removed from the HTCondor
pool.

* HTCondor: ensure Windows nodes are detected as unhealthy

Ensure that the script for Windows exits with an error before starting
HTCondor when it cannot download the condor_config file.

* Align formatting choices with recent commits

* HTCondor autoscaler

Adopt a more conservative approach that the autoscaler should treat
nodes in any state that reflects automated MIG modification as an "idle"
node for the purposes of autoscaling. This helps prevent autoscaling
runaway when VMs are unable to enter the healthy state (which reflects
as "NONE" for currentAction in the MIG).

* Fix tests: look for yaml file, use image with yaml compat gcloud

* Use multiline yaml block scalar for Batch runnable

* Address feedback from #2204

* Bump cryptography from 41.0.6 to 42.0.0 in /community/front-end/ofe

Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.6 to 42.0.0.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](pyca/cryptography@41.0.6...42.0.0)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Take "first deploy only" dependency on `slurm_files` (#2181)

* Refactor `eval` functions (#2201)

Make `Blueprint.Eval` the only place where blueprint context is created.

* Remove `setGlobalLabels` as it's not needed (#2193)

The `combineLabels` will do it.

* Show hint message if unsupported function is used (#2211)

Currently we have a few contexts that got evaluated by ghpc:
* `vars`;
* `module.settings` in packer groups;
* `validators.input`
Our evaluation context only supports 2 functions: `merge` and `flatten`.

The rest of expressions (`module.settings` in TF groups) are not evaluated by ghpc
=> can use any valid HCL-syntax.

* Update pre-commit hooks

* Restrict GitHub actions to operate on upstream

- the dependency license and PR label actions only need to run on the
  GoogleCloudPlatform copy of the HPC Toolkit

* Remove pre-commit from Cloud Build PR validation

* Create GitHub Action to run pre-commit

- pre-commit verification will run on every Pull Request
- if the user opts in with the label "pre-commit-autofix" the user can
  request that pre-commit add a commit that fixes formatting, where it
  is capable of automatically fixing formatting

* Bump django from 4.2.7 to 4.2.10 in /community/front-end/ofe (#2213)

Bumps [django](https://github.com/django/django) from 4.2.7 to 4.2.10.
- [Commits](django/django@4.2.7...4.2.10)

---
updated-dependencies:
- dependency-name: django
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Don't run `destroy_resource_policies` before `destroy_nodes` is done (#2217)

```sh
module.slurm_controller.module.cleanup_compute_nodes[0].null_resource.destroy_nodes_on_destroy[0]: Destruction complete after 2m52s
module.slurm_controller.module.cleanup_resource_policies[0].null_resource.destroy_resource_policies_on_destroy[0]: Destroying... [id=89627024760583747
65]
```

* Rename HTC Slurm configuration templates with explicit purpose

* Add Slurm configuration template for long Prolog/Epilog scripts

* Adopt empty string as default value for maintenance_interval

The default value of null cannot be set as a deployment variable; this
will allow the value to be set at the top of a blueprint.

* Remove enable_devel from slurm-gcp v5 examples

* Add login node to spack gromacs tutorial example

* Version bump to 1.28.0 (#2232)

---------

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Mark Olson <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nick Stroud <[email protected]>
Co-authored-by: Harsh Thakkar <[email protected]>
Co-authored-by: Tom Downes <[email protected]>
Co-authored-by: Tom Downes <[email protected]>
Co-authored-by: Stas Bekman <[email protected]>
Co-authored-by: Eimantas Kazakevicius <[email protected]>
Co-authored-by: Mark Olson <[email protected]>
Co-authored-by: Carson Dunbar <[email protected]>
Co-authored-by: Carlos Boneti <[email protected]>
Co-authored-by: Rohit Ramu <[email protected]>
Co-authored-by: Alyssa <[email protected]>
Co-authored-by: alyssa-sm <[email protected]>
Co-authored-by: cdunbar13 <[email protected]>
Co-authored-by: Aaron Golden <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-chore To not include into release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants