-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: "failed to switch to Consul server: target sub-connection is not ready (state=TRANSIENT_FAILURE)" for single DC across multiple K8s #1903
Comments
I would appreciate any help on this. We are currently stuck with this at the POC / initial setup part. If the issue takes a long time to solve, there is a high chance the management will skip the consul and move to some alternatives. Kindly help. |
ok, to update here, I was able to figure out the issue. If you follow the current documentation https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/single-dc-multi-k8s as it is, it will only create the consul UI service on
So the final cluster1-values.yaml will look like:
Then, you need to make sure you submit this So the final cluster2-values.yaml will look like:
After redeploying with the required values on both clusters, everything started working fine and the cluster2 consul could join the consul server running on cluster1 fine.
|
This is based on the issue opened here hashicorp/consul-k8s#1903 If you follow the documentation https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/single-dc-multi-k8s exactly as it is, the first cluster will only create the consul UI service on NodePort but not the rest of the services (including for grpc). By default, from the helm chart, they are created as headless services by setting clusterIP None. This will cause an issue for the second cluster to discover consul server on the first cluster over gRPC as it cannot simply cannot through gRPC default port 8502 and it ends up in an error as shown in the issue hashicorp/consul-k8s#1903 As a solution, the grpc service should be exposed using NodePort (or LoadBalancer). I added those changes required in both cluster1-values.yaml and cluster2-values.yaml, and also a description for those changes for the normal users to understand. Kindly review and I hope this PR will be accepted.
Submitted PR for changes in the documentation |
…to connect (#16430) * First cluster grpc service should be NodePort This is based on the issue opened here hashicorp/consul-k8s#1903 If you follow the documentation https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/single-dc-multi-k8s exactly as it is, the first cluster will only create the consul UI service on NodePort but not the rest of the services (including for grpc). By default, from the helm chart, they are created as headless services by setting clusterIP None. This will cause an issue for the second cluster to discover consul server on the first cluster over gRPC as it cannot simply cannot through gRPC default port 8502 and it ends up in an error as shown in the issue hashicorp/consul-k8s#1903 As a solution, the grpc service should be exposed using NodePort (or LoadBalancer). I added those changes required in both cluster1-values.yaml and cluster2-values.yaml, and also a description for those changes for the normal users to understand. Kindly review and I hope this PR will be accepted. * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> --------- Co-authored-by: trujillo-adam <[email protected]>
Closing this issue as the PR is accepted now. |
…16927) * ISSUE_TEMPLATE: Update issue template to include ask for HCL config files for bugs (#16307) * Update bug_report.md * Fix hostname alignment checks for HTTPRoutes (#16300) * Fix hostname alignment checks for HTTPRoutes * Fix panicky xDS test flakes (#16305) * Add defensive guard to make some tests less flaky and panic less * Do the actual fix * Add stricter validation and some normalization code for API Gateway ConfigEntries (#16304) * Add stricter validation and some normalization code for API Gateway ConfigEntries * ISSUE TEMPLATE: update issue templates to include comments instead of inline text for instructions (#16313) * Update bug_report.md * Update feature_request.md * Update ui_issues.md * Update pull_request_template.md * [OSS] security: update go to 1.20.1 (#16263) * security: update go to 1.20.1 * Protobuf Refactoring for Multi-Module Cleanliness (#16302) Protobuf Refactoring for Multi-Module Cleanliness This commit includes the following: Moves all packages that were within proto/ to proto/private Rewrites imports to account for the packages being moved Adds in buf.work.yaml to enable buf workspaces Names the proto-public buf module so that we can override the Go package imports within proto/buf.yaml Bumps the buf version dependency to 1.14.0 (I was trying out the version to see if it would get around an issue - it didn't but it also doesn't break things and it seemed best to keep up with the toolchain changes) Why: In the future we will need to consume other protobuf dependencies such as the Google HTTP annotations for openapi generation or grpc-gateway usage. There were some recent changes to have our own ratelimiting annotations. The two combined were not working when I was trying to use them together (attempting to rebase another branch) Buf workspaces should be the solution to the problem Buf workspaces means that each module will have generated Go code that embeds proto file names relative to the proto dir and not the top level repo root. This resulted in proto file name conflicts in the Go global protobuf type registry. The solution to that was to add in a private/ directory into the path within the proto/ directory. That then required rewriting all the imports. Is this safe? AFAICT yes The gRPC wire protocol doesn't seem to care about the proto file names (although the Go grpc code does tack on the proto file name as Metadata in the ServiceDesc) Other than imports, there were no changes to any generated code as a result of this. * new docs for consul and consul-k8s troubleshoot command (#16284) * new docs for consul and consul-k8s troubleshoot command * add changelog * add troubleshoot command * address comments, and update cli output to match * revert changes to troubleshoot upstreams, changes will happen in separate pr * Update .changelog/16284.txt Co-authored-by: Nitya Dhanushkodi <[email protected]> * address comments * update trouble proxy output * add missing s, add required fields in usage --------- Co-authored-by: Nitya Dhanushkodi <[email protected]> * Normalize all API Gateway references (#16316) * Fix HTTPRoute and TCPRoute expectation for enterprise metadata (#16322) * ISSUE_TEMPLATE: formatting for comments (#16325) * Update all templates. * fix: revert go mod compat for sdk,api to 1.19 (#16323) * fix: add tls config to unix socket when https is used (#16301) * fix: add tls config to unix socket when https is used * unit test and changelog * fix flakieness (#16338) * chore: document and unit test sdk/testutil/retry (#16049) * [API Gateway] Validate listener name is not empty (#16340) * [API Gateway] Validate listener name is not empty * Update docstrings and test * Fix issue with peer services incorrectly appearing as connect-enabled. (#16339) Prior to this commit, all peer services were transmitted as connect-enabled as long as a one or more mesh-gateways were healthy. With this change, there is now a difference between typical services and connect services transmitted via peering. A service will be reported as "connect-enabled" as long as any of these conditions are met: 1. a connect-proxy sidecar is registered for the service name. 2. a connect-native instance of the service is registered. 3. a service resolver / splitter / router is registered for the service name. 4. a terminating gateway has registered the service. * [API Gateway] Turn down controller log levels (#16348) * [API Gateway] Fix targeting service splitters in HTTPRoutes (#16350) * [API Gateway] Fix targeting service splitters in HTTPRoutes * Fix test description * [API Gateway] Various fixes for Config Entry fields (#16347) * [API Gateway] Various fixes for Config Entry fields * simplify logic per PR review * upgrade test: splitter and resolver config entry in peered cluster (#16356) * Upgrade Alpine image to 3.17 (#16358) * Update existing docs from Consul API Gateway -> API Gateway for Kubernetes (#16360) * Update existing docs from Consul API Gateway -> API Gateway for Kubernetes * Update page header to reflect page title change * Update nav title to match new page title * initial code (#16296) * Add changelog entry for API Gateway (Beta) (#16369) * Placeholder commit for changelog entry * Add changelog entry announcing support for API Gateway on VMs * Adjust casing * [API Gateway] Fix infinite loop in controller and binding non-accepted routes and gateways (#16377) * Rate limiter/add ip prefix (#16342) * add support for prefixes in the config tree * fix to use default config when the prefix have no config * Documentation update: Adding K8S clusters to external Consul servers (#16285) * Remove Consul Client installation option With Consul-K8S 1.0 and introduction of Consul-Dataplane, K8S has the option to run without running Consul Client agents. * remove note referring to the same documentation * Added instructions on the use of httpsPort when servers are not running TLS enabled * Modified titile and description * Add docs for usage endpoint and command (#16258) * Add docs for usage endpoint and command * NET-2285: Assert total number of expected instances by Consul (#16371) * set BRANCH_NAME to release-1.15.x (#16374) * Docs/rate limiting 1.15 (#16345) * Added rate limit section to agent overview, updated headings per style guide * added GTRL section and overview * added usage docs for rate limiting 1.15 * added file for initializing rate limits * added steps for initializing rate limits * updated descriptions for rate_limits in agent conf * updated rate limiter-related metrics * tweaks to agent index * Apply suggestions from code review Co-authored-by: Dhia Ayachi <[email protected]> Co-authored-by: Krastin Krastev <[email protected]> * Apply suggestions from code review Co-authored-by: Krastin Krastev <[email protected]> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Jeff Boruszak <[email protected]> --------- Co-authored-by: Dhia Ayachi <[email protected]> Co-authored-by: Krastin Krastev <[email protected]> Co-authored-by: Jeff Boruszak <[email protected]> * [UI] CC-4031: change from Action, a and button to hds::Button (#16251) * Correct WAL metrics registrations (#16388) * chore: remove stable-website (#16386) * Refactor the disco chain -> xds logic (#16392) * Add envoy extension docs (#16376) * Add envoy extension docs * Update message about envoy extensions with proxy defaults * fix tab error * Update website/content/docs/connect/proxies/envoy-extensions/usage/lua.mdx * fix operator prerender issue * Apply suggestions from code review Co-authored-by: trujillo-adam <[email protected]> * update envoyextension warning in proxy defaults so its inline * Update website/content/docs/connect/proxies/envoy-extensions/index.mdx --------- Co-authored-by: trujillo-adam <[email protected]> * upgrade test: peering with resolver and failover (#16391) * Troubleshoot service to service comms (#16385) * Troubleshoot service to service comms * adjustments * breaking fix * api-docs breaking fix * Links added to CLI pages * Update website/content/docs/troubleshoot/troubleshoot-services.mdx Co-authored-by: Eric Haberkorn <[email protected]> * Update website/content/docs/troubleshoot/troubleshoot-services.mdx Co-authored-by: Tu Nguyen <[email protected]> * Update website/content/docs/troubleshoot/troubleshoot-services.mdx Co-authored-by: Tu Nguyen <[email protected]> * nav re-ordering * Edits recommended in code review --------- Co-authored-by: Eric Haberkorn <[email protected]> Co-authored-by: Tu Nguyen <[email protected]> * Docs/cluster peering 1.15 updates (#16291) * initial commit * initial commit * Overview updates * Overview page improvements * More Overview improvements * improvements * Small fixes/updates * Updates * Overview updates * Nav data * More nav updates * Fix * updates * Updates + tip test * Directory test * refining * Create restructure w/ k8s * Single usage page * Technical Specification * k8s pages * typo * L7 traffic management * Manage connections * k8s page fix * Create page tab corrections * link to k8s * intentions * corrections * Add-on intention descriptions * adjustments * Missing </CodeTabs> * Diagram improvements * Final diagram update * Apply suggestions from code review Co-authored-by: trujillo-adam <[email protected]> Co-authored-by: Tu Nguyen <[email protected]> Co-authored-by: David Yu <[email protected]> * diagram name fix * Fixes * Updates to index.mdx * Tech specs page corrections * Tech specs page rename * update link to tech specs * K8s - new pages + tech specs * k8s - manage peering connections * k8s L7 traffic management * Separated establish connection pages * Directory fixes * Usage clean up * k8s docs edits * Updated nav data * CodeBlock Component fix * filename * CodeBlockConfig removal * Redirects * Update k8s filenames * Reshuffle k8s tech specs for clarity, fmt yaml files * Update general cluster peering docs, reorder CLI > API > UI, cross link to kubernetes * Fix config rendering in k8s usage docs, cross link to general usage from k8s docs * fix legacy link * update k8s docs * fix nested list rendering * redirect fix * page error --------- Co-authored-by: trujillo-adam <[email protected]> Co-authored-by: Tu Nguyen <[email protected]> Co-authored-by: David Yu <[email protected]> Co-authored-by: Tu Nguyen <[email protected]> * Fix rendering error on new operator usage docs (#16393) * add missing field to oss struct (#16401) * fix(docs): correct rate limit metrics (#16400) * Fix various flaky tests (#16396) * Native API Gateway Docs (#16365) * Create empty files * Copy over content for overview * Copy over content for usage * Copy over content for api-gateway config * Copy over content for http-route config * Copy over content for tcp-route config * Copy over content for inline-certificate config * Add docs to the sidebar * Clean up overview. Start cleaning up usage * Add BETA badge to API Gateways portion of nav * Fix header * Fix up usage * Fix up API Gateway config * Update paths to be consistent w/ other gateway docs * Fix up http-route * Fix up inline-certificate * rename path * Fix up tcp-route * Add CodeTabs * Add headers to config pages * Fix configuration model for http route and inline certificate * Add version callout to API gateway overview page * Fix values for inline certificate * Fix values for api gateway configuration * Fix values for TCP Route config * Fix values for HTTP Route config * Adds link from k8s gateway to vm gateway page * Remove versioning warning * Serve overview page at ../api-gateway, consistent w/ mesh-gateway * Remove weight field from tcp-route docs * Linking to usage instead of overview from k8s api-gateway to vm api-gateway * Fix issues in usage page * Fix links in usage * Capitalize Kubernetes * Apply suggestions from code review Co-authored-by: trujillo-adam <[email protected]> * remove optional callout * Apply suggestions from code review Co-authored-by: trujillo-adam <[email protected]> * Apply suggestions from code review Co-authored-by: trujillo-adam <[email protected]> * Apply suggestions from code review * Update website/content/docs/connect/gateways/api-gateway/configuration/api-gateway.mdx * Fix formatting of Hostnames * Update website/content/docs/api-gateway/index.mdx * Update website/content/docs/connect/gateways/api-gateway/configuration/http-route.mdx Co-authored-by: Andrew Stucki <[email protected]> * Add cross-linking of config entries * Fix rendering error on new operator usage docs * Update website/content/docs/connect/gateways/api-gateway/configuration/http-route.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/connect/gateways/api-gateway/configuration/http-route.mdx Co-authored-by: trujillo-adam <[email protected]> * Apply suggestions from code review * Apply suggestions from code review * Add BETA badges to config entry links * http route updates * Add Enterprise keys * Use map instead of list for meta field, use consistent formatting * Convert spaces to tabs * Add all Enterprise info to TCP Route * Use pascal case for JSON api-gateway example * Add enterprise to HCL api-gw cfg * Use pascal case for missed JSON config fields * Add enterprise to JSON api-gw cfg * Add enterprise to api-gw values * adds enterprise to http route * Update website/content/docs/connect/gateways/api-gateway/index.mdx Co-authored-by: danielehc <[email protected]> * Add enterprise to api-gw spec * Add missing namespace, partition + meta to specification * fixes for http route * Fix ordering of API Gatetway cfg spec items * whitespace * Add linking of values to tcp * Apply suggestions from code review Co-authored-by: Jeff Boruszak <[email protected]> * Fix comma in wrong place * Apply suggestions from code review Co-authored-by: Jeff Boruszak <[email protected]> * Move Certificates down * Apply suggestions from code review Co-authored-by: Jeff Boruszak <[email protected]> * Tabs to spaces in httproute * Use configuration entry instead of config entry * Fix indentations on api-gateway and tcp-route * Add whitespace between code block and prose * Apply suggestions from code review Co-authored-by: trujillo-adam <[email protected]> * adds <> to http route --------- Co-authored-by: Nathan Coleman <[email protected]> Co-authored-by: Melisa Griffin <[email protected]> Co-authored-by: Tu Nguyen <[email protected]> Co-authored-by: trujillo-adam <[email protected]> Co-authored-by: Tu Nguyen <[email protected]> Co-authored-by: Melisa Griffin <[email protected]> Co-authored-by: Andrew Stucki <[email protected]> Co-authored-by: danielehc <[email protected]> Co-authored-by: Jeff Boruszak <[email protected]> * NET-2286: Add tests to verify traffic redirects between services (#16390) * Try DRYing up createCluster in integration tests (#16199) * add back staging bits (#16411) * Fix a couple inconsistencies in `operator usage instances` command (#16260) * NO_JIRA: refactor validate function in traffic mgt tests (#16422) * Basic gobased API gateway spinup test (#16278) * wip, proof of concept, gateway service being registered, don't know how to hit it * checkpoint * Fix up API Gateway go tests (#16297) * checkpoint, getting InvalidDiscoveryChain route protocol does not match targeted service protocol * checkpoint * httproute hittable * tests working, one header test failing * differentiate services by status code, minor cleanup * working tests * updated GetPort interface * fix getport --------- Co-authored-by: Andrew Stucki <[email protected]> * Fix attempt for test fail panics in xDS (#16319) * Fix attempt for test fail panics in xDS * switch to a mutex pointer * update changelog (#16426) * update changelog * fix changelog formatting * feat: update alerts to Hds::Alert component (CC-4035) (#16412) * fix: ui tests run is fixed (applying class attribute twice to the hbs element caused the issue (#16428) * Refactor and move wal docs (#16387) * Add WAL documentation. Also fix some minor metrics registration details * Add tests to verify metrics are registered correctly * refactor and move wal docs * Updates to the WAL overview page * updates to enable WAL usage topic * updates to the monitoring WAL backend topic * updates for revert WAL topic * a few tweaks to overview and udpated metadescriptions * Apply suggestions from code review Co-authored-by: Paul Banks <[email protected]> * make revert docs consistent with enable * Apply suggestions from code review Co-authored-by: Paul Banks <[email protected]> * address feedback * address final feedback * Apply suggestions from code review Co-authored-by: Jeff Boruszak <[email protected]> --------- Co-authored-by: Paul Banks <[email protected]> Co-authored-by: trujillo-adam <[email protected]> Co-authored-by: trujillo-adam <[email protected]> Co-authored-by: Jeff Boruszak <[email protected]> * UI: Update Consul UI colors to use HDS colors (#16111) * update red color variables to hds * change background red to be one step lighter * map oranges * map greens * map blues * map greys * delete themes, colours: lemon, magenta, strawberry, and vault color aliases * add unmapped rainbow colours * replace white and transparent vars, remove unused semantic vars and frame placeholders * small tweaks to improve contrast, change node health status x/check colours for non-voters to match design doc, replace semantic colour action w hds colour * add unmapped grays, remove dark theme, manually set nav bar to use dark colours * map consul pink colour * map yellows * add unmapped oranges, delete light theme * remove readme, base variables, clean up dangling colours * Start working on the nav disclosure menus * Update main-nav-horizontal dropdowns * Format template * Update box-shadow tokens * Replace --tone- usage with tokens * Update nav disabled state and panel border colour * Replace rgb usage on tile * Fix permissions modal overlay * More fixes * Replace orange-500 with amber-200 * Update badge colors * Update vertical sidebar colors * Remove top border on consul peer list ul --------- Co-authored-by: wenincode <[email protected]> * Add missing link (#16437) * docs: remove extra whitespace in frontmatter (#16436) * Delete Vagrantfile (#16442) * upgrade test: consolidate resolver test cases (#16443) * UI: Fix rendering issue in search and lists (#16444) * Upgrade ember-cli-string-helpers * add extra lock change * Update docs for consul-k8s 1.1.0 (#16447) * Update ingress-gateways.mdx (#16330) * Update ingress-gateways.mdx Added an example of running the HELM install for the ingress gateways using values.yaml * Apply suggestions from code review * Update ingress-gateways.mdx Adds closing back ticks on example command. The suggesting UI strips them out. --------- Co-authored-by: trujillo-adam <[email protected]> * grpc: fix data race in balancer registration (#16229) Registering gRPC balancers is thread-unsafe because they are stored in a global map variable that is accessed without holding a lock. Therefore, it's expected that balancers are registered _once_ at the beginning of your program (e.g. in a package `init` function) and certainly not after you've started dialing connections, etc. > NOTE: this function must only be called during initialization time > (i.e. in an init() function), and is not thread-safe. While this is fine for us in production, it's challenging for tests that spin up multiple agents in-memory. We currently register a balancer per- agent which holds agent-specific state that cannot safely be shared. This commit introduces our own registry that _is_ thread-safe, and implements the Builder interface such that we can call gRPC's `Register` method once, on start-up. It uses the same pattern as our resolver registry where we use the dial target's host (aka "authority"), which is unique per-agent, to determine which builder to use. * cli: ensure acl token read -self works (#16445) Fixes a regression in #16044 The consul acl token read -self cli command should not require an -accessor-id because typically the persona invoking this would not already know the accessor id of their own token. * docs: Add backwards compatibility for Consul 1.14.x and consul-dataplane in the Envoy compat matrix (#16462) * Update envoy.mdx * gateways: add e2e test for API Gateway HTTPRoute ParentRef change (#16408) * test(gateways): add API Gateway HTTPRoute ParentRef change test * test(gateways): add checkRouteError helper * test(gateways): remove EOF check in CI this seems to sometimes be 'connection reset by peer' instead * Update test/integration/consul-container/test/gateways/http_route_test.go * Gateway Test HTTPPathRewrite (#16418) * add http url path rewrite * add Mike's test back in * update kind to use api.APIGateway * cli: remove stray whitespace when loading the consul version from the VERSION file (#16467) Fixes a regression from #15631 in the output of `consul version` from: Consul v1.16.0-dev +ent Revision 56b86acbe5+CHANGES to Consul v1.16.0-dev+ent Revision 56b86acbe5+CHANGES * Docs/services refactor docs day 122022 (#16103) * converted main services page to services overview page * set up services usage dirs * added Define Services usage page * converted health checks everything page to Define Health Checks usage page * added Register Services and Nodes usage page * converted Query with DNS to Discover Services and Nodes Overview page * added Configure DNS Behavior usage page * added Enable Static DNS Lookups usage page * added the Enable Dynamic Queries DNS Queries usage page * added the Configuration dir and overview page - may not need the overview, tho * fixed the nav from previous commit * added the Services Configuration Reference page * added Health Checks Configuration Reference page * updated service defaults configuraiton entry to new configuration ref format * fixed some bad links found by checker * more bad links found by checker * another bad link found by checker * converted main services page to services overview page * set up services usage dirs * added Define Services usage page * converted health checks everything page to Define Health Checks usage page * added Register Services and Nodes usage page * converted Query with DNS to Discover Services and Nodes Overview page * added Configure DNS Behavior usage page * added Enable Static DNS Lookups usage page * added the Enable Dynamic Queries DNS Queries usage page * added the Configuration dir and overview page - may not need the overview, tho * fixed the nav from previous commit * added the Services Configuration Reference page * added Health Checks Configuration Reference page * updated service defaults configuraiton entry to new configuration ref format * fixed some bad links found by checker * more bad links found by checker * another bad link found by checker * fixed cross-links between new topics * updated links to the new services pages * fixed bad links in scale file * tweaks to titles and phrasing * fixed typo in checks.mdx * started updating the conf ref to latest template * update SD conf ref to match latest CT standard * Apply suggestions from code review Co-authored-by: Eddie Rowe <[email protected]> * remove previous version of the checks page * fixed cross-links * Apply suggestions from code review Co-authored-by: Eddie Rowe <[email protected]> --------- Co-authored-by: Eddie Rowe <[email protected]> * docs: clarify license expiration upgrade behavior (#16464) * add provider ca auth-method support for azure Does the required dance with the local HTTP endpoint to get the required data for the jwt based auth setup in Azure. Keeps support for 'legacy' mode where all login data is passed on via the auth methods parameters. Refactored check for hardcoded /login fields. * Changed titles for services pages to sentence style cap (#16477) * Changed titles for services pages to sentence style cap * missed a meta title * docs: Consul 1.15.0 and Consul K8s 1.0 release notes (#16481) * add new release notes --------- Co-authored-by: Tu Nguyen <[email protected]> * fix (cli): return error msg if acl policy not found (#16485) * fix: return error msg if acl policy not found * changelog * add test * update services nav titles (#16484) * Improve ux to help users avoid overwriting fields of ACL tokens, roles and policies (#16288) * Deprecate merge-policies and add options add-policy-name/add-policy-id to improve CLI token update command * deprecate merge-roles fields * Fix potential flakey tests and update ux to remove 'completely' + typo fixes * NET-2292: port ingress-gateway test case "http" from BATS addendum (#16490) * docs: Update release notes with Envoy compat issue (#16494) * Update v1_15_x.mdx --------- Co-authored-by: Tu Nguyen <[email protected]> * Suppress AlreadyRegisteredError to fix test retries (#16501) * Suppress AlreadyRegisteredError to fix test retries * Remove duplicate sink * Speed up test by registering services concurrently (#16509) * add provider ca support for jwt file base auth Adds support for a jwt token in a file. Simply reads the file and sends the read in jwt along to the vault login. It also supports a legacy mode with the jwt string being passed directly. In which case the path is made optional. * docs(architecture): remove merge conflict leftovers (#16507) * add provider ca auth support for kubernetes Adds support for Kubernetes jwt/token file based auth. Only needs to read the file and save the contents as the jwt/token. * Merge pull request #4538 from hashicorp/NET-2396 (#16516) NET-2396: refactor test to reduce duplication * Merge pull request #4584 from hashicorp/refactor_cluster_config (#16517) NET-2841: PART 1 - refactor NewPeeringCluster to support custom config * Add ServiceResolver RequestTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable (#16495) * Leverage ServiceResolver ConnectTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable * Regenerate golden files * Add RequestTimeout field * Add changelog entry * Fix issue where terminating gateway service resolvers weren't properly cleaned up (#16498) * Fix issue where terminating gateway service resolvers weren't properly cleaned up * Add integration test for cleaning up resolvers * Add changelog entry * Use state test and drop integration test * Add support for failover policies (#16505) * modified unsupported envoy version error (#16518) - When an envoy version is out of a supported range, we now return the envoy version being used as `major.minor.x` to indicate that it is the minor version at most that is incompatible - When an envoy version is in the list of unsupported envoy versions we return back the envoy version in the error message as `major.minor.patch` as now the exact version matters. * Remove private prefix from proto-gen-rpc-glue e2e test (#16433) * Fix resolution of service resolvers with subsets for external upstreams (#16499) * Fix resolution of service resolvers with subsets for external upstreams * Add tests * Add changelog entry * Update view filter logic * fixed broken links associated with cluster peering updates (#16523) * fixed broken links associated with cluster peering updates * additional links to fix * typos * fixed redirect file * add provider ca support for approle auth-method Adds support for the approle auth-method. Only handles using the approle role/secret to auth and it doesn't support the agent's extra management configuration options (wrap and delete after read) as they are not required as part of the auth (ie. they are vault agent things). * update connect/ca's vault AuthMethod conf section (#16346) Updated Params field to re-frame as supporting arguments specific to the supported vault-agent auth-auth methods with links to each methods "#configuration" section. Included a call out limits on parameters supported. * proxycfg: ensure that an irrecoverable error in proxycfg closes the xds session and triggers a replacement proxycfg watcher (#16497) Receiving an "acl not found" error from an RPC in the agent cache and the streaming/event components will cause any request loops to cease under the assumption that they will never work again if the token was destroyed. This prevents log spam (#14144, #9738). Unfortunately due to things like: - authz requests going to stale servers that may not have witnessed the token creation yet - authz requests in a secondary datacenter happening before the tokens get replicated to that datacenter - authz requests from a primary TO a secondary datacenter happening before the tokens get replicated to that datacenter The caller will get an "acl not found" *before* the token exists, rather than just after. The machinery added above in the linked PRs will kick in and prevent the request loop from looping around again once the tokens actually exist. For `consul-dataplane` usages, where xDS is served by the Consul servers rather than the clients ultimately this is not a problem because in that scenario the `agent/proxycfg` machinery is on-demand and launched by a new xDS stream needing data for a specific service in the catalog. If the watching goroutines are terminated it ripples down and terminates the xDS stream, which CDP will eventually re-establish and restart everything. For Consul client usages, the `agent/proxycfg` machinery is ahead-of-time launched at service registration time (called "local" in some of the proxycfg machinery) so when the xDS stream comes in the data is already ready to go. If the watching goroutines terminate it should terminate the xDS stream, but there's no mechanism to re-spawn the watching goroutines. If the xDS stream reconnects it will see no `ConfigSnapshot` and will not get one again until the client agent is restarted, or the service is re-registered with something changed in it. This PR fixes a few things in the machinery: - there was an inadvertent deadlock in fetching snapshot from the proxycfg machinery by xDS, such that when the watching goroutine terminated the snapshots would never be fetched. This caused some of the xDS machinery to get indefinitely paused and not finish the teardown properly. - Every 30s we now attempt to re-insert all locally registered services into the proxycfg machinery. - When services are re-inserted into the proxycfg machinery we special case "dead" ones such that we unilaterally replace them rather that doing that conditionally. * NET-2903 Normalize weight for http routes (#16512) * NET-2903 Normalize weight for http routes * Update website/content/docs/connect/gateways/api-gateway/configuration/http-route.mdx Co-authored-by: trujillo-adam <[email protected]> * Add some basic UI improvements for api-gateway services (#16508) * Add some basic ui improvements for api-gateway services * Add changelog entry * Use ternary for null check * Update gateway doc links * rename changelog entry for new PR * Fix test * fixes empty link in DNS usage page (#16534) * NET-2904 Fixes API Gateway Route Service Weight Division Error * Improve ux around ACL token to help users avoid overwriting node/service identities (#16506) * Deprecate merge-node-identities and merge-service-identities flags * added tests for node identities changes * added changelog file and docs * Follow-up fixes to consul connect envoy command (#16530) * Merge pull request #4573 from hashicorp/NET-2841 (#16544) * Merge pull request #4573 from hashicorp/NET-2841 NET-2841: PART 2 refactor upgrade tests to include version 1.15 * update upgrade versions * upgrade test: discovery chain across partition (#16543) * Update the consul-k8s cli docs for the new `proxy log` subcommand (#16458) * Update the consul-k8s cli docs for the new `proxy log` subcommand * Updated consul-k8s docs from PR feedback * Added proxy log command to release notes * Delete test-link-rewrites.yml (#16546) * feat: update notification to use hds toast component (#16519) * Fix flakey tests related to ACL token updates (#16545) * Fix flakey tests related to ACL token updates * update all acl token update tests * extra create_token function to its own thing * support vault auth config for alicloud ca provider Add support for using existing vault auto-auth configurations as the provider configuration when using Vault's CA provider with AliCloud. AliCloud requires 2 extra fields to enable it to use STS (it's preferred auth setup). Our vault-plugin-auth-alicloud package contained a method to help generate them as they require you to make an http call to a faked endpoint proxy to get them (url and headers base64 encoded). * Update docs to reflect functionality (#16549) * Update docs to reflect functionality * make consistent with other client runtimes * upgrade test: use retry with ModifyIndex and remove ent test file (#16553) * add agent locality and replicate it across peer streams (#16522) * docs: Document config entry permissions (#16556) * Broken link fixes (#16566) * NET-2954: Improve integration tests CI execution time (#16565) * NET-2954: Improve integration tests CI execution time * fix ci * remove comments and modify config file * fix bug that can lead to peering service deletes impacting the state of local services (#16570) * Update changelog with patch releases (#16576) * Bump submodules from latest 1.15.1 patch release (#16578) * Update changelog with Consul patch releases 1.13.7, 1.14.5, 1.15.1 * Bump submodules from latest patch release * Forgot one * website: adds content-check command and README update (#16579) * added a backport-checker GitHub action (#16567) * added a backport-checker GitHub action * Update .github/workflows/backport-checker.yml * auto-updated agent/uiserver/dist/ from commit 63204b518 (#16587) Co-authored-by: hc-github-team-consul-core <[email protected]> * GRPC stub for the ResourceService (#16528) * UI: Fix htmlsafe errors throughout the app (#16574) * Upgrade ember-intl * Add changelog * Add yarn lock * Add namespace file with build tag for OSS gateway tests (#16590) * Add namespace file with build tag for OSS tests * Remove TODO comment * JIRA pr check: Filter out OSS/ENT merges (#16593) * jira pr check filter out dependabot and oss/ent merges * allow setting locality on services and nodes (#16581) * Add Peer Locality to Discovery Chains (#16588) Add peer locality to discovery chains * fixes for unsupported partitions field in CRD metadata block (#16604) * fixes for unsupported partitions field in CRD metadata block * Apply suggestions from code review Co-authored-by: Luke Kysow <[email protected]> --------- Co-authored-by: Luke Kysow <[email protected]> * Create a weekly 404 checker for all Consul docs content (#16603) * Consul WAN Fed with Vault Secrets Backend document updates (#16597) * Consul WAN Fed with Vault Secrets Backend document updates * Corrected dc1-consul.yaml and dc2-consul.yaml file highlights * Update website/content/docs/k8s/deployment-configurations/vault/wan-federation.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/k8s/deployment-configurations/vault/wan-federation.mdx Co-authored-by: trujillo-adam <[email protected]> --------- Co-authored-by: trujillo-adam <[email protected]> * Allow HCP metrics collection for Envoy proxies Co-authored-by: Ashvitha Sridharan <[email protected]> Co-authored-by: Freddy <[email protected]> Add a new envoy flag: "envoy_hcp_metrics_bind_socket_dir", a directory where a unix socket will be created with the name `<namespace>_<proxy_id>.sock` to forward Envoy metrics. If set, this will configure: - In bootstrap configuration a local stats_sink and static cluster. These will forward metrics to a loopback listener sent over xDS. - A dynamic listener listening at the socket path that the previously defined static cluster is sending metrics to. - A dynamic cluster that will forward traffic received at this listener to the hcp-metrics-collector service. Reasons for having a static cluster pointing at a dynamic listener: - We want to secure the metrics stream using TLS, but the stats sink can only be defined in bootstrap config. With dynamic listeners/clusters we can use the proxy's leaf certificate issued by the Connect CA, which isn't available at bootstrap time. - We want to intelligently route to the HCP collector. Configuring its addreess at bootstrap time limits our flexibility routing-wise. More on this below. Reasons for defining the collector as an upstream in `proxycfg`: - The HCP collector will be deployed as a mesh service. - Certificate management is taken care of, as mentioned above. - Service discovery and routing logic is automatically taken care of, meaning that no code changes are required in the xds package. - Custom routing rules can be added for the collector using discovery chain config entries. Initially the collector is expected to be deployed to each admin partition, but in the future could be deployed centrally in the default partition. These config entries could even be managed by HCP itself. * Add copywrite setup file (#16602) * Add sameness-group configuration entry. (#16608) This commit adds a sameness-group config entry to the API and structs packages. It includes some validation logic and a new memdb index that tracks the default sameness-group for each partition. Sameness groups will simplify the effort of managing failovers / intentions / exports for peers and partitions. Note that this change purely to introduce the configuration entry and does not include the full functionality of sameness-groups. * Preserve CARoots when updating Vault CA configuration (#16592) If a CA config update did not cause a root change, the codepath would return early and skip some steps which preserve its intermediate certificates and signing key ID. This commit re-orders some code and prevents updates from generating new intermediate certificates. * Add UI copyright headers files (#16614) * Add copyright headers to UI files * Ensure copywrite file ignores external libs * Docs discovery typo (#16628) * docs(discovery): typo * docs(discovery): EOF and trim lines --------- Co-authored-by: trujillo-adam <[email protected]> * Fix issue with trust bundle read ACL check. (#16630) This commit fixes an issue where trust bundles could not be read by services in a non-default namespace, unless they had excessive ACL permissions given to them. Prior to this change, `service:write` was required in the default namespace in order to read the trust bundle. Now, `service:write` to a service in any namespace is sufficient. * Basic resource type registry (#16622) * Backport ENT-4704 (#16612) * feat: update typography to consume hds styles (#16577) * Add known issues to Raft WAL docs. (#16600) * Add known issues to Raft WAL docs. * Refactor update based on review feedback * Tune 404 checker to exclude false-positives and use intended file path (#16636) * Update e2e tests for namespaces (#16627) * Refactored "NewGatewayService" to handle namespaces, fixed TestHTTPRouteFlattening test * Fixed existing http_route tests for namespacing * Squash aclEnterpriseMeta for ResourceRefs and HTTPServices, accept namespace for creating connect services and regular services * Use require instead of assert after creating namespaces in http_route_tests * Refactor NewConnectService and NewGatewayService functions to use cfg objects to reduce number of method args * Rename field on SidecarConfig in tests from `SidecarServiceName` to `Name` to avoid stutter * net 2731 ip config entry OSS version (#16642) * ip config entry * name changing * move to ent * ent version * renaming * change format * renaming * refactor * add default values * fix confusing spiffe ids in golden tests (#16643) * First cluster grpc service should be NodePort for the second cluster to connect (#16430) * First cluster grpc service should be NodePort This is based on the issue opened here https://github.com/hashicorp/consul-k8s/issues/1903 If you follow the documentation https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/single-dc-multi-k8s exactly as it is, the first cluster will only create the consul UI service on NodePort but not the rest of the services (including for grpc). By default, from the helm chart, they are created as headless services by setting clusterIP None. This will cause an issue for the second cluster to discover consul server on the first cluster over gRPC as it cannot simply cannot through gRPC default port 8502 and it ends up in an error as shown in the issue https://github.com/hashicorp/consul-k8s/issues/1903 As a solution, the grpc service should be exposed using NodePort (or LoadBalancer). I added those changes required in both cluster1-values.yaml and cluster2-values.yaml, and also a description for those changes for the normal users to understand. Kindly review and I hope this PR will be accepted. * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> --------- Co-authored-by: trujillo-adam <[email protected]> * Add in query options for catalog service existing in a specific (#16652) namespace when creating service for tests * fix: add AccessorID property to PUT token request (#16660) * add sameness group support to service resolver failover and redirects (#16664) * Fix incorrect links on Envoy extensions documentation (#16666) * [API Gateway] Fix invalid cluster causing gateway programming delay (#16661) * Add test for http routes * Add fix * Fix tests * Add changelog entry * Refactor and fix flaky tests * Bump tomhjp/gh-action-jira-search from 0.2.1 to 0.2.2 (#16667) Bumps [tomhjp/gh-action-jira-search](https://github.com/tomhjp/gh-action-jira-search) from 0.2.1 to 0.2.2. - [Release notes](https://github.com/tomhjp/gh-action-jira-search/releases) - [Commits](https://github.com/tomhjp/gh-action-jira-search/compare/v0.2.1...v0.2.2) --- updated-dependencies: - dependency-name: tomhjp/gh-action-jira-search dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * build(deps): bump atlassian/gajira-transition from 2.0.1 to 3.0.1 (#15921) Bumps [atlassian/gajira-transition](https://github.com/atlassian/gajira-transition) from 2.0.1 to 3.0.1. - [Release notes](https://github.com/atlassian/gajira-transition/releases) - [Commits](https://github.com/atlassian/gajira-transition/compare/v2.0.1...v3.0.1) --- updated-dependencies: - dependency-name: atlassian/gajira-transition dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: David Yu <[email protected]> * Snapshot restore tests (#16647) * add snapshot restore test * add logstore as test parameter * Use the correct image version * make sure we read the logs from a followers to test the follower snapshot install path. * update to raf-wal v0.3.0 * add changelog. * updating changelog for bug description and removed integration test. * setting up test container builder to only set logStore for 1.15 and higher --------- Co-authored-by: Paul Banks <[email protected]> Co-authored-by: John Murret <[email protected]> * add sameness groups to discovery chains (#16671) * feat: add category annotation to RPC and gRPC methods (#16646) * Update GH actions to create Jira issue automatically (#16656) * Adds check to verify that the API Gateway is being created with at least one listener * Fix route subscription when using namespaces (#16677) * Fix route subscription when using namespaces * Update changelog * Fix changelog entry to reference that the bug was enterprise only * peering: peering partition failover fixes (#16673) add local source partition for peered upstreams * fix jira sync actions, remove custom fields (#16686) * Docs/update jira sync pr issue (#16688) * fix jira sync actions, remove custom fields * remove more additional fields, debug * Docs: Jira sync Update issuetype to bug (#16689) * update issuetype to bug * fix conditional for pr edu * build(deps): bump tomhjp/gh-action-jira-create from 0.2.0 to 0.2.1 (#16685) Bumps [tomhjp/gh-action-jira-create](https://github.com/tomhjp/gh-action-jira-create) from 0.2.0 to 0.2.1. - [Release notes](https://github.com/tomhjp/gh-action-jira-create/releases) - [Commits](https://github.com/tomhjp/gh-action-jira-create/compare/v0.2.0...v0.2.1) --- updated-dependencies: - dependency-name: tomhjp/gh-action-jira-create dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: David Yu <[email protected]> * build(deps): bump tomhjp/gh-action-jira-comment from 0.1.0 to 0.2.0 (#16684) Bumps [tomhjp/gh-action-jira-comment](https://github.com/tomhjp/gh-action-jira-comment) from 0.1.0 to 0.2.0. - [Release notes](https://github.com/tomhjp/gh-action-jira-comment/releases) - [Commits](https://github.com/tomhjp/gh-action-jira-comment/compare/v0.1.0...v0.2.0) --- updated-dependencies: - dependency-name: tomhjp/gh-action-jira-comment dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: David Yu <[email protected]> * NET-2397: Add readme.md to upgrade test subdirectory (#16610) * NET-2397: Add readme.md to upgrade test subdirectory * remove test code * fix link and update steps of adding new test cases (#16654) * fix link and update steps of adding new test cases * Apply suggestions from code review Co-authored-by: Nick Irvine <[email protected]> --------- Co-authored-by: Nick Irvine <[email protected]> --------- Co-authored-by: cskh <[email protected]> Co-authored-by: Nick Irvine <[email protected]> * chore: replace hardcoded node name with a constant (#16692) * Fix broken links from api docs (#16695) * Update WAL Known issues (#16676) * UI: update Ember to 3.28.6 (#16616) --------- Co-authored-by: wenincode <[email protected]> * Regen helm docs (#16701) * Remove unused are hosts set check (#16691) * Remove unused are hosts set check * Remove all traces of unused 'AreHostsSet' parameter * Remove unused Hosts attribute * Remove commented out use of snap.APIGateway.Hosts * [NET-3029] Migrate build-distros to GHA (#16669) * migrate build distros to GHA Signed-off-by: Dan Bond <[email protected]> * build-arm Signed-off-by: Dan Bond <[email protected]> * don't use matrix Signed-off-by: Dan Bond <[email protected]> * check-go-mod Signed-off-by: Dan Bond <[email protected]> * add notify slack script Signed-off-by: Dan Bond <[email protected]> * notify slack if failure Signed-off-by: Dan Bond <[email protected]> * rm notify slack script Signed-off-by: Dan Bond <[email protected]> * fix check-go-mod job Signed-off-by: Dan Bond <[email protected]> --------- Signed-off-by: Dan Bond <[email protected]> * Update envoy extension docs, service-defaults, add multi-config example for lua (#16710) * fix build workflow (#16719) Signed-off-by: Dan Bond <[email protected]> * Helm docs without developer.hashicorp.com prefix (#16711) This was causing linter errors * add extra resiliency to snapshot restore test (#16712) * fix: gracefully fail on invalid port number (#16721) * Copyright headers for config files git + circleci (#16703) * Copyright headers for config files git + circleci * Release folder copyright headers * fix bug where pqs that failover to a cluster peer dont un-fail over (#16729) * add enterprise xds tests (#16738) * delete config when nil (#16690) * delete config when nil * fix mock interface implementation * fix handler test to use the right assertion * extract DeleteConfig as a separate API. * fix mock limiter implementation to satisfy the new interface * fix failing tests * add test comments * Changelog for audit logging fix. (#16700) * Changelog for audit logging fix. * Use GH issues type for edu board (#16750) * fix: remove unused tenancy category from rate limit spec (#16740) * Remove version bump from CRT workflow (#16728) This bumps the version to reflect the next patch release; however, we use a specific branch for each patch release and so never wind up cutting a release directly from the `release/1.15.x` (for example) where this is intended to work. * tests instantiating clients w/o shutting down (#16755) noticed via their port still in use messages. * RELENG-471: Remove obsolete load-test workflow (#16737) * Remove obsolete load-test workflow * remove load-tests from circleci config. --------- Co-authored-by: John Murret <[email protected]> * add failover policy to ProxyConfigEntry in api (#16759) * add failover policy to ProxyConfigEntry in api * update docs * Fix broken links in Consul docs (#16640) * Fix broken links in Consul docs * more broken link fixes * more 404 fixes * 404 fixes * broken link fix --------- Co-authored-by: Tu Nguyen <[email protected]> * Change partition for peers in discovery chain targets (#16769) This commit swaps the partition field to the local partition for discovery chains targeting peers. Prior to this change, peer upstreams would always use a value of default regardless of which partition they exist in. This caused several issues in xds / proxycfg because of id mismatches. Some prior fixes were made to deal with one-off id mismatches that this PR also cleans up, since they are no longer needed. * Docs/intentions refactor docs day 2022 (#16758) * converted intentions conf entry to ref CT format * set up intentions nav * add page for intentions usage * final intentions usage page * final intentions overview page * fixed old relative links * updated diagram for overview * updated links to intentions content * fixed typo in updated links * rename intentions overview page file to index * rollback link updates to intentions overview * fixed nav * Updated custom HTML in API and CLI pages to MD * applied suggestions from review to index page * moved conf examples from usage to conf ref * missed custom HTML section * applied additional feedback * Apply suggestions from code review Co-authored-by: Tu Nguyen <[email protected]> * updated headings in usage page * renamed files and udpated nav * updated links to new file names * added redirects and final tweaks * typo --------- Co-authored-by: Tu Nguyen <[email protected]> * Add storage backend interface and in-memory implementation (#16538) Introduces `storage.Backend`, which will serve as the interface between the Resource Service and the underlying storage system (Raft today, but in the future, who knows!). The primary design goal of this interface is to keep its surface area small, and push as much functionality as possible into the layers above, so that new implementations can be added with little effort, and easily proven to be correct. To that end, we also provide a suite of "conformance" tests that can be run against a backend implementation to check it behaves correctly. In this commit, we introduce an initial in-memory storage backend, which is suitable for tests and when running Consul in development mode. This backend is a thin wrapper around the `Store` type, which implements a resource database using go-memdb and our internal pub/sub system. `Store` will also be used to handle reads in our Raft backend, and in the future, used as a local cache for external storage systems. * Fix bug in changelog checker where bash variable is not quoted (#16681) * Read(...) endpoint for the resource service (#16655) * Fix Edu Jira automation (#16778) * Fix struct tags for TCPService enterprise meta (#16781) * Fix struct tags for TCPService enterprise meta * Add changelog * Expand route flattening test for multiple namespaces (#16745) * Exand route flattening test for multiple namespaces * Add helper for checking http route config entry exists without checking for bound status * Fix port and hostname check for http route flattening test * WatchList(..) endpoint for the resource service (#16726) * Allocate virtual ip for resolver/router/splitter config entries (#16760) * add ip rate limiter controller OSS parts (#16790) * Resource service List(..) endpoint (#16753) * changes to support new PQ enterprise fields (#16793) * add scripts for testing locally consul-ui-toolkit (#16794) * Update normalization of route refs (#16789) * Use merge of enterprise meta's rather than new custom method * Add merge logic for tcp routes * Add changelog * Normalize certificate refs on gateways * Fix infinite call loop * Explicitly call enterprise meta * copyright headers for agent folder (#16704) * copyright headers for agent folder * Ignore test data files * fix proto files and remove headers in agent/uiserver folder * ignore deep-copy files * Copyright headers for command folder (#16705) * copyright headers for agent folder * Ignore test data files * fix proto files and remove headers in agent/uiserver folder * ignore deep-copy files * copyright headers for agent folder * Copyright headers for command folder * fix merge conflicts * Add copyright headers for acl, api and bench folders (#16706) * copyright headers for agent folder * Ignore test data files * fix proto files and remove headers in agent/uiserver folder * ignore deep-copy files * copyright headers for agent folder * fix merge conflicts * copyright headers for agent folder * Ignore test data files * fix proto files * ignore agent/uiserver folder for now * copyright headers for agent folder * Add copyright headers for acl, api and bench folders * Github Actions Migration - move go-tests workflows to GHA (#16761) * go-tests workflow * add test splitting to go-tests * fix re-reun fails report path * fix re-reun fails report path another place * fixing tests for32bit and race * use script file to generate runners * fixing run path * add checkout * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * passing runs-on * setting up runs-on as a parameter to check-go-mod * making on pull_request * Update .github/scripts/rerun_fails_report.sh Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * make runs-on required * removing go-version param that is not used. * removing go-version param that is not used. * Modify build-distros to use medium runners (#16773) * go-tests workflow * add test splitting to go-tests * fix re-reun fails report path * fix re-reun fails report path another place * fixing tests for32bit and race * use script file to generate runners * fixing run path * add checkout * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * passing runs-on * setting up runs-on as a parameter to check-go-mod * trying mediums * adding in script * fixing runs-on to be parameter * fixing merge conflict * changing to on push * removing whitespace * go-tests workflow * add test splitting to go-tests * fix re-reun fails report path * fix re-reun fails report path another place * fixing tests for32bit and race * use script file to generate runners * fixing run path * add checkout * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * passing runs-on * setting up runs-on as a parameter to check-go-mod * changing back to on pull_request --------- Co-authored-by: Dan Bond <[email protected]> * Github Actions Migration - move verify-ci workflows to GHA (#16777) * add verify-ci workflow * adding comment and changing to on pull request. * changing to pull_requests * changing to pull_request * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * [NET-3029] Migrate frontend to GHA (#16731) * changing set up to a small * using consuls own custom runner pool. --------- Co-authored-by: Dan Bond <[email protected]> * Copyright headers for missing files/folders (#16708) * copyright headers for agent folder * fix: export ReadWriteRatesConfig struct as it needs to referenced from consul-k8s (#16766) * docs: Updates to support HCP Consul cluster peering release (#16774) * New HCP Consul documentation section + links * Establish cluster peering usage cross-link * unrelated fix to backport to v1.15 * nav correction + fixes * Tech specs fixes * specifications for headers * Tech specs fixes + alignments * sprawl edits * Tip -> note * port ENT ingress gateway upgrade tests [NET-2294] [NET-2296] (#16804) * [COMPLIANCE] Add Copyright and License Headers (#16807) * [COMPLIANCE] Add Copyright and License Headers * fix headers for generated files * ignore dist folder --------- Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> Co-authored-by: Ronald Ekambi <[email protected]> Co-authored-by: Ronald <[email protected]> * add order by locality failover to Consul enterprise (#16791) * ci: changes resulting from running on consul-enterprise (#16816) * changes resulting from running on consul-enterprise * removing comment line * port ENT upgrade tests flattening (#16824) * docs: raise awareness of GH-16779 (#16823) * updating command to reflect the additional package exclusions in CircleCI (#16829) * storage: fix resource leak in Watch (#16817) * Remove UI brand-loader copyright headers as they do not render appropriately (#16835) * Add sameness-group to exported-services config entries (#16836) This PR adds the sameness-group field to exported-service config entries, which allows for services to be exported to multiple destination partitions / peers easily. * Add default resolvers to disco chains based on the default sameness group (#16837) * [NET-3029] Migrate dev-* jobs to GHA (#16792) * ci: add build-artifacts workflow Signed-off-by: Dan Bond <[email protected]> * makefile for gha dev-docker Signed-off-by: Dan Bond <[email protected]> * use docker actions instead of make Signed-off-by: Dan Bond <[email protected]> * Add context Signed-off-by: Dan Bond <[email protected]> * testing push Signed-off-by: Dan Bond <[email protected]> * set short sha Signed-off-by: Dan Bond <[email protected]> * upload to s3 Signed-off-by: Dan Bond <[email protected]> * rm s3 upload Signed-off-by: Dan Bond <[email protected]> * use runner setup job Signed-off-by: Dan Bond <[email protected]> * on push Signed-off-by: Dan Bond <[email protected]> * testing Signed-off-by: Dan Bond <[email protected]> * on pr Signed-off-by: Dan Bond <[email protected]> * revert testing Signed-off-by: Dan Bond <[email protected]> * OSS/ENT logic Signed-off-by: Dan Bond <[email protected]> * add comments Signed-off-by: Dan Bond <[email protected]> * Update .github/workflows/build-artifacts.yml Co-authored-by: John Murret <[email protected]> --------- Signed-off-by: Dan Bond <[email protected]> Co-authored-by: John Murret <[email protected]> * add region field (#16825) * add region field * fix syntax error in test file * go fmt * go fmt * remove test * Connect CA Primary Provider refactor (#16749) * Rename Intermediate cert references to LeafSigningCert Within the Consul CA subsystem, the term "Intermediate" is confusing because the meaning changes depending on provider and datacenter (primary vs secondary). For example, when using the Consul CA the "ActiveIntermediate" may return the root certificate in a primary datacenter. At a high level, we are interested in knowing which CA is responsible for signing leaf certs, regardless of its position in a certificate chain. This rename makes the intent clearer. * Move provider state check earlier * Remove calls to GenerateLeafSigningCert GenerateLeafSigningCert (formerly known as GenerateIntermediate) is vestigial in non-Vault providers, as it simply returns the root certificate in primary datacenters. By folding Vault's intermediate cert logic into `GenerateRoot` we can encapsulate the intermediate cert handling within `newCARoot`. * Move GenerateLeafSigningCert out of PrimaryProvidder Now that the Vault Provider calls GenerateLeafSigningCert w…
* Add missing link (#16437) * docs: remove extra whitespace in frontmatter (#16436) * Delete Vagrantfile (#16442) * upgrade test: consolidate resolver test cases (#16443) * UI: Fix rendering issue in search and lists (#16444) * Upgrade ember-cli-string-helpers * add extra lock change * Update docs for consul-k8s 1.1.0 (#16447) * Update ingress-gateways.mdx (#16330) * Update ingress-gateways.mdx Added an example of running the HELM install for the ingress gateways using values.yaml * Apply suggestions from code review * Update ingress-gateways.mdx Adds closing back ticks on example command. The suggesting UI strips them out. --------- Co-authored-by: trujillo-adam <[email protected]> * grpc: fix data race in balancer registration (#16229) Registering gRPC balancers is thread-unsafe because they are stored in a global map variable that is accessed without holding a lock. Therefore, it's expected that balancers are registered _once_ at the beginning of your program (e.g. in a package `init` function) and certainly not after you've started dialing connections, etc. > NOTE: this function must only be called during initialization time > (i.e. in an init() function), and is not thread-safe. While this is fine for us in production, it's challenging for tests that spin up multiple agents in-memory. We currently register a balancer per- agent which holds agent-specific state that cannot safely be shared. This commit introduces our own registry that _is_ thread-safe, and implements the Builder interface such that we can call gRPC's `Register` method once, on start-up. It uses the same pattern as our resolver registry where we use the dial target's host (aka "authority"), which is unique per-agent, to determine which builder to use. * cli: ensure acl token read -self works (#16445) Fixes a regression in #16044 The consul acl token read -self cli command should not require an -accessor-id because typically the persona invoking this would not already know the accessor id of their own token. * docs: Add backwards compatibility for Consul 1.14.x and consul-dataplane in the Envoy compat matrix (#16462) * Update envoy.mdx * gateways: add e2e test for API Gateway HTTPRoute ParentRef change (#16408) * test(gateways): add API Gateway HTTPRoute ParentRef change test * test(gateways): add checkRouteError helper * test(gateways): remove EOF check in CI this seems to sometimes be 'connection reset by peer' instead * Update test/integration/consul-container/test/gateways/http_route_test.go * Gateway Test HTTPPathRewrite (#16418) * add http url path rewrite * add Mike's test back in * update kind to use api.APIGateway * cli: remove stray whitespace when loading the consul version from the VERSION file (#16467) Fixes a regression from #15631 in the output of `consul version` from: Consul v1.16.0-dev +ent Revision 56b86acbe5+CHANGES to Consul v1.16.0-dev+ent Revision 56b86acbe5+CHANGES * Docs/services refactor docs day 122022 (#16103) * converted main services page to services overview page * set up services usage dirs * added Define Services usage page * converted health checks everything page to Define Health Checks usage page * added Register Services and Nodes usage page * converted Query with DNS to Discover Services and Nodes Overview page * added Configure DNS Behavior usage page * added Enable Static DNS Lookups usage page * added the Enable Dynamic Queries DNS Queries usage page * added the Configuration dir and overview page - may not need the overview, tho * fixed the nav from previous commit * added the Services Configuration Reference page * added Health Checks Configuration Reference page * updated service defaults configuraiton entry to new configuration ref format * fixed some bad links found by checker * more bad links found by checker * another bad link found by checker * converted main services page to services overview page * set up services usage dirs * added Define Services usage page * converted health checks everything page to Define Health Checks usage page * added Register Services and Nodes usage page * converted Query with DNS to Discover Services and Nodes Overview page * added Configure DNS Behavior usage page * added Enable Static DNS Lookups usage page * added the Enable Dynamic Queries DNS Queries usage page * added the Configuration dir and overview page - may not need the overview, tho * fixed the nav from previous commit * added the Services Configuration Reference page * added Health Checks Configuration Reference page * updated service defaults configuraiton entry to new configuration ref format * fixed some bad links found by checker * more bad links found by checker * another bad link found by checker * fixed cross-links between new topics * updated links to the new services pages * fixed bad links in scale file * tweaks to titles and phrasing * fixed typo in checks.mdx * started updating the conf ref to latest template * update SD conf ref to match latest CT standard * Apply suggestions from code review Co-authored-by: Eddie Rowe <[email protected]> * remove previous version of the checks page * fixed cross-links * Apply suggestions from code review Co-authored-by: Eddie Rowe <[email protected]> --------- Co-authored-by: Eddie Rowe <[email protected]> * docs: clarify license expiration upgrade behavior (#16464) * add provider ca auth-method support for azure Does the required dance with the local HTTP endpoint to get the required data for the jwt based auth setup in Azure. Keeps support for 'legacy' mode where all login data is passed on via the auth methods parameters. Refactored check for hardcoded /login fields. * Changed titles for services pages to sentence style cap (#16477) * Changed titles for services pages to sentence style cap * missed a meta title * docs: Consul 1.15.0 and Consul K8s 1.0 release notes (#16481) * add new release notes --------- Co-authored-by: Tu Nguyen <[email protected]> * fix (cli): return error msg if acl policy not found (#16485) * fix: return error msg if acl policy not found * changelog * add test * update services nav titles (#16484) * Improve ux to help users avoid overwriting fields of ACL tokens, roles and policies (#16288) * Deprecate merge-policies and add options add-policy-name/add-policy-id to improve CLI token update command * deprecate merge-roles fields * Fix potential flakey tests and update ux to remove 'completely' + typo fixes * NET-2292: port ingress-gateway test case "http" from BATS addendum (#16490) * docs: Update release notes with Envoy compat issue (#16494) * Update v1_15_x.mdx --------- Co-authored-by: Tu Nguyen <[email protected]> * Suppress AlreadyRegisteredError to fix test retries (#16501) * Suppress AlreadyRegisteredError to fix test retries * Remove duplicate sink * Speed up test by registering services concurrently (#16509) * add provider ca support for jwt file base auth Adds support for a jwt token in a file. Simply reads the file and sends the read in jwt along to the vault login. It also supports a legacy mode with the jwt string being passed directly. In which case the path is made optional. * docs(architecture): remove merge conflict leftovers (#16507) * add provider ca auth support for kubernetes Adds support for Kubernetes jwt/token file based auth. Only needs to read the file and save the contents as the jwt/token. * Merge pull request #4538 from hashicorp/NET-2396 (#16516) NET-2396: refactor test to reduce duplication * Merge pull request #4584 from hashicorp/refactor_cluster_config (#16517) NET-2841: PART 1 - refactor NewPeeringCluster to support custom config * Add ServiceResolver RequestTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable (#16495) * Leverage ServiceResolver ConnectTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable * Regenerate golden files * Add RequestTimeout field * Add changelog entry * Fix issue where terminating gateway service resolvers weren't properly cleaned up (#16498) * Fix issue where terminating gateway service resolvers weren't properly cleaned up * Add integration test for cleaning up resolvers * Add changelog entry * Use state test and drop integration test * Add support for failover policies (#16505) * modified unsupported envoy version error (#16518) - When an envoy version is out of a supported range, we now return the envoy version being used as `major.minor.x` to indicate that it is the minor version at most that is incompatible - When an envoy version is in the list of unsupported envoy versions we return back the envoy version in the error message as `major.minor.patch` as now the exact version matters. * Remove private prefix from proto-gen-rpc-glue e2e test (#16433) * Fix resolution of service resolvers with subsets for external upstreams (#16499) * Fix resolution of service resolvers with subsets for external upstreams * Add tests * Add changelog entry * Update view filter logic * fixed broken links associated with cluster peering updates (#16523) * fixed broken links associated with cluster peering updates * additional links to fix * typos * fixed redirect file * add provider ca support for approle auth-method Adds support for the approle auth-method. Only handles using the approle role/secret to auth and it doesn't support the agent's extra management configuration options (wrap and delete after read) as they are not required as part of the auth (ie. they are vault agent things). * update connect/ca's vault AuthMethod conf section (#16346) Updated Params field to re-frame as supporting arguments specific to the supported vault-agent auth-auth methods with links to each methods "#configuration" section. Included a call out limits on parameters supported. * proxycfg: ensure that an irrecoverable error in proxycfg closes the xds session and triggers a replacement proxycfg watcher (#16497) Receiving an "acl not found" error from an RPC in the agent cache and the streaming/event components will cause any request loops to cease under the assumption that they will never work again if the token was destroyed. This prevents log spam (#14144, #9738). Unfortunately due to things like: - authz requests going to stale servers that may not have witnessed the token creation yet - authz requests in a secondary datacenter happening before the tokens get replicated to that datacenter - authz requests from a primary TO a secondary datacenter happening before the tokens get replicated to that datacenter The caller will get an "acl not found" *before* the token exists, rather than just after. The machinery added above in the linked PRs will kick in and prevent the request loop from looping around again once the tokens actually exist. For `consul-dataplane` usages, where xDS is served by the Consul servers rather than the clients ultimately this is not a problem because in that scenario the `agent/proxycfg` machinery is on-demand and launched by a new xDS stream needing data for a specific service in the catalog. If the watching goroutines are terminated it ripples down and terminates the xDS stream, which CDP will eventually re-establish and restart everything. For Consul client usages, the `agent/proxycfg` machinery is ahead-of-time launched at service registration time (called "local" in some of the proxycfg machinery) so when the xDS stream comes in the data is already ready to go. If the watching goroutines terminate it should terminate the xDS stream, but there's no mechanism to re-spawn the watching goroutines. If the xDS stream reconnects it will see no `ConfigSnapshot` and will not get one again until the client agent is restarted, or the service is re-registered with something changed in it. This PR fixes a few things in the machinery: - there was an inadvertent deadlock in fetching snapshot from the proxycfg machinery by xDS, such that when the watching goroutine terminated the snapshots would never be fetched. This caused some of the xDS machinery to get indefinitely paused and not finish the teardown properly. - Every 30s we now attempt to re-insert all locally registered services into the proxycfg machinery. - When services are re-inserted into the proxycfg machinery we special case "dead" ones such that we unilaterally replace them rather that doing that conditionally. * NET-2903 Normalize weight for http routes (#16512) * NET-2903 Normalize weight for http routes * Update website/content/docs/connect/gateways/api-gateway/configuration/http-route.mdx Co-authored-by: trujillo-adam <[email protected]> * Add some basic UI improvements for api-gateway services (#16508) * Add some basic ui improvements for api-gateway services * Add changelog entry * Use ternary for null check * Update gateway doc links * rename changelog entry for new PR * Fix test * fixes empty link in DNS usage page (#16534) * NET-2904 Fixes API Gateway Route Service Weight Division Error * Improve ux around ACL token to help users avoid overwriting node/service identities (#16506) * Deprecate merge-node-identities and merge-service-identities flags * added tests for node identities changes * added changelog file and docs * Follow-up fixes to consul connect envoy command (#16530) * Merge pull request #4573 from hashicorp/NET-2841 (#16544) * Merge pull request #4573 from hashicorp/NET-2841 NET-2841: PART 2 refactor upgrade tests to include version 1.15 * update upgrade versions * upgrade test: discovery chain across partition (#16543) * Update the consul-k8s cli docs for the new `proxy log` subcommand (#16458) * Update the consul-k8s cli docs for the new `proxy log` subcommand * Updated consul-k8s docs from PR feedback * Added proxy log command to release notes * Delete test-link-rewrites.yml (#16546) * feat: update notification to use hds toast component (#16519) * Fix flakey tests related to ACL token updates (#16545) * Fix flakey tests related to ACL token updates * update all acl token update tests * extra create_token function to its own thing * support vault auth config for alicloud ca provider Add support for using existing vault auto-auth configurations as the provider configuration when using Vault's CA provider with AliCloud. AliCloud requires 2 extra fields to enable it to use STS (it's preferred auth setup). Our vault-plugin-auth-alicloud package contained a method to help generate them as they require you to make an http call to a faked endpoint proxy to get them (url and headers base64 encoded). * Update docs to reflect functionality (#16549) * Update docs to reflect functionality * make consistent with other client runtimes * upgrade test: use retry with ModifyIndex and remove ent test file (#16553) * add agent locality and replicate it across peer streams (#16522) * docs: Document config entry permissions (#16556) * Broken link fixes (#16566) * NET-2954: Improve integration tests CI execution time (#16565) * NET-2954: Improve integration tests CI execution time * fix ci * remove comments and modify config file * fix bug that can lead to peering service deletes impacting the state of local services (#16570) * Update changelog with patch releases (#16576) * Bump submodules from latest 1.15.1 patch release (#16578) * Update changelog with Consul patch releases 1.13.7, 1.14.5, 1.15.1 * Bump submodules from latest patch release * Forgot one * website: adds content-check command and README update (#16579) * added a backport-checker GitHub action (#16567) * added a backport-checker GitHub action * Update .github/workflows/backport-checker.yml * auto-updated agent/uiserver/dist/ from commit 63204b518 (#16587) Co-authored-by: hc-github-team-consul-core <[email protected]> * GRPC stub for the ResourceService (#16528) * UI: Fix htmlsafe errors throughout the app (#16574) * Upgrade ember-intl * Add changelog * Add yarn lock * Add namespace file with build tag for OSS gateway tests (#16590) * Add namespace file with build tag for OSS tests * Remove TODO comment * JIRA pr check: Filter out OSS/ENT merges (#16593) * jira pr check filter out dependabot and oss/ent merges * allow setting locality on services and nodes (#16581) * Add Peer Locality to Discovery Chains (#16588) Add peer locality to discovery chains * fixes for unsupported partitions field in CRD metadata block (#16604) * fixes for unsupported partitions field in CRD metadata block * Apply suggestions from code review Co-authored-by: Luke Kysow <[email protected]> --------- Co-authored-by: Luke Kysow <[email protected]> * Create a weekly 404 checker for all Consul docs content (#16603) * Consul WAN Fed with Vault Secrets Backend document updates (#16597) * Consul WAN Fed with Vault Secrets Backend document updates * Corrected dc1-consul.yaml and dc2-consul.yaml file highlights * Update website/content/docs/k8s/deployment-configurations/vault/wan-federation.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/k8s/deployment-configurations/vault/wan-federation.mdx Co-authored-by: trujillo-adam <[email protected]> --------- Co-authored-by: trujillo-adam <[email protected]> * Allow HCP metrics collection for Envoy proxies Co-authored-by: Ashvitha Sridharan <[email protected]> Co-authored-by: Freddy <[email protected]> Add a new envoy flag: "envoy_hcp_metrics_bind_socket_dir", a directory where a unix socket will be created with the name `<namespace>_<proxy_id>.sock` to forward Envoy metrics. If set, this will configure: - In bootstrap configuration a local stats_sink and static cluster. These will forward metrics to a loopback listener sent over xDS. - A dynamic listener listening at the socket path that the previously defined static cluster is sending metrics to. - A dynamic cluster that will forward traffic received at this listener to the hcp-metrics-collector service. Reasons for having a static cluster pointing at a dynamic listener: - We want to secure the metrics stream using TLS, but the stats sink can only be defined in bootstrap config. With dynamic listeners/clusters we can use the proxy's leaf certificate issued by the Connect CA, which isn't available at bootstrap time. - We want to intelligently route to the HCP collector. Configuring its addreess at bootstrap time limits our flexibility routing-wise. More on this below. Reasons for defining the collector as an upstream in `proxycfg`: - The HCP collector will be deployed as a mesh service. - Certificate management is taken care of, as mentioned above. - Service discovery and routing logic is automatically taken care of, meaning that no code changes are required in the xds package. - Custom routing rules can be added for the collector using discovery chain config entries. Initially the collector is expected to be deployed to each admin partition, but in the future could be deployed centrally in the default partition. These config entries could even be managed by HCP itself. * Add copywrite setup file (#16602) * Add sameness-group configuration entry. (#16608) This commit adds a sameness-group config entry to the API and structs packages. It includes some validation logic and a new memdb index that tracks the default sameness-group for each partition. Sameness groups will simplify the effort of managing failovers / intentions / exports for peers and partitions. Note that this change purely to introduce the configuration entry and does not include the full functionality of sameness-groups. * Preserve CARoots when updating Vault CA configuration (#16592) If a CA config update did not cause a root change, the codepath would return early and skip some steps which preserve its intermediate certificates and signing key ID. This commit re-orders some code and prevents updates from generating new intermediate certificates. * Add UI copyright headers files (#16614) * Add copyright headers to UI files * Ensure copywrite file ignores external libs * Docs discovery typo (#16628) * docs(discovery): typo * docs(discovery): EOF and trim lines --------- Co-authored-by: trujillo-adam <[email protected]> * Fix issue with trust bundle read ACL check. (#16630) This commit fixes an issue where trust bundles could not be read by services in a non-default namespace, unless they had excessive ACL permissions given to them. Prior to this change, `service:write` was required in the default namespace in order to read the trust bundle. Now, `service:write` to a service in any namespace is sufficient. * Basic resource type registry (#16622) * Backport ENT-4704 (#16612) * feat: update typography to consume hds styles (#16577) * Add known issues to Raft WAL docs. (#16600) * Add known issues to Raft WAL docs. * Refactor update based on review feedback * Tune 404 checker to exclude false-positives and use intended file path (#16636) * Update e2e tests for namespaces (#16627) * Refactored "NewGatewayService" to handle namespaces, fixed TestHTTPRouteFlattening test * Fixed existing http_route tests for namespacing * Squash aclEnterpriseMeta for ResourceRefs and HTTPServices, accept namespace for creating connect services and regular services * Use require instead of assert after creating namespaces in http_route_tests * Refactor NewConnectService and NewGatewayService functions to use cfg objects to reduce number of method args * Rename field on SidecarConfig in tests from `SidecarServiceName` to `Name` to avoid stutter * net 2731 ip config entry OSS version (#16642) * ip config entry * name changing * move to ent * ent version * renaming * change format * renaming * refactor * add default values * fix confusing spiffe ids in golden tests (#16643) * First cluster grpc service should be NodePort for the second cluster to connect (#16430) * First cluster grpc service should be NodePort This is based on the issue opened here https://github.com/hashicorp/consul-k8s/issues/1903 If you follow the documentation https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/single-dc-multi-k8s exactly as it is, the first cluster will only create the consul UI service on NodePort but not the rest of the services (including for grpc). By default, from the helm chart, they are created as headless services by setting clusterIP None. This will cause an issue for the second cluster to discover consul server on the first cluster over gRPC as it cannot simply cannot through gRPC default port 8502 and it ends up in an error as shown in the issue https://github.com/hashicorp/consul-k8s/issues/1903 As a solution, the grpc service should be exposed using NodePort (or LoadBalancer). I added those changes required in both cluster1-values.yaml and cluster2-values.yaml, and also a description for those changes for the normal users to understand. Kindly review and I hope this PR will be accepted. * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> --------- Co-authored-by: trujillo-adam <[email protected]> * Add in query options for catalog service existing in a specific (#16652) namespace when creating service for tests * fix: add AccessorID property to PUT token request (#16660) * add sameness group support to service resolver failover and redirects (#16664) * Fix incorrect links on Envoy extensions documentation (#16666) * [API Gateway] Fix invalid cluster causing gateway programming delay (#16661) * Add test for http routes * Add fix * Fix tests * Add changelog entry * Refactor and fix flaky tests * Bump tomhjp/gh-action-jira-search from 0.2.1 to 0.2.2 (#16667) Bumps [tomhjp/gh-action-jira-search](https://github.com/tomhjp/gh-action-jira-search) from 0.2.1 to 0.2.2. - [Release notes](https://github.com/tomhjp/gh-action-jira-search/releases) - [Commits](https://github.com/tomhjp/gh-action-jira-search/compare/v0.2.1...v0.2.2) --- updated-dependencies: - dependency-name: tomhjp/gh-action-jira-search dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * build(deps): bump atlassian/gajira-transition from 2.0.1 to 3.0.1 (#15921) Bumps [atlassian/gajira-transition](https://github.com/atlassian/gajira-transition) from 2.0.1 to 3.0.1. - [Release notes](https://github.com/atlassian/gajira-transition/releases) - [Commits](https://github.com/atlassian/gajira-transition/compare/v2.0.1...v3.0.1) --- updated-dependencies: - dependency-name: atlassian/gajira-transition dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: David Yu <[email protected]> * Snapshot restore tests (#16647) * add snapshot restore test * add logstore as test parameter * Use the correct image version * make sure we read the logs from a followers to test the follower snapshot install path. * update to raf-wal v0.3.0 * add changelog. * updating changelog for bug description and removed integration test. * setting up test container builder to only set logStore for 1.15 and higher --------- Co-authored-by: Paul Banks <[email protected]> Co-authored-by: John Murret <[email protected]> * add sameness groups to discovery chains (#16671) * feat: add category annotation to RPC and gRPC methods (#16646) * Update GH actions to create Jira issue automatically (#16656) * Adds check to verify that the API Gateway is being created with at least one listener * Fix route subscription when using namespaces (#16677) * Fix route subscription when using namespaces * Update changelog * Fix changelog entry to reference that the bug was enterprise only * peering: peering partition failover fixes (#16673) add local source partition for peered upstreams * fix jira sync actions, remove custom fields (#16686) * Docs/update jira sync pr issue (#16688) * fix jira sync actions, remove custom fields * remove more additional fields, debug * Docs: Jira sync Update issuetype to bug (#16689) * update issuetype to bug * fix conditional for pr edu * build(deps): bump tomhjp/gh-action-jira-create from 0.2.0 to 0.2.1 (#16685) Bumps [tomhjp/gh-action-jira-create](https://github.com/tomhjp/gh-action-jira-create) from 0.2.0 to 0.2.1. - [Release notes](https://github.com/tomhjp/gh-action-jira-create/releases) - [Commits](https://github.com/tomhjp/gh-action-jira-create/compare/v0.2.0...v0.2.1) --- updated-dependencies: - dependency-name: tomhjp/gh-action-jira-create dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: David Yu <[email protected]> * build(deps): bump tomhjp/gh-action-jira-comment from 0.1.0 to 0.2.0 (#16684) Bumps [tomhjp/gh-action-jira-comment](https://github.com/tomhjp/gh-action-jira-comment) from 0.1.0 to 0.2.0. - [Release notes](https://github.com/tomhjp/gh-action-jira-comment/releases) - [Commits](https://github.com/tomhjp/gh-action-jira-comment/compare/v0.1.0...v0.2.0) --- updated-dependencies: - dependency-name: tomhjp/gh-action-jira-comment dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: David Yu <[email protected]> * NET-2397: Add readme.md to upgrade test subdirectory (#16610) * NET-2397: Add readme.md to upgrade test subdirectory * remove test code * fix link and update steps of adding new test cases (#16654) * fix link and update steps of adding new test cases * Apply suggestions from code review Co-authored-by: Nick Irvine <[email protected]> --------- Co-authored-by: Nick Irvine <[email protected]> --------- Co-authored-by: cskh <[email protected]> Co-authored-by: Nick Irvine <[email protected]> * chore: replace hardcoded node name with a constant (#16692) * Fix broken links from api docs (#16695) * Update WAL Known issues (#16676) * UI: update Ember to 3.28.6 (#16616) --------- Co-authored-by: wenincode <[email protected]> * Regen helm docs (#16701) * Remove unused are hosts set check (#16691) * Remove unused are hosts set check * Remove all traces of unused 'AreHostsSet' parameter * Remove unused Hosts attribute * Remove commented out use of snap.APIGateway.Hosts * [NET-3029] Migrate build-distros to GHA (#16669) * migrate build distros to GHA Signed-off-by: Dan Bond <[email protected]> * build-arm Signed-off-by: Dan Bond <[email protected]> * don't use matrix Signed-off-by: Dan Bond <[email protected]> * check-go-mod Signed-off-by: Dan Bond <[email protected]> * add notify slack script Signed-off-by: Dan Bond <[email protected]> * notify slack if failure Signed-off-by: Dan Bond <[email protected]> * rm notify slack script Signed-off-by: Dan Bond <[email protected]> * fix check-go-mod job Signed-off-by: Dan Bond <[email protected]> --------- Signed-off-by: Dan Bond <[email protected]> * Update envoy extension docs, service-defaults, add multi-config example for lua (#16710) * fix build workflow (#16719) Signed-off-by: Dan Bond <[email protected]> * Helm docs without developer.hashicorp.com prefix (#16711) This was causing linter errors * add extra resiliency to snapshot restore test (#16712) * fix: gracefully fail on invalid port number (#16721) * Copyright headers for config files git + circleci (#16703) * Copyright headers for config files git + circleci * Release folder copyright headers * fix bug where pqs that failover to a cluster peer dont un-fail over (#16729) * add enterprise xds tests (#16738) * delete config when nil (#16690) * delete config when nil * fix mock interface implementation * fix handler test to use the right assertion * extract DeleteConfig as a separate API. * fix mock limiter implementation to satisfy the new interface * fix failing tests * add test comments * Changelog for audit logging fix. (#16700) * Changelog for audit logging fix. * Use GH issues type for edu board (#16750) * fix: remove unused tenancy category from rate limit spec (#16740) * Remove version bump from CRT workflow (#16728) This bumps the version to reflect the next patch release; however, we use a specific branch for each patch release and so never wind up cutting a release directly from the `release/1.15.x` (for example) where this is intended to work. * tests instantiating clients w/o shutting down (#16755) noticed via their port still in use messages. * RELENG-471: Remove obsolete load-test workflow (#16737) * Remove obsolete load-test workflow * remove load-tests from circleci config. --------- Co-authored-by: John Murret <[email protected]> * add failover policy to ProxyConfigEntry in api (#16759) * add failover policy to ProxyConfigEntry in api * update docs * Fix broken links in Consul docs (#16640) * Fix broken links in Consul docs * more broken link fixes * more 404 fixes * 404 fixes * broken link fix --------- Co-authored-by: Tu Nguyen <[email protected]> * Change partition for peers in discovery chain targets (#16769) This commit swaps the partition field to the local partition for discovery chains targeting peers. Prior to this change, peer upstreams would always use a value of default regardless of which partition they exist in. This caused several issues in xds / proxycfg because of id mismatches. Some prior fixes were made to deal with one-off id mismatches that this PR also cleans up, since they are no longer needed. * Docs/intentions refactor docs day 2022 (#16758) * converted intentions conf entry to ref CT format * set up intentions nav * add page for intentions usage * final intentions usage page * final intentions overview page * fixed old relative links * updated diagram for overview * updated links to intentions content * fixed typo in updated links * rename intentions overview page file to index * rollback link updates to intentions overview * fixed nav * Updated custom HTML in API and CLI pages to MD * applied suggestions from review to index page * moved conf examples from usage to conf ref * missed custom HTML section * applied additional feedback * Apply suggestions from code review Co-authored-by: Tu Nguyen <[email protected]> * updated headings in usage page * renamed files and udpated nav * updated links to new file names * added redirects and final tweaks * typo --------- Co-authored-by: Tu Nguyen <[email protected]> * Add storage backend interface and in-memory implementation (#16538) Introduces `storage.Backend`, which will serve as the interface between the Resource Service and the underlying storage system (Raft today, but in the future, who knows!). The primary design goal of this interface is to keep its surface area small, and push as much functionality as possible into the layers above, so that new implementations can be added with little effort, and easily proven to be correct. To that end, we also provide a suite of "conformance" tests that can be run against a backend implementation to check it behaves correctly. In this commit, we introduce an initial in-memory storage backend, which is suitable for tests and when running Consul in development mode. This backend is a thin wrapper around the `Store` type, which implements a resource database using go-memdb and our internal pub/sub system. `Store` will also be used to handle reads in our Raft backend, and in the future, used as a local cache for external storage systems. * Fix bug in changelog checker where bash variable is not quoted (#16681) * Read(...) endpoint for the resource service (#16655) * Fix Edu Jira automation (#16778) * Fix struct tags for TCPService enterprise meta (#16781) * Fix struct tags for TCPService enterprise meta * Add changelog * Expand route flattening test for multiple namespaces (#16745) * Exand route flattening test for multiple namespaces * Add helper for checking http route config entry exists without checking for bound status * Fix port and hostname check for http route flattening test * WatchList(..) endpoint for the resource service (#16726) * Allocate virtual ip for resolver/router/splitter config entries (#16760) * add ip rate limiter controller OSS parts (#16790) * Resource service List(..) endpoint (#16753) * changes to support new PQ enterprise fields (#16793) * add scripts for testing locally consul-ui-toolkit (#16794) * Update normalization of route refs (#16789) * Use merge of enterprise meta's rather than new custom method * Add merge logic for tcp routes * Add changelog * Normalize certificate refs on gateways * Fix infinite call loop * Explicitly call enterprise meta * copyright headers for agent folder (#16704) * copyright headers for agent folder * Ignore test data files * fix proto files and remove headers in agent/uiserver folder * ignore deep-copy files * Copyright headers for command folder (#16705) * copyright headers for agent folder * Ignore test data files * fix proto files and remove headers in agent/uiserver folder * ignore deep-copy files * copyright headers for agent folder * Copyright headers for command folder * fix merge conflicts * Add copyright headers for acl, api and bench folders (#16706) * copyright headers for agent folder * Ignore test data files * fix proto files and remove headers in agent/uiserver folder * ignore deep-copy files * copyright headers for agent folder * fix merge conflicts * copyright headers for agent folder * Ignore test data files * fix proto files * ignore agent/uiserver folder for now * copyright headers for agent folder * Add copyright headers for acl, api and bench folders * Github Actions Migration - move go-tests workflows to GHA (#16761) * go-tests workflow * add test splitting to go-tests * fix re-reun fails report path * fix re-reun fails report path another place * fixing tests for32bit and race * use script file to generate runners * fixing run path * add checkout * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * passing runs-on * setting up runs-on as a parameter to check-go-mod * making on pull_request * Update .github/scripts/rerun_fails_report.sh Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * make runs-on required * removing go-version param that is not used. * removing go-version param that is not used. * Modify build-distros to use medium runners (#16773) * go-tests workflow * add test splitting to go-tests * fix re-reun fails report path * fix re-reun fails report path another place * fixing tests for32bit and race * use script file to generate runners * fixing run path * add checkout * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * passing runs-on * setting up runs-on as a parameter to check-go-mod * trying mediums * adding in script * fixing runs-on to be parameter * fixing merge conflict * changing to on push * removing whitespace * go-tests workflow * add test splitting to go-tests * fix re-reun fails report path * fix re-reun fails report path another place * fixing tests for32bit and race * use script file to generate runners * fixing run path * add checkout * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * passing runs-on * setting up runs-on as a parameter to check-go-mod * changing back to on pull_request --------- Co-authored-by: Dan Bond <[email protected]> * Github Actions Migration - move verify-ci workflows to GHA (#16777) * add verify-ci workflow * adding comment and changing to on pull request. * changing to pull_requests * changing to pull_request * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * [NET-3029] Migrate frontend to GHA (#16731) * changing set up to a small * using consuls own custom runner pool. --------- Co-authored-by: Dan Bond <[email protected]> * Copyright headers for missing files/folders (#16708) * copyright headers for agent folder * fix: export ReadWriteRatesConfig struct as it needs to referenced from consul-k8s (#16766) * docs: Updates to support HCP Consul cluster peering release (#16774) * New HCP Consul documentation section + links * Establish cluster peering usage cross-link * unrelated fix to backport to v1.15 * nav correction + fixes * Tech specs fixes * specifications for headers * Tech specs fixes + alignments * sprawl edits * Tip -> note * port ENT ingress gateway upgrade tests [NET-2294] [NET-2296] (#16804) * [COMPLIANCE] Add Copyright and License Headers (#16807) * [COMPLIANCE] Add Copyright and License Headers * fix headers for generated files * ignore dist folder --------- Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> Co-authored-by: Ronald Ekambi <[email protected]> Co-authored-by: Ronald <[email protected]> * add order by locality failover to Consul enterprise (#16791) * ci: changes resulting from running on consul-enterprise (#16816) * changes resulting from running on consul-enterprise * removing comment line * port ENT upgrade tests flattening (#16824) * docs: raise awareness of GH-16779 (#16823) * updating command to reflect the additional package exclusions in CircleCI (#16829) * storage: fix resource leak in Watch (#16817) * Remove UI brand-loader copyright headers as they do not render appropriately (#16835) * Add sameness-group to exported-services config entries (#16836) This PR adds the sameness-group field to exported-service config entries, which allows for services to be exported to multiple destination partitions / peers easily. * Add default resolvers to disco chains based on the default sameness group (#16837) * [NET-3029] Migrate dev-* jobs to GHA (#16792) * ci: add build-artifacts workflow Signed-off-by: Dan Bond <[email protected]> * makefile for gha dev-docker Signed-off-by: Dan Bond <[email protected]> * use docker actions instead of make Signed-off-by: Dan Bond <[email protected]> * Add context Signed-off-by: Dan Bond <[email protected]> * testing push Signed-off-by: Dan Bond <[email protected]> * set short sha Signed-off-by: Dan Bond <[email protected]> * upload to s3 Signed-off-by: Dan Bond <[email protected]> * rm s3 upload Signed-off-by: Dan Bond <[email protected]> * use runner setup job Signed-off-by: Dan Bond <[email protected]> * on push Signed-off-by: Dan Bond <[email protected]> * testing Signed-off-by: Dan Bond <[email protected]> * on pr Signed-off-by: Dan Bond <[email protected]> * revert testing Signed-off-by: Dan Bond <[email protected]> * OSS/ENT logic Signed-off-by: Dan Bond <[email protected]> * add comments Signed-off-by: Dan Bond <[email protected]> * Update .github/workflows/build-artifacts.yml Co-authored-by: John Murret <[email protected]> --------- Signed-off-by: Dan Bond <[email protected]> Co-authored-by: John Murret <[email protected]> * add region field (#16825) * add region field * fix syntax error in test file * go fmt * go fmt * remove test * Connect CA Primary Provider refactor (#16749) * Rename Intermediate cert references to LeafSigningCert Within the Consul CA subsystem, the term "Intermediate" is confusing because the meaning changes depending on provider and datacenter (primary vs secondary). For example, when using the Consul CA the "ActiveIntermediate" may return the root certificate in a primary datacenter. At a high level, we are interested in knowing which CA is responsible for signing leaf certs, regardless of its position in a certificate chain. This rename makes the intent clearer. * Move provider state check earlier * Remove calls to GenerateLeafSigningCert GenerateLeafSigningCert (formerly known as GenerateIntermediate) is vestigial in non-Vault providers, as it simply returns the root certificate in primary datacenters. By folding Vault's intermediate cert logic into `GenerateRoot` we can encapsulate the intermediate cert handling within `newCARoot`. * Move GenerateLeafSigningCert out of PrimaryProvidder Now that the Vault Provider calls GenerateLeafSigningCert within GenerateRoot, we can remove the method from all other providers that never used it in a meaningful way. * Add test for IntermediatePEM * Rename GenerateRoot to GenerateCAChain "Root" was being overloaded in the Consul CA context, as different providers and configs resulted in a single root certificate or a chain originating from an external trusted CA. Since the Vault provider also generates intermediates, it seems more accurate to call this a CAChain. * Update changelog with patch releases (#16856) * Update changelog with patch releases * Backport missed 1.0.4 patch release to changelog * Fix typo on cli-flags.mdx (#16843) Change "segements" to segments * Allow dialer to re-establish terminated peering (#16776) Currently, if an acceptor peer deletes a peering the dialer's peering will eventually get to a "terminated" state. If the two clusters need to be re-peered the acceptor will re-generate the token but the dialer will encounter this error on the call to establish: "failed to get addresses to dial peer: failed to refresh peer server addresses, will continue to use initial addresses: there is no active peering for "<<<ID>>>"" This is because in `exchangeSecret().GetDialAddresses()` we will get an error if fetching addresses for an inactive peering. The peering shows up as inactive at this point because of the existing terminated state. Rather than checking whether a peering is active we can instead check whether it was deleted. This way users do not need to delete terminated peerings in the dialing cluster before re-establishing them. * CA mesh CA expiration to it's own section This is part of an effort to raise awareness that you need to monitor your mesh CA if coming from an external source as you'll need to manage the rotation. * Fix broken doc in consul-k8s upgrade (#16852) Signed-off-by: dttung2905 <[email protected]> Co-authored-by: David Yu <[email protected]> * docs: add envoy to the proxycfg diagram (#16834) * docs: add envoy to the proxycfg diagram * ci: increase deep-copy and lint-enum jobs to use large runner as they hang in ENT (#16866) * docs: add envoy to the proxycfg diagram (#16834) * docs: add envoy to the proxycfg diagram * increase dee-copy job to use large runner. disable lint-enums on ENT * set lint-enums to a large * remove redunant installation of deep-copy --------- Co-authored-by: cskh <[email protected]> * Raft storage backend (#16619) * ad arm64 testing (#16876) * Omit false positives from 404 checker (#16881) * Remove false positives from 404 checker * fix remaining 404s * ci: fixes missing deps in frontend gha workflows (#16872) Signed-off-by: Dan Bond <[email protected]> * always test oss and conditionally test enterprise (#16827) * temporarily disable macos-arm64 tests job in go-tests (#16898) * Resource `Write` endpoint (#16786) * Resource `Delete` endpoint (#16756) * Wasm Envoy HTTP extension (#16877) * Fix API GW broken link (#16885) * Fix API GW broken link * Update website/content/docs/api-gateway/upgrades.mdx Co-authored-by: Tu Nguyen <[email protected]> --------- Co-authored-by: Tu Nguyen <[email protected]> * ci: Add success jobs. make go-test-enterprise conditional. build-distros and go-tests trigger on push to main and release branches (#16905) * Add go-tests-success job and make go-test-enterprise conditional * fixing lint-32bit reference * fixing reference to -go-test-troubleshoot * add all jobs that fan out. * fixing success job to need set up * add echo to success job * adding success jobs to build-artifacts, build-distros, and frontend. * changing the name of the job in verify ci to be consistent with other workflows * enable go-tests, build-distros, and verify-ci to run on merge to main and release branches because they currently do not with just the pull_request trigger * increase ENT runner size for xl to match OSS. have guild-distros use xl to match CircleCI (#16920) * log warning about certificate expiring sooner and with more details The old setting of 24 hours was not enough time to deal with an expiring certificates. This change ups it to 28 days OR 40% of the full cert duration, whichever is shorter. It also adds details to the log message to indicate which certificate it is logging about and a suggested action. * highlight the agent.tls cert metric with CA ones Include server agent certificate with list of cert metrics that need monitoring. * docs: improve upgrade path guidance (#16925) * Test: add noCleanup to TestServer stop (#16919) * docs: fix typo in LocalRequestTimeoutMs (#16917) * ci: add GOTAGS to build-distros (#16934) * APIGW: Routes with duplicate parents should be invalid (#16926) * ensure route parents are unique when creating an http route * Ensure tcp route parents are unique * Added unit tests * ci: remove verify-ci from circleci (#16860) * ci: remove go-tests workflow from CircleCI (#16855) * remove go-tests workflow from CircleCI * add yaml anchor back * ci: build-artifacts - fix platform missing in manifest error (#16940) * ci: build-artifacts - fix platform missing in manifest error * remove platform key * Check acls on resource `Read`, `List`, and `WatchList` (#16842) * Resource validation hook for `Write` endpoint (#16950) * Remove deprecated service-defaults upstream behavior. (#16957) Prior to this change, peer services would be targeted by service-default overrides as long as the new `peer` field was not found in the config entry. This commit removes that deprecated backwards-compatibility behavior. Now it is necessary to specify the `peer` field in order for upstream overrides to apply to a peer upstream. * Fix the indentation of the copyAnnotations example (#16873) * Update docs for service-defaults overrides. (#16960) Update docs for service-defaults overrides. Co-authored-by: trujillo-adam <[email protected]> * resource: `WriteStatus` endpoint (#16886) * Remove global.name requirement for APs (#16964) This is not a requirement when using APs because each AP has its own auth method so it's okay if the names overlap. * ci: remove build-distros from CircleCI (#16941) * feat: add reporting config with reload (#16890) * Added backport labels to PR template checklist (#16966) * ci: split frontend ember jobs (#16973) Signed-off-by: Dan Bond <[email protected]> * Memdb Txn Commit race condition fix (#16871) * Add a test to reproduce the race condition * Fix race condition by publishing the event after the commit and adding a lock to prevent out of order events. * split publish to generate the list of events before committing the transaction. * add changelog * remove extra func * Apply suggestions from code review Co-authored-by: Dan Upton <[email protected]> * add comment to explain test --------- Co-authored-by: Dan Upton <[email protected]> * add sameness to exported services structs in the api package (#16984) * circleci: remove frontend jobs (#16906) * circleci: remove fronted jobs Signed-off-by: Dan Bond <[email protected]> * remove frontend-cache Signed-off-by: Dan Bond <[email protected]> --------- Signed-off-by: Dan Bond <[email protected]> * Enforce ACLs on resource `Write` and `Delete` endpoints (#16956) * Update list of Envoy versions (#16889) * Update list of Envoy versions * Update docs + CI + tests * Add changelog entry * Add newly-released Envoy versions 1.23.8 and 1.24.6 * Add newly-released Envoy version 1.22.11 * Add mutate hook to `Write` endpoint (#16958) * upgrade test: config nodeName, nodeid, and inherited persistent data for consul container (#16931) * move enterprise test cases out of open source (#16985) * Fix delete when uid not provided (#16996) * Enforce Owner rules in `Write` endpoint (#16983) * add IP rate limiting config update (#16997) * add IP rate limiting config update * fix review comments * * added Sameness Group to proto files (#16998) - added Sameness Group to config entries - added Sameness Group to subscriptions * generated proto files * added Sameness Group events to the state store - added test cases * Refactored health RPC Client - moved code that is common to rpcclient under rpcclient common.go. This will help set us up to support future RPC clients * Refactored proxycfg glue views - Moved views to rpcclient config entry. This will allow us to reuse this code for a config entry client * added config entry RPC Client - Copied most of the testing code from rpcclient/health * hooked up new rpcclient in agent * fixed documentation and comments for clarity * added missing error message content to troubleshooting (#17005) * Add PrioritizeByLocality to config entries. (#17007) This commit adds the PrioritizeByLocality field to both proxy-config and service-resolver config entries for locality-aware routing. The field is currently intended for enterprise only, and will be used to enable prioritization of service-mesh connections to services based on geographical region / zone. * fixed bad link (#17009) * added an intro statement for the SI conf entry confiration model (#17017) * added an intro statement for the SI conf entry confiration model * caught a few more typos * Tenancy wildcard validaton for `Write`, `Read`, and `Delete` endpoints (#17004) * docs: update docs related to GH-16779 (#17020) * server: wire up in-process Resource Service (#16978) * add ability to start container tests in debug mode and attach a debugger (#16887) * add ability to start container tests in debug mode and attach a debugger to consul while running it. * add a debug message with the debug port * use pod to get the right port * fix image used in basic test * add more data to identify which container to debug. * fix comment Co-authored-by: Evan Culver <[email protected]> * rename debugUri to debugURI --------- Co-authored-by: Evan Culver <[email protected]> * feat: set up reporting agent (#16991) * api: enable query options on agent force-leave endpoint (#15987) * Bump the golang.org/x/net to 0.7.0 to address CVE-2022-41723 (#16754) * Bump the golang.org/x/net to 0.7.0 to address CVE-2022-41723 https://nvd.nist.gov/vuln/detail/CVE-2022-41723 * Add changelog entry --------- Co-authored-by: Nathan Coleman <[email protected]> * Don't send updates twice (#16999) * add test-integrations workflow * add test-integrations success job * update vault integration testing versions (#16949) * change parallelism to 4 forgotestsum. use env.CONSUL_VERSION so we can see the version. * use env for repeated values * match test to circleci * fix envvar * fix envvar 2 * fix envvar 3 * fix envvar 4 * fix envvar 5 * make upgrade and compatibility tests match circleci * backport of commit 107b85cb019dbee204d4286647e53d0e5c60f1bd * backport of commit f0ce0f92505449c4ccf8bf408c2160abace8221f * ci: add test-integrations (#16915) * add test-integrations workflow * add test-integrations success job * update vault integration testing versions (#16949) * change parallelism to 4 forgotestsum. use env.CONSUL_VERSION so we can see the version. * use env for repeated values * match test to circleci * fix envvar * fix envvar 2 * fix envvar 3 * fix envvar 4 * fix envvar 5 * make upgrade and compatibility tests match circleci * run go env to check environment * debug docker Signed-off-by: Dan Bond <[email protected]> * debug docker Signed-off-by: Dan Bond <[email protected]> * revert debug docker Signed-off-by: Dan Bond <[email protected]> * going back to command that worked 5 days ago for compatibility tests * Update Envoy versions to reflect changes in #16889 * cd to test dir * try running ubuntu latest * update PR with latest changes that work in enterprise * yaml still sucks * test GH fix (localhost resolution) * change for testing * test splitting and ipv6 lookup for compatibility and upgrade tests * fix indention * consul as image name * remove the on push * add gotestsum back in * removing the use of the gotestsum download action * yaml sucks today just like yesterday * fixing nomad tests * worked out the kinks on enterprise --------- Signed-off-by: Dan Bond <[email protected]> Co-authored-by: John Eikenberry <[email protected]> Co-authored-by: Dan Bond <[email protected]> Co-authored-by: Nathan Coleman <[email protected]> Co-authored-by: Sarah <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Dan Bond <[email protected]> Signed-off-by: dttung2905 <[email protected]> Co-authored-by: Tu Nguyen <[email protected]> Co-authored-by: Bryce Kalow <[email protected]> Co-authored-by: David Yu <[email protected]> Co-authored-by: cskh <[email protected]> Co-authored-by: Tyler Wendlandt <[email protected]> Co-authored-by: Curt Bushko <[email protected]> Co-authored-by: amitchahalgits <[email protected]> Co-authored-by: trujillo-adam <[email protected]> Co-authored-by: Dan Upton <[email protected]> Co-authored-by: R.B. Boyer <[email protected]> Co-authored-by: Mike Morris <[email protected]> Co-authored-by: sarahalsmiller <[email protected]> Co-authored-by: Eddie Rowe <[email protected]> Co-authored-by: skpratt <[email protected]> Co-authored-by: John Eikenberry <[email protected]> Co-authored-by: Ronald <[email protected]> Co-authored-by: Nick Irvine <[email protected]> Co-authored-by: Chris S. Kim <[email protected]> Co-authored-by: Michael Hofer <[email protected]> Co-authored-by: Anita Akaeze <[email protected]> Co-authored-by: Andrew Stucki <[email protected]> Co-authored-by: Eric Haberkorn <[email protected]> Co-authored-by: Michael Wilkerson <[email protected]> Co-authored-by: Matt Keeler <[email protected]> Co-authored-by: Melisa Griffin <[email protected]> Co-authored-by: John Maguire <[email protected]> Co-authored-by: Ashlee M Boyer <[email protected]> Co-authored-by: Valeriia Ruban <[email protected]> Co-authored-by: Paul Glass <[email protected]> Co-authored-by: Semir Patel <[email protected]> Co-authored-by: Luke Kysow <[email protected]> Co-authored-by: natemollica-dev <[email protected]> Co-authored-by: Ashvitha <[email protected]> Co-authored-by: Derek Menteer <[email protected]> Co-authored-by: Bastien Dronneau <[email protected]> Co-authored-by: Freddy <[email protected]> Co-authored-by: Paul Banks <[email protected]> Co-authored-by: wangxinyi7 <[email protected]> Co-authored-by: Vipin John Wilson <[email protected]> Co-authored-by: Rosemary Wang <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Dhia Ayachi <[email protected]> Co-authored-by: John Murret <[email protected]> Co-authored-by: Poonam Jadhav <[email protected]> Co-authored-by: Nitya Dhanushkodi <[email protected]> Co-authored-by: Dan Bond <[email protected]> Co-authored-by: Nathan Coleman <[email protected]> Co-authored-by: brian shore <[email protected]> Co-authored-by: malizz <[email protected]> Co-authored-by: Kyle Havlovitz <[email protected]> Co-authored-by: Jeff Boruszak <[email protected]> Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> Co-authored-by: Ronald Ekambi <[email protected]> Co-authored-by: Jared Kirschner <[email protected]> Co-authored-by: Michael Zalimeni <[email protected]> Co-authored-by: Hariram Sankaran <[email protected]> Co-authored-by: Dao Thanh Tung <[email protected]> Co-authored-by: Chris Thain <[email protected]> Co-authored-by: Andrea Scarpino <[email protected]> Co-authored-by: Thomas Eckert <[email protected]> Co-authored-by: Evan Culver <[email protected]> Co-authored-by: Andrei Komarov <[email protected]> Co-authored-by: Kevin Wang <[email protected]> Co-authored-by: Sarah <[email protected]>
…ner count from hanging into release/1.15.x (#17085) * cli: remove stray whitespace when loading the consul version from the VERSION file (#16467) Fixes a regression from #15631 in the output of `consul version` from: Consul v1.16.0-dev +ent Revision 56b86acbe5+CHANGES to Consul v1.16.0-dev+ent Revision 56b86acbe5+CHANGES * Docs/services refactor docs day 122022 (#16103) * converted main services page to services overview page * set up services usage dirs * added Define Services usage page * converted health checks everything page to Define Health Checks usage page * added Register Services and Nodes usage page * converted Query with DNS to Discover Services and Nodes Overview page * added Configure DNS Behavior usage page * added Enable Static DNS Lookups usage page * added the Enable Dynamic Queries DNS Queries usage page * added the Configuration dir and overview page - may not need the overview, tho * fixed the nav from previous commit * added the Services Configuration Reference page * added Health Checks Configuration Reference page * updated service defaults configuraiton entry to new configuration ref format * fixed some bad links found by checker * more bad links found by checker * another bad link found by checker * converted main services page to services overview page * set up services usage dirs * added Define Services usage page * converted health checks everything page to Define Health Checks usage page * added Register Services and Nodes usage page * converted Query with DNS to Discover Services and Nodes Overview page * added Configure DNS Behavior usage page * added Enable Static DNS Lookups usage page * added the Enable Dynamic Queries DNS Queries usage page * added the Configuration dir and overview page - may not need the overview, tho * fixed the nav from previous commit * added the Services Configuration Reference page * added Health Checks Configuration Reference page * updated service defaults configuraiton entry to new configuration ref format * fixed some bad links found by checker * more bad links found by checker * another bad link found by checker * fixed cross-links between new topics * updated links to the new services pages * fixed bad links in scale file * tweaks to titles and phrasing * fixed typo in checks.mdx * started updating the conf ref to latest template * update SD conf ref to match latest CT standard * Apply suggestions from code review Co-authored-by: Eddie Rowe <[email protected]> * remove previous version of the checks page * fixed cross-links * Apply suggestions from code review Co-authored-by: Eddie Rowe <[email protected]> --------- Co-authored-by: Eddie Rowe <[email protected]> * docs: clarify license expiration upgrade behavior (#16464) * add provider ca auth-method support for azure Does the required dance with the local HTTP endpoint to get the required data for the jwt based auth setup in Azure. Keeps support for 'legacy' mode where all login data is passed on via the auth methods parameters. Refactored check for hardcoded /login fields. * Changed titles for services pages to sentence style cap (#16477) * Changed titles for services pages to sentence style cap * missed a meta title * docs: Consul 1.15.0 and Consul K8s 1.0 release notes (#16481) * add new release notes --------- Co-authored-by: Tu Nguyen <[email protected]> * fix (cli): return error msg if acl policy not found (#16485) * fix: return error msg if acl policy not found * changelog * add test * update services nav titles (#16484) * Improve ux to help users avoid overwriting fields of ACL tokens, roles and policies (#16288) * Deprecate merge-policies and add options add-policy-name/add-policy-id to improve CLI token update command * deprecate merge-roles fields * Fix potential flakey tests and update ux to remove 'completely' + typo fixes * NET-2292: port ingress-gateway test case "http" from BATS addendum (#16490) * docs: Update release notes with Envoy compat issue (#16494) * Update v1_15_x.mdx --------- Co-authored-by: Tu Nguyen <[email protected]> * Suppress AlreadyRegisteredError to fix test retries (#16501) * Suppress AlreadyRegisteredError to fix test retries * Remove duplicate sink * Speed up test by registering services concurrently (#16509) * add provider ca support for jwt file base auth Adds support for a jwt token in a file. Simply reads the file and sends the read in jwt along to the vault login. It also supports a legacy mode with the jwt string being passed directly. In which case the path is made optional. * docs(architecture): remove merge conflict leftovers (#16507) * add provider ca auth support for kubernetes Adds support for Kubernetes jwt/token file based auth. Only needs to read the file and save the contents as the jwt/token. * Merge pull request #4538 from hashicorp/NET-2396 (#16516) NET-2396: refactor test to reduce duplication * Merge pull request #4584 from hashicorp/refactor_cluster_config (#16517) NET-2841: PART 1 - refactor NewPeeringCluster to support custom config * Add ServiceResolver RequestTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable (#16495) * Leverage ServiceResolver ConnectTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable * Regenerate golden files * Add RequestTimeout field * Add changelog entry * Fix issue where terminating gateway service resolvers weren't properly cleaned up (#16498) * Fix issue where terminating gateway service resolvers weren't properly cleaned up * Add integration test for cleaning up resolvers * Add changelog entry * Use state test and drop integration test * Add support for failover policies (#16505) * modified unsupported envoy version error (#16518) - When an envoy version is out of a supported range, we now return the envoy version being used as `major.minor.x` to indicate that it is the minor version at most that is incompatible - When an envoy version is in the list of unsupported envoy versions we return back the envoy version in the error message as `major.minor.patch` as now the exact version matters. * Remove private prefix from proto-gen-rpc-glue e2e test (#16433) * Fix resolution of service resolvers with subsets for external upstreams (#16499) * Fix resolution of service resolvers with subsets for external upstreams * Add tests * Add changelog entry * Update view filter logic * fixed broken links associated with cluster peering updates (#16523) * fixed broken links associated with cluster peering updates * additional links to fix * typos * fixed redirect file * add provider ca support for approle auth-method Adds support for the approle auth-method. Only handles using the approle role/secret to auth and it doesn't support the agent's extra management configuration options (wrap and delete after read) as they are not required as part of the auth (ie. they are vault agent things). * update connect/ca's vault AuthMethod conf section (#16346) Updated Params field to re-frame as supporting arguments specific to the supported vault-agent auth-auth methods with links to each methods "#configuration" section. Included a call out limits on parameters supported. * proxycfg: ensure that an irrecoverable error in proxycfg closes the xds session and triggers a replacement proxycfg watcher (#16497) Receiving an "acl not found" error from an RPC in the agent cache and the streaming/event components will cause any request loops to cease under the assumption that they will never work again if the token was destroyed. This prevents log spam (#14144, #9738). Unfortunately due to things like: - authz requests going to stale servers that may not have witnessed the token creation yet - authz requests in a secondary datacenter happening before the tokens get replicated to that datacenter - authz requests from a primary TO a secondary datacenter happening before the tokens get replicated to that datacenter The caller will get an "acl not found" *before* the token exists, rather than just after. The machinery added above in the linked PRs will kick in and prevent the request loop from looping around again once the tokens actually exist. For `consul-dataplane` usages, where xDS is served by the Consul servers rather than the clients ultimately this is not a problem because in that scenario the `agent/proxycfg` machinery is on-demand and launched by a new xDS stream needing data for a specific service in the catalog. If the watching goroutines are terminated it ripples down and terminates the xDS stream, which CDP will eventually re-establish and restart everything. For Consul client usages, the `agent/proxycfg` machinery is ahead-of-time launched at service registration time (called "local" in some of the proxycfg machinery) so when the xDS stream comes in the data is already ready to go. If the watching goroutines terminate it should terminate the xDS stream, but there's no mechanism to re-spawn the watching goroutines. If the xDS stream reconnects it will see no `ConfigSnapshot` and will not get one again until the client agent is restarted, or the service is re-registered with something changed in it. This PR fixes a few things in the machinery: - there was an inadvertent deadlock in fetching snapshot from the proxycfg machinery by xDS, such that when the watching goroutine terminated the snapshots would never be fetched. This caused some of the xDS machinery to get indefinitely paused and not finish the teardown properly. - Every 30s we now attempt to re-insert all locally registered services into the proxycfg machinery. - When services are re-inserted into the proxycfg machinery we special case "dead" ones such that we unilaterally replace them rather that doing that conditionally. * NET-2903 Normalize weight for http routes (#16512) * NET-2903 Normalize weight for http routes * Update website/content/docs/connect/gateways/api-gateway/configuration/http-route.mdx Co-authored-by: trujillo-adam <[email protected]> * Add some basic UI improvements for api-gateway services (#16508) * Add some basic ui improvements for api-gateway services * Add changelog entry * Use ternary for null check * Update gateway doc links * rename changelog entry for new PR * Fix test * fixes empty link in DNS usage page (#16534) * NET-2904 Fixes API Gateway Route Service Weight Division Error * Improve ux around ACL token to help users avoid overwriting node/service identities (#16506) * Deprecate merge-node-identities and merge-service-identities flags * added tests for node identities changes * added changelog file and docs * Follow-up fixes to consul connect envoy command (#16530) * Merge pull request #4573 from hashicorp/NET-2841 (#16544) * Merge pull request #4573 from hashicorp/NET-2841 NET-2841: PART 2 refactor upgrade tests to include version 1.15 * update upgrade versions * upgrade test: discovery chain across partition (#16543) * Update the consul-k8s cli docs for the new `proxy log` subcommand (#16458) * Update the consul-k8s cli docs for the new `proxy log` subcommand * Updated consul-k8s docs from PR feedback * Added proxy log command to release notes * Delete test-link-rewrites.yml (#16546) * feat: update notification to use hds toast component (#16519) * Fix flakey tests related to ACL token updates (#16545) * Fix flakey tests related to ACL token updates * update all acl token update tests * extra create_token function to its own thing * support vault auth config for alicloud ca provider Add support for using existing vault auto-auth configurations as the provider configuration when using Vault's CA provider with AliCloud. AliCloud requires 2 extra fields to enable it to use STS (it's preferred auth setup). Our vault-plugin-auth-alicloud package contained a method to help generate them as they require you to make an http call to a faked endpoint proxy to get them (url and headers base64 encoded). * Update docs to reflect functionality (#16549) * Update docs to reflect functionality * make consistent with other client runtimes * upgrade test: use retry with ModifyIndex and remove ent test file (#16553) * add agent locality and replicate it across peer streams (#16522) * docs: Document config entry permissions (#16556) * Broken link fixes (#16566) * NET-2954: Improve integration tests CI execution time (#16565) * NET-2954: Improve integration tests CI execution time * fix ci * remove comments and modify config file * fix bug that can lead to peering service deletes impacting the state of local services (#16570) * Update changelog with patch releases (#16576) * Bump submodules from latest 1.15.1 patch release (#16578) * Update changelog with Consul patch releases 1.13.7, 1.14.5, 1.15.1 * Bump submodules from latest patch release * Forgot one * website: adds content-check command and README update (#16579) * added a backport-checker GitHub action (#16567) * added a backport-checker GitHub action * Update .github/workflows/backport-checker.yml * auto-updated agent/uiserver/dist/ from commit 63204b518 (#16587) Co-authored-by: hc-github-team-consul-core <[email protected]> * GRPC stub for the ResourceService (#16528) * UI: Fix htmlsafe errors throughout the app (#16574) * Upgrade ember-intl * Add changelog * Add yarn lock * Add namespace file with build tag for OSS gateway tests (#16590) * Add namespace file with build tag for OSS tests * Remove TODO comment * JIRA pr check: Filter out OSS/ENT merges (#16593) * jira pr check filter out dependabot and oss/ent merges * allow setting locality on services and nodes (#16581) * Add Peer Locality to Discovery Chains (#16588) Add peer locality to discovery chains * fixes for unsupported partitions field in CRD metadata block (#16604) * fixes for unsupported partitions field in CRD metadata block * Apply suggestions from code review Co-authored-by: Luke Kysow <[email protected]> --------- Co-authored-by: Luke Kysow <[email protected]> * Create a weekly 404 checker for all Consul docs content (#16603) * Consul WAN Fed with Vault Secrets Backend document updates (#16597) * Consul WAN Fed with Vault Secrets Backend document updates * Corrected dc1-consul.yaml and dc2-consul.yaml file highlights * Update website/content/docs/k8s/deployment-configurations/vault/wan-federation.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/k8s/deployment-configurations/vault/wan-federation.mdx Co-authored-by: trujillo-adam <[email protected]> --------- Co-authored-by: trujillo-adam <[email protected]> * Allow HCP metrics collection for Envoy proxies Co-authored-by: Ashvitha Sridharan <[email protected]> Co-authored-by: Freddy <[email protected]> Add a new envoy flag: "envoy_hcp_metrics_bind_socket_dir", a directory where a unix socket will be created with the name `<namespace>_<proxy_id>.sock` to forward Envoy metrics. If set, this will configure: - In bootstrap configuration a local stats_sink and static cluster. These will forward metrics to a loopback listener sent over xDS. - A dynamic listener listening at the socket path that the previously defined static cluster is sending metrics to. - A dynamic cluster that will forward traffic received at this listener to the hcp-metrics-collector service. Reasons for having a static cluster pointing at a dynamic listener: - We want to secure the metrics stream using TLS, but the stats sink can only be defined in bootstrap config. With dynamic listeners/clusters we can use the proxy's leaf certificate issued by the Connect CA, which isn't available at bootstrap time. - We want to intelligently route to the HCP collector. Configuring its addreess at bootstrap time limits our flexibility routing-wise. More on this below. Reasons for defining the collector as an upstream in `proxycfg`: - The HCP collector will be deployed as a mesh service. - Certificate management is taken care of, as mentioned above. - Service discovery and routing logic is automatically taken care of, meaning that no code changes are required in the xds package. - Custom routing rules can be added for the collector using discovery chain config entries. Initially the collector is expected to be deployed to each admin partition, but in the future could be deployed centrally in the default partition. These config entries could even be managed by HCP itself. * Add copywrite setup file (#16602) * Add sameness-group configuration entry. (#16608) This commit adds a sameness-group config entry to the API and structs packages. It includes some validation logic and a new memdb index that tracks the default sameness-group for each partition. Sameness groups will simplify the effort of managing failovers / intentions / exports for peers and partitions. Note that this change purely to introduce the configuration entry and does not include the full functionality of sameness-groups. * Preserve CARoots when updating Vault CA configuration (#16592) If a CA config update did not cause a root change, the codepath would return early and skip some steps which preserve its intermediate certificates and signing key ID. This commit re-orders some code and prevents updates from generating new intermediate certificates. * Add UI copyright headers files (#16614) * Add copyright headers to UI files * Ensure copywrite file ignores external libs * Docs discovery typo (#16628) * docs(discovery): typo * docs(discovery): EOF and trim lines --------- Co-authored-by: trujillo-adam <[email protected]> * Fix issue with trust bundle read ACL check. (#16630) This commit fixes an issue where trust bundles could not be read by services in a non-default namespace, unless they had excessive ACL permissions given to them. Prior to this change, `service:write` was required in the default namespace in order to read the trust bundle. Now, `service:write` to a service in any namespace is sufficient. * Basic resource type registry (#16622) * Backport ENT-4704 (#16612) * feat: update typography to consume hds styles (#16577) * Add known issues to Raft WAL docs. (#16600) * Add known issues to Raft WAL docs. * Refactor update based on review feedback * Tune 404 checker to exclude false-positives and use intended file path (#16636) * Update e2e tests for namespaces (#16627) * Refactored "NewGatewayService" to handle namespaces, fixed TestHTTPRouteFlattening test * Fixed existing http_route tests for namespacing * Squash aclEnterpriseMeta for ResourceRefs and HTTPServices, accept namespace for creating connect services and regular services * Use require instead of assert after creating namespaces in http_route_tests * Refactor NewConnectService and NewGatewayService functions to use cfg objects to reduce number of method args * Rename field on SidecarConfig in tests from `SidecarServiceName` to `Name` to avoid stutter * net 2731 ip config entry OSS version (#16642) * ip config entry * name changing * move to ent * ent version * renaming * change format * renaming * refactor * add default values * fix confusing spiffe ids in golden tests (#16643) * First cluster grpc service should be NodePort for the second cluster to connect (#16430) * First cluster grpc service should be NodePort This is based on the issue opened here https://github.com/hashicorp/consul-k8s/issues/1903 If you follow the documentation https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/single-dc-multi-k8s exactly as it is, the first cluster will only create the consul UI service on NodePort but not the rest of the services (including for grpc). By default, from the helm chart, they are created as headless services by setting clusterIP None. This will cause an issue for the second cluster to discover consul server on the first cluster over gRPC as it cannot simply cannot through gRPC default port 8502 and it ends up in an error as shown in the issue https://github.com/hashicorp/consul-k8s/issues/1903 As a solution, the grpc service should be exposed using NodePort (or LoadBalancer). I added those changes required in both cluster1-values.yaml and cluster2-values.yaml, and also a description for those changes for the normal users to understand. Kindly review and I hope this PR will be accepted. * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> * Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx Co-authored-by: trujillo-adam <[email protected]> --------- Co-authored-by: trujillo-adam <[email protected]> * Add in query options for catalog service existing in a specific (#16652) namespace when creating service for tests * fix: add AccessorID property to PUT token request (#16660) * add sameness group support to service resolver failover and redirects (#16664) * Fix incorrect links on Envoy extensions documentation (#16666) * [API Gateway] Fix invalid cluster causing gateway programming delay (#16661) * Add test for http routes * Add fix * Fix tests * Add changelog entry * Refactor and fix flaky tests * Bump tomhjp/gh-action-jira-search from 0.2.1 to 0.2.2 (#16667) Bumps [tomhjp/gh-action-jira-search](https://github.com/tomhjp/gh-action-jira-search) from 0.2.1 to 0.2.2. - [Release notes](https://github.com/tomhjp/gh-action-jira-search/releases) - [Commits](https://github.com/tomhjp/gh-action-jira-search/compare/v0.2.1...v0.2.2) --- updated-dependencies: - dependency-name: tomhjp/gh-action-jira-search dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * build(deps): bump atlassian/gajira-transition from 2.0.1 to 3.0.1 (#15921) Bumps [atlassian/gajira-transition](https://github.com/atlassian/gajira-transition) from 2.0.1 to 3.0.1. - [Release notes](https://github.com/atlassian/gajira-transition/releases) - [Commits](https://github.com/atlassian/gajira-transition/compare/v2.0.1...v3.0.1) --- updated-dependencies: - dependency-name: atlassian/gajira-transition dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: David Yu <[email protected]> * Snapshot restore tests (#16647) * add snapshot restore test * add logstore as test parameter * Use the correct image version * make sure we read the logs from a followers to test the follower snapshot install path. * update to raf-wal v0.3.0 * add changelog. * updating changelog for bug description and removed integration test. * setting up test container builder to only set logStore for 1.15 and higher --------- Co-authored-by: Paul Banks <[email protected]> Co-authored-by: John Murret <[email protected]> * add sameness groups to discovery chains (#16671) * feat: add category annotation to RPC and gRPC methods (#16646) * Update GH actions to create Jira issue automatically (#16656) * Adds check to verify that the API Gateway is being created with at least one listener * Fix route subscription when using namespaces (#16677) * Fix route subscription when using namespaces * Update changelog * Fix changelog entry to reference that the bug was enterprise only * peering: peering partition failover fixes (#16673) add local source partition for peered upstreams * fix jira sync actions, remove custom fields (#16686) * Docs/update jira sync pr issue (#16688) * fix jira sync actions, remove custom fields * remove more additional fields, debug * Docs: Jira sync Update issuetype to bug (#16689) * update issuetype to bug * fix conditional for pr edu * build(deps): bump tomhjp/gh-action-jira-create from 0.2.0 to 0.2.1 (#16685) Bumps [tomhjp/gh-action-jira-create](https://github.com/tomhjp/gh-action-jira-create) from 0.2.0 to 0.2.1. - [Release notes](https://github.com/tomhjp/gh-action-jira-create/releases) - [Commits](https://github.com/tomhjp/gh-action-jira-create/compare/v0.2.0...v0.2.1) --- updated-dependencies: - dependency-name: tomhjp/gh-action-jira-create dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: David Yu <[email protected]> * build(deps): bump tomhjp/gh-action-jira-comment from 0.1.0 to 0.2.0 (#16684) Bumps [tomhjp/gh-action-jira-comment](https://github.com/tomhjp/gh-action-jira-comment) from 0.1.0 to 0.2.0. - [Release notes](https://github.com/tomhjp/gh-action-jira-comment/releases) - [Commits](https://github.com/tomhjp/gh-action-jira-comment/compare/v0.1.0...v0.2.0) --- updated-dependencies: - dependency-name: tomhjp/gh-action-jira-comment dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: David Yu <[email protected]> * NET-2397: Add readme.md to upgrade test subdirectory (#16610) * NET-2397: Add readme.md to upgrade test subdirectory * remove test code * fix link and update steps of adding new test cases (#16654) * fix link and update steps of adding new test cases * Apply suggestions from code review Co-authored-by: Nick Irvine <[email protected]> --------- Co-authored-by: Nick Irvine <[email protected]> --------- Co-authored-by: cskh <[email protected]> Co-authored-by: Nick Irvine <[email protected]> * chore: replace hardcoded node name with a constant (#16692) * Fix broken links from api docs (#16695) * Update WAL Known issues (#16676) * UI: update Ember to 3.28.6 (#16616) --------- Co-authored-by: wenincode <[email protected]> * Regen helm docs (#16701) * Remove unused are hosts set check (#16691) * Remove unused are hosts set check * Remove all traces of unused 'AreHostsSet' parameter * Remove unused Hosts attribute * Remove commented out use of snap.APIGateway.Hosts * [NET-3029] Migrate build-distros to GHA (#16669) * migrate build distros to GHA Signed-off-by: Dan Bond <[email protected]> * build-arm Signed-off-by: Dan Bond <[email protected]> * don't use matrix Signed-off-by: Dan Bond <[email protected]> * check-go-mod Signed-off-by: Dan Bond <[email protected]> * add notify slack script Signed-off-by: Dan Bond <[email protected]> * notify slack if failure Signed-off-by: Dan Bond <[email protected]> * rm notify slack script Signed-off-by: Dan Bond <[email protected]> * fix check-go-mod job Signed-off-by: Dan Bond <[email protected]> --------- Signed-off-by: Dan Bond <[email protected]> * Update envoy extension docs, service-defaults, add multi-config example for lua (#16710) * fix build workflow (#16719) Signed-off-by: Dan Bond <[email protected]> * Helm docs without developer.hashicorp.com prefix (#16711) This was causing linter errors * add extra resiliency to snapshot restore test (#16712) * fix: gracefully fail on invalid port number (#16721) * Copyright headers for config files git + circleci (#16703) * Copyright headers for config files git + circleci * Release folder copyright headers * fix bug where pqs that failover to a cluster peer dont un-fail over (#16729) * add enterprise xds tests (#16738) * delete config when nil (#16690) * delete config when nil * fix mock interface implementation * fix handler test to use the right assertion * extract DeleteConfig as a separate API. * fix mock limiter implementation to satisfy the new interface * fix failing tests * add test comments * Changelog for audit logging fix. (#16700) * Changelog for audit logging fix. * Use GH issues type for edu board (#16750) * fix: remove unused tenancy category from rate limit spec (#16740) * Remove version bump from CRT workflow (#16728) This bumps the version to reflect the next patch release; however, we use a specific branch for each patch release and so never wind up cutting a release directly from the `release/1.15.x` (for example) where this is intended to work. * tests instantiating clients w/o shutting down (#16755) noticed via their port still in use messages. * RELENG-471: Remove obsolete load-test workflow (#16737) * Remove obsolete load-test workflow * remove load-tests from circleci config. --------- Co-authored-by: John Murret <[email protected]> * add failover policy to ProxyConfigEntry in api (#16759) * add failover policy to ProxyConfigEntry in api * update docs * Fix broken links in Consul docs (#16640) * Fix broken links in Consul docs * more broken link fixes * more 404 fixes * 404 fixes * broken link fix --------- Co-authored-by: Tu Nguyen <[email protected]> * Change partition for peers in discovery chain targets (#16769) This commit swaps the partition field to the local partition for discovery chains targeting peers. Prior to this change, peer upstreams would always use a value of default regardless of which partition they exist in. This caused several issues in xds / proxycfg because of id mismatches. Some prior fixes were made to deal with one-off id mismatches that this PR also cleans up, since they are no longer needed. * Docs/intentions refactor docs day 2022 (#16758) * converted intentions conf entry to ref CT format * set up intentions nav * add page for intentions usage * final intentions usage page * final intentions overview page * fixed old relative links * updated diagram for overview * updated links to intentions content * fixed typo in updated links * rename intentions overview page file to index * rollback link updates to intentions overview * fixed nav * Updated custom HTML in API and CLI pages to MD * applied suggestions from review to index page * moved conf examples from usage to conf ref * missed custom HTML section * applied additional feedback * Apply suggestions from code review Co-authored-by: Tu Nguyen <[email protected]> * updated headings in usage page * renamed files and udpated nav * updated links to new file names * added redirects and final tweaks * typo --------- Co-authored-by: Tu Nguyen <[email protected]> * Add storage backend interface and in-memory implementation (#16538) Introduces `storage.Backend`, which will serve as the interface between the Resource Service and the underlying storage system (Raft today, but in the future, who knows!). The primary design goal of this interface is to keep its surface area small, and push as much functionality as possible into the layers above, so that new implementations can be added with little effort, and easily proven to be correct. To that end, we also provide a suite of "conformance" tests that can be run against a backend implementation to check it behaves correctly. In this commit, we introduce an initial in-memory storage backend, which is suitable for tests and when running Consul in development mode. This backend is a thin wrapper around the `Store` type, which implements a resource database using go-memdb and our internal pub/sub system. `Store` will also be used to handle reads in our Raft backend, and in the future, used as a local cache for external storage systems. * Fix bug in changelog checker where bash variable is not quoted (#16681) * Read(...) endpoint for the resource service (#16655) * Fix Edu Jira automation (#16778) * Fix struct tags for TCPService enterprise meta (#16781) * Fix struct tags for TCPService enterprise meta * Add changelog * Expand route flattening test for multiple namespaces (#16745) * Exand route flattening test for multiple namespaces * Add helper for checking http route config entry exists without checking for bound status * Fix port and hostname check for http route flattening test * WatchList(..) endpoint for the resource service (#16726) * Allocate virtual ip for resolver/router/splitter config entries (#16760) * add ip rate limiter controller OSS parts (#16790) * Resource service List(..) endpoint (#16753) * changes to support new PQ enterprise fields (#16793) * add scripts for testing locally consul-ui-toolkit (#16794) * Update normalization of route refs (#16789) * Use merge of enterprise meta's rather than new custom method * Add merge logic for tcp routes * Add changelog * Normalize certificate refs on gateways * Fix infinite call loop * Explicitly call enterprise meta * copyright headers for agent folder (#16704) * copyright headers for agent folder * Ignore test data files * fix proto files and remove headers in agent/uiserver folder * ignore deep-copy files * Copyright headers for command folder (#16705) * copyright headers for agent folder * Ignore test data files * fix proto files and remove headers in agent/uiserver folder * ignore deep-copy files * copyright headers for agent folder * Copyright headers for command folder * fix merge conflicts * Add copyright headers for acl, api and bench folders (#16706) * copyright headers for agent folder * Ignore test data files * fix proto files and remove headers in agent/uiserver folder * ignore deep-copy files * copyright headers for agent folder * fix merge conflicts * copyright headers for agent folder * Ignore test data files * fix proto files * ignore agent/uiserver folder for now * copyright headers for agent folder * Add copyright headers for acl, api and bench folders * Github Actions Migration - move go-tests workflows to GHA (#16761) * go-tests workflow * add test splitting to go-tests * fix re-reun fails report path * fix re-reun fails report path another place * fixing tests for32bit and race * use script file to generate runners * fixing run path * add checkout * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * passing runs-on * setting up runs-on as a parameter to check-go-mod * making on pull_request * Update .github/scripts/rerun_fails_report.sh Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * make runs-on required * removing go-version param that is not used. * removing go-version param that is not used. * Modify build-distros to use medium runners (#16773) * go-tests workflow * add test splitting to go-tests * fix re-reun fails report path * fix re-reun fails report path another place * fixing tests for32bit and race * use script file to generate runners * fixing run path * add checkout * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * passing runs-on * setting up runs-on as a parameter to check-go-mod * trying mediums * adding in script * fixing runs-on to be parameter * fixing merge conflict * changing to on push * removing whitespace * go-tests workflow * add test splitting to go-tests * fix re-reun fails report path * fix re-reun fails report path another place * fixing tests for32bit and race * use script file to generate runners * fixing run path * add checkout * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * passing runs-on * setting up runs-on as a parameter to check-go-mod * changing back to on pull_request --------- Co-authored-by: Dan Bond <[email protected]> * Github Actions Migration - move verify-ci workflows to GHA (#16777) * add verify-ci workflow * adding comment and changing to on pull request. * changing to pull_requests * changing to pull_request * Apply suggestions from code review Co-authored-by: Dan Bond <[email protected]> * [NET-3029] Migrate frontend to GHA (#16731) * changing set up to a small * using consuls own custom runner pool. --------- Co-authored-by: Dan Bond <[email protected]> * Copyright headers for missing files/folders (#16708) * copyright headers for agent folder * fix: export ReadWriteRatesConfig struct as it needs to referenced from consul-k8s (#16766) * docs: Updates to support HCP Consul cluster peering release (#16774) * New HCP Consul documentation section + links * Establish cluster peering usage cross-link * unrelated fix to backport to v1.15 * nav correction + fixes * Tech specs fixes * specifications for headers * Tech specs fixes + alignments * sprawl edits * Tip -> note * port ENT ingress gateway upgrade tests [NET-2294] [NET-2296] (#16804) * [COMPLIANCE] Add Copyright and License Headers (#16807) * [COMPLIANCE] Add Copyright and License Headers * fix headers for generated files * ignore dist folder --------- Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> Co-authored-by: Ronald Ekambi <[email protected]> Co-authored-by: Ronald <[email protected]> * add order by locality failover to Consul enterprise (#16791) * ci: changes resulting from running on consul-enterprise (#16816) * changes resulting from running on consul-enterprise * removing comment line * port ENT upgrade tests flattening (#16824) * docs: raise awareness of GH-16779 (#16823) * updating command to reflect the additional package exclusions in CircleCI (#16829) * storage: fix resource leak in Watch (#16817) * Remove UI brand-loader copyright headers as they do not render appropriately (#16835) * Add sameness-group to exported-services config entries (#16836) This PR adds the sameness-group field to exported-service config entries, which allows for services to be exported to multiple destination partitions / peers easily. * Add default resolvers to disco chains based on the default sameness group (#16837) * [NET-3029] Migrate dev-* jobs to GHA (#16792) * ci: add build-artifacts workflow Signed-off-by: Dan Bond <[email protected]> * makefile for gha dev-docker Signed-off-by: Dan Bond <[email protected]> * use docker actions instead of make Signed-off-by: Dan Bond <[email protected]> * Add context Signed-off-by: Dan Bond <[email protected]> * testing push Signed-off-by: Dan Bond <[email protected]> * set short sha Signed-off-by: Dan Bond <[email protected]> * upload to s3 Signed-off-by: Dan Bond <[email protected]> * rm s3 upload Signed-off-by: Dan Bond <[email protected]> * use runner setup job Signed-off-by: Dan Bond <[email protected]> * on push Signed-off-by: Dan Bond <[email protected]> * testing Signed-off-by: Dan Bond <[email protected]> * on pr Signed-off-by: Dan Bond <[email protected]> * revert testing Signed-off-by: Dan Bond <[email protected]> * OSS/ENT logic Signed-off-by: Dan Bond <[email protected]> * add comments Signed-off-by: Dan Bond <[email protected]> * Update .github/workflows/build-artifacts.yml Co-authored-by: John Murret <[email protected]> --------- Signed-off-by: Dan Bond <[email protected]> Co-authored-by: John Murret <[email protected]> * add region field (#16825) * add region field * fix syntax error in test file * go fmt * go fmt * remove test * Connect CA Primary Provider refactor (#16749) * Rename Intermediate cert references to LeafSigningCert Within the Consul CA subsystem, the term "Intermediate" is confusing because the meaning changes depending on provider and datacenter (primary vs secondary). For example, when using the Consul CA the "ActiveIntermediate" may return the root certificate in a primary datacenter. At a high level, we are interested in knowing which CA is responsible for signing leaf certs, regardless of its position in a certificate chain. This rename makes the intent clearer. * Move provider state check earlier * Remove calls to GenerateLeafSigningCert GenerateLeafSigningCert (formerly known as GenerateIntermediate) is vestigial in non-Vault providers, as it simply returns the root certificate in primary datacenters. By folding Vault's intermediate cert logic into `GenerateRoot` we can encapsulate the intermediate cert handling within `newCARoot`. * Move GenerateLeafSigningCert out of PrimaryProvidder Now that the Vault Provider calls GenerateLeafSigningCert within GenerateRoot, we can remove the method from all other providers that never used it in a meaningful way. * Add test for IntermediatePEM * Rename GenerateRoot to GenerateCAChain "Root" was being overloaded in the Consul CA context, as different providers and configs resulted in a single root certificate or a chain originating from an external trusted CA. Since the Vault provider also generates intermediates, it seems more accurate to call this a CAChain. * Update changelog with patch releases (#16856) * Update changelog with patch releases * Backport missed 1.0.4 patch release to changelog * Fix typo on cli-flags.mdx (#16843) Change "segements" to segments * Allow dialer to re-establish terminated peering (#16776) Currently, if an acceptor peer deletes a peering the dialer's peering will eventually get to a "terminated" state. If the two clusters need to be re-peered the acceptor will re-generate the token but the dialer will encounter this error on the call to establish: "failed to get addresses to dial peer: failed to refresh peer server addresses, will continue to use initial addresses: there is no active peering for "<<<ID>>>"" This is because in `exchangeSecret().GetDialAddresses()` we will get an error if fetching addresses for an inactive peering. The peering shows up as inactive at this point because of the existing terminated state. Rather than checking whether a peering is active we can instead check whether it was deleted. This way users do not need to delete terminated peerings in the dialing cluster before re-establishing them. * CA mesh CA expiration to it's own section This is part of an effort to raise awareness that you need to monitor your mesh CA if coming from an external source as you'll need to manage the rotation. * Fix broken doc in consul-k8s upgrade (#16852) Signed-off-by: dttung2905 <[email protected]> Co-authored-by: David Yu <[email protected]> * docs: add envoy to the proxycfg diagram (#16834) * docs: add envoy to the proxycfg diagram * ci: increase deep-copy and lint-enum jobs to use large runner as they hang in ENT (#16866) * docs: add envoy to the proxycfg diagram (#16834) * docs: add envoy to the proxycfg diagram * increase dee-copy job to use large runner. disable lint-enums on ENT * set lint-enums to a large * remove redunant installation of deep-copy --------- Co-authored-by: cskh <[email protected]> * Raft storage backend (#16619) * ad arm64 testing (#16876) * Omit false positives from 404 checker (#16881) * Remove false positives from 404 checker * fix remaining 404s * ci: fixes missing deps in frontend gha workflows (#16872) Signed-off-by: Dan Bond <[email protected]> * always test oss and conditionally test enterprise (#16827) * temporarily disable macos-arm64 tests job in go-tests (#16898) * Resource `Write` endpoint (#16786) * Resource `Delete` endpoint (#16756) * Wasm Envoy HTTP extension (#16877) * Fix API GW broken link (#16885) * Fix API GW broken link * Update website/content/docs/api-gateway/upgrades.mdx Co-authored-by: Tu Nguyen <[email protected]> --------- Co-authored-by: Tu Nguyen <[email protected]> * ci: Add success jobs. make go-test-enterprise conditional. build-distros and go-tests trigger on push to main and release branches (#16905) * Add go-tests-success job and make go-test-enterprise conditional * fixing lint-32bit reference * fixing reference to -go-test-troubleshoot * add all jobs that fan out. * fixing success job to need set up * add echo to success job * adding success jobs to build-artifacts, build-distros, and frontend. * changing the name of the job in verify ci to be consistent with other workflows * enable go-tests, build-distros, and verify-ci to run on merge to main and release branches because they currently do not with just the pull_request trigger * increase ENT runner size for xl to match OSS. have guild-distros use xl to match CircleCI (#16920) * log warning about certificate expiring sooner and with more details The old setting of 24 hours was not enough time to deal with an expiring certificates. This change ups it to 28 days OR 40% of the full cert duration, whichever is shorter. It also adds details to the log message to indicate which certificate it is logging about and a suggested action. * highlight the agent.tls cert metric with CA ones Include server agent certificate with list of cert metrics that need monitoring. * docs: improve upgrade path guidance (#16925) * Test: add noCleanup to TestServer stop (#16919) * docs: fix typo in LocalRequestTimeoutMs (#16917) * ci: add GOTAGS to build-distros (#16934) * APIGW: Routes with duplicate parents should be invalid (#16926) * ensure route parents are unique when creating an http route * Ensure tcp route parents are unique * Added unit tests * ci: remove verify-ci from circleci (#16860) * ci: remove go-tests workflow from CircleCI (#16855) * remove go-tests workflow from CircleCI * add yaml anchor back * ci: build-artifacts - fix platform missing in manifest error (#16940) * ci: build-artifacts - fix platform missing in manifest error * remove platform key * Check acls on resource `Read`, `List`, and `WatchList` (#16842) * Resource validation hook for `Write` endpoint (#16950) * Remove deprecated service-defaults upstream behavior. (#16957) Prior to this change, peer services would be targeted by service-default overrides as long as the new `peer` field was not found in the config entry. This commit removes that deprecated backwards-compatibility behavior. Now it is necessary to specify the `peer` field in order for upstream overrides to apply to a peer upstream. * Fix the indentation of the copyAnnotations example (#16873) * Update docs for service-defaults overrides. (#16960) Update docs for service-defaults overrides. Co-authored-by: trujillo-adam <[email protected]> * resource: `WriteStatus` endpoint (#16886) * Remove global.name requirement for APs (#16964) This is not a requirement when using APs because each AP has its own auth method so it's okay if the names overlap. * ci: remove build-distros from CircleCI (#16941) * feat: add reporting config with reload (#16890) * Added backport labels to PR template checklist (#16966) * ci: split frontend ember jobs (#16973) Signed-off-by: Dan Bond <[email protected]> * Memdb Txn Commit race condition fix (#16871) * Add a test to reproduce the race condition * Fix race condition by publishing the event after the commit and adding a lock to prevent out of order events. * split publish to generate the list of events before committing the transaction. * add changelog * remove extra func * Apply suggestions from code review Co-authored-by: Dan Upton <[email protected]> * add comment to explain test --------- Co-authored-by: Dan Upton <[email protected]> * add sameness to exported services structs in the api package (#16984) * circleci: remove frontend jobs (#16906) * circleci: remove fronted jobs Signed-off-by: Dan Bond <[email protected]> * remove frontend-cache Signed-off-by: Dan Bond <[email protected]> --------- Signed-off-by: Dan Bond <[email protected]> * Enforce ACLs on resource `Write` and `Delete` endpoints (#16956) * Update list of Envoy versions (#16889) * Update list of Envoy versions * Update docs + CI + tests * Add changelog entry * Add newly-released Envoy versions 1.23.8 and 1.24.6 * Add newly-released Envoy version 1.22.11 * Add mutate hook to `Write` endpoint (#16958) * upgrade test: config nodeName, nodeid, and inherited persistent data for consul container (#16931) * move enterprise test cases out of open source (#16985) * Fix delete when uid not provided (#16996) * Enforce Owner rules in `Write` endpoint (#16983) * add IP rate limiting config update (#16997) * add IP rate limiting config update * fix review comments * * added Sameness Group to proto files (#16998) - added Sameness Group to config entries - added Sameness Group to subscriptions * generated proto files * added Sameness Group events to the state store - added test cases * Refactored health RPC Client - moved code that is common to rpcclient under rpcclient common.go. This will help set us up to support future RPC clients * Refactored proxycfg glue views - Moved views to rpcclient config entry. This will allow us to reuse this code for a config entry client * added config entry RPC Client - Copied most of the testing code from rpcclient/health * hooked up new rpcclient in agent * fixed documentation and comments for clarity * added missing error message content to troubleshooting (#17005) * Add PrioritizeByLocality to config entries. (#17007) This commit adds the PrioritizeByLocality field to both proxy-config and service-resolver config entries for locality-aware routing. The field is currently intended for enterprise only, and will be used to enable prioritization of service-mesh connections to services based on geographical region / zone. * fixed bad link (#17009) * added an intro statement for the SI conf entry confiration model (#17017) * added an intro statement for the SI conf entry confiration model * caught a few more typos * Tenancy wildcard validaton for `Write`, `Read`, and `Delete` endpoints (#17004) * docs: update docs related to GH-16779 (#17020) * server: wire up in-process Resource Service (#16978) * add ability to start container tests in debug mode and attach a debugger (#16887) * add ability to start container tests in debug mode and attach a debugger to consul while running it. * add a debug message with the debug port * use pod to get the right port * fix image used in basic test * add more data to identify which container to debug. * fix comment Co-authored-by: Evan Culver <[email protected]> * rename debugUri to debugURI --------- Co-authored-by: Evan Culver <[email protected]> * feat: set up reporting agent (#16991) * api: enable query options on agent force-leave endpoint (#15987) * Bump the golang.org/x/net to 0.7.0 to address CVE-2022-41723 (#16754) * Bump the golang.org/x/net to 0.7.0 to address CVE-2022-41723 https://nvd.nist.gov/vuln/detail/CVE-2022-41723 * Add changelog entry --------- Co-authored-by: Nathan Coleman <[email protected]> * Don't send updates twice (#16999) * ci: add test-integrations (#16915) * add test-integrations workflow * add test-integrations success job * update vault integration testing versions (#16949) * change parallelism to 4 forgotestsum. use env.CONSUL_VERSION so we can see the version. * use env for repeated values * match test to circleci * fix envvar * fix envvar 2 * fix envvar 3 * fix envvar 4 * fix envvar 5 * make upgrade and compatibility tests match circleci * run go env to check environment * debug docker Signed-off-by: Dan Bond <[email protected]> * debug docker Signed-off-by: Dan Bond <[email protected]> * revert debug docker Signed-off-by: Dan Bond <[email protected]> * going back to command that worked 5 days ago for compatibility tests * Update Envoy versions to reflect changes in #16889 * cd to test dir * try running ubuntu latest * update PR with latest changes that work in enterprise * yaml still sucks * test GH fix (localhost resolution) * change for testing * test splitting and ipv6 lookup for compatibility and upgrade tests * fix indention * consul as image name * remove the on push * add gotestsum back in * removing the use of the gotestsum download action * yaml sucks today just like yesterday * fixing nomad tests * worked out the kinks on enterprise --------- Signed-off-by: Dan Bond <[email protected]> Co-authored-by: John Eikenberry <[email protected]> Co-authored-by: Dan Bond <[email protected]> Co-authored-by: Nathan Coleman <[email protected]> Co-authored-by: Sarah <[email protected]> * ci: remove test-integrations CircleCI workflow (#16928) * remove all CircleCI files * remove references to CircleCI * remove more references to CircleCI * pin golangci-lint to v1.51.1 instead of v1.51 * Avoid decoding nil pointer in map walker (#17048) * Revert "cache: refactor agent cache fetching to prevent unnecessary f… (#16818) (#17046) Revert "cache: refactor agent cache fetching to prevent unnecessary fetches on error (#14956)" Co-authored-by: Derek Menteer <[email protected]> * Permissive mTLS (#17035) This implements permissive mTLS , which allows toggling services into "permissive" mTLS mode. Permissive mTLS mode allows incoming "non Consul-mTLS" traffic to be forward unmodified to the application. * Update service-defaults and proxy-defaults config entries with a MutualTLSMode field * Update the mesh config entry with an AllowEnablingPermissiveMutualTLS field and implement the necessary validation. AllowEnablingPermissiveMutualTLS must be true to allow changing to MutualTLSMode=permissive, but this does not require that all proxy-defaults and service-defaults are currently in strict mode. * Update xDS listener config to add a "permissive filter chain" when MutualTLSMode=permissive for a particular service. The permissive filter chain matches incoming traffic by the destination port. If the destination port matches the service port from the catalog, then no mTLS is required and the traffic sent is forwarded unmodified to the application. * [NET-3090] Add new JWT provider config entry (#17036) * [NET-3090] Add new JWT provider config entry * Add initial test cases * update validations for jwt-provider config entry fields * more validation * start improving tests * more tests * Normalize * Improve tests and move validate fns * usage test update * Add split between ent and oss for partitions * fix lint issues * Added retry backoff, fixed tests, removed unused defaults * take into account default partitions * use countTrue and add aliases * omit audiences if empty * fix failing tests * add omit-entry * update copyright headers ids --------- Co-authored-by: Ronald Ekambi <[email protected]> Co-authored-by: Ronald <[email protected]> * [NET-3091] Update service intentions to support jwt provider references (#17037) * [NET-3090] Add new JWT provider config entry * Add initial test cases * update validations for jwt-provider config entry fields * more validation * start improving tests * more tests * Normalize * Improve tests and move validate fns * usage test update * Add split between ent and oss for partitions * fix lint issues * Added retry backoff, fixed tests, removed unused defaults * take into account default partitions * use countTrue and add aliases * omit audiences if empty * fix failing tests * add omit-entry * Add JWT intentions * generate proto * fix deep copy issues * remove extra field * added some tests * more tests * add validation for creating existing jwt * fix nil issue * More tests, fix conflicts and improve memdb call * fix namespace * add aliases * consolidate errors, skip duplicate memdb calls * reworked iteration over config entries * logic improvements from review --------- Co-authored-by: Ronald Ekambi <[email protected]> * remove worklogs upload (#17056) * [COMPLIANCE] Add Copyright and License Headers (#16854) Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> Co-authored-by: Ronald <[email protected]> * Fix generated proto files (#17063) * [COMPLIANCE] Add Copyright and License Headers * generate proto --------- Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> * fix broken links (#17032) * fix broken links * Apply suggestions from code review Co-authored-by: Jeff Boruszak <[email protected]> --------- Co-authored-by: Jeff Boruszak <[email protected]> * Add sameness groups to service intentions. (#17064) * Enforce operator:write acl on `WriteStatus` endpoint (#17019) * NET-3648: Add script to get consul and envoy version (#17060) * use proper TOTAL_RUNNER setting when generating runner matrix. if matrix size is smaller than total_runners, use the smaller number * try again * try again 2 * try again 3 * try again 4 * try again 5 * try scenario where number is less * backport of commit 4ca8f8c65c4fb1262ef70786549a8f9617d31816 * backport of commit 5185c5ada3ab41f9eca76c25acfdbcc764bceeef * backport of commit 171df26f9cb29ebfb3c30db8298a3666c12a41d6 * backport of commit a786025ed1bdbbf74e4e0138f4a750be79d4c2ea * backport of commit f36c71ca7633cbc42a9b82bad2c277378ae4a0f6 --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Dan Bond <[email protected]> Signed-off-by: dttung2905 <[email protected]> Co-authored-by: R.B. Boyer <[email protected]> Co-authored-by: trujillo-adam <[email protected]> Co-authored-by: Eddie Rowe <[email protected]> Co-authored-by: skpratt <[email protected]> Co-authored-by: John Eikenberry <[email protected]> Co-authored-by: David Yu <[email protected]> Co-authored-by: Tu Nguyen <[email protected]> Co-authored-by: cskh <[email protected]> Co-authored-by: Ronald <[email protected]> Co-authored-by: Nick Irvine <[email protected]> Co-authored-by: Chris S. Kim <[email protected]> Co-authored-by: Michael Hofer <[email protected]> Co-authored-by: Anita Akaeze <[email protected]> Co-authored-by: Andrew Stucki <[email protected]> Co-authored-by: Eric Haberkorn <[email protected]> Co-authored-by: Michael Wilkerson <[email protected]> Co-authored-by: Matt Keeler <[email protected]> Co-authored-by: Melisa Griffin <[email protected]> Co-authored-by: John Maguire <[email protected]> Co-authored-by: Ashlee M Boyer <[email protected]> Co-authored-by: Valeriia Ruban <[email protected]> Co-authored-by: Paul Glass <[email protected]> Co-authored-by: Semir Patel <[email protected]> Co-authored-by: Bryce Kalow <[email protected]> Co-authored-by: Tyler Wendlandt <[email protected]> Co-authored-by: Luke Kysow <[email protected]> Co-authored-by: natemollica-dev <[email protected]> Co-authored-by: Ashvitha <[email protected]> Co-authored-by: Derek Menteer <[email protected]> Co-authored-by: Bastien Dronneau <[email protected]> Co-authored-by: Freddy <[email protected]> Co-authored-by: Paul Banks <[email protected]> Co-authored-by: wangxinyi7 <[email protected]> Co-authored-by: Vipin John Wilson <[email protected]> Co-authored-by: Rosemary Wang <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Dhia Ayachi <[email protected]> Co-authored-by: John Murret <[email protected]> Co-authored-by: Poonam Jadhav <[email protected]> Co-authored-by: Nitya Dhanushkodi <[email protected]> Co-authored-by: Dan Bond <[email protected]> Co-authored-by: Nathan Coleman <[email protected]> Co-authored-by: brian shore <[email protected]> Co-authored-by: malizz <[email protected]> Co-authored-by: Dan Upton <[email protected]> Co-authored-by: Kyle Havlovitz <[email protected]> Co-authored-by: Jeff Boruszak <[email protected]> Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> Co-authored-by: Ronald Ekambi <[email protected]> Co-authored-by: Jared Kirschner <[email protected]> Co-authored-by: Michael Zalimeni <[email protected]> Co-authored-by: Hariram Sankaran <[email protected]> Co-authored-by: Dao Thanh Tung <[email protected]> Co-authored-by: Chris Thain <[email protected]> Co-authored-by: Andrea Scarpino <[email protected]> Co-authored-by: Thomas Eckert <[email protected]> Co-authored-by: Evan Culver <[email protected]> Co-authored-by: Andrei Komarov <[email protected]> Co-authored-by: Kevin Wang <[email protected]> Co-authored-by: Sarah <[email protected]>
Community Note
Overview of the Issue
I have two production K8 clusters created using Kops controller. The two K8 clusters run on two seperate VPC's on AWS and its VPC peered between them. I wanted to install consul on both clusters but with only single consul datacenter. I followed the documentation to deploy the same. Everything is fine on the first K8 cluster which acts as the server. But with the second K8 cluster ( the client ) the deployment fails. The consul-server-acl-init and consul-connect-injector pods ends up in crashloop always.
Reproduction Steps
Firstly, you must have two working k8 clusters.
Prepared the Helm release names as environment variables for both the server and client install:
On server cluster:
Helm chart and its custom values used on the Server K8 cluster:
To deploy, first generate the Gossip encryption key and save it as a Kubernetes secret.
Installed on the first cluster:
Extracted CA certificate and ACL bootstrap token generated during installation on the server k8 cluster:
On client k8 cluster:
Applied the credentials extracted from the first cluster to the second cluster:
Where 172.26.1.58 is one of the node IPs and 31608 is the nodePort of the server k8 cluster.
Then, proceeded with the installation of the second cluster.
At this point:
Status of server cluster:
Status of client cluster:
You can see consul-connect-injector and consul-server-acl pods are having crashloop
Logs
Logs from consul-connect-injector and consul-server-acl pods are:
Expected behavior
The pods should not have crashloop and the client should able to join the server cluster over gRPC port 8502
Environment details
If not already included, please provide the following:
consul-k8s
version:1.14.4
using helm chart1.0.4
values.yaml
used to deploy the helm chart:already provided above
Additionally, please provide details regarding the Kubernetes Infrastructure, as shown below:
Kubernetes version:
Server cluster:
v1.23.9
Client cluster:
v1.23.16
Cloud Provider:
K8s created using Kops on AWS
Networking CNI plugin in use:
Calico
Additional Context
The two k8 clusters existing in two seperate VPCs but are VPC peered and can communicate each other and the whole CIDRs are whitelisted for all ports.
The doc I followed is https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/single-dc-multi-k8s
The text was updated successfully, but these errors were encountered: