Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to use 127.0.0.1 for network listeners by default in some cases #3639

Closed
wants to merge 4 commits into from

Conversation

atoulme
Copy link
Contributor

@atoulme atoulme commented Sep 14, 2023

Description:
Move to use 127.0.0.1 for the network listen interface of the collector by default on Windows and Linux installers, as well as Chocolatey

@atoulme atoulme requested review from a team as code owners September 14, 2023 18:00
@rmfitzpatrick
Copy link
Contributor

I don't believe we've advertised this as an upcoming change and wonder about a smoother transition using feature flags and the internal settings mechanism.

The failing TestWindowsIISInstrumentation test shows that the health check extension is arguably a bad candidate for this change since its use is catered for external clients.

CHANGELOG.md Outdated Show resolved Hide resolved
@atoulme
Copy link
Contributor Author

atoulme commented Sep 14, 2023

We're just changing some of the default values of the installers and not the software itself, so we can probably not do this work with feature flags. We could make a warning log appear advertising this upcoming change as a separate PR.

I guess I will need to check the best way to address the health check failure. For now I think the best approach is to configure the health check to bind to 0.0.0.0 to keep the test working.

@jeffreyc-splunk
Copy link
Contributor

I guess I will need to check the best way to address the health check failure. For now I think the best approach is to configure the health check to bind to 0.0.0.0 to keep the test working.

You can try updating the installation command for the -network_interface "0.0.0.0" option.

@atoulme
Copy link
Contributor Author

atoulme commented Sep 14, 2023

Yes, I pushed this as result. Thanks for the help :)

@jeffreyc-splunk
Copy link
Contributor

If you rebase, the installer tests will probably fail since it is expecting "0.0.0.0" by default , and the custom test uses "127.0.0.1". We can probably just swap those in the tests.

README.md Show resolved Hide resolved
README.md Show resolved Hide resolved
@atoulme
Copy link
Contributor Author

atoulme commented Sep 14, 2023

Re docs - sure. @aurbiztondo-splunk as heads up

@atoulme atoulme force-pushed the localhost_listen branch 4 times, most recently from f437cde to 74c922e Compare September 15, 2023 03:05
@rmfitzpatrick
Copy link
Contributor

rmfitzpatrick commented Sep 18, 2023

We're just changing some of the default values of the installers and not the software itself, so we can probably not do this work with feature flags. We could make a warning log appear advertising this upcoming change as a separate PR.

I'm suggesting another approach based on the content of these changes. Having to manage the ip in multiple places adds some complexity to expected behavior and considering its use is only in the default agent and gateway configs that the settings/main evaluate it could happen there. imo 127.0.0.1* would only be applicable for agent installations and defaulting* to it in gateway mode is potentially entirely breaking.

edit: We don't necessarily have to tie into the feature flags, but building on the existing (config) settings functionality makes more sense to me than addressing this in every installer/deployment method.

@hughesjj
Copy link
Contributor

Well, I feel this discussion in -contrib is relevant. I definitely prefer to use localhost, but, have we discussed this amongst our customers before changing the default? Sorry if you've internally shared it, if so I've missed it.

If this was greenfield, yes, totally agreed localhost would be ideal, but I'm unaware of the full scope impact to our customers by changing this default

Some telemetry around configuration would be ideal, assuming we could audit all receivers to ensure configopaque is in use where appropriate...

(yay more work! Eh, that's what we get paid for 😁)

@atoulme
Copy link
Contributor Author

atoulme commented Oct 2, 2023

From our discussions. the approach to revise this approach advocates to be mindful of adopting different modes of operation for the collector, which approach the problem holistically:

  • Agent
  • Gateway
  • Custom
    We currently don't have a good way to identify how the collector was started. We run with a custom mode always. We also don't identify anywhere in our installation if we picked agent or gateway during install, except for the config file name we pick.

I like this approach as an evolution of how we shape and distribute the collector to make it easier on everyone to know how things work, with an authoritative approach where we know for a fact that running as gateway is matching an exact configuration and reduces pain levels for customers who can live with defaults.

Right now, however, we're not there, and we have to contend with good defaults for our "custom" mode which is set initially by our installer.

The proposition then boils down right now to heuristics, which may help but also burden ourselves with complexity or edge cases. The proposition is that if we identify the collector is running with an exact match to the gateway configuration file location, our settings defaults then should change to be set to 0.0.0.0.

This proposition contradicts this PR as we set the environment variable value during installation to a fixed value, which we set to 127.0.0.1 for specific installation methods that reflect most likely an agent deployment on a host.

We can approach this in 2 ways:
We can expose the default value during installation to be decided based on prior prompts and choose 127.0.0.1 or 0.0.0.0 as defaults based on whether the user wants a gateway or agent mode.
We can decide to leave this environment variable unset during installation and apply a default mindfully based on file location.

Both approaches have their merits. I would personally like to pick the one that doesn't bind us in a particular mode of operation.
If we set the environment variable explicitly, we can offer a clear contract to our users. However, we are then beholden to this approach moving forward.
If we set the default based on file location, we are placing ourselves in a more flimsy place, as the file location maybe not always reflect that we are working with a gateway mode, but we reduce our commitment.

I propose we do the following:

  • Add a log at startup with the value picked in settings.go so we don't have a hard time debugging this later with customers.
  • Add code towards moving to 0.0.0.0 when the gateway location is used.
  • During installation, do not set the default env var value if none is provided.

@rmfitzpatrick
Copy link
Contributor

Add a log at startup with the value picked in settings.go so we don't have a hard time debugging this later with customers.
Add code towards moving to 0.0.0.0 when the gateway location is used.
During installation, do not set the default env var value if none is provided.

👍

We currently don't have a good way to identify how the collector was started.

Though not necessarily "good" we do know the config and config content but I'm not sure what deductions or end state we'd want to have. Having the desired config resolve to agent_config.yaml or gateway_config.yaml seems an adequate mechanism to determine what value to set the env var to. I would advise that we always set a value if not set directly for continuity and to avoid requiring a (arbitrary?) constant name for the env var to be usable.

@atoulme
Copy link
Contributor Author

atoulme commented Oct 2, 2023

I would advise that we always set a value if not set directly for continuity and to avoid requiring a (arbitrary?) constant name for the env var to be usable.

We can still exercise the env var and set it with an empty value so it's reserved and documented in the customer configuration.

@rmfitzpatrick rmfitzpatrick mentioned this pull request Oct 5, 2023
@atoulme atoulme closed this Oct 11, 2023
@atoulme atoulme deleted the localhost_listen branch November 8, 2023 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants