Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a config endpoint on a new api server #21025

Merged
merged 33 commits into from
Dec 5, 2023
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
42b8aa8
refactor to prepare new server
pgimalac Nov 20, 2023
02344fd
feat: implement config api server
pgimalac Nov 21, 2023
1e315df
refactor: move each server to its own file
pgimalac Nov 21, 2023
66fe2eb
merge with main
pgimalac Nov 21, 2023
bcf2bcf
fix revive duplicate import lint
pgimalac Nov 22, 2023
17b1c34
refactor: rename variables to ipc and cmd
pgimalac Nov 24, 2023
cbfadde
rename api server files to cmd and ipc
pgimalac Nov 24, 2023
a1125dd
limit ipc config endpoint to api_key for now
pgimalac Nov 24, 2023
7f69bc9
feat: define proper config payload type configPayload
pgimalac Nov 24, 2023
59cd5c5
merge with main
pgimalac Nov 27, 2023
e457dc9
renaming configs to agent_ipc_host/port
pgimalac Nov 28, 2023
44113ca
directly return error as string, or marshalled value
pgimalac Nov 28, 2023
69606a1
disable new endpoint by default
pgimalac Nov 28, 2023
4b62f52
rename extraHosts additionalHostIdentities
pgimalac Nov 29, 2023
99993c7
add some docs and logs, clean up stop server
pgimalac Nov 29, 2023
d1c14cd
refactor: improve naming
pgimalac Nov 29, 2023
c21c66d
move config endpoint to separate function
pgimalac Nov 29, 2023
79ba640
add logs to config endpoint on failure and success
pgimalac Nov 29, 2023
23f8c5b
only add the host without port to the certificate
pgimalac Nov 29, 2023
3a04b99
refactor: move config endpoint to separate directory, add tests
pgimalac Dec 1, 2023
6d1e442
test logic around enabling ipc server
pgimalac Dec 1, 2023
9ecb46c
add expvar metrics in config endpoint
pgimalac Dec 1, 2023
fa3bfa2
rename endpoint server and path
pgimalac Dec 1, 2023
ed704ae
merge with main
pgimalac Dec 1, 2023
6071290
refactor: rewrite test cases as structs
pgimalac Dec 2, 2023
27b9332
early return getIPCServerAddressPort
pgimalac Dec 2, 2023
b7ea9f8
Update pkg/api/security/security.go
pgimalac Dec 4, 2023
a6abb74
clean up variable name and types
pgimalac Dec 4, 2023
70fd44c
simplify test naming
pgimalac Dec 4, 2023
36685a1
refactor: remove global state, clean test naming
pgimalac Dec 4, 2023
72d395d
chore: log in case of answer write error
pgimalac Dec 4, 2023
d6763cf
pass config as parameter to test helper
pgimalac Dec 4, 2023
fde8155
feat: test expvars
pgimalac Dec 4, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions cmd/agent/api/listener.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,6 @@ func getIPCAddressPort() (string, error) {
}

// getListener returns a listening connection
func getListener() (net.Listener, error) {
address, err := getIPCAddressPort()
if err != nil {
return nil, err
}
func getListener(address string) (net.Listener, error) {
return net.Listen("tcp", address)
}
35 changes: 10 additions & 25 deletions cmd/agent/api/security.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,6 @@ import (
"github.com/DataDog/datadog-agent/pkg/api/util"
)

var (
tlsKeyPair *tls.Certificate
tlsCertPool *x509.CertPool
tlsAddr string
pgimalac marked this conversation as resolved.
Show resolved Hide resolved
)

// validateToken - validates token for legacy API
func validateToken(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
Expand All @@ -47,13 +41,9 @@ func parseToken(token string) (interface{}, error) {
return struct{}{}, nil
}

func buildSelfSignedKeyPair() ([]byte, []byte) {

func buildSelfSignedKeyPair(extraHosts ...string) ([]byte, []byte) {
pgimalac marked this conversation as resolved.
Show resolved Hide resolved
hosts := []string{"127.0.0.1", "localhost", "::1"}
ipcAddr, err := getIPCAddressPort()
if err == nil {
hosts = append(hosts, ipcAddr)
}
hosts = append(hosts, extraHosts...)
_, rootCertPEM, rootKey, err := security.GenerateRootCert(hosts, 2048)
if err != nil {
return nil, nil
Expand All @@ -68,26 +58,21 @@ func buildSelfSignedKeyPair() ([]byte, []byte) {
return rootCertPEM, rootKeyPEM
}

func initializeTLS() error {
cert, key := buildSelfSignedKeyPair()
func initializeTLS(extraHosts ...string) (*tls.Certificate, *x509.CertPool, error) {
pgimalac marked this conversation as resolved.
Show resolved Hide resolved
cert, key := buildSelfSignedKeyPair(extraHosts...)
if cert == nil {
return errors.New("unable to generate certificate")
return nil, nil, errors.New("unable to generate certificate")
}
pair, err := tls.X509KeyPair(cert, key)
if err != nil {
return fmt.Errorf("unable to generate TLS key pair: %v", err)
return nil, nil, fmt.Errorf("unable to generate TLS key pair: %v", err)
}
tlsKeyPair = &pair
tlsCertPool = x509.NewCertPool()

tlsCertPool := x509.NewCertPool()
ok := tlsCertPool.AppendCertsFromPEM(cert)
if !ok {
return fmt.Errorf("unable to add new certificate to pool")
}

tlsAddr, err = getIPCAddressPort()
if err != nil {
return fmt.Errorf("unable to get IPC address and port: %v", err)
return nil, nil, fmt.Errorf("unable to add new certificate to pool")
}

return nil
return &pair, tlsCertPool, nil
pgimalac marked this conversation as resolved.
Show resolved Hide resolved
}
167 changes: 53 additions & 114 deletions cmd/agent/api/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,28 +11,19 @@ sending commands and receiving infos.
package api

import (
"context"
"crypto/tls"
"fmt"
stdLog "log"
"net"
"net/http"
"time"
"strconv"

"github.com/cihub/seelog"
gorilla "github.com/gorilla/mux"
grpc_auth "github.com/grpc-ecosystem/go-grpc-middleware/auth"
"github.com/grpc-ecosystem/grpc-gateway/runtime"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials"

"github.com/DataDog/datadog-agent/cmd/agent/api/internal/agent"
"github.com/DataDog/datadog-agent/cmd/agent/api/internal/check"

"github.com/DataDog/datadog-agent/comp/aggregator/demultiplexer"
"github.com/DataDog/datadog-agent/comp/core/flare"
"github.com/DataDog/datadog-agent/comp/core/secrets"
"github.com/DataDog/datadog-agent/comp/core/workloadmeta"
workloadmetaServer "github.com/DataDog/datadog-agent/comp/core/workloadmeta/server"
"github.com/DataDog/datadog-agent/comp/dogstatsd/replay"
dogstatsdServer "github.com/DataDog/datadog-agent/comp/dogstatsd/server"
dogstatsddebug "github.com/DataDog/datadog-agent/comp/dogstatsd/serverDebug"
Expand All @@ -44,14 +35,22 @@ import (
"github.com/DataDog/datadog-agent/pkg/api/util"
"github.com/DataDog/datadog-agent/pkg/config"
remoteconfig "github.com/DataDog/datadog-agent/pkg/config/remote/service"
pb "github.com/DataDog/datadog-agent/pkg/proto/pbgo/core"
"github.com/DataDog/datadog-agent/pkg/tagger"
taggerserver "github.com/DataDog/datadog-agent/pkg/tagger/server"
grpcutil "github.com/DataDog/datadog-agent/pkg/util/grpc"
"github.com/DataDog/datadog-agent/pkg/util/log"
"github.com/DataDog/datadog-agent/pkg/util/optional"
)

var listener net.Listener
func startServer(listener net.Listener, srv *http.Server, name string) {
// Use a stack depth of 4 on top of the default one to get a relevant filename in the stdlib
logWriter, _ := config.NewLogWriter(5, seelog.ErrorLvl)

srv.ErrorLog = stdLog.New(logWriter, fmt.Sprintf("Error from the agent http %s: ", name), 0) // log errors to seelog
pgimalac marked this conversation as resolved.
Show resolved Hide resolved

tlsListener := tls.NewListener(listener, srv.TLSConfig)

go srv.Serve(tlsListener) //nolint:errcheck

log.Infof("%s started on %s", name, listener.Addr().String())
pgimalac marked this conversation as resolved.
Show resolved Hide resolved
}

// StartServer creates the router and starts the HTTP server
func StartServer(
pgimalac marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -69,124 +68,64 @@ func StartServer(
invHost inventoryhost.Component,
secretResolver secrets.Component,
) error {
err := initializeTLS()
apiAddr, err := getIPCAddressPort()
if err != nil {
return fmt.Errorf("unable to initialize TLS: %v", err)
return fmt.Errorf("unable to get IPC address and port: %v", err)
}

// get the transport we're going to use under HTTP
listener, err = getListener()
if err != nil {
// we use the listener to handle commands for the Agent, there's
// no way we can recover from this error
return fmt.Errorf("Unable to create the api server: %v", err)
}
extraHosts := []string{apiAddr}

err = util.CreateAndSetAuthToken()
if err != nil {
return err
}
ipcHost := config.Datadog.GetString("agent_ipc_host")
apiConfigPort := config.Datadog.GetInt("agent_ipc_port")
apiConfigEnabled := apiConfigPort > 0
apiConfigHostPort := net.JoinHostPort(ipcHost, strconv.Itoa(apiConfigPort))
pgimalac marked this conversation as resolved.
Show resolved Hide resolved

// gRPC server
authInterceptor := grpcutil.AuthInterceptor(parseToken)
opts := []grpc.ServerOption{
grpc.Creds(credentials.NewClientTLSFromCert(tlsCertPool, tlsAddr)),
grpc.StreamInterceptor(grpc_auth.StreamServerInterceptor(authInterceptor)),
grpc.UnaryInterceptor(grpc_auth.UnaryServerInterceptor(authInterceptor)),
if apiConfigEnabled {
extraHosts = append(extraHosts, apiConfigHostPort)
pgimalac marked this conversation as resolved.
Show resolved Hide resolved
}

s := grpc.NewServer(opts...)
pb.RegisterAgentServer(s, &server{})
pb.RegisterAgentSecureServer(s, &serverSecure{
configService: configService,
taggerServer: taggerserver.NewServer(tagger.GetDefaultTagger()),
// TODO(components): decide if workloadmetaServer should be componentized itself
workloadmetaServer: workloadmetaServer.NewServer(wmeta),
dogstatsdServer: dogstatsdServer,
capture: capture,
})

dcreds := credentials.NewTLS(&tls.Config{
ServerName: tlsAddr,
RootCAs: tlsCertPool,
})
dopts := []grpc.DialOption{grpc.WithTransportCredentials(dcreds)}

// starting grpc gateway
ctx := context.Background()
gwmux := runtime.NewServeMux()
err = pb.RegisterAgentHandlerFromEndpoint(
ctx, gwmux, tlsAddr, dopts)
tlsKeyPair, tlsCertPool, err := initializeTLS(extraHosts...)
if err != nil {
return fmt.Errorf("error registering agent handler from endpoint %s: %v", tlsAddr, err)
return fmt.Errorf("unable to initialize TLS: %v", err)
}

err = pb.RegisterAgentSecureHandlerFromEndpoint(
ctx, gwmux, tlsAddr, dopts)
if err != nil {
return fmt.Errorf("error registering agent secure handler from endpoint %s: %v", tlsAddr, err)
tlsConfig := &tls.Config{
Certificates: []tls.Certificate{*tlsKeyPair},
NextProtos: []string{"h2"},
MinVersion: tls.VersionTLS12,
}

// Setup multiplexer
// create the REST HTTP router
agentMux := gorilla.NewRouter()
checkMux := gorilla.NewRouter()
// Validate token for every request
agentMux.Use(validateToken)
checkMux.Use(validateToken)

mux := http.NewServeMux()
mux.Handle(
"/agent/",
http.StripPrefix("/agent",
agent.SetupHandlers(
agentMux,
flare,
dogstatsdServer,
serverDebug,
wmeta,
logsAgent,
senderManager,
hostMetadata,
invAgent,
demux,
invHost,
secretResolver,
)))
mux.Handle("/check/", http.StripPrefix("/check", check.SetupHandlers(checkMux)))
mux.Handle("/", gwmux)

// Use a stack depth of 4 on top of the default one to get a relevant filename in the stdlib
logWriter, _ := config.NewLogWriter(5, seelog.ErrorLvl)

srv := grpcutil.NewMuxedGRPCServer(
tlsAddr,
&tls.Config{
Certificates: []tls.Certificate{*tlsKeyPair},
NextProtos: []string{"h2"},
MinVersion: tls.VersionTLS12,
},
s,
grpcutil.TimeoutHandlerFunc(mux, time.Duration(config.Datadog.GetInt64("server_timeout"))*time.Second),
)
if err := util.CreateAndSetAuthToken(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any tests around the token validation on either the old server code or the new?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of, I'll open a separate PR to fix that (I would like to merge this one as it's blocking other PRs)

return err
}

srv.ErrorLog = stdLog.New(logWriter, "Error from the agent http API server: ", 0) // log errors to seelog
if err := startCMDServer(
apiAddr, tlsConfig, tlsCertPool,
configService, flare, dogstatsdServer,
capture, serverDebug, wmeta, logsAgent,
senderManager, hostMetadata, invAgent,
demux, invHost, secretResolver,
Comment on lines +109 to +113
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not be a convention in Go, but I see having too many parameters/arguments in a function as a code smell.

Maybe a suggestion for the CMD server could be its struct that we initialize and later call startServer or something. I also understand it might not be the right suggestion for this particular use case. At least we can start a conversation 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree, IMO the API should at some point become a generic components others could register endpoints into, but this would be part of an eventual refactor of the API component

Copy link
Contributor

@Kaderinho Kaderinho Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, all these arguments will be removed in the near future as we'll embed them in the API Component directly (they will be injected to the API component through Fx)

See comment here:

// * StartServer args will be moved into the Component struct directly

); err != nil {
return err
}

tlsListener := tls.NewListener(listener, srv.TLSConfig)
if apiConfigEnabled {
if err := startIPCServer(apiConfigHostPort, tlsConfig); err != nil {
StopServer()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to call StopServer when startIPCServer fail? Ideally when startIPCServer returns an error, startIPCServer should release any resources allocated if any.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the fact that we are using global variables cmdListener and ipcConfigListener makes it more difficult to grok what is going on. Have you consider moving away from global variables and having server struct that holds cmdListener and ipcConfigListener information? That way, we could have functions like StopCMDServer and StopIPCServer , the same for startCMDServer and startIPCServer defined as part of the server struct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it was previously using var listener net.Listener but it might be a good opportunity to refactor such pattern 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ogaca-dd I need to call StopServer if startIPCServer fails because the other server was started beforehand, and needs to be stopped. I added some comments to make that clearer.

@GustavoCaso This would be implemented in the comp PR, in this one I try to avoid changing things I don't need to

Copy link
Member

@GustavoCaso GustavoCaso Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comp PR

You mean the API component PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the componentization of the API will likely remove these global variables

return err
}
}

go srv.Serve(tlsListener) //nolint:errcheck
return nil
}

// StopServer closes the connection and the server
// stops listening to new commands.
func StopServer() {
pgimalac marked this conversation as resolved.
Show resolved Hide resolved
if listener != nil {
listener.Close()
if cmdListener != nil {
cmdListener.Close()
}
if ipcConfigListener != nil {
ipcConfigListener.Close()
}
}

// ServerAddress retruns the server address.
func ServerAddress() *net.TCPAddr {
return listener.Addr().(*net.TCPAddr)
}
Loading