Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detect network conflict as name is not guaranteed to be unique #10612

Merged
merged 1 commit into from
May 31, 2023

Conversation

ndeloof
Copy link
Contributor

@ndeloof ndeloof commented May 24, 2023

What I did
this partially revisit #9585 to better detect misuse of network and name conflicts

  • report error to user when ambiguous name is used with multiple networks found
  • report error when network doesn't have the expected labels set
  • manage creation conflict
  • once target network has been identified, mutate the project to use ID as name, so that other API calls don't become ambiguous

also created dedicated func resolveExternalNetwork to keep code reasonably readable (while this makes diff harder to read)

@ndeloof ndeloof force-pushed the network_conflict branch 5 times, most recently from 7f7f0df to 470983e Compare May 24, 2023 12:54
@codecov
Copy link

codecov bot commented May 24, 2023

Codecov Report

Patch coverage: 53.57% and project coverage change: -0.45 ⚠️

Comparison is base (42cd961) 59.57% compared to head (f740b1a) 59.12%.

Additional details and impacted files
@@            Coverage Diff             @@
##               v2   #10612      +/-   ##
==========================================
- Coverage   59.57%   59.12%   -0.45%     
==========================================
  Files         107      107              
  Lines        9437     9466      +29     
==========================================
- Hits         5622     5597      -25     
- Misses       3239     3286      +47     
- Partials      576      583       +7     
Impacted Files Coverage Δ
pkg/utils/slices.go 80.00% <42.85%> (-20.00%) ⬇️
pkg/compose/create.go 58.73% <49.55%> (-2.37%) ⬇️
pkg/compose/convergence.go 68.50% <66.66%> (-0.80%) ⬇️
pkg/compose/down.go 63.49% <88.88%> (-1.22%) ⬇️
pkg/compose/filters.go 83.33% <100.00%> (+2.08%) ⬆️

... and 4 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@ndeloof ndeloof force-pushed the network_conflict branch from 470983e to 9141723 Compare May 24, 2023 15:32
@ndeloof ndeloof requested review from a team, nicksieger, StefanScherer, ulyssessouza, glours, milas and laurazard and removed request for a team May 24, 2023 15:32
return created, err
}
for k, s := range networkingConfig.EndpointsConfig {
s.Links = links
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using NetworkConnect with network name this wasn't required. Sounds like some engine weirdness, anyway this is consistent we define NetworkConfig the same way we later connect to network

Copy link
Contributor

@milas milas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nice improvement!

Sorry to leave so many comments, let me know if anything doesn't make sense. I'm also not really adamant about anything here, just trying to be extra cautious!

Comment on lines 1110 to 1142
// we could have set NetworkList with a projectFilter and networkFilter but not doing so allows to catch this
// scenario were a network with same name exists but doesn't have label, and use of `CheckDuplicate: true`
// prevents to create another one.
if len(networks) > 0 {
return fmt.Errorf("a network with name %q already exists but was not created by compose"+
"Set `external: true` to use a network you created", n.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently (i.e. ≤ v2.18.2), Compose will happily use any existing network that fits the name, even if it's not labeled.

I think perhaps we should make this a warning for now and change it into an error in a later release after some time.

Comment on lines 1166 to 1171
created, err := s.apiClient().NetworkCreate(ctx, n.Name, createOpts)
if err != nil {
if errdefs.IsConflict(err) {
// Maybe another execution of `docker compose up|run` created same network
// let's retry
return s.ensureNetwork(ctx, n)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems better to let the error propagate here and fail the command. Since there's nothing like a project lock (which causes problems as your comment points out 🙈), if state is changing underneath us, we should just fail IMO.

Alternatively, I think we should at least not have ensureNetwork call itself recursively -- 409 Conflict is unfortunately pretty generic (even in Moby land), so I'm sure there's some possibility of getting this into an infinite loop. I think returning the error and having something like ensureNetworkWithRetries(name, retries) that calls ensureNetwork(name) would be safer/easier to reason about

w.Event(progress.ErrorEvent(networkEventName))
return errors.Wrapf(err, "failed to create network %s", n.Name)
}
n.Name = created.ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this all kind of a big workaround for awkwardness in the engine API in the first place, but I'm a bit nervous about this subtle mutation of the Project.

Would it make sense to have ID string yaml:"-"on the network object and assign to there? Of course, that'd require a "hack" incompose-go` 😬

Another thought would be to carry more internal state independent of the Project, where we could have something like map[string]string networkNameIDs.

(I'm also not very enthusiastic about either of those options and willing to accept the current overwriting of .Name is a reasonable compromise, but wanted to at least raise it for discussion.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the NetworkConnect we rely on expect an id parameter to be ["Network ID or name"](see https://docs.docker.com/engine/api/v1.42/#tag/Network/operation/NetworkConnect) - there's actually many places where engine accepts both, so the GetContainer and FindNetwork having idOrName parameter and managing ambiguous references.
I guess this design decision makes the CLI easier to implement, as there's no need to first resolve a name then invoke operation by ID, but this obviously leaks into compose model as Name - attribute should be NameOfID as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created compose-spec/compose-go#411 so we can store an ID without having to mutate Name, as this is indeed error-prone in the code.

Copy link
Contributor Author

@ndeloof ndeloof May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered moby overrides NetworkConfig set on ContainerCreate and ALWAYS use network.Name as key, which prevent ContainerStart to succeed when name is ambiguous. So using ID as Name won't help 😓

pkg/compose/create.go Show resolved Hide resolved
})
}
return ops
}

func (s *composeService) removeNetwork(ctx context.Context, name string, w progress.Writer) error {
func (s *composeService) removeNetwork(ctx context.Context, networkName string, projectName string, name string, w progress.Writer) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (s *composeService) removeNetwork(ctx context.Context, networkName string, projectName string, name string, w progress.Writer) error {
func (s *composeService) removeNetwork(ctx context.Context, networkName string, projectName string, networkID string, w progress.Writer) error {

Filters: filters.NewArgs(
projectFilter(projectName),
networkFilter(networkName),
filters.Arg("name", name)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
filters.Arg("name", name)),
filters.Arg("name", networkName)),

(fwiw this is partially why I'm afraid of overwriting n.Name with the ID...makes it reaaallly easy to be passing the wrong things around)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually, shouldn't we just get rid of this filter entirely if we're using the com.compose.network label?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Engine Name vs ID reference makes this ambiguous, + compose also hase this network.name concept which is only the key used in yaml mapping - set here as networkName which is incorrect variable name - and is not the actual network name (<project>_key) 😰

I'll revisit this

Comment on lines 174 to 175
projectFilter(projectName),
networkFilter(networkName),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE: My other comment about making "Compose-managed network exists but is NOT labeled by Compose" a warning

Currently, if you end up in that state, you can't get out easily by doing down && up since Compose will also no longer remove the network w/o labels:

❯ ~/dev/compose/bin/build/docker-compose down && ~/dev/compose/bin/build/docker-compose up --wait
[+] Building 0.0s (0/0)
a network with name "pr10612_custom" already exists but was not created by composeSet `external: true` to use a network you created

I wonder if we should do the filtering client side and emit warnings when we find resources like this? We could potentially include them in --remove-orphans (though it's really more like --remove-uninvited-house-guests in this case 😂)

Comment on lines 182 to 185
// NetworkList API doesn't return the exact name match, so we can retrieve more than one network with a request
networks = utils.Filter(networks, func(net moby.NetworkResource) bool {
return net.Name == name
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary - NetworkList with a name filter works reliably, it's NetworkInspect that does prefix matching

Copy link
Contributor

@glours glours left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added just few non blocking questions

pkg/compose/create.go Outdated Show resolved Hide resolved
pkg/compose/create.go Outdated Show resolved Hide resolved
@ndeloof ndeloof force-pushed the network_conflict branch from c60f564 to f740b1a Compare May 31, 2023 14:03
@jauderho
Copy link

jauderho commented Jun 23, 2023

I'm not 100% sure if this is the patch that broke things but things were working on version 2.18.1

So I have a compose.yml file


services:
  dnscrypt-proxy:
    container_name: dnscrypt-proxy
    #build: .
    image: ghcr.io/jauderho/dnscrypt-proxy:2.1.4@sha256:1baca84db9c59a0f0d783f84b9f52a7ef4d08b8fddf9a0357cf319453285e1df
    ports:
      - "5400:5400/tcp"
      - "5400:5400/udp"
    # Volumes store your data between container upgrades
    volumes:
      - ./config/:/config/
    #dns:
    #  - 127.0.0.1
    #  - 192.168.1.10
    #cap_add:
      #- NET_ADMIN
    #  - NET_BIND_SERVICE
    restart: unless-stopped
    labels:
      - "com.centurylinklabs.watchtower.enable=true"

networks:
  default:
    name: dns

The dns network was created using docker network create dns. Prior to 2.19.0, this worked with no issue but upon switching to 2.19.0. This is what I get now.

[+] Stopping 1/0
 ✔ Container dnscrypt-proxy  Stopped                                                                                                                                                         0.0s 
WARN[0000] a network with name dns exists but was not created by compose.
Set `external: true` to use an existing network 
network dns was found but has incorrect label com.docker.compose.network set to ""

This corresponds to line# 1110 of pkg/compose/create.go. Not quite sure how to fix this.

@ndeloof
Copy link
Contributor Author

ndeloof commented Jun 23, 2023

isn't warning explicit enough?
To have compose use a network you created on your own, you should use external: true

networks:
  default:
    name: dns
    external: true

@jauderho
Copy link

Thanks. This fixed it for me.

@HannesJo0139
Copy link

isn't warning explicit enough? To have compose use a network you created on your own, you should use external: true

networks:
  default:
    name: dns
    external: true

To be honest, it's not. The message tells me the label was not expected. I don't know what label it was expecting nor why it was. I also don't know why an unexpected label would be caused by a missing external: true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants