Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser in docker 1.13 breaks backwards compatibility for comments in middle of RUN statement #29005

Closed
yosifkit opened this issue Dec 1, 2016 · 20 comments · Fixed by #29064
Closed
Labels
area/builder platform/windows priority/P0 Urgent: Security, critical bugs, blocking issues. drop everything until this issue is addressed. version/1.12
Milestone

Comments

@yosifkit
Copy link
Contributor

yosifkit commented Dec 1, 2016

Description
Previously (Docker 1.6+) one could document long RUN lines with comments interspersed. #24725 broke this.

FROM alpine
RUN echo \
# comment
hi

Steps to reproduce the issue:

  1. build the above Dockerfile on 1.13-rc2

Describe the results you received:
Error response from daemon: Unknown instruction: HI

Describe the results you expected:
Docker build the above Dockerfile on 1.6 through 1.12 (thanks @tianon for testing all of them)
Successfully built xxx

Additional information you deem important (e.g. issue happens only occasionally):
Quick list of some official-images broken by this change:

Output of docker version:
(working version)

Client:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.7.3
 Git commit:   6b644ec
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.7.3
 Git commit:   6b644ec
 Built:        
 OS/Arch:      linux/amd64

Output of docker info:
(working version)

Containers: 6
 Running: 4
 Paused: 0
 Stopped: 2
Images: 5715
Server Version: 1.12.3
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: null overlay bridge host
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.8.8-gentoo
Operating System: Gentoo/Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 30.87 GiB
Name: isengard
ID: UOCM:3F65:5FZC:6H5L:W3HY:34G4:A5XZ:SUOV:S2D4:XQTO:4KGA:6XSE
Docker Root Dir: /mnt/spare/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8
@tianon
Copy link
Member

tianon commented Dec 1, 2016

Just for reference, #28587 was another issue discussing this, but I think at least the comment half of this change is a really serious regression (and is going to be a pretty severe breaking change for existing Dockerfiles in the wild), since there is no good workaround for adding comments in the middle of a really long RUN line (whereas an intentional empty line can easily get a \ added to workaround that change).

@tianon
Copy link
Member

tianon commented Dec 1, 2016

The last-known version where comments were not supported inline being 1.5 is amusing given https://blog.docker.com/2015/10/docker-hub-deprecation-1-5/ (1.5 and earlier no longer supported by Docker Hub 😄)

@tianon
Copy link
Member

tianon commented Dec 1, 2016

Any way we could do some querying on the public automated builds to see at least in the public aspect what the impact of this breaking change is?

@thaJeztah thaJeztah added this to the 1.13.0 milestone Dec 1, 2016
@thaJeztah thaJeztah added the priority/P0 Urgent: Security, critical bugs, blocking issues. drop everything until this issue is addressed. label Dec 1, 2016
yongtang added a commit to yongtang/docker that referenced this issue Dec 1, 2016
This fix is to address moby#29005 where continuation after comments
are not allowed after moby#24725.

This fix made the change so that moby#29005 could be addressed.

An integration test has been added. Other related tests have
been udpated to conform to the behavior changes.

This fix fixes moby#29005, but please feel free for any additional
suggestions.

Signed-off-by: Yong Tang <[email protected]>
@davidfischer-ch
Copy link

davidfischer-ch commented Dec 1, 2016

It also affects me. So there is no deprecation policy? Great.

@vdemeester
Copy link
Member

@davidfischer-ch yes, it has been overlooked 😓 😞. We are gonna work on it for the 1.13.0 GA so it does not break people Dockerfile… (@yongtang already proposed a changed 👼)

@thaJeztah
Copy link
Member

There's a couple of issues at hand here;

I think the original design for RUN was that RUN is "agnostic" to its
content, because it doesn't know what shell is used to execute the RUN
statement in (i.e., everything after RUN should be treated as an opaque blob).

While this should probably be "true", the implementation does not match that;
before the content is sent to the shell, the Dockerfile parser / builder (in
no particular order here);

  1. Removes empty lines inside the RUN
  2. Skips "comment only" lines (including comments not starting at position 0)
  3. Strips line-continuation markers (\)

Because of these steps, a RUN statement does not have the same behavior
as a regular shell (see this gist)

Basically, comment-only lines are handled by Docker, comments preceeded by
other characters are passed on to the shell.

Note that skipping comments not starting at position 0 is basically a bug,
because comments in Dockerfiles should only be seen as comment if they start
at position 0.

PR #24725 was created to fix cases where,
because of a combination of 1. and 3., a trailing \ resulted in the
Dockerfile continuing to see furter instructions as part of the RUN statement;

FROM busybox
RUN echo "foo" \
#&& echo "bar";



# Do something else
RUN echo "something else"

Resulting in;

Step 1 : FROM busybox
 ---> 47bcc53f74dc
Step 2 : RUN echo "foo" RUN echo "something else"
 ---> Running in 991b76486899
foo RUN echo something else
 ---> d488a4c21b22

I think we should fix that case, but we can consider it a "legacy" burden
we have to carry. Which leaves us with some options;

A. Revert all changes

Include both empty and comment lines as part of the RUN statement. This
would be a valid Dockerfile;

FROM busybox
RUN echo "foo"; \


# This is a comment

    # So is this

    echo "and this is part of the same RUN"


RUN echo "this is the next RUN"

B. Skip comment-only lines, but break on empty lines

This is what's being implemented in #29006. Users can use a \ on an empty line to add whitespace, but an empty line
stops the RUN.

With this change, this would still be valid;

FROM busybox
RUN echo "foo"; \
\
# This is a comment
\
    # So is this
    \
    echo "and this is part of the same RUN"


RUN echo "this is the next RUN"

C. Skip comment-only lines, warn / deprecate continuing on empty lines

Like B., but instead of breaking, print a warning ("skipping empty line"?),
Or a deprecation warning ("WARN: skipping empty line. Empty lines inside a RUN
instruction is deprecated, and will be removed in Docker 1.16")

@duglin
Copy link
Contributor

duglin commented Dec 1, 2016

High-order question (IMO): If we were starting from scratch would we want it to act like the shell does? Meaning, stop when it hits a blank line AND when it hits a comment?

In a perfect world I'd say "yes". Which would then lead towards us deprecating the new behavior and then removing it in 3 releases.

However, there's a catch. Not only do we have a lot of Dockerfiles in the wild with this "odd" syntax (which by itself doesn't necessarily mean we should allow it to continue), BUT given there's no good way to prevent multiple layers from being created when you have multiple RUNs, which means people are kind of forced into using RUNs that span a lot of lines via continuation chars, there's no good way to allow people to be good citizens and add comments/docs to their long RUN cmds.

Now, if we had a "squash just certain layers" type of operation in Dockerfiles then people wouldn't need to create these huge RUN cmds. They could have multiple RUNs, with comments between them, and then squash them down afterwards. Normally, in the past, we tried to align Dockerfile processing with shell processing, but this point (to me) is why we may not be able to align with the shell in this case.

Net: I think we should add back in support for comments in multi-line cmds but don't deprecate it until (or 'if') we add support for selective squashing.

@yongtang
Copy link
Member

yongtang commented Dec 1, 2016

@thaJeztah @duglin
I want to bring in another angle to look at the issue, which is related to the definition of comment line, and escaping \ after comment #. See (#24725 (comment)) and (#24725 (comment))

From my understanding, the issue we are facing now is probably not about when to continue for RUN, but about when to stop for RUN.

For example, let's assume there is another command pair RUN-SHELL and RUN-END for Dockerfile:

RUN-SHELL
echo "foo"; \
\
# This is a comment
\

    # So is this
    \
    echo "and this is part of the same RUN"
RUN-END
RUN echo "this is the next RUN"

Then the above will be much cleaner and a lot easier to parse, because we could essentially cut any blob in between RUN-SHELL and RUN-END, and pass the whole blob to a shell. The commenting, escaping, etc. will be handled by shell, not by Dockerfile parser.

We could even add an additional flag to specify which shell, e.g.,

RUN-SHELL --shell bash
echo "foo"; \
\
# This is a comment
\

    # So is this
    \
    echo "and this is part of the same RUN"
RUN-END
RUN echo "this is the next RUN"

I think another alternative solution is to revert back the old behavior, while at the same time introduce a new command pair RUN-SHELL/RUN-END (or RUN-BEGIN/RUN-END, or RUNBLOCK/RUNEND, etc.)

At the same time, we encourage users to use RUN for one-liner and RUN BLOCK/END for multiple line shells.

I don't know if introducing another (pair) command for Dockerfile is the best solution. But to be honest, RUN with multiple lines and so many varieties of breaking/continuing/commenting, is really hard to maintain. And solving different issues user facing from different angles is really difficult as well.

@thaJeztah
Copy link
Member

@yongtang @duglin Yes, I think having "heredoc" (or similar) notation would be the solution for that #1554, something like RUN_BEGIN / RUN_END.

For 1.13 we should decide though if we go with A, B, or C (or another option if I missed one)

@duglin
Copy link
Contributor

duglin commented Dec 1, 2016

Yea, a heredoc is another option. For now I think the current PR is the right approach - we can then decide on the rest later.

@thaJeztah
Copy link
Member

Any way we could do some querying on the public automated builds to see at least in the public aspect what the impact of this breaking change is?

@tianon I know it was done at some point, i.e. every Dockerfile on GitHub was built/tested for a change that was made in the builder. I'm not sure how it was done though; @crosbymichael may know

@tianon
Copy link
Member

tianon commented Dec 1, 2016

My own vote would be C, given that blank lines are trivially fixed by adding a \, but I'd be happy with B as well. I think the discussion of adding better multi-line syntax is orthogonal at this point given how deeply invested we are into the current syntax still (but definitely interesting as a separate discussion point).

@kinghuang
Copy link

My preference is A for Docker 1.13.

B breaks all Dockerfiles (30+) in my team, and I suspect may impact many others. I believe such a significant breaking change should be communicated three releases in advance, as stated in Breaking changes and incompatibilities.

As for C, while blank lines are trivially fixed by adding a \, that makes the RUN instruction visually cluttered, in my opinion. Blanks lines play a role in the readability of long RUN instructions. I think changing that warrants much more discussion (separately).

@tianon
Copy link
Member

tianon commented Dec 1, 2016

Oh sorry, I misspoke -- my preference is for C, but I'd be fine with A. I think B is a bit too aggressive on the breakage scale for 1.13 alone.

@thaJeztah
Copy link
Member

My preference goes to C, but with an extra note; let's see if we can find a better solution before 1.16 (such as the discussed heredoc/code-block).

We can extend the deprecation period if needed if no alternative is present yet.

@cpuguy83
Copy link
Member

cpuguy83 commented Dec 2, 2016

For 1.13 I'd go for A.
This is something that requires more testing and I don't think we can afford to throw together a fix last minute.
We're already (nearly) 3 release candidates in, I don't think it's time for making changes to the parser other than to bring us back to where we were for 1.12 unless this was something that was broken in 1.12 (which I don't think it was).

@vdemeester
Copy link
Member

So, let's go for A for 1.13 (see #29064) and work on C afterwars.

@NicoTexas
Copy link

Just ran into this new feature. I have to say, solution C for 1.13 seems more reasonable.

@yongtang
Copy link
Member

yongtang commented Dec 6, 2016

@NicoTexas Added a PR #29161 for solution C. The PR generates a warning for empty line continuation, for 1.14 (maybe).

@glensc
Copy link
Contributor

glensc commented Dec 10, 2016

@duglin #29005 (comment) the other reason for RUN spanning multiple lines is because it would be annoying to prefix every shell command with RUN keyword. i like the heredoc idea for that #29005 (comment)

adamliter added a commit to adamliter/psiturk-docker that referenced this issue Oct 15, 2017
OctavioBR added a commit to NeowayLabs/docker-ogre that referenced this issue Oct 20, 2017
According moby issue, it will be forbidden to inset comment in between multiline commands
moby/moby#29005
zervnet added a commit to nils-wisiol/django-rest-boilerplate that referenced this issue Nov 1, 2017
nils-wisiol pushed a commit to nils-wisiol/django-rest-boilerplate that referenced this issue Dec 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/builder platform/windows priority/P0 Urgent: Security, critical bugs, blocking issues. drop everything until this issue is addressed. version/1.12
Projects
None yet