Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dbode] Call syscall.Setrlimit to set num files open hard limit with setcap for DB docker image #1666

Merged
merged 6 commits into from
May 28, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions docker/m3dbnode/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,18 @@ COPY --from=builder /go/src/github.com/m3db/m3/bin/m3dbnode /bin/
COPY --from=builder /go/src/github.com/m3db/m3/src/dbnode/config/m3dbnode-local-etcd.yml /etc/m3dbnode/m3dbnode.yml
COPY --from=builder /go/src/github.com/m3db/m3/scripts/m3dbnode_bootstrapped.sh /bin/

# Use setcap and run as specific user
RUN apk add libcap && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had not seen this before. Did some reading. Kind of bizarre that the capabilities get set on the file/binary level

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the +ep do? Can you just add a comment to this line generally also

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not opposed to adding a comment, the +e is "effective" and +p is for "permitted".

mkdir -p /home/m3dbnode-user && \
addgroup -S m3dbnode-group && \
adduser -u 1000 -S -h /home/m3dbnode-user -G m3dbnode-group m3dbnode-user && \
chown m3dbnode-user:m3dbnode-group /bin/m3dbnode && \
setcap cap_ipc_lock=+ep /bin/m3dbnode && \
setcap cap_sys_resource=+ep /bin/m3dbnode && \
mkdir -p /var/lib && \
chown -R m3dbnode-user:m3dbnode-group /var/lib

USER m3dbnode-user

ENTRYPOINT [ "/bin/m3dbnode" ]
CMD [ "-f", "/etc/m3dbnode/m3dbnode.yml" ]
79 changes: 78 additions & 1 deletion src/dbnode/server/limits.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,15 @@
package server

import (
"bufio"
"fmt"
"os/exec"
"strconv"
"strings"
"syscall"

xos "github.com/m3db/m3/src/x/os"
xerror "github.com/m3db/m3/src/x/errors"
xos "github.com/m3db/m3/src/x/os"
)

const (
Expand Down Expand Up @@ -78,3 +83,75 @@ func validateProcessLimits() error {

return multiErr.FinalError()
}

func raiseRlimitToNROpen() error {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: this maybe should be in a _linux.go file.

cmd := exec.Command("sysctl", "-a")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just be able to check /proc/sys: https://www.kernel.org/doc/Documentation/sysctl/fs.txt

In kube:

/ # hostname
m3db-cluster-rep0-0
/ # cat /proc/sys/fs/nr_open
3000000

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah cool, yup I'll change it to this instead.

stdout, err := cmd.StdoutPipe()
if err != nil {
return fmt.Errorf(
"unable to raise nofile limits: sysctl_stdout_err=%v", err)
}

defer stdout.Close()

if err := cmd.Start(); err != nil {
return fmt.Errorf(
"unable to raise nofile limits: sysctl_start_err=%v", err)
}

var (
scanner = bufio.NewScanner(stdout)
limit uint64
)
for scanner.Scan() {
line := scanner.Text()
if !strings.Contains(line, "nr_open") {
continue
}
equalsIdx := strings.LastIndex(line, "=")
if equalsIdx < 0 {
return fmt.Errorf(
"unable to raise nofile limits: sysctl_parse_stdout_err=%v", err)
}
value := strings.TrimSpace(line[equalsIdx+1:])
n, err := strconv.Atoi(value)
if err != nil {
return fmt.Errorf(
"unable to raise nofile limits: sysctl_eval_stdout_err=%v", err)
}

limit = uint64(n)
break
}

if err := scanner.Err(); err != nil {
return fmt.Errorf(
"unable to raise nofile limits: sysctl_read_stdout_err=%v", err)
}

if err := cmd.Wait(); err != nil {
return fmt.Errorf(
"unable to raise nofile limits: sysctl_exec_err=%v", err)
}

var limits syscall.Rlimit
if err := syscall.Getrlimit(syscall.RLIMIT_NOFILE, &limits); err != nil {
return fmt.Errorf(
"unable to raise nofile limits: rlimit_get_err=%v", err)
}

if limits.Max >= limit && limits.Cur >= limit {
// Limit already set correctly
return nil
}

limits.Max = limit
limits.Cur = limit

if err := syscall.Setrlimit(syscall.RLIMIT_NOFILE, &limits); err != nil {
return fmt.Errorf(
"unable to raise nofile limits: rlimit_set_err=%v", err)
}

return nil
}
6 changes: 6 additions & 0 deletions src/dbnode/server/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,12 @@ func Run(runOpts RunOptions) {
}
defer logger.Sync()

// Raise soft fd limit to hard limit
if err := raiseRlimitToNROpen(); err != nil {
logger.Warn("unable to raise rlimit", zap.Error(err))
}

// Parse file and directory modes
newFileMode, err := cfg.Filesystem.ParseNewFileMode()
if err != nil {
logger.Fatal("could not parse new file mode", zap.Error(err))
Expand Down