Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access points support in EFS #2337

Closed
andrei-xdlab opened this issue Dec 28, 2020 · 11 comments
Closed

Access points support in EFS #2337

andrei-xdlab opened this issue Dec 28, 2020 · 11 comments

Comments

@andrei-xdlab
Copy link

Hello,

I would like to replace slow EBS storage with EFS filesystem for hundreds of nodes pcluster. Do you support EFS Access points in 2.10.1? I have existing EFS filesystem and would like to create multiple access points (/home, /shared, /app etc) and mount shared on master/compute nodes. Do you plan to have a native support for EFS's POSIX user/permission?

Thanks

@demartinofra
Copy link
Contributor

Hi Andrei,

unfortunately at the moment ParallelCluster does not offer native support for EFS Access Points. I'm going to mark this as a feature request so that we take this into account when extending EFS functionalities in ParallelCluster.
Until then you could use a custom post-install script to mount an existing EFS with Access Points enabled.

Francesco

@andrei-xdlab
Copy link
Author

andrei-xdlab commented Dec 28, 2020

Hello Francesco,

Thank you for opening feature request.

  • ability to include existing access point IDs (fsap-xxx) in the template to mount multiple access points.
  • ability to create EFS filesystem, access points and directory permissions/ownership

[efs efscustom]
efs_fs_id = fs-xxxx
fsap_id1 = fsap-xxxx
shared_dir = /home
fsap_id2 = fsap-xxxx
shared_dir = /app

@andrei-xdlab
Copy link
Author

andrei-xdlab commented Dec 29, 2020

I created custom AMI (2.10.1) to include /home and /shared EFS access points in /etc/fstab. My pcluster create is failing due to nfs_export failure. (cfn-init.log below)

/etc/fstab
fs-xxxx /home efs _netdev,tls,accesspoint=fsap-0xxxx 0 0
fs-xxxx /shared efs _netdev,tls,accesspoint=fsap-0xxx 0 0

How do I disable creation of EBS volume and NFS exports (home and shared) on master node? I will have EFS access points mounted on master and all compute nodes.

On master instance I see that /home access point but /shared still created from EBS volume

/dev/xvdb 20G 45M 19G 1% /shared
127.0.0.1:/ 8.0E 0 8.0E 0% /home

/var/log/cfn-init.log

Error executing action `create` on resource 'nfs_export[/home]'
================================================================================

Mixlib::ShellOut::ShellCommandFailed
------------------------------------
execute[exportfs] (/etc/chef/local-mode-cache/cache/cookbooks/nfs/providers/export.rb line 43) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of exportfs -ar ----
STDOUT: 
STDERR: exportfs: /etc/exports [1]: Neither 'subtree_check' or 'no_subtree_check' specified for export "172.31.0.0/16:/shared".
  Assuming default behaviour ('no_subtree_check').
  NOTE: this default has changed since nfs-utils version 1.0.x

exportfs: /etc/exports [2]: Neither 'subtree_check' or 'no_subtree_check' specified for export "172.31.0.0/16:/home".
  Assuming default behaviour ('no_subtree_check').
  NOTE: this default has changed since nfs-utils version 1.0.x

exportfs: /home requires fsid= for NFS export
---- End output of exportfs -ar ----
Ran exportfs -ar returned 1

Cookbook Trace:
---------------
/etc/chef/local-mode-cache/cache/cookbooks/nfs/providers/export.rb:73:in `block in class_from_file'

Resource Declaration:
---------------------
# In /etc/chef/local-mode-cache/cache/cookbooks/aws-parallelcluster/recipes/head_node_base_config.rb

130: nfs_export "/home" do
131:   network node['cfncluster']['ec2-metadata']['vpc-ipv4-cidr-blocks']
132:   writeable true
133:   options ['no_root_squash']
134: end
135: 

@demartinofra
Copy link
Contributor

demartinofra commented Jun 9, 2021

Sorry for the very late reply.

At the moment it is not possible to disable the shared EBS and to mount an EFS filesystem on top of the /home directory cause this would break ParallelCluster. Such options will be available as part of the next major version (3.0.0) of the product.

In case you want to mount EFS on top of home after mouting the file system you will have to copy over all the files that were already present in home by doing something like:

# bind MUST be executed after the mount, order it's important!
mount -o bind / /tmp/mount
# use /bin/cp and not cp, because cp is an alias to "cp -i"
/bin/cp -rpfT /tmp/mount/home /home

@elgalu
Copy link

elgalu commented Jun 9, 2021

Hi, will 3.0.0 have the option to use LustreFS instead of EFS to mount the home?

@demartinofra
Copy link
Contributor

it is very likely it will

@OleguerCanal
Copy link

Hi, not sure if this here the right place to ask, but I have the following situation:

I started a cluster using pcluster and I have my data on an EFS volume which I have mounted on the EBM volume. However, when I sbatch jobs the scripts don't seem to be able to access the EFS volume. Is there something else I need to do @demartinofra ?

Thank you a lot

@OleguerCanal
Copy link

I ended up starting a new pcluster with the efs already mounted using the config file

@francisreyes-tfs
Copy link

I would be interested in support for EFS access points natively.

@hanwen-pcluste
Copy link
Contributor

I have created an internal feature request and we will consider this soon

@gmarciani
Copy link
Contributor

This feature has been included in ParallelCluster 3.11.0.
See documentation: https://docs.aws.amazon.com/parallelcluster/latest/ug/SharedStorage-v3.html#yaml-SharedStorage-EfsSettings-AccessPointId

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants