-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
recheck from db if service is reported as down. #45
base: stable/xena-m3
Are you sure you want to change the base?
Conversation
only shrink share is expected to decrease max_files. This is flawed in multiple ways: - setting max_files_multiplier to < 1 may not work as intended in this case - netapp is doing some rounding, not every max inode number can be set Change-Id: I94215b212ceccdba151e64cb38db9f26f7fbc1d2
Change-Id: I80b030c39b5a328ad212674880ffcd4b4725aff6
we look at the host (NetApp cluster) instead of pool (NetApp node) Change-Id: I2d4b51aa78e9aa7fda800b99a6d9a6e6a42c55f2
can happen that share has been deleted in the meantime Change-Id: Id01618a490f6bf4e55304471d4dc3388582e8e2b
errors we have seen are temporary. Once they were in error state, they got stuck there, whereas a simple re-apply would have helped. So we go this way to re-apply and log the event. Change-Id: Id0a3d50bf0f82cb6d079988514ecff282c90de41
was added back then in 4b07b64 for easier error handling, but in the meantime we catch reexport aka ensure errors more centrally Change-Id: Id2a4547400bd630168268250e9ef8ee05cd93ae2
follows 01b9d7c We need to use the backend cluster client to check for existence of a vserver in a different backend. Change-Id: I8eb35e714a9a5c02f40b50fc226b27e05ed9c2f7
applies to access rules which are older than 7 days. I.e. in the first week it will be re-tried on every ensure run. Log known errors with lower level. Change-Id: I359bf99e092c35f0b2785a74ec7a9a5afbd181d5
according to https://urllib3.readthedocs.io/en/latest/reference/urllib3.util.html#module-urllib3.util.retry to fix "Temporary failure in name resolution" Retry is visible in log like: ``` WARNING urllib3.connectionpool [req-d92bd8d6-f05f-404c-8ded-087d29c9bf9f] Retrying (Retry(total=4, connect=4, read=2, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fcde1cfb3a0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /servlets/netapp.servlets.admin.XMLrequest_filer WARNING urllib3.connectionpool [req-d92bd8d6-f05f-404c-8ded-087d29c9bf9f] Retrying (Retry(total=3, connect=3, read=2, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fcddbf223a0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /servlets/netapp.servlets.admin.XMLrequest_filer WARNING urllib3.connectionpool [req-d92bd8d6-f05f-404c-8ded-087d29c9bf9f] Retrying (Retry(total=2, connect=2, read=2, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fcde0443340>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /servlets/netapp.servlets.admin.XMLrequest_filer WARNING urllib3.connectionpool [req-d92bd8d6-f05f-404c-8ded-087d29c9bf9f] Retrying (Retry(total=1, connect=1, read=2, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fcde0144a00>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /servlets/netapp.servlets.admin.XMLrequest_filer ``` Change-Id: Ic9ff8208f10df9dbed09717d6b218f6293d2338a
Change-Id: Ibc21b6c72d76a3a804f67e66e7604b3d0be4373f Closes-Bug: #1971710
The filter_properties operates with pools. A set intersection would limit affinity to a certain pool, but we want to allow a set of pools within the same backend cluster. E.g. 'same_host': [ 'HostA@BackendA#Pool1', 'HostA@BackendA#Pool2'] is fine in filter_propeties Change-Id: If96192f25f03e517a78fefa65db91e02d0ba20e9
is required to get it into volume comment Change-Id: I657c19a74eb561f441dc6749210bfa906687a002
…ities configure dns before everything else for security services, this is a prereq configure certs and use signed sessions for active directory fail unit test if certs would expire in less than 60 days Change-Id: Id50894f9dda06741d05949e41817ba340f17dd2c
When trying to compare two values that are non-numeric using the driver filter, the filter function will give an error. This is not desirable as it might be interesting to support comparatives with non-numeric values provided by the filter objects (share, host, etc). For example, the following formula failed before the fix: filter_function = '(share.project_id == "bb212f09317a4f4a8952ef3f729c2551")' Copied from cinder https://opendev.org/openstack/cinder/commit/87a7e80a2cbc4c8abcf4394242a02fcc5140e44b Change-Id: Icbfabb3bc0f608ebdd0784337db0921cc7763c53
fallback to setup cifs without ldaps and session signing after 3 failed attempts. ensure: don't re-apply cifs security settings to allow manual override Change-Id: I3a5341e2bd5c6343cff6fc50f05d855dc4f09312
'reserved_share_extend_percentage' backend config option allows Manila to consider different reservation percentage for share extend operation. With this option, under existing limit of 'reserved_share_percentage', we do not want user to create new share if limit is hit, but allow user to extend existing share. DocImpact Closes-Bug: #1961087 Change-Id: I000a7f530569ff80495b1df62a91981dc5865023 (cherry picked from commit 6431b86)
If the customer is configuring "Servers" as AD Server in the Security Service then the domain controller discovery mode should be changed to "none" and only these servers should be used.
NFS v4.0 was previously handled differently in NetApp driver. We want to explicitly disable v4.0 if it is not set in 'netapp_enabled_share_protocols'. Also enable v4.1 options like read/write delegation, pnfs, acls.
During share network create API, if failure occurs quota is not rolled back and its usable only after quota reservations timed out (waiting conf.reservation_expire seconds). Closes-bug: #1975483 Change-Id: I3de8f5bfa6ac4580da9b1012caa25657a6df71ec (cherry picked from commit 8c854a1)
For all projects enable logical space reporting - For neo disable dedupe and compression - For share replica, share from snapshot, share modify(extend/shrink) retain behaviour parent share.
In regions with high load & payload it become necessary to check once against lastest 'created_at' or 'updated_at' from db, if earlier fetched value report service as down.
this fix is based on observation of bug, I was not able to reproduce the issue on devstack. though we reproduce it on some regions, we can try this change there. |
26107fa
to
650fa9c
Compare
After checking all call invocations of service_is_up(), all calls are made with service fetched from db and default value of 60 seconds is good enough to rule out that we need to fetch db values again. The issues seems more like service is taking more than 60 seconds to be up. Current workaround we have in production is 10 minutes (600 seconds) this makes only solution is call to service_is_up by caller passing the value of threshold (if None passed, take conf default) |
bddb4dc
to
2e4560f
Compare
3ebf07c
to
0d528e5
Compare
334c8a7
to
251cc83
Compare
8365486
to
fd00178
Compare
81d3bc6
to
6da5d11
Compare
4bcd0a9
to
909108b
Compare
301504e
to
0d8ed76
Compare
276f0ad
to
6514ce7
Compare
047db96
to
ee5d991
Compare
a94d862
to
40110c4
Compare
please rebase |
f0f623e
to
5b27408
Compare
In regions with high load & payload it become necessary to check once
against lastest 'created_at' or 'updated_at' from db, if earlier
fetched value report service as down.