-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with JSON decode on vgs combined stream ouput #247
Comments
Quick and dirty implementation I did to mitigate this temporarily. diff --git a/pkg/lvm/lvm_util.go b/pkg/lvm/lvm_util.go
index 0bb4877..e9b52ed 100644
--- a/pkg/lvm/lvm_util.go
+++ b/pkg/lvm/lvm_util.go
@@ -17,6 +17,7 @@ limitations under the License.
package lvm
import (
+ "bytes"
"encoding/json"
"fmt"
"os"
@@ -663,11 +664,16 @@ func ListLVMVolumeGroup(reloadCache bool) ([]apis.VolumeGroup, error) {
"--units", "b",
}
cmd := exec.Command(VGList, args...)
- output, err := cmd.CombinedOutput()
+ var execOut bytes.Buffer
+ // var execErr bytes.Buffer
+ cmd.Stdout = &execOut
+ // cmd.Stderr = &execErr // do something with this useful information
+ err := cmd.Run()
if err != nil {
klog.Errorf("lvm: list volume group cmd %v: %v", args, err)
return nil, err
}
+ output := execOut.Bytes()
return decodeVgsJSON(output)
} |
@kro-cat Thank you for reporting the issue. Can it be reproduced easily? If yes, please help us with the steps. Since you have found the fix for it also. Would you like to raise a PR for this? |
@abhilashshetty04 Of course! I'll have a PR up in a bit. I'm convinced this is a niche issue. My current hypothesis is that it's due to the fact that the host machine has a different lvm system_id than the lvm-driver container. When lvm detects a foreign system id on a volume group, it will print a warning on STDERR; this is the kind of message that led me to this issue. Since the system id can be set when using I haven't tested that yet, so I can't be certain that it's entirely correct. |
By "in a bit" I guess I really meant to say "when I'm confident it actually solves the problem I'm having instead of just hiding it" |
@kro-cat , Thanks for replying. We pre-provision vg (volume group) anyway right? So what do you mean by lvm-driver detects vg not managed by itself? |
@abhilashshetty04 I'll try my best to explain why that might cause a problem. In my particular use case, the host uses lvm logical volumes for its own filesystem (/var, /home, stuff like that). These volume groups cannot be managed by lvm-localpv, so I don't include them in its configuration. Additionally, the host's lvm.conf has been configured beyond defaults; notably, the hosts each have a unique systemid configured. These are the kinds of volumes whose volume groups create this issue. When the lvm-driver pod executes |
I think I should follow-up on my previous comment. The issue here is that lvm-driver crashes unexpectedly when warning messages are mixed with JSON output from |
) * fix(lvm): change calling operation to avoid garbage in json object Drop the STDERR stream contents from `output` to avoid JSON mangling. The `vgs` command may print non-critical warnings to STDERR. Warnings may not necessarily result in a failure return code, which allows the program to continue with marshalling the JSON-formatted output. Combining this stream with STDIN will cause the next step at `decodeVgsJSON()` to fail due to garbage mixed in the JSON. Fixes #247 Signed-off-by: Kara <[email protected]> * fix(lvm): add PR changelog - suppress STDERR when calling vgs Add PR changelog Signed-off-by: Kara <[email protected]> * refactor(lvm): refactor exec to separate function, add log verbosity Move exec code into separate function which returns STDOUT and STDERR streams separately. This is to facilitate future occurences of that same pattern within lvm_util.go . Additionally, add log messages to assist with debugging lvm and json. Signed-off-by: Kara <[email protected]> * feat(ci): integrate test to simulate foreign lvm systemid This test aims to determine whether lvm-driver is capable of operating with a volume group whose systemid is foreign to the kubernetes host system. The test ensures the program in pkg/lvm/lvm_utils.go is capable of operating in a foreign lvm environment. This is important when the kubernetes host system has a different lvm configuration than the lvm-driver container. Differences in configuration may cause unforseen consequences when the host machine provisions a volume group which is to be consumed by the lvm-driver container. Additionally, this feature includes teardown code in the form of the cleanup functions which aims to aid with cleaning up ci resources in a local development environment. To make the cleanup actions compatible with the original operation of the ci-test.sh script, the feature is only enabled through the use of environment variables. While this commit fixes issue #249, it's intent was as an integration test for issue #247. Fixes #249 Signed-off-by: Kara <[email protected]> * feat(ci): clean up volume groups when applicable To avoid volume group caching, explicitly remove volume groups (and their pvs) before removing the disk and loopback device. Signed-off-by: Kara <[email protected]> * fix(ci): use -n instead of ! -z, change last statement to return true Fix linter errors: use -n instead of ! -z in test statements. Last statement in file is used as the return code of the file, so it must return true; change `[ ! -z ... ] && ...` into `[ -z ... ] || ...` to achieve this. Signed-off-by: kro <[email protected]> --------- Signed-off-by: Kara <[email protected]> Signed-off-by: kro <[email protected]>
@kro-cat @Abhinandan-Purkait I have the same issue with lvs and pvs. I think all commands should use RunCommandSplit instead of CombindOutput. |
@Mjolkhare I think I'll set up a PR this weekend for that since I never did that refactor in pull #250, unless of course someone beats me to it. |
@kro-cat Thanks for being on top of this. |
The LVM system may sometimes produce non-critical warnings which are written to STDERR without formatting. Combining STDERR with STDIN may cause failures when the output is being formatted or otherwise interpreted. Following pull openebs#250, it's been requested that all lvm commands be refactored to use a split output in order to resolve this issue under non-tested scenarios. See issue openebs#247. The definition for RunCommandSplit has been moved above all uses of the function, and any Command using CombinedOutput and an lvm command (s.a.: lvs, vgs, lvcreate, &c.) has been refactored to instead use RunCommandSplit to obtain the command's output. If anything is written to STDERR by the lvm commands, RunCommandSplit prints the message to the log as a warning. Signed-off-by: kro <[email protected]>
…#261) * refactor(ci): rename env variables and update vgcreate command Env vars LVM_SYSTEMID and LVM_CONFIG are used to describe a potentially foreign lvm system on the kubernetes host. The lvm system is only meant to be foreign to the lvm-localpv containers. Due to the way the ci tests are written, the kubernetes host and the lvm-localpv conatiners must have identical lvm configurations or ci tests might fail. The variables LVM_SYSTEMID and LVM_CONFIG have been renamed to FOREIGN_LVM_SYSTEMID and FOREIGN_LVM_CONFIG, respectively. This is helpful when determining their roles at a glance. A secondary change was made to the foreign pv creation: the lvm option `--config` has been used in place of `--systemid` to hopefully minimize unintended side effects of using the same lvm config (with the exception of the system id) during volume group creation. Signed-off-by: kro <[email protected]> * refactor(lvm): update all lvm commands to use split output The LVM system may sometimes produce non-critical warnings which are written to STDERR without formatting. Combining STDERR with STDIN may cause failures when the output is being formatted or otherwise interpreted. Following pull #250, it's been requested that all lvm commands be refactored to use a split output in order to resolve this issue under non-tested scenarios. See issue #247. The definition for RunCommandSplit has been moved above all uses of the function, and any Command using CombinedOutput and an lvm command (s.a.: lvs, vgs, lvcreate, &c.) has been refactored to instead use RunCommandSplit to obtain the command's output. If anything is written to STDERR by the lvm commands, RunCommandSplit prints the message to the log as a warning. Signed-off-by: kro <[email protected]> --------- Signed-off-by: kro <[email protected]>
I recently ran into a bug which is causing the lvm driver to crash unexpectedly on my systems.
I've traced it to pkg/lvm/lvm_util.go line 666, in ListLVMVolumeGroup()
STDERR and STDOUT are combined via
cmd.CombinedOutput()
. Thevgs
program may produce non-critical warning messages on STDERR which don't affect the operation of this driver. Warning messages also don't affect the return code of thevgs
program. Any warning messages printed to STDERR will fail the JSON decode step indecodeVgsJSON()
.The text was updated successfully, but these errors were encountered: