Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ncdump crashes after a 2D NC_STRING variable is written to this file #2002

Closed
krisfed opened this issue May 18, 2021 · 3 comments · Fixed by #2004
Closed

ncdump crashes after a 2D NC_STRING variable is written to this file #2002

krisfed opened this issue May 18, 2021 · 3 comments · Fixed by #2004

Comments

@krisfed
Copy link

krisfed commented May 18, 2021

Hi all!

So far I am seeing this issue with only one specific file (nctest_netcdf4_orig.nc in attached.zip , mostly full of dummy data). Before, ncdump works fine on the file, but after I write a 2D NC_STRING variable to it, ncdump seems to crash with "Segmentation fault: 11".

I isolated the reproduction steps for writing the NC_STRING variable (and thus creating the problematic file) to the string_ncdump_crush.cpp in attached.zip

Here are the output and details of what I am observing:

  1. ncdump runs fine on the file before
ncdump nctest_netcdf4_orig.nc
netcdf nctest_netcdf4_orig {
dimensions:
	time = UNLIMITED ; // (50 currently)
	y = 50 ;
	x = 50 ;
	r = 1 ;
	c = 2 ;
	charcol = 6 ;
variables:
	double pi ;
		pi:description = "circumference by diameter" ;
	short temperature(time) ;
		temperature:_FillValue = 200s ;
		temperature:scale_factor = 1.8 ;
		temperature:add_offset = 32. ;
		temperature:units = "degrees_fahrenheight" ;
		temperature:value = "1 to 50 column vector" ;
	double time(time) ;
		time:_FillValue = 200. ;
		time:length = "1 to 55" ;
	double peaks(x, y) ;
		peaks:_FillValue = 9.96920996838687e+36 ;
		peaks:description = "varData=peaks(50);" ;
	double double(c, r) ;
		double:double = 2.2250738585072e-308, 1.79769313486232e+308 ;
	float single(c, r) ;
		single:single = 1.175494e-38f, 3.402823e+38f ;
	uint64 uint64(c, r) ;
		uint64:uint64 = 0ULL, 18446744073709551615ULL ;
	int64 int64(c, r) ;
		int64:int64 = -9223372036854775808LL, 9223372036854775807LL ;
	int int32(c, r) ;
		int32:int32 = -2147483648, 2147483647 ;
	uint uint32(c, r) ;
		uint32:uint32 = 0U, 4294967295U ;
	short int16(c, r) ;
		int16:int16 = -32768s, 32767s ;
	ushort uint16(c, r) ;
		uint16:uint16 = 0US, 65535US ;
	byte int8(c, r) ;
		int8:int8 = -128b, 127b ;
	ubyte uint8(c, r) ;
		uint8:uint8 = 0UB, 255UB ;
	char char(charcol) ;
		char:char = "hohoho" ;

// global attributes:
		:creation_date = "6/6/66" ;
data:

 pi = 3.14159265358979 ;

 temperature = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 
    19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 
    37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50 ;

 time = _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _ ;

 peaks =
  6.67128029671744e-05, 0.000102910100082449, 0.0001536335025436, 
 < ... many values ... >
    6.99711871894863e-05, 4.10297274582676e-05 ;

 double =
  2.2250738585072e-308,
  1.79769313486232e+308 ;

 single =
  1.175494e-38,
  3.402823e+38 ;

 uint64 =
  0,
  18446744073709551615 ;

 int64 =
  -9223372036854775808,
  9223372036854775807 ;

 int32 =
  -2147483648,
  2147483647 ;

 uint32 =
  0,
  _ ;

 int16 =
  -32768,
  32767 ;

 uint16 =
  0,
  _ ;

 int8 =
  -128,
  127 ;

 uint8 =
  0,
  255 ;

 char = "hohoho" ;

group: grid1 {
  dimensions:
  	row = 10 ;
  	col = 10 ;
  variables:
  	double magic10(col, row) ;
  		magic10:_FillValue = 9.96920996838687e+36 ;
  		magic10:location = "in group /grid1" ;

  // group attributes:
  		:absolutepath = "/grid1" ;
  data:

   magic10 =
  92, 98, 4, 85, 86, 17, 23, 79, 10, 11,
  99, 80, 81, 87, 93, 24, 5, 6, 12, 18,
  1, 7, 88, 19, 25, 76, 82, 13, 94, 100,
  8, 14, 20, 21, 2, 83, 89, 95, 96, 77,
  15, 16, 22, 3, 9, 90, 91, 97, 78, 84,
  67, 73, 54, 60, 61, 42, 48, 29, 35, 36,
  74, 55, 56, 62, 68, 49, 30, 31, 37, 43,
  51, 57, 63, 69, 75, 26, 32, 38, 44, 50,
  58, 64, 70, 71, 52, 33, 39, 45, 46, 27,
  40, 41, 47, 28, 34, 65, 66, 72, 53, 59 ;

  group: subGrid1 {
    dimensions:
    	row = 5 ;
    	col = 5 ;
    	sampleTime = UNLIMITED ; // (3 currently)
    variables:
    	double magic5(col, row) ;
    		magic5:_FillValue = 9.96920996838687e+36 ;
    		magic5:location = "in group /grid1/subGrid1" ;
    	double sensorValue(col, sampleTime) ;
    		sensorValue:_FillValue = 9.96920996838687e+36 ;

    // group attributes:
    		:absolutepath = "/grid1/subGrid1" ;
    data:

     magic5 =
  17, 23, 4, 10, 11,
  24, 5, 6, 12, 18,
  1, 7, 13, 19, 25,
  8, 14, 20, 21, 2,
  15, 16, 22, 3, 9 ;

     sensorValue =
  {1, 11, 21},
  {2, 12, 22},
  {3, 13, 23},
  {4, 14, 24},
  {5, 15, 25} ;
    } // group subGrid1
  } // group grid1
}
  1. Create a duplicate of the original file and run the repro steps (string_ncdump_crush.cpp) that will create a small 2x2 NC_STRING variable named "str" in the duplicate file. This all happens without any error (string_ncdump_crush.cpp checks for error code after each operation):
$ cp nctest_netcdf4_orig.nc nctest_netcdf4.nc
$ ./a.out
  1. Now ncdump crashes (but the new variable seems to be read and displayed correctly):
$ ncdump nctest_netcdf4.nc 
netcdf nctest_netcdf4 {
dimensions:
	time = UNLIMITED ; // (50 currently)
	y = 50 ;
	x = 50 ;
	r = 1 ;
	c = 2 ;
	charcol = 6 ;
	m = 2 ;
	n = 2 ;
variables:
	double pi ;
		pi:description = "circumference by diameter" ;
	short temperature(time) ;
		temperature:_FillValue = 200s ;
		temperature:scale_factor = 1.8 ;
		temperature:add_offset = 32. ;
		temperature:units = "degrees_fahrenheight" ;
		temperature:value = "1 to 50 column vector" ;
	double time(time) ;
		time:_FillValue = 200. ;
		time:length = "1 to 55" ;
	double peaks(x, y) ;
		peaks:_FillValue = 9.96920996838687e+36 ;
		peaks:description = "varData=peaks(50);" ;
	double double(c, r) ;
		double:double = 2.2250738585072e-308, 1.79769313486232e+308 ;
	float single(c, r) ;
		single:single = 1.175494e-38f, 3.402823e+38f ;
	uint64 uint64(c, r) ;
		uint64:uint64 = 0ULL, 18446744073709551615ULL ;
	int64 int64(c, r) ;
		int64:int64 = -9223372036854775808LL, 9223372036854775807LL ;
	int int32(c, r) ;
		int32:int32 = -2147483648, 2147483647 ;
	uint uint32(c, r) ;
		uint32:uint32 = 0U, 4294967295U ;
	short int16(c, r) ;
		int16:int16 = -32768s, 32767s ;
	ushort uint16(c, r) ;
		uint16:uint16 = 0US, 65535US ;
	byte int8(c, r) ;
		int8:int8 = -128b, 127b ;
	ubyte uint8(c, r) ;
		uint8:uint8 = 0UB, 255UB ;
	char char(charcol) ;
		char:char = "hohoho" ;
	string str(n, m) ;

// global attributes:
		:creation_date = "6/6/66" ;
data:

 pi = 3.14159265358979 ;

 temperature = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 
    19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 
    37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50 ;

 time = _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _ ;

 peaks =
  6.67128029671744e-05, 0.000102910100082449, 0.0001536335025436, 
    < ... many values ... >
    6.99711871894863e-05, 4.10297274582676e-05 ;

 double =
  2.2250738585072e-308,
  1.79769313486232e+308 ;

 single =
  1.175494e-38,
  3.402823e+38 ;

 uint64 =
  0,
  18446744073709551615 ;

 int64 =
  -9223372036854775808,
  9223372036854775807 ;

 int32 =
  -2147483648,
  2147483647 ;

 uint32 =
  0,
  _ ;

 int16 =
  -32768,
  32767 ;

 uint16 =
  0,
  _ ;

 int8 =
  -128,
  127 ;

 uint8 =
  0,
  255 ;

 char = "hohoho" ;

 str =
  "Monday", "Wednesday",
  "Tuesday", "Thursday" ;

group: grid1 {
  dimensions:
  	row = 10 ;
  	col = 10 ;
  variables:
  	double magic10(col, row) ;
  		magic10:_FillValue = 9.96920996838687e+36 ;
  		magic10:location = "in group /grid1" ;

  // group attributes:
  		:absolutepath = "/grid1" ;
  data:

   magic10 =
  92, 98, 4, 85, 86, 17, 23, 79, 10, 11,
  99, 80, 81, 87, 93, 24, 5, 6, 12, 18,
  1, 7, 88, 19, 25, 76, 82, 13, 94, 100,
  8, 14, 20, 21, 2, 83, 89, 95, 96, 77,
  15, 16, 22, 3, 9, 90, 91, 97, 78, 84,
  67, 73, 54, 60, 61, 42, 48, 29, 35, 36,
  74, 55, 56, 62, 68, 49, 30, 31, 37, 43,
  51, 57, 63, 69, 75, 26, 32, 38, 44, 50,
  58, 64, 70, 71, 52, 33, 39, 45, 46, 27,
  40, 41, 47, 28, 34, 65, 66, 72, 53, 59 ;

  group: subGrid1 {
    dimensions:
    	row = 5 ;
    	col = 5 ;
    	sampleTime = UNLIMITED ; // (3 currently)
    variables:
    	double magic5(col, row) ;
    		magic5:_FillValue = 9.96920996838687e+36 ;
    		magic5:location = "in group /grid1/subGrid1" ;
    	double sensorValue(col, sampleTime) ;
    		sensorValue:_FillValue = 9.96920996838687e+36 ;

    // group attributes:
    		:absolutepath = "/grid1/subGrid1" ;
    data:

     magic5 =
  17, 23, 4, 10, 11,
  24, 5, 6, 12, 18,
  1, 7, 13, 19, 25,
  8, 14, 20, 21, 2,
  15, 16, 22, 3, 9 ;

     sensorValue =
  {1, 11, 21},
  {2, 12, 22},
  {3, 13, 23},
  {4, 14, 24},
  {5, 15, 25} ;
    } // group subGrid1
  } // group grid1
}
Segmentation fault: 11

The above is with netcdf-c 4.7.4 on a macOS 11.2.3 machine. @DennisHeimbigner has mentioned an ncdump crash on another issue I reported (#1985), and I tried building ncdump with the new dumplib.c he attached, and I still saw the crash.

I have also tried running ncdump on the same problematic file on a linux machine. There the issue presents a bit differently:

% ./ncdump ../nctest_netcdf4.nc 
netcdf nctest_netcdf4 {
dimensions:
	time = UNLIMITED ; // (50 currently)
	y = 50 ;
	x = 50 ;
	r = 1 ;
	c = 2 ;
	charcol = 6 ;
	m = 2 ;
	n = 2 ;
variables:
	double pi ;
		pi:description = "circumference by diameter" ;
	short temperature(time) ;
		temperature:_FillValue = 200s ;
		temperature:scale_factor = 1.8 ;
		temperature:add_offset = 32. ;
		temperature:units = "degrees_fahrenheight" ;
		temperature:value = "1 to 50 column vector" ;
	double time(time) ;
		time:_FillValue = 200. ;
		time:length = "1 to 55" ;
	double peaks(x, y) ;
		peaks:_FillValue = 9.96920996838687e+36 ;
		peaks:description = "varData=peaks(50);" ;
	double double(c, r) ;
		double:double = 2.2250738585072e-308, 1.79769313486232e+308 ;
	float single(c, r) ;
		single:single = 1.175494e-38f, 3.402823e+38f ;
	uint64 uint64(c, r) ;
		uint64:uint64 = 0ULL, 18446744073709551615ULL ;
	int64 int64(c, r) ;
		int64:int64 = -9223372036854775808LL, 9223372036854775807LL ;
	int int32(c, r) ;
		int32:int32 = -2147483648, 2147483647 ;
	uint uint32(c, r) ;
		uint32:uint32 = 0U, 4294967295U ;
	short int16(c, r) ;
		int16:int16 = -32768s, 32767s ;
	ushort uint16(c, r) ;
		uint16:uint16 = 0US, 65535US ;
	byte int8(c, r) ;
		int8:int8 = -128b, 127b ;
	ubyte uint8(c, r) ;
		uint8:uint8 = 0UB, 255UB ;
	char char(charcol) ;
		char:char = "hohoho" ;
	string str(n, m) ;

// global attributes:
		:creation_date = "6/6/66" ;
data:

 pi = 3.14159265358979 ;

 temperature = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 
    19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 
    37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50 ;

 time = _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _ ;

 peaks =
malloc(): invalid size (unsorted)
Abort

Do you think the issue is with ncdump, or with how the NC_STRING variable is written, or something is wonky with this one file itself?

I am most of all concerned about - and would like to rule out - any latent issues (that are somehow only make apparent by ncdump) in the writing of NC_STRING data...

Would appreciate your thoughts!

@DennisHeimbigner
Copy link
Collaborator

DennisHeimbigner commented May 18, 2021

So it turns out that for netcdf-4 files, it is not the case that the maximum dimension id is not equal to the (total number of dimensions -1); that is there are holes in the assignment of dimension ids. This error occurs because ncdump was making the false assumption of equality of those two values.
Fix is to search the dataset to find the maximum dimension id and use that to allocate a vector of all dimensions indexed by dimension id.
I should have a pull request to fix this shortly.
Surprised this bug did not show up earlier.

DennisHeimbigner added a commit to DennisHeimbigner/netcdf-c that referenced this issue May 18, 2021
re: issue Unidata#2002

It turns out that ncdump has an error where it assumes that the set of all dimension ids has no holes. That is that (maxid+1) = ndims. This is incorrect for a variety of reasons for netcdf-4.

So instead of counting total number of dimensions in a dataset, it is necessary to look for the maximum dimension id and use that when allocating a table of all dimensions.
@DennisHeimbigner
Copy link
Collaborator

Fixed by #2004

@krisfed
Copy link
Author

krisfed commented May 19, 2021

This is great! Thank you, Dennis! I no longer see the crash after applying your changes to dumplib.c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants