Skip to content

Commit

Permalink
gdb: building inferior strings from within GDB
Browse files Browse the repository at this point in the history
History Of This Patch
=====================

This commit aims to address PR gdb/21699.  There have now been a
couple of attempts to fix this issue.  Simon originally posted two
patches back in 2021:

  https://sourceware.org/pipermail/gdb-patches/2021-July/180894.html
  https://sourceware.org/pipermail/gdb-patches/2021-July/180896.html

Before Pedro then posted a version of his own:

  https://sourceware.org/pipermail/gdb-patches/2021-July/180970.html

After this the conversation halted.  Then in 2023 I (Andrew) also took
a look at this bug and posted two versions:

  https://sourceware.org/pipermail/gdb-patches/2023-April/198570.html
  https://sourceware.org/pipermail/gdb-patches/2023-April/198680.html

The approach taken in my first patch was pretty similar to what Simon
originally posted back in 2021.  My second attempt was only a slight
variation on the first.

Pedro then pointed out his older patch, and so we arrive at this
patch.  The GDB changes here are mostly Pedro's work, but updated by
me (Andrew), any mistakes are mine.

The tests here are a combinations of everyone's work, and the commit
message is new, but copies bits from everyone's earlier work.

Problem Description
===================

Bug PR gdb/21699 makes the observation that using $_as_string with
GDB's printf can cause GDB to print unexpected data from the
inferior.  The reproducer is pretty simple:

  #include <stddef.h>
  static char arena[100];

  /* Override malloc() so value_coerce_to_target() gets a known
     pointer, and we know we"ll see an error if $_as_string() gives
     a string that isn't null terminated. */
  void
  *malloc (size_t size)
  {
      memset (arena, 'x', sizeof (arena));
      if (size > sizeof (arena))
          return NULL;
      return arena;
  }

  int
  main ()
  {
    return 0;
  }

And then in a GDB session:

  $ gdb -q test
  Reading symbols from /tmp/test...
  (gdb) start
  Temporary breakpoint 1 at 0x4004c8: file test.c, line 17.
  Starting program: /tmp/test

  Temporary breakpoint 1, main () at test.c:17
  17        return 0;
  (gdb) printf "%s\n", $_as_string("hello")
  "hello"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  (gdb) quit

The problem above is caused by how value_cstring is used within
py-value.c, but once we understand the issue then it turns out that
value_cstring is used in an unexpected way in many places within GDB.

Within py-value.c we have a null-terminated C-style string.  We then
pass a pointer to this string, along with the length of this
string (so not including the null-character) to value_cstring.

In value_cstring GDB allocates an array value of the given character
type, and copies in requested number of characters.  However
value_cstring does not add a null-character of its own.  This means
that the value created by calling value_cstring is only
null-terminated if the null-character is included in the passed in
length.  In py-value.c this is not the case, and indeed, in most uses
of value_cstring, this is not the case.

When GDB tries to print one of these strings the value contents are
pushed to the inferior, and then read back as a C-style string, that
is, GDB reads inferior memory until it finds a null-terminator.  For
the py-value.c case, no null-terminator is pushed into the inferior,
so GDB will continue reading inferior memory until a null-terminator
is found, with unpredictable results.

Patch Description
=================

The first thing this patch does is better define what the arguments
for the two function value_cstring and value_string should represent.
The comments in the header file are updated to describe whether the
length argument should, or should not, include a null-character.
Also, the data argument is changed to type gdb_byte.  The functions as
they currently exist will handle wide-characters, in which case more
than one 'char' would be needed for each character.  As such using
gdb_byte seems to make more sense.

To avoid adding casts throughout GDB, I've also added an overload that
still takes a 'char *', but asserts that the character type being used
is of size '1'.

The value_cstring function is now responsible for adding a null
character at the end of the string value it creates.

However, once we start looking at how value_cstring is used, we
realise there's another, related, problem.  Not every language's
strings are null terminated.  Fortran and Ada strings, for example,
are just an array of characters, GDB already has the function
value_string which can be used to create such values.

Consider this example using current GDB:

  (gdb) set language ada
  (gdb) p $_gdb_setting("arch")
  $1 = (97, 117, 116, 111)
  (gdb) ptype $
  type = array (1 .. 4) of char
  (gdb) p $_gdb_maint_setting("test-settings string")
  $2 = (0)
  (gdb) ptype $
  type = array (1 .. 1) of char

This shows two problems, first, the $_gdb_setting and
$_gdb_maint_setting functions are calling value_cstring using the
builtin_char character, rather than a language appropriate type.  In
the first call, the 'arch' case, the value_cstring call doesn't
include the null character, so the returned array only contains the
expected characters.  But, in the $_gdb_maint_setting example we do
end up including the null-character, even though this is not expected
for Ada strings.

This commit adds a new language method language_defn::value_string,
this function takes a pointer and length and creates a language
appropriate value that represents the string.  For C, C++, etc this
will be a null-terminated string (by calling value_cstring), and for
Fortran and Ada this can be a bounded array of characters with no null
terminator.  Additionally, this new language_defn::value_string
function is responsible for selecting a language appropriate character
type.

After this commit the only calls to value_cstring are from the C
expression evaluator and from the default language_defn::value_string.

And the only calls to value_string are from Fortan, Ada, and ObjectC
related code.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=21699

Co-Authored-By: Simon Marchi <[email protected]>
Co-Authored-By: Andrew Burgess <[email protected]>
Co-Authored-By: Pedro Alves <[email protected]>
Approved-By: Simon Marchi <[email protected]>
  • Loading branch information
3 people committed Jun 5, 2023
1 parent f4afd6c commit baab375
Show file tree
Hide file tree
Showing 18 changed files with 544 additions and 48 deletions.
13 changes: 13 additions & 0 deletions gdb/ada-lang.c
Original file line number Diff line number Diff line change
Expand Up @@ -13524,6 +13524,19 @@ class ada_language : public language_defn
return symbol->is_artificial ();
}

/* See language.h. */
struct value *value_string (struct gdbarch *gdbarch,
const char *ptr, ssize_t len) const override
{
struct type *type = language_string_char_type (this, gdbarch);
value *val = ::value_string (ptr, len, type);
/* VAL will be a TYPE_CODE_STRING, but Ada only knows how to print
strings that are arrays of characters, so fix the type now. */
gdb_assert (val->type ()->code () == TYPE_CODE_STRING);
val->type ()->set_code (TYPE_CODE_ARRAY);
return val;
}

/* See language.h. */
void language_arch_info (struct gdbarch *gdbarch,
struct language_arch_info *lai) const override
Expand Down
14 changes: 6 additions & 8 deletions gdb/c-lang.c
Original file line number Diff line number Diff line change
Expand Up @@ -652,16 +652,11 @@ c_string_operation::evaluate (struct type *expect_type,
}
else
{
int i;

/* Write the terminating character. */
for (i = 0; i < type->length (); ++i)
obstack_1grow (&output, 0);
int element_size = type->length ();

if (satisfy_expected)
{
LONGEST low_bound, high_bound;
int element_size = type->length ();

if (!get_discrete_bounds (expect_type->index_type (),
&low_bound, &high_bound))
Expand All @@ -676,10 +671,13 @@ c_string_operation::evaluate (struct type *expect_type,
result = value::allocate (expect_type);
memcpy (result->contents_raw ().data (), obstack_base (&output),
obstack_object_size (&output));
/* Write the terminating character. */
memset (result->contents_raw ().data () + obstack_object_size (&output),
0, element_size);
}
else
result = value_cstring ((const char *) obstack_base (&output),
obstack_object_size (&output),
result = value_cstring ((const gdb_byte *) obstack_base (&output),
obstack_object_size (&output) / element_size,
type);
}
return result;
Expand Down
18 changes: 4 additions & 14 deletions gdb/cli/cli-cmds.c
Original file line number Diff line number Diff line change
Expand Up @@ -2315,12 +2315,7 @@ value_from_setting (const setting &var, struct gdbarch *gdbarch)
len = st.length ();
}

if (len > 0)
return value_cstring (value, len,
builtin_type (gdbarch)->builtin_char);
else
return value_cstring ("", 1,
builtin_type (gdbarch)->builtin_char);
return current_language->value_string (gdbarch, value, len);
}
default:
gdb_assert_not_reached ("bad var_type");
Expand Down Expand Up @@ -2372,8 +2367,8 @@ str_value_from_setting (const setting &var, struct gdbarch *gdbarch)
{
std::string cmd_val = get_setshow_command_value_string (var);

return value_cstring (cmd_val.c_str (), cmd_val.size (),
builtin_type (gdbarch)->builtin_char);
return current_language->value_string (gdbarch, cmd_val.c_str (),
cmd_val.size ());
}

case var_string:
Expand All @@ -2400,12 +2395,7 @@ str_value_from_setting (const setting &var, struct gdbarch *gdbarch)
len = st.length ();
}

if (len > 0)
return value_cstring (value, len,
builtin_type (gdbarch)->builtin_char);
else
return value_cstring ("", 1,
builtin_type (gdbarch)->builtin_char);
return current_language->value_string (gdbarch, value, len);
}
default:
gdb_assert_not_reached ("bad var_type");
Expand Down
10 changes: 10 additions & 0 deletions gdb/f-lang.c
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,16 @@ f_language::get_encoding (struct type *type)
return encoding;
}

/* See language.h. */

struct value *
f_language::value_string (struct gdbarch *gdbarch,
const char *ptr, ssize_t len) const
{
struct type *type = language_string_char_type (this, gdbarch);
return ::value_string (ptr, len, type);
}

/* A helper function for the "bound" intrinsics that checks that TYPE
is an array. LBOUND_P is true for lower bound; this is used for
the error message, if any. */
Expand Down
5 changes: 5 additions & 0 deletions gdb/f-lang.h
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,11 @@ class f_language : public language_defn

/* See language.h. */

struct value *value_string (struct gdbarch *gdbarch,
const char *ptr, ssize_t len) const override;

/* See language.h. */

const char *struct_too_deep_ellipsis () const override
{ return "(...)"; }

Expand Down
4 changes: 1 addition & 3 deletions gdb/guile/scm-math.c
Original file line number Diff line number Diff line change
Expand Up @@ -803,9 +803,7 @@ vlscm_convert_typed_value_from_scheme (const char *func_name,
0 /*non-strict*/,
&except_scm);
if (s != NULL)
value = value_cstring (s.get (), len,
language_string_char_type (language,
gdbarch));
value = language->value_string (gdbarch, s.get (), len);
else
value = NULL;
}
Expand Down
10 changes: 10 additions & 0 deletions gdb/language.c
Original file line number Diff line number Diff line change
Expand Up @@ -874,6 +874,16 @@ language_string_char_type (const struct language_defn *la,

/* See language.h. */

struct value *
language_defn::value_string (struct gdbarch *gdbarch,
const char *ptr, ssize_t len) const
{
struct type *type = language_string_char_type (this, gdbarch);
return value_cstring (ptr, len, type);
}

/* See language.h. */

struct type *
language_bool_type (const struct language_defn *la,
struct gdbarch *gdbarch)
Expand Down
6 changes: 6 additions & 0 deletions gdb/language.h
Original file line number Diff line number Diff line change
Expand Up @@ -602,6 +602,12 @@ struct language_defn
virtual char string_lower_bound () const
{ return c_style_arrays_p () ? 0 : 1; }

/* Return the LEN characters long string at PTR as a value suitable for
this language. GDBARCH is used to infer the character type. The
default implementation returns a null-terminated C string. */
virtual struct value *value_string (struct gdbarch *gdbarch,
const char *ptr, ssize_t len) const;

/* Returns true if the symbols names should be stored in GDB's data
structures for minimal/partial/full symbols using their linkage (aka
mangled) form; false if the symbol names should be demangled first.
Expand Down
8 changes: 3 additions & 5 deletions gdb/python/py-value.c
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,6 @@
#define builtin_type_pybool \
language_bool_type (current_language, gdbpy_enter::get_gdbarch ())

#define builtin_type_pychar \
language_string_char_type (current_language, gdbpy_enter::get_gdbarch ())

struct value_object {
PyObject_HEAD
struct value_object *next;
Expand Down Expand Up @@ -1881,8 +1878,9 @@ convert_value_from_python (PyObject *obj)
gdb::unique_xmalloc_ptr<char> s
= python_string_to_target_string (obj);
if (s != NULL)
value = value_cstring (s.get (), strlen (s.get ()),
builtin_type_pychar);
value
= current_language->value_string (gdbpy_enter::get_gdbarch (),
s.get (), strlen (s.get ()));
}
else if (PyObject_TypeCheck (obj, &value_object_type))
value = ((value_object *) obj)->value->copy ();
Expand Down
32 changes: 32 additions & 0 deletions gdb/testsuite/gdb.base/internal-string-values.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
/* This testcase is part of GDB, the GNU debugger.
Copyright 2021-2023 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */

static void
trace_me (void)
{}

static void
end (void)
{}

int
main (void)
{
trace_me ();
end ();
return 0;
}
Loading

0 comments on commit baab375

Please sign in to comment.