Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get position info or parser context with custom from_json() that may throw exceptions? #1508

Closed
traceon opened this issue Mar 11, 2019 · 19 comments · Fixed by #2562
Closed
Assignees
Labels
release item: ⚡ improvement solution: proposed fix a fix for the issue has been proposed and waits for confirmation state: please discuss please discuss the issue or vote for your favorite option

Comments

@traceon
Copy link

traceon commented Mar 11, 2019

Sorry if the question is lame or was answered before (couldn't find anything similar in the issues history).

Is it possible to get a position info or, better, a typical json::exception with a context where the "parsing" failed, for any exceptions thrown inside my custom from_json() for third party classes that may throw errors on their from_string() builder functions?

@traceon traceon changed the title How to get position info or generic error messages with custom from_json()? How to get position info or parser context with custom from_json() that may throw exceptions? Mar 11, 2019
@nlohmann
Copy link
Owner

This is unfortunately not possible, because it is a compile-time error. This compile-time error is a static assertion that is triggered if a type is subject conversion from or to JSON, but no from_json/to_json function can be found - either predefined in the library (e.g., for standard types like int or std::vector) or inside the namespace of the respective type. See https://github.com/nlohmann/json#arbitrary-types-conversions for more information.

@traceon
Copy link
Author

traceon commented Mar 11, 2019

Oh, I probably haven't expressed it clearly: I have implemented my from_json() for the required type, and it works, as long as I provide the correct string. What I want to achieve, is to be able to output regular json errors with line and column where it failed to parse, assuming that failure is manifested as my custom from_json() throwing some exception.

#include <boost/lexical_cast.hpp>
#include <boost/thread.hpp>

#include <nlohmann/json.hpp>

#include <stdexcept>
#include <string>

class ThreadCount {
public:
    explicit ThreadCount(std::size_t num = 0) : num_(num) {}

    // This will throw on non-numbers or misspelled 'hardware' or 'physical'.
    static ThreadCount from_string(const std::string& value) {
        ThreadCount tc;

        if (value == "hardware") {
            tc.num_ = boost::thread::hardware_concurrency();
        }
        else if (value == "physical") {
            tc.num_ = boost::thread::physical_concurrency();
        }
        else {
            tc.num_ = boost::lexical_cast<decltype(tc.num_)>(value);
        }

        return tc;
    }

private:
    std::size_t num_;
};

inline void from_json(const nlohmann::json& json, ThreadCount& value) {
    if (json.is_string()) {
        value = ThreadCount::from_string(json.get<std::string>());
    }
    else if (json.is_number_unsigned()) {
        value = ThreadCount(json.get<std::size_t>());
    }
    else {
        throw std::runtime_error("value must be either string or unsigned number");
    }
}

int main(int argc, const char* argv[]) {
    int exitCode = EXIT_SUCCESS;

    try {
//      auto tc_json = "{ \"value\": \"hardware\" }"_json; // OK
//      auto tc_json = "{ \"value\": \"10\" }"_json;       // OK
//      auto tc_json = "{ \"value\": -6 }"_json;           // std::exception: value must be either string or unsigned number
//      auto tc_json = "{ \"value\": \"fisikal\" }"_json;  // boost::bad_lexical_cast: bad lexical cast: source type value could not be interpreted as target
        auto tc_json = "{ \"value\": xyz }"_json;          // nlohmann::json::exception: [json.exception.parse_error.101] parse error at line 1, column 12: syntax error while parsing value - invalid literal; last read: '"value": x'

        auto tc = tc_json.at("value").get<ThreadCount>();
    }
    catch (const nlohmann::json::exception& ex) {
        std::fprintf(stderr, "nlohmann::json::exception: %s\n", ex.what());
        exitCode = EXIT_FAILURE;
    }
    catch (const boost::bad_lexical_cast& ex) {
        std::fprintf(stderr, "boost::bad_lexical_cast: %s\n", ex.what());
        exitCode = EXIT_FAILURE;
    }
    catch (const std::exception& ex) {
        std::fprintf(stderr, "std::exception: %s\n", ex.what());
        exitCode = EXIT_FAILURE;
    }
    catch (...) {
        std::fprintf(stderr, "unknown exception\n");
        exitCode = EXIT_FAILURE;
    }

    return exitCode;
}

Here, in place of std::exception and boost::bad_lexical_cast I want to get error strings similar to nlohmann::json::exception case.

@nlohmann
Copy link
Owner

So you basically want to throw your own parse error exceptions?

@traceon
Copy link
Author

traceon commented Mar 11, 2019

Kind of. Or be able to wrap original error with extra parser info. The ultimate goal is to let the user know where exactly in the json he made a mistake, that caused problems during deserialization of a custom structure.

@nlohmann
Copy link
Owner

The class for this is:

class parse_error : public exception
{
  public:
    /*!
    @brief create a parse error exception
    @param[in] id_       the id of the exception
    @param[in] position  the position where the error occurred (or with
                         chars_read_total=0 if the position cannot be
                         determined)
    @param[in] what_arg  the explanatory string
    @return parse_error object
    */
    static parse_error create(int id_, const position_t& pos, const std::string& what_arg)
    {
        std::string w = exception::name("parse_error", id_) + "parse error" +
                        position_string(pos) + ": " + what_arg;
        return parse_error(id_, pos.chars_read_total, w.c_str());
    }

    static parse_error create(int id_, std::size_t byte_, const std::string& what_arg)
    {
        std::string w = exception::name("parse_error", id_) + "parse error" +
                        (byte_ != 0 ? (" at byte " + std::to_string(byte_)) : "") +
                        ": " + what_arg;
        return parse_error(id_, byte_, w.c_str());
    }

    /*!
    @brief byte index of the parse error

    The byte index of the last read character in the input file.

    @note For an input with n bytes, 1 is the index of the first character and
          n+1 is the index of the terminating null byte or the end of file.
          This also holds true when reading a byte vector (CBOR or MessagePack).
    */
    const std::size_t byte;

  private:
    parse_error(int id_, std::size_t byte_, const char* what_arg)
        : exception(id_, what_arg), byte(byte_) {}

    static std::string position_string(const position_t& pos)
    {
        return " at line " + std::to_string(pos.lines_read + 1) +
               ", column " + std::to_string(pos.chars_read_current_line);
    }
};

so you could create and throw a parse_error yourself.

@traceon
Copy link
Author

traceon commented Mar 11, 2019 via email

@nlohmann
Copy link
Owner

The positions are only known at parse time. The from_json method is called with a valid JSON value (which does not necessarily need to come from a parser), and only your logic decide to reject values. Unfortunately, you have no way to know the source location of the value you reject.

If you really need the source location, you may need to implement your own SAX parser that not only creates the JSON values, but also performs additional type checks. Inside the SAX parser, the source locations are known. See https://nlohmann.github.io/json/classnlohmann_1_1basic__json_a8a3dd150c2d1f0df3502d937de0871db.html#a8a3dd150c2d1f0df3502d937de0871db for some documentation.

@traceon
Copy link
Author

traceon commented Mar 11, 2019

Yes, I suspected it. Will try implementing custom SAX parser then.

However, a different kind of position may still be available, like an absolute json pointer for the json object/value passed to from_json(). So I wonder, will it be technically possible to do, like to extend the from_json() lookup mechanism to also look for possible from_json()'s that accept positions/pointers?

@nlohmann
Copy link
Owner

I guess no, since there is a lot of ADL magic running in the background. Maybe @theodelrieu has an idea.

@gregmarr
Copy link
Contributor

ISTR a discussion a while back about the json object having a parent pointer for cases like this, so that you could walk back up the tree to the root to form a path. I don't think it went anywhere.

@theodelrieu
Copy link
Contributor

It could technically be done, however we have to figure out which overloads would have priority over the other. Or if there is no priority, providing both overloads would result in ambiguous call.

I'm still not sure it is worth the trouble though, there might be a way not involving touching from_json mechanism.

@traceon
Copy link
Author

traceon commented Mar 12, 2019

we have to figure out which overloads would have priority over the other

Most informative first? E.g., parser context > json path > no context/position

@stale
Copy link

stale bot commented Apr 11, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Apr 11, 2019
@stale stale bot closed this as completed Apr 18, 2019
@bart9h
Copy link

bart9h commented Aug 28, 2020

Will stale bot reopen the issue if I comment here?

I also miss a way to get line information out of a nlohmann::detail::exception object.

@nlohmann
Copy link
Owner

I really do not see a way to achieve this without adding a lot of debug information you always pay for even if no error occurs.

@nlohmann nlohmann reopened this Aug 28, 2020
@stale stale bot removed the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Aug 28, 2020
@bart9h
Copy link

bart9h commented Aug 28, 2020

I see. You don't want to introduce a performance (and code complexity) penalty for this.

What about you add a separate function just to check for errors?
So the normal parsing won't be affected, and in case of error we can call this error checking method that tells line/column of error (and maybe some more info).

@nlohmann
Copy link
Owner

I wish it would be that easy... What I could think of is the following:

  • Add a pointer to basic_json to link the parent of the value. Thereby, the null in [null] could "know" that it is part of an array. For this, all constructors need to be adjusted to set that value.
  • If an error occurs at some value, we can go up the parent chain and generate some diagnostic information like a JSON pointer to the child value. Then you could identify a value with /foo/bar/1, so the user can understand which value has the wrong type.
  • Maybe (!), we could also add the parse location to values.

All this should be hidden behind some preprocessor macro such that you will never pay for the overhead (1 pointer, 2 integers) if you don't want to.

@nlohmann nlohmann added the state: please discuss please discuss the issue or vote for your favorite option label Aug 29, 2020
@stale
Copy link

stale bot commented Oct 4, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Oct 4, 2020
@stale stale bot closed this as completed Oct 12, 2020
@nlohmann nlohmann mentioned this issue Jan 1, 2021
5 tasks
@nlohmann
Copy link
Owner

nlohmann commented Jan 1, 2021

FYI: I am trying to fix this issue in #2562. Any help would be greatly appreciated!

@nlohmann nlohmann added the solution: proposed fix a fix for the issue has been proposed and waits for confirmation label Jan 23, 2021
@nlohmann nlohmann added this to the Release 3.9.2 milestone Jan 23, 2021
@nlohmann nlohmann reopened this Jan 23, 2021
@stale stale bot removed the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Jan 23, 2021
@nlohmann nlohmann self-assigned this Jan 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release item: ⚡ improvement solution: proposed fix a fix for the issue has been proposed and waits for confirmation state: please discuss please discuss the issue or vote for your favorite option
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants