Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common: Remove dependency on cpp-optparse (Only for FEXLoader) #4070

Closed

Conversation

Sonicadvance1
Copy link
Member

Doesn't remove it from the full project since three tools still use it.

FEXLoader argument loader is fairly special since we have a generator tied to it. This generator lets us have some niceties where everything ends up being a string literal without any sort of dynamic requirements.

cpp-optparse has a systemic issue where it makes deep copies of /everything/ all the time. This means it has huge runtime costs and huge stack usage (4992 bytes). The new implementation uses 608 bytes of stack space (plus 640 if it actually parses a strenum).

When profiling cpp-optparse with FEXLoader's argument it takes ~2,039,307 ns to parse the arguments! As a direct comparison this new implementation takes ~56,885 ns. Both sides not getting passed /any/ arguments, really shows how spicy this library is.

@Sonicadvance1 Sonicadvance1 force-pushed the remove_cpp_optparse_fexloader branch from 4063b9d to 8ef0bc1 Compare September 15, 2024 06:05
@Sonicadvance1 Sonicadvance1 force-pushed the remove_cpp_optparse_fexloader branch 5 times, most recently from 88bdf03 to fe696c0 Compare September 16, 2024 00:54
@Sonicadvance1 Sonicadvance1 force-pushed the remove_cpp_optparse_fexloader branch from fe696c0 to b6c9a5b Compare September 18, 2024 17:54
Copy link
Member

@neobrain neobrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some comments here and some others in 4072.

Source/Common/ArgumentLoader.cpp Show resolved Hide resolved
}

if (ProgramArguments.empty() && OptionDetails == nullptr) {
// In the special case that we hit a parse error and we haven't parsed any arguments, pass everything.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least, FEXLoader --non-existing-option some_binary should properly fail and notify the user about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already happens.

$ ./Bin/FEXLoader --non-existing-option `which ls`
Error: Unsupported argument: --non-existing-option

This edge case specifically only happens if the argument getting parsed was not prefixed with - or --, so passing in a command or something. Like the comment said, something like FEXLoader /usr/bin/ls would hit this case.

Other cases with invalid argument passing is above in the IsLong or IsShort checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stumbled upon something during testing but can't remember what exactly, might have been a silly mistake on my end though. It's true that this specific example works fine after all.

Comment on lines 65 to 70
fextl::vector<fextl::string> GetGuestArgs() {
return std::move(RemainingArgs);
}
fextl::vector<const char*> GetParsedArgs() {
return std::move(ProgramArguments);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need a way to pull all state underneath this class, the way to do it is to enforce move semantics:

auto [Guest, Parsed] = std::move(arg_parser).GetArgs();

This is achieved by merging the two functions and making it callable with rvalues, only:

std::pair<vector, vector> GetArgs() && {
    return make_pair(std::move(Args1), std::move(Args2));
}

Equivalently, the parsed arguments don't need to be state in ArgParser but could instead be returned from Parse() directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

Source/Common/ArgumentLoader.cpp Outdated Show resolved Hide resolved
Source/Common/ArgumentLoader.cpp Outdated Show resolved Hide resolved
Arguments[1] = "FEX";
Arguments[2] = nullptr;

execvp("man", (char* const*)Arguments);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it not print a built-in help for FEXLoader like it does for other tools in 4072?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to redirect to the man page because the options are complex enough with enough formatting issues that it would be a pain. It's becoming a common thing in Linux space that passing -h or --help in to Linux applications just load the man page anyway.

Source/Common/ArgumentLoader.cpp Outdated Show resolved Hide resolved
@Sonicadvance1 Sonicadvance1 force-pushed the remove_cpp_optparse_fexloader branch 3 times, most recently from 9d0596d to 3c3e7e9 Compare September 24, 2024 10:30
const auto Split = Arg.find_first_of('=');
bool NeedsSplitArg {};
if (Split == Arg.npos) {
ArgFirst = Arg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line has no effect.

const bool IsShort = Arg.find("--", 0, 2) == Arg.npos && Arg.find("-", 0, 1) == 0;
const bool IsLong = Arg.find("--", 0, 2) == 0;

std::string_view ArgFirst {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable doesn't need to be declared before proper initialization, since the two branches don't converge in a path that still uses the variable.

Comment on lines +153 to +154
for (size_t i = 1; i < ArgFirst.size(); ++i) {
const auto ArgChar = ArgFirst.substr(i, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use Arg directly here?

}

std::pair<fextl::vector<fextl::string>, fextl::vector<const char*>> ArgParser::Parse(int argc, char** argv) {
fextl::vector<fextl::string> RemainingArgs {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declaration of this can be moved all the way down.

#define BEFORE_PARSE
private:
FEX::ArgLoader::ArgLoader* Loader;
std::string_view _Version {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting variable names with an underscore is weird style, and this particular name seems common enough that we might actually hit a collision from reserved names at some point.

_exit(1);
}

class ArgParser final {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs some documentation to explain why it exists and what's the difference from the confusingly similarly named ArgLoader.

execvp("man", (char* const*)Arguments);
}

void ExitWithError(std::string_view Error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global functions used only locally should be static.

@@ -90,6 +90,8 @@ foreach(ASM_SRC ${ASM_SOURCES})
list(APPEND ARGS_LIST "--smcchecks=full")
endif()

list(APPEND ARGS_LIST "--")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed now when it wasn't before?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a bug in our arguments getting passed to unittests for years (#4071) and it was undiscovered. Due to a quirk with cpp-optparse where any unsupported arguments were gathered and passed to the guest application. This didn't necessarily break anything in our unittests, just made them run silently, and for all the harness things the additional argument was ignored anyway.

The commit these changes are from should probably be broken out to a freestanding commit before this PR is merged anyway.

Copy link
Member Author

@Sonicadvance1 Sonicadvance1 Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and since I realize I didn't actually answer the question. We're being more strict than cpp-optparse splitting out FEXLoader arguments from guest arguments because of that mentioned problem. So the unittests need the split pattern now, where cpp-optparse didn't need this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're being more strict than cpp-optparse splitting out FEXLoader arguments from guest arguments because of that mentioned problem. So the unittests need the split pattern now, where cpp-optparse didn't need this.

Sounds reasonable 👍

Doesn't remove it from the full project since three tools still use it.

FEXLoader argument loader is fairly special since we have a generator
tied to it. This generator lets us have some niceties where everything
ends up being a string literal without any sort of dynamic requirements.

cpp-optparse has a systemic issue where it makes deep copies of
/everything/ all the time. This means it has huge runtime costs and huge
stack usage (4992 bytes). The new implementation uses 608 bytes of stack
space (plus 640 if it actually parses a strenum).

When profiling cpp-optparse with FEXLoader's argument it takes ~2,039,307 ns to parse the arguments!
As a direct comparison this new implementation takes ~56,885 ns.
Both sides not getting passed /any/ arguments, really shows how spicy
this library is.
Boolean short arguments can be combined. I don't really use this feature
but I guess other people do.
@Sonicadvance1 Sonicadvance1 force-pushed the remove_cpp_optparse_fexloader branch from 3c3e7e9 to c6c06bd Compare September 30, 2024 18:39
@Sonicadvance1
Copy link
Member Author

I don't have the energy to push this over the finish line. With a bit of effort the FEXLoader implementation and "everything else" implementation could be merged in to a single class with some inheritance or something.

The modified cpp-optparse still works and the 35x speed-up while nice, is only like a 5% removal of execution time in short running applications (If it even happens, FEXInterpreter early exits this impl).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants