Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/suggestions #2447

Merged
merged 19 commits into from
Aug 25, 2018
Merged

Conversation

Tezd
Copy link

@Tezd Tezd commented May 30, 2018

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

This PR adds autocomplete to client for base queries, types, engines, formats, and functions except combinators.

Please bear with me on this PR since i am not very familiar with C++ language and its toolchain. My apologies in advance.
P.S Can I use Clients sendQuery method for gathering data about dbs, tables?

  • Rewrite completion hash
  • Load functions, dbs, tables, columns using system tables with limit
  • Use threading to load completion data in order to not block client
  • Rewrite code to modern standard
  • Fix safety threads, fix memory leaks
  • Remove TST, use parser instead
  • Allow completion disabling
  • Implement max_block_size for system.tables and system.columns
  • Ensure compatibility with old servers (old layout of system database)

return h;
}

int init_hash_table(HashTable *ht, size_t size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what source this code was copy-pasted?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


namespace Completion
{
static uint hashpjw(const char *arKey, uint nKeyLength)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad hash function.

@alexey-milovidov
Copy link
Member

P.S Can I use Clients sendQuery method for gathering data about dbs, tables?

This is perfectly Ok, but care should be taken when there are huge amount of tables.
We have servers with about million of small StripeLog tables for chunks.
You can define a limit for maximum number of fetched tables. And probably do it in a separate thread.
Command line client should never stuck.

For example, it is a typical issue for MySQL CLI: when you have huge number of tables, it will startup slowly.


#include <sys/types.h>

//All of functionality for hash was taken from mysql-server project from completion_hash.cpp file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing license. The license is most likely to be incompatible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, its only one way compatible. Will rewrite completion hash.

{(char *)"Expression"},
{(char *)"Set"},
//FUNCTIONS
{(char *)"plus"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be difficult to support unless we will use FunctionFactory directly (or query system.functions table for looser coupling).

std::vector<Block> blocks;

//preload all functions
sendQuery("SELECT name FROM system.functions", blocks);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aggregation function can also have suffixes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wont it be too much repetition and will pollute suggestions? For that reason i didn't want to include combinators, but can add them regardless.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be you right.

@@ -171,6 +242,9 @@ class Client : public Poco::Util::Application
/// External tables info.
std::list<ExternalTable> external_tables;

/// Suggestion limit for how many databases and tables to fetch
int suggestion_limit = 100;
Copy link
Contributor

@filimonov filimonov Jun 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why so low by default? For databases that can be reasonable, for tables - a bit too low

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@filimonov what limit will you suggest? twice the amount? not really sure :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that having dozens of tables is quite common. Clickhouse have 24 system tables in default distribution. At that point I have 93 tables on my local clickhouse. And it can be very irritating when autocompletion works for "almost" every table.

Limit? May be 256 ? Even you you will have 64 columns per table, and 32 chars per column name it will give 512 Kb of memory, which is not a big number nowadays. Also 256 is more 'round' number :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@filimonov Bumped limit to 256. One can always adjust it to higher/lower limits by adding --suggestion_limit= to your CLI client on startup ex. ./clickhouse client --suggestion_limit=512

delete implodedName;
}
}
delete query;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@filimonov Added dictionary name and attribute name as part of completion. Can add whole dictGetT() template as well. But I think there will be problems with complex keys.


QUERYPART queryParts[] = {
// CREATE DATABASE, TABLE, VIEW
{(const char *)"CREATE"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@filimonov Compared. Added missing query keywords

{(const char *)"Tuple"},
{(const char *)"Nested"},
{(const char *)"Expression"},
{(const char *)"Set"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nullable

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@filimonov added Nullable as well)

@alexey-milovidov
Copy link
Member

TODO: It shouldn't work when you paste a text with tabs in terminal.

@Tezd
Copy link
Author

Tezd commented Jun 19, 2018

@alexey-milovidov it doesn't trigger completion on paste, but will clear tabs from pasted text (at least this is behavior in my env) If it shows different behavior I will investigate it. Also not sure why builds are failing from time to time.

@Tezd
Copy link
Author

Tezd commented Jul 10, 2018

@alexey-milovidov Are there other changes that are needed ?

@alexey-milovidov
Copy link
Member

Yes, it requires multiple changes. I will address them later.

@alexey-milovidov
Copy link
Member

Now we have system tables for all required entities except language keywords.

@Tezd
Copy link
Author

Tezd commented Aug 2, 2018

@alexey-milovidov updated PR according to information from new system tables.

@Tezd
Copy link
Author

Tezd commented Aug 15, 2018

@alexey-milovidov sorry to bother, but just want to know if there are any more changes that are needed?

@alexey-milovidov
Copy link
Member

alexey-milovidov commented Aug 21, 2018

It's in my queue.

We have to rewrite code to modern style
(remove malloc, free, new, delete, sprintf, strdup, strlen, std::thread::detach)
Fix memory leaks. Fix thread safety errors.
Remove ternary search tree as it is not needed.
Extract completion for keywords from the parser (similar to "expected one of ... " report)
instead of explicit listing.
Allow to completely disable completion just in case.

Implement max_block_size for system.tables and system.columns table because we have servers with about one million of StripeLog tables with about 500 columns and when you do SELECT FROM system.columns even with limit, it will execute slowly and eat excessive amount of memory.

Check what happens when you copy-paste a query with tab characters.

Check and ensure compatibility of new client with old servers (completion may not work but client should work as before).

It is possible to implement in about three days.

sendQuery(
"SELECT name FROM system.functions"
" UNION ALL "
"SELECT name FROM system.table_engines"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will client be able to work with older versions of server, which doesn't have those tables?

@alexey-milovidov
Copy link
Member

The diff in this pull request is outdated. See master branch.

alexey-milovidov added a commit that referenced this pull request Aug 29, 2018
proller pushed a commit to proller/ClickHouse that referenced this pull request Sep 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants