Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_impala_queries does not return all records #68

Open
funes79 opened this issue Apr 20, 2018 · 5 comments
Open

get_impala_queries does not return all records #68

funes79 opened this issue Apr 20, 2018 · 5 comments

Comments

@funes79
Copy link

funes79 commented Apr 20, 2018

When I run a get_impala_queries from python it returns just 2 records, even if I use the same date range and filter.
When I filter it in the CM UI, the first two records appears immediately, and then after a second the rest of those queries.
Is it possible to get the next result somehow?

@Liuzhj
Copy link

Liuzhj commented Apr 21, 2018

hi @funes79

do you have some code to show ?

@funes79
Copy link
Author

funes79 commented Apr 21, 2018

Running the query for expensive queries in the last 7 days returns just 2 record:

from datetime import datetime, timedelta
api = ApiResource(cm_host, username="reader", password="cmreader", version=18)
c = api.get_all_clusters()[0]

for s in c.get_all_services():
    if s.type == 'IMPALA':
        impala = s

now = datetime.utcnow()
daysback = 7
start = now - timedelta(days=daysback)
end = now
print('> Scanning last %s days, from %s till %s ' % ( daysback, start, end) )
filterStr = 'memory_aggregate_peak >= 60GB'
queries = impala.get_impala_queries(start_time=start, end_time=end, filter_str=filterStr, limit=1000, offset=0)
for query in queries.queries:
    print '> queyrid = '+query.queryId

Result:

Scanning last 7 days, from 2018-04-14 19:14:30.128989 till 2018-04-21 19:14:30.128989
queyrid = 4b4732a957d53eff:36aed14500000000
queyrid = 284a368120d0f55f:10950c0f00000000

But looping through and calling the get_impala_queries for shorter intervals returns more results:

for i in xrange(1,7):
    start = now - timedelta(days=i)
    end = now - timedelta(days=i-1)
    print('> Scanning %s days ago, from %s till %s ' % ( daysback, start, end) )    
    filterStr = 'memory_aggregate_peak >= 60GB'
    queries = impala.get_impala_queries(start_time=start, end_time=end, filter_str=filterStr, limit=1000, offset=0)
    for query in queries.queries:
        print '> queryid = '+query.queryId

Result:

Scanning 7 days ago, from 2018-04-20 19:14:30.128989 till 2018-04-21 19:14:30.128989
Scanning 7 days ago, from 2018-04-19 19:14:30.128989 till 2018-04-20 19:14:30.128989
queryid = 4b4732a957d53eff:36aed14500000000
queryid = 284a368120d0f55f:10950c0f00000000
Scanning 7 days ago, from 2018-04-18 19:14:30.128989 till 2018-04-19 19:14:30.128989
queryid = f4a9f149c0fc14d:520bc9ec00000000
queryid = 6a4b66e3cb8f1778:32ed8d7d00000000
queryid = bb488109a374db59:4eb2f8f700000000
queryid = b4963e9b6f9d1ea:9f55733800000000
.... and much more

@Liuzhj
Copy link

Liuzhj commented Apr 22, 2018

hi @funes79
ok ,i get it , i will try it tomorrow .

@Liuzhj
Copy link

Liuzhj commented Apr 23, 2018

hi @funes79

i am try to execute , it's execute normally,
this it my code

import ...

def impala_query(cluster):
    end = datetime.now()
    start = end - timedelta(days=7)
    print start, end
    for s in cluster.get_all_services():
        if s.type == 'IMPALA':
            impala = s
            q =  impala.get_impala_queries(start_time=start, end_time=end, filter_str='database=xxx')
            for i in q.queries:
                print i.queryId

if __name__ == '__main__':
    try:
        cm_host = 'xx'
        api = ApiResource(cm_host, username='reader', password='cmreader', version=6)
        clusterName = api.get_all_clusters()[0]
        impala_query(clusterName)

image

@funes79
Copy link
Author

funes79 commented Apr 23, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants