Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example for downloading full RDS log doesn't actually work #2268

Closed
bitglue opened this issue Oct 30, 2016 · 26 comments
Closed

Example for downloading full RDS log doesn't actually work #2268

bitglue opened this issue Oct 30, 2016 · 26 comments
Labels
closing-soon This issue will automatically close in 4 days unless further comments are made. guidance Question that needs advice or information. rds

Comments

@bitglue
Copy link

bitglue commented Oct 30, 2016

The instructions from #1617 do not download the entire log file as documented.

$ aws --version
aws-cli/1.11.10 Python/2.7.11 Darwin/15.5.0 botocore/1.4.67
$ aws --output text rds describe-db-log-files --db-instance-identifier mydatabase
[...]
DESCRIBEDBLOGFILES  1477763976000   error/postgresql.log.2016-10-29-17  1908814
[...]
$ aws rds download-db-log-file-portion --db-instance-identifier mydatabase --log-file-name error/postgresql.log.2016-10-29-17 --starting-token 0 --output text > full.txt
$ ls -l full.txt
-rw-r--r--  1 philfrost  staff  1212017 Oct 30 07:59 full.txt

Note the log file has size 1908814, but downloaded it is only 1212017 bytes.

It's unclear if it's even possible to download a log file in a simple shell script as the pagination tokens do not seem to be available with --output text. I'm guessing one would need to parse JSON or XML to get them.

@JordonPhillips
Copy link
Member

Could you post the --debug log? I'd like to see what we're sending the service to see if the issue is with us or if there's a bug service-side.

@JordonPhillips JordonPhillips added question closing-soon This issue will automatically close in 4 days unless further comments are made. rds labels Nov 8, 2016
@JordonPhillips
Copy link
Member

@anbotero reported similarly here. Could one of you please paste in the portion of the --debug logs that shows what we're sending to the service? You'll be looking for an entry that contains Making request for OperationModel.

@anbotero
Copy link

anbotero commented Nov 8, 2016

@JordonPhillips hey there.

I get several of those Making requests, so this here is the first one, three in between, and the last one:

2016-11-08 13:39:54,887 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=DownloadDBLogFilePortion) (verify_ssl=True) with params: {'body': {'Action': u'DownloadDBLogFilePortion', u'Marker': u'0', 'Version': u'2014-10-31', u'LogFileName': u'slowquery/engine.very-slow-queries.log.0', u'DBInstanceIdentifier': u'mydb'}, 'url': u'https://rds.amazonaws.com/', 'headers': {'User-Agent': 'aws-cli/1.11.13 Python/2.7.10 Darwin/15.6.0 botocore/1.4.70'}, 'context': {'client_region': 'us-east-1', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x10cd793d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}
2016-11-08 13:40:02,560 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=DownloadDBLogFilePortion) (verify_ssl=True) with params: {'body': {'Action': u'DownloadDBLogFilePortion', u'Marker': '19:1048697', 'Version': u'2014-10-31', u'LogFileName': u'slowquery/engine.very-slow-queries.log.0', u'DBInstanceIdentifier': u'mydb'}, 'url': u'https://rds.amazonaws.com/', 'headers': {'User-Agent': 'aws-cli/1.11.13 Python/2.7.10 Darwin/15.6.0 botocore/1.4.70'}, 'context': {'client_region': 'us-east-1', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x10cd793d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}
2016-11-08 13:40:05,260 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=DownloadDBLogFilePortion) (verify_ssl=True) with params: {'body': {'Action': u'DownloadDBLogFilePortion', u'Marker': '19:2097350', 'Version': u'2014-10-31', u'LogFileName': u'slowquery/engine.very-slow-queries.log.0', u'DBInstanceIdentifier': u'mydb'}, 'url': u'https://rds.amazonaws.com/', 'headers': {'User-Agent': 'aws-cli/1.11.13 Python/2.7.10 Darwin/15.6.0 botocore/1.4.70'}, 'context': {'client_region': 'us-east-1', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x10cd793d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}
2016-11-08 13:40:08,087 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=DownloadDBLogFilePortion) (verify_ssl=True) with params: {'body': {'Action': u'DownloadDBLogFilePortion', u'Marker': '19:3145944', 'Version': u'2014-10-31', u'LogFileName': u'slowquery/engine.very-slow-queries.log.0', u'DBInstanceIdentifier': u'mydb'}, 'url': u'https://rds.amazonaws.com/', 'headers': {'User-Agent': 'aws-cli/1.11.13 Python/2.7.10 Darwin/15.6.0 botocore/1.4.70'}, 'context': {'client_region': 'us-east-1', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x10cd793d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}
2016-11-08 13:40:55,224 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=DownloadDBLogFilePortion) (verify_ssl=True) with params: {'body': {'Action': u'DownloadDBLogFilePortion', u'Marker': '19:16779173', 'Version': u'2014-10-31', u'LogFileName': u'slowquery/engine.very-slow-queries.log.0', u'DBInstanceIdentifier': u'mydb'}, 'url': u'https://rds.amazonaws.com/', 'headers': {'User-Agent': 'aws-cli/1.11.13 Python/2.7.10 Darwin/15.6.0 botocore/1.4.70'}, 'context': {'client_region': 'us-east-1', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x10cd793d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}

Before each one of those but the first I get this:

2016-11-08 13:40:05,250 - MainThread - botocore.hooks - DEBUG - Event needs-retry.rds.DownloadDBLogFilePortion: calling handler <botocore.retryhandler.RetryHandler object at 0x10cb20f10>
2016-11-08 13:40:05,250 - MainThread - botocore.retryhandler - DEBUG - No retry needed.

With that I’m getting, out of three tries, a 17MB file, when the file on the Web Console says it’s 2.3GB.

Let me know if you need anything else.

@JordonPhillips
Copy link
Member

@anbotero It looks like what we're sending to the service is correct, so whatever the service is returning must be strange. Your debug log should also contain sections that have Response headers: and Response body:. Could you post those as well? I'm willing to bet the service is indicating that there is nothing left.

@anbotero
Copy link

anbotero commented Nov 8, 2016

@JordonPhillips indeed, it indicates something like that. This is the last of those responses:

2016-11-08 13:40:55,665 - MainThread - botocore.vendored.requests.packages.urllib3.connectionpool - DEBUG - "POST / HTTP/1.1" 200 1051028
2016-11-08 13:40:58,273 - MainThread - botocore.parsers - DEBUG - Response headers: {'x-amzn-requestid': 'a2ebcd66-b1d4-22a7-89fd-a253ac5e0b14', 'vary': 'Accept-Encoding', 'content-length': '1051028', 'content-type': 'text/xml', 'date': 'Tue, 08 Nov 2016 18:40:55 GMT'}
2016-11-08 13:40:58,274 - MainThread - botocore.parsers - DEBUG - Response body:
<DownloadDBLogFilePortionResponse xmlns="http://rds.amazonaws.com/doc/2014-10-31/">
  <DownloadDBLogFilePortionResult>
    <AdditionalDataPending>false</AdditionalDataPending>

Previous iterations of <AdditionalDataPending></AdditionalDataPending> have a true value.

@JordonPhillips
Copy link
Member

@anbotero With that in mind, and since it seems to be happening on the console, I would recommend raising this issue on the service forums. I'll let them know as well.

@pslavov
Copy link

pslavov commented Nov 12, 2016

Hi,
I can confirm that this bug is happening. I tested with different versions of awscli and python, and on different servers, but the result is always that log file is cut somehow - no matter the size.
I tested with:
aws-cli/1.10.48 Python/2.7.12 from amazon EC2 instance and
aws-cli/1.11.13 Python/3.5.2 from outside amazon.
But no meter what I do aws rds download-db-log-file-portion is not working.

@jlintz
Copy link

jlintz commented Nov 14, 2016

Having issues with this also. I'm unable to download more than 1.3-1.5 GB of a log and then I get either the following errors

A client error (InvalidParameterValue) occurred when calling the DownloadDBLogFilePortion operation: This file contains binary data and should be downloaded instead of viewed.

or

A client error (Throttling) occurred when calling the DownloadDBLogFilePortion operation: Rate exceeded

Using the following version on an EC2 instance

aws-cli/1.10.1 Python/3.5.2 Linux/4.4.0-43-generic botocore/1.3.23

and on my laptop

aws-cli/1.10.56 Python/2.7.11 Darwin/16.1.0 botocore/1.4.46

@dialt0ne
Copy link

dialt0ne commented Nov 14, 2016

I am also seeing this error. See the output below. Additionally, I think it's because the messages are being truncated. There is a truncation for each log file portion except for the last one.

$ aws --output text rds describe-db-log-files --db-instance-identifier $DBINSTANCE | grep 2016-11-11-18
DESCRIBEDBLOGFILES  1478890800000   error/postgresql.log.2016-11-11-18  206701928
$ aws rds download-db-log-file-portion --db-instance-identifier $DBINSTANCE --log-file-name error/postgresql.log.2016-11-11-18 --starting-token 0 --max-items 99999999999 --output=text --debug 1>stdout1 2>stderr1
$ stat -f '%z' stdout1
206241418
$ grep -c "Your log message was truncated" stdout1
196
$ echo '206701928 / ( 1024 * 1024 )' | bc
197
$ aws --version
aws-cli/1.10.56 Python/2.7.10 Darwin/14.5.0 botocore/1.4.46

@fmmatthewzeemann
Copy link

This is a horrible bit code (did not have much time to spend on it) but it does let me get the entire log file (I hope) ...

#!/bin/bash 
COUNTER=1
LASTFOUNDTOKEN=0
PREVIOUSTOKEN=0

FILE=$1

rm -f ${FILE}

while [  $COUNTER -lt 100 ]; do
	echo "Lets try and get ${FILE}.${COUNTER}"
	echo "The starting-token will be set to ${LASTFOUNDTOKEN}"
	PREVIOUSTOKEN=${LASTFOUNDTOKEN}
	
	aws rds download-db-log-file-portion --db-instance-identifier mtsos-prd-db-pg01 --log-file-name error/${FILE} --starting-token ${LASTFOUNDTOKEN}  --debug --output text 2>>${FILE}.${COUNTER}.debug >> ${FILE}.${COUNTER}
	LASTFOUNDTOKEN=`grep "<Marker>" ${FILE}.${COUNTER}.debug | tail -1 | tr -d "<Marker>" | tr -d "/" | tr -d " "`
	
	echo "LASTFOUNDTOKEN is ${LASTFOUNDTOKEN}"
	echo "PREVIOUSTOKEN is ${PREVIOUSTOKEN}"
	
	if [ ${PREVIOUSTOKEN} == ${LASTFOUNDTOKEN} ]; then
		echo "No more new markers, exiting"
		rm -f ${FILE}.${COUNTER}.debug
		rm -f ${FILE}.${COUNTER}
		exit;
	else
		echo "Marker is ${LASTFOUNDTOKEN} more to come ... "
		echo " "
		rm -f ${FILE}.${COUNTER}.debug
		PREVIOUSTOKEN=${LASTFOUNDTOKEN}
	fi
	
	cat ${FILE}.${COUNTER} >> ${FILE}
	rm -f ${FILE}.${COUNTER}
	
	let COUNTER=COUNTER+1
done

so I pass in the logfile name ...
./example-get.sh postgresql.log.2017-02-09-15

it loops through until it gets no more new markers

.
.
.
The starting-token will be set to 1:1652232848
LASTFOUNDTOKEN is 1:1693873977
PREVIOUSTOKEN is 1:1652232848
Marker is 1:1693873977 more to come ... 
Lets try and get postgresql.log.2017-02-09-15.27
The starting-token will be set to 1:1693873977
LASTFOUNDTOKEN is 1:1693873977
PREVIOUSTOKEN is 1:1693873977
No more new markers, exiting

and I end up with a file called postgresql.log.2017-02-09-15 that has the entire log (I hope)

As I mentioned this is very quick so feel free to improve....

@cscetbon
Copy link

cscetbon commented Mar 7, 2017

Same issue here. Using the last Marker as a start-token value allows to grab the rest of the log .
Thank you for the code @fmmatthewzeemann !

@jlintz
Copy link

jlintz commented May 22, 2017

@stealthycoin any reason why this was closed? Was this fixed?

@stephanlindauer
Copy link

stephanlindauer commented Jun 1, 2017

@jlintz asking myself the same question. seeing the same/similar problem with aws-cli/1.11.95
/cc @JordonPhillips

aws rds download-db-log-file-portion \
    --db-instance-identifier XXXXXXXXXX \
    --region XXXXXXXX \
    --log-file-name error/XXXXXXXXXXX \
    --starting-token=0 \
    --profile XXXXXXXX \
    --output text >> test.txt

only gives me ~300mb of a ~950mb logfile.

@yuriipolishchuk
Copy link

The same here. Cannot download even 100mb log.

aws --version
aws-cli/1.11.165 Python/2.7.6 Linux/3.13.0-100-generic botocore/1.7.23

@cscetbon
Copy link

cscetbon commented Oct 5, 2017

I don't know why that issue has been closed. I said that using the proposed shell script was a workaround but not that the code is fixed !

@ealimm
Copy link

ealimm commented Oct 27, 2017

Anyone had a fix yet or tried to contact aws?
I have the same with webconsole. It fails all the time.

@yuriipolishchuk
Copy link

@chefone I use script from @fmmatthewzeemann as a workaround

@cscetbon
Copy link

Yeah but it should be fixed at the API level ...

@stephanlindauer
Copy link

i solved this by using the golang sdk instead of the aws cli.
other sdk's would probably also do the trick.

@jerryhebert
Copy link

jerryhebert commented Nov 10, 2017

Workaround didn't work for me as I guess the output is different in my version of aws-cli but I'm just chiming in to say that I am seeing this issue with 1.11.183.

$ aws --version
aws-cli/1.11.183 Python/2.7.6 Linux/3.13.0-128-generic botocore/1.7.41
$ aws rds download-db-log-file-portion --region us-west-2  --db-instance-identifier $DB --output text --log-file-name error/postgresql.log.2017-11-10-21  --starting-token 0 > 21
$ grep truncated 21
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]

@marksher
Copy link

I checked with AWS support about this and they said that the implementation doesnt work in the CLI or Boto3. They gave me some code that works on REST and I tweaked it into a module. I apologize if this is off-topic for this forum, but I thought it would help a lot of people on this thread.

Attached is the starter code the tech gave me. It seems to work so far.

sample1.txt

@igrayson
Copy link

igrayson commented Mar 18, 2018

Another alternative to try, which worked for me: the deprecated rds-download-db-logfile command, which RDS references in their REST (!) documentation.

$ cd ~/Downloads/RDSCli-1.19.004

$ AWS_RDS_HOME=$(pwd) ./bin/rds-download-db-logfile YOUR-INSTANCE --I YOUR_ACCESS_KEY --S 'YOUR_SECRET_KEY' --region YOUR-REGION --log-file-name error/postgresql.log.2018-03-16-20 > postgresql.log.2018-03-16-20

@joer14
Copy link

joer14 commented May 17, 2018

@marksher thank you very much for that code. I cleaned it up just a little bit and turned it into something I can run in a slightly more automated fashion:

https://gist.github.com/joer14/4e5fc38a832b9d96ea5c3d5cb8cf1fe9

@diehlaws diehlaws added guidance Question that needs advice or information. and removed question labels Jan 4, 2019
@RachadAbiChahine
Copy link

never mind what is the option as said in the doc "Downloads all or a portion of the specified log file, up to 1 MB in size."

@rams3sh
Copy link

rams3sh commented Mar 30, 2021

I modified the code of @andrewmackett 's version of @joer14 's gist to support retries in case of instance metadata based credentials expire during continuous fetching of logs along with some CLI arguments magic. Also supported usage of profiles.

https://gist.github.com/rams3sh/15ac9487f2b6860988dc5fb967e754aa

@ccampo133
Copy link

If anybody is still running into this problem, I created a small Python script that you can use to download RDS log files and save them in S3: https://github.com/ccampo133/rds-logs-to-s3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closing-soon This issue will automatically close in 4 days unless further comments are made. guidance Question that needs advice or information. rds
Projects
None yet
Development

No branches or pull requests