Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout when downloading data #140

Open
bennigeir opened this issue Jan 5, 2023 · 2 comments
Open

Timeout when downloading data #140

bennigeir opened this issue Jan 5, 2023 · 2 comments

Comments

@bennigeir
Copy link

How to reproduce the behaviour

I'm trying to download annotated data from Doccano. I run the following code:

from doccano_client import DoccanoClient

client = DoccanoClient('http://some.url')
client.login(username='username', password='password')

user = client.get_profile()

client.download(4, 'JSONL')

and after some time I get the following error:

TimeoutError: Timeout waiting for task b9030b6d-1959-4de1-bd29-32038422974d

Is there anyway for me to increase the timeout, download parts of the annotated data or get around this timeout error?

Your Environment

  • Operating System: Ubuntu 18.04
  • Python Version: 3.9.7
  • Package Version: 1.2.6
@david-engelmann
Copy link
Contributor

Are you able to download other projects without issue? Was the get_profile command successful? Sometimes the TimeoutError means that the client is not getting any response—as if the Doccano did not exist.

@Verster77
Copy link

I get the same error on all the projects and with uploads as well. Also logged the issue here - doccano/doccano#2297

import logging
from doccano_client import DoccanoClient
import boto3
import os

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

doccano_url = 'xxxx'
username = 'xxxx'
password = 'xxxx'

logger.info("Initializing the Doccano client...")

client = DoccanoClient(doccano_url)
client.login(username=username, password=password)
logger.info("Logged in to Doccano successfully.")
project_id = 1

save_directory = "/dbfs/tmp"

logger.info(f"Checking download options for project ID: {project_id}...")
download_options = client.list_download_options(project_id)
if not download_options:
    logger.error(f"No download options available for project ID: {project_id}.")
    exit()

logger.info(f"Available download options: {', '.join([option.name for option in download_options])}")
logger.info(f"Attempting to download dataset from project ID: {project_id} in 'JSONL' format...")

print("start download")
downloaded_file_path = client.download(project_id, format='JSONL', only_approved=True, dir_name=save_directory)
logger.info(f"Dataset downloaded successfully to {downloaded_file_path}.")

timeoutError: Timeout waiting for task d70dca76-6064-4067-9380-0a8ba2517fdd
Command took 1.01 hours

The logs from the docker_nginx_1 container just shows the below:
[27/Oct/2023:15:58:02 +0000] "GET /v1/tasks/status/f0178324-dab6-4062-80a2-db4d4d014232 HTTP/1.1" 200 42 "URL//xxxxx" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36" "-"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants