Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to build docker with tf annotation #154

Closed
savan77 opened this issue Oct 25, 2018 · 11 comments
Closed

Unable to build docker with tf annotation #154

savan77 opened this issue Oct 25, 2018 · 11 comments
Labels
question Further information is requested

Comments

@savan77
Copy link
Contributor

savan77 commented Oct 25, 2018

Hi,

I am trying to build this project with tf annotation/gpu enabled. I was able to built it successfully and I cross-checked it is up and running. But, I am unable to access it through a browser. Please see attached screenshots.

However, when I built without tf annotation it was working fine. Any thoughts?

Thanks
screenshot from 2018-10-25 10-53-12
screenshot from 2018-10-25 11-13-29

@nmanovic
Copy link
Contributor

@savan77 ,

Could you please attach logs from cvat container (docker logs cvat) and the command which was used to build and run CVAT with TF annotation?

@nmanovic nmanovic added the question Further information is requested label Oct 25, 2018
@savan77
Copy link
Contributor Author

savan77 commented Oct 25, 2018

Thanks @nmanovic .
After spending too much time on GPU, I decided to give it a try on CPU. Initially, it worked. But now its kind of random. Sometimes, it works and sometimes it does not. I am unable to figure out what's wrong. Please find the the log below. Also, I built with docker-compose build with required modifications to use TF Annotation (i.e adding url_patterns, tf_annotations, and 'yes' in Dockerfile).

2018-10-25 09:57:16,499 INFO RPC interface 'supervisor' initialized
2018-10-25 09:57:16,500 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2018-10-25 09:57:16,500 INFO supervisord started with pid 1
2018-10-25 09:57:17,503 INFO spawned: 'rqworker_low' with pid 10
2018-10-25 09:57:17,507 INFO spawned: 'runserver' with pid 11
2018-10-25 09:57:17,511 INFO spawned: 'rqworker_default_0' with pid 12
2018-10-25 09:57:17,515 INFO spawned: 'rqworker_default_1' with pid 13
2018-10-25 09:57:17,534 DEBG 'rqworker_low' stderr output:
wait-for-it.sh: waiting for cvat_redis:6379 without a timeout

2018-10-25 09:57:17,539 DEBG 'runserver' stderr output:
wait-for-it.sh: waiting for cvat_db:5432 without a timeout

2018-10-25 09:57:17,541 DEBG 'rqworker_low' stderr output:
wait-for-it.sh: cvat_redis:6379 is available after 0 seconds

2018-10-25 09:57:17,542 DEBG 'rqworker_default_0' stderr output:
wait-for-it.sh: waiting for cvat_redis:6379 without a timeout

2018-10-25 09:57:17,543 DEBG 'rqworker_low' stderr output:
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell

2018-10-25 09:57:17,543 DEBG 'rqworker_default_1' stderr output:
wait-for-it.sh: waiting for cvat_redis:6379 without a timeout

2018-10-25 09:57:17,546 DEBG 'runserver' stderr output:
wait-for-it.sh: cvat_db:5432 is available after 0 seconds

2018-10-25 09:57:17,548 DEBG 'runserver' stderr output:
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell

2018-10-25 09:57:17,548 DEBG 'rqworker_default_0' stderr output:
wait-for-it.sh: cvat_redis:6379 is available after 0 seconds

2018-10-25 09:57:17,548 DEBG 'rqworker_default_1' stderr output:
wait-for-it.sh: cvat_redis:6379 is available after 0 seconds

2018-10-25 09:57:17,549 DEBG 'rqworker_default_0' stderr output:
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell

2018-10-25 09:57:17,549 DEBG 'rqworker_default_1' stderr output:
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell

2018-10-25 09:57:18,550 INFO success: rqworker_low entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-10-25 09:57:18,551 INFO success: runserver entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-10-25 09:57:18,551 INFO success: rqworker_default_0 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-10-25 09:57:18,551 INFO success: rqworker_default_1 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-10-25 09:57:18,873 DEBG 'rqworker_default_1' stderr output:
09:57:18 Registering birth of worker bece3e763906.13

2018-10-25 09:57:18,874 DEBG 'rqworker_default_1' stderr output:
09:57:18 RQ worker 'rq:worker:bece3e763906.13' started, version 0.10.0

2018-10-25 09:57:18,874 DEBG 'rqworker_default_1' stderr output:
09:57:18 *** Listening on default...

2018-10-25 09:57:18,875 DEBG 'rqworker_default_1' stderr output:
09:57:18 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.

2018-10-25 09:57:18,877 DEBG 'rqworker_default_1' stderr output:
09:57:18 Cleaning registries for queue: default

2018-10-25 09:57:18,878 DEBG 'rqworker_default_1' stderr output:
09:57:18 *** Listening on default...

2018-10-25 09:57:18,878 DEBG 'rqworker_default_0' stderr output:
09:57:18 Registering birth of worker bece3e763906.12

2018-10-25 09:57:18,878 DEBG 'rqworker_default_1' stderr output:
09:57:18 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.

2018-10-25 09:57:18,880 DEBG 'rqworker_default_0' stderr output:
09:57:18 RQ worker 'rq:worker:bece3e763906.12' started, version 0.10.0

2018-10-25 09:57:18,880 DEBG 'rqworker_default_0' stderr output:
09:57:18 *** Listening on default...
09:57:18 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.

2018-10-25 09:57:18,881 DEBG 'rqworker_default_0' stderr output:
09:57:18 Cleaning registries for queue: default

2018-10-25 09:57:18,882 DEBG 'rqworker_default_0' stderr output:
09:57:18 *** Listening on default...

2018-10-25 09:57:18,882 DEBG 'rqworker_default_0' stderr output:
09:57:18 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.

2018-10-25 09:57:18,906 DEBG 'rqworker_low' stderr output:
09:57:18 Registering birth of worker bece3e763906.10

2018-10-25 09:57:18,907 DEBG 'rqworker_low' stderr output:
09:57:18 RQ worker 'rq:worker:bece3e763906.10' started, version 0.10.0

2018-10-25 09:57:18,908 DEBG 'rqworker_low' stderr output:
09:57:18 *** Listening on low...

2018-10-25 09:57:18,908 DEBG 'rqworker_low' stderr output:
09:57:18 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.

2018-10-25 09:57:18,908 DEBG 'rqworker_low' stderr output:
09:57:18 Cleaning registries for queue: low

2018-10-25 09:57:18,910 DEBG 'rqworker_low' stderr output:
09:57:18 *** Listening on low...
09:57:18 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.

2018-10-25 09:57:19,050 DEBG 'runserver' stdout output:
Operations to perform:
Apply all migrations: admin, auth, contenttypes, engine, sessions
Running migrations:
No migrations to apply.
Your models have changes that are not yet reflected in a migration, and so won't be applied.
Run 'manage.py makemigrations' to make new migrations, and then re-run 'manage.py migrate' to apply them.

2018-10-25 09:57:20,384 DEBG 'runserver' stdout output:
Successfully ran command.
Server URL : http://localhost:8080/
Server Root : /tmp/mod_wsgi-localhost:8080:1000
Server Conf : /tmp/mod_wsgi-localhost:8080:1000/httpd.conf
Error Log File : /dev/stderr (INFO)
Request Capacity : 5 (1 process * 5 threads)
Request Timeout : 60 (seconds)
Startup Timeout : 15 (seconds)
Queue Backlog : 100 (connections)
Queue Timeout : 45 (seconds)
Server Capacity : 20 (event/worker), 20 (prefork)
Server Backlog : 500 (connections)
Locale Setting : C.UTF-8

2018-10-25 09:57:20,420 DEBG 'runserver' stdout output:
httpd (pid 10) already running

2018-10-25 09:57:20,420 DEBG fd 10 closed, stopped monitoring <POutputDispatcher at 139690480675008 for <Subprocess at 139690480933272 with name runserver in state RUNNING> (stdout)>
2018-10-25 09:57:20,420 DEBG fd 15 closed, stopped monitoring <POutputDispatcher at 139690480675224 for <Subprocess at 139690480933272 with name runserver in state RUNNING> (stderr)>
2018-10-25 09:57:20,420 INFO exited: runserver (exit status 0; expected)
2018-10-25 09:57:20,420 DEBG received SIGCLD indicating a child quit

@savan77
Copy link
Contributor Author

savan77 commented Oct 25, 2018

When I wrote above comment the container was up and running but I was not able to access it through browser. Then I shut down (docker-compose down) containers and closed terminal and tried again by running docker-compose up -d. It works now. I am new to Docker and don't know whether such behaviour is expected or not.

Also, what is the best way to force app reflect changes I made to code. Re-building docker image every time I change the code will take lot of time. I think there might be some workaround. Thanks

@bsekachev
Copy link
Member

@savan77
I would recommend you read this manual for development.
Do you use the CVAT from the master branch? If I right, how did you run tensorflow on CPU? Have you any changes in source code?

@savan77
Copy link
Contributor Author

savan77 commented Oct 25, 2018

@bsekachev
No, I did not make any changes. The script it self installed tensorflow, I removed tensorflow-gpu from requirement file.

@nmanovic
Copy link
Contributor

@savan77

2018-10-25 09:57:20,420 DEBG 'runserver' stdout output: httpd (pid 10) already running

Usually the log mean that cvat container was killed or stopped. You should use up -d and down commands for docker-compose only. Don't try to stop a container and run it again. It can lead to different problems.

When I wrote above comment the container was up and running but I was not able to access it through browser. Then I shut down (docker-compose down) containers and closed terminal and tried again by running docker-compose up -d. It works now.

As you can see docker-compose down just solved your problems as I described above. It removed apache temporary data and apache started again without any problems. Similar problem is described here.

@savan77
Copy link
Contributor Author

savan77 commented Oct 25, 2018

Hi, thanks for the information. I would also like to know how can I use other models given that they are compatible with default frozen graph's structure. I read I had to add an environment variable with the path to frozen graph, but I guess it won't work with Docker as normally it does. What kind of special changes I would have to make here? You can close this. Thanks

@nmanovic
Copy link
Contributor

Hi @savan77 ,

I'm not sure that I understand your question. Could you please elaborate a little bit?

@savan77
Copy link
Contributor Author

savan77 commented Oct 25, 2018

I would like to use other object detection models in this tools. How do I do that? I understand that it fetches path from environment variables. But I think I cannot add path to environment as we do because docker runs somewhere else. In fact, I tried to set that environment variable to absolute path of my frozen graph but it didn't work.

In the documentation, it is stated that 'This variable must be available from cvat runtime environment'. Can you elaborate on this?

@nmanovic
Copy link
Contributor

@savan77 ,

You always can write docker-compose.override.yml file and add something like code below:

version: "2.3"

services:
  cvat:
    environment:
      TF_ANNOTATION_MODEL_PATH: ...

@nmanovic
Copy link
Contributor

@savan77 ,

I'm going to close the issue. Don't hesitate to reopen it if you still have more questions.

TOsmanov pushed a commit to TOsmanov/cvat that referenced this issue Aug 23, 2021
…#154)

- Fixed image saving in VggFace2 and Widerface. Formats should not convert extensions, unless requested
- Refactored image saving in formats
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants