Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There are no resumes present in the specified folder. Yet the defaults are there. #272

Open
joshuacox opened this issue Jun 12, 2024 · 17 comments

Comments

@joshuacox
Copy link

joshuacox commented Jun 12, 2024

There are no resumes present in the specified folder. Yet the defaults are there.

tree Data/
Data/
├── JobDescription
│   ├── job_desc_front_end_engineer.pdf
│   ├── job_desc_full_stack_engineer.pdf
│   ├── job_desc_java_developer.pdf
│   └── job_desc_product_manager.pdf
└── Resumes
    ├── alfred_pennyworth_pm.pdf
    ├── barry_allen_fe.pdf
    ├── bruce_wayne_fullstack.pdf
    ├── harvey_dent_mle.pdf
    ├── john_doe.pdf
    └── josh_cox.pdf

I did try adding my resume in there (josh_cox.pdf), and creating the Processed directory in there too (as mentioned in another issue) with no change.

To Reproduce
Steps to reproduce the behavior:

  1. Fresh clone
  2. docker compose up
  3. watch build take place
  4. See error
 => ERROR [resume-matcher 8/8] RUN python run_first.py                                                                              2.4s 
------                                                                                                                                   
 > [resume-matcher 8/8] RUN python run_first.py:                                                                                         
2.178 2024-06-12 16:55:06,825 (run_first.py:34) - INFO: Started to read from Data/Resumes                                                
2.178 2024-06-12 16:55:06,825 (run_first.py:44) - ERROR: There are no resumes present in the specified folder.                           
2.178 2024-06-12 16:55:06,825 (run_first.py:45) - ERROR: Exiting from the program.                                                       
2.178 2024-06-12 16:55:06,825 (run_first.py:46) - ERROR: Please add resumes in the Data/Resumes folder and try again.
------
failed to solve: process "/bin/sh -c python run_first.py" did not complete successfully: exit code: 1

Expected behavior

app to serve

Screenshots
image

Desktop (please complete the following information):

  • OS: NixOS
  • Browser none app did not start
  • Version 24.05
docker info
Client:
 Version:    24.0.9
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.0
    Path:     /nix/store/jidnm42865p7pisj8i7nils91ianj19f-docker-plugins/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  2.27.0
    Path:     /nix/store/jidnm42865p7pisj8i7nils91ianj19f-docker-plugins/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 10
  Running: 0
  Paused: 0
  Stopped: 10
 Images: 14
 Server Version: 24.0.9
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: journald
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 nvidia runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: v1.7.16
 runc version: 
 init version: 
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.6.32
 Operating System: NixOS 24.05 (Uakari)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.38GiB
 Name: spectre360
 ID: 21a3559c-927d-493e-a4d1-74843f52fbad
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: true
@shredinjohn
Copy link

I am facing the same issue as well

@SubramanyamChalla24
Copy link
Contributor

hi @joshuacox , can you run python run_first.py and verify if the same error pops up? I'm able to run the same with a new clone .

@amritajain13
Copy link

@SubramanyamChalla24 i am trying docker compose up -d command and after python run_first.py command started executing this gives me the same error can you hellp

@SubramanyamChalla24
Copy link
Contributor

@amritajain13 , did you try running python run_first.py ?

@joshuacox
Copy link
Author

joshuacox commented Jun 15, 2024

python run_first.py 
Traceback (most recent call last):
  File "/unreal/gpu/Resume-Matcher/run_first.py", line 5, in <module>
    from scripts import JobDescriptionProcessor, ResumeProcessor
  File "/unreal/gpu/Resume-Matcher/scripts/__init__.py", line 2, in <module>
    from .JobDescriptionProcessor import JobDescriptionProcessor
  File "/unreal/gpu/Resume-Matcher/scripts/JobDescriptionProcessor.py", line 5, in <module>
    from .parsers import ParseJobDesc, ParseResume
  File "/unreal/gpu/Resume-Matcher/scripts/parsers/__init__.py", line 1, in <module>
    from .ParseJobDescToJson import ParseJobDesc
  File "/unreal/gpu/Resume-Matcher/scripts/parsers/ParseJobDescToJson.py", line 5, in <module>
    from scripts.Extractor import DataExtractor
  File "/unreal/gpu/Resume-Matcher/scripts/Extractor.py", line 6, in <module>
    from .utils import TextCleaner
  File "/unreal/gpu/Resume-Matcher/scripts/utils/__init__.py", line 3, in <module>
    from .Utils import TextCleaner
  File "/unreal/gpu/Resume-Matcher/scripts/utils/Utils.py", line 7, in <module>
    nlp = spacy.load("en_core_web_md")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/8msv6rh44z033csqkg2r3fa2j21m92px-python3-3.11.9-env/lib/python3.11/site-packages/spacy/__init__.py", line 51, in load
    return util.load_model(
           ^^^^^^^^^^^^^^^^
  File "/nix/store/8msv6rh44z033csqkg2r3fa2j21m92px-python3-3.11.9-env/lib/python3.11/site-packages/spacy/util.py", line 472, in load_model
    raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en_core_web_md'. It doesn't seem to be a Python package or a valid path to a data directory.

I am in NixOS, which is one of the reasons I am trying to run it inside of docker to try it out.

and there are no files named 'en_cor*' in my fresh clone.

find . -iname 'en_core*'

@SubramanyamChalla24
Copy link
Contributor

@joshuacox can you install the spacy english mdel with the command python -m spacy download en_core_web_sm and retry it?

@joshuacox
Copy link
Author

bash➜ python
Python 3.11.9 (main, Apr  2 2024, 08:25:04) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import spacy
>>> nlp = spacy.load("en_core_web_sm")
>>> import en_core_web_sm
>>> doc = nlp("This is a sentence.")
>>> print([(w.text, w.pos_) for w in doc])
[('This', 'PRON'), ('is', 'AUX'), ('a', 'DET'), ('sentence', 'NOUN'), ('.', 'PUNCT')]

but that is not inside the container, which apparently has not downloaded the model?

@SubramanyamChalla24
Copy link
Contributor

It's included in the requirements.txt for the docker container. It seems to be an issue , can you more details about the issue? Or if you can join discord , we can discuss the issue there .

@dieideeistgut
Copy link

dieideeistgut commented Jun 17, 2024

Had the same Issue when switching to docker compose. Manually adding the Processed Folder as a mounted Volume fixed this for me. Kind of

- ./Data/Processed:/data/Resume-Matcher/Data/Processed

@joshuacox
Copy link
Author

@dieideeistgut ya this docker compose file is completely broken without that volume, @SubramanyamChalla24 it works for you?

@joshuacox joshuacox mentioned this issue Jun 27, 2024
11 tasks
@joshuacox
Copy link
Author

there are some proposed changes, but unfortunately this is still broken. I left an ls -alh Data/Resumes; sleep5 statement in the dockerfile to show that, indeed, the Resumes folder is in there and populated with resumes.

P4jMepR added a commit to P4jMepR/pull-Resume-Matcher that referenced this issue Jun 30, 2024
As in title:
- Static pathing -> Relative pathing (in run_first.py)
- previous .gitignore structure messed up directory structure (srbhr#272)
P4jMepR added a commit to P4jMepR/pull-Resume-Matcher that referenced this issue Jun 30, 2024
Git won't acknowledge directories without any files within them.
Now if file structure is not complete it will create one. (solved srbhr#272)
@P4jMepR
Copy link

P4jMepR commented Jun 30, 2024

@joshuacox @shredinjohn @amritajain13 @dieideeistgut
Issue is:
Repo is missing 3 directories.
Upon creating them everything works flawlessly. My pull request currently awaits approval.
I've also fixed the relative pathing (might mitigate possible Docker issues) and added a traceback which would give us a hint what's going on to the Error we were all getting.

image

@nanafy
Copy link

nanafy commented Jul 23, 2024

Its not hard to fix. Add the directories missing as pointed out by P4jMepR. Inside Processed:

  • Data
  • Resumes
  • JobDescription

Then run python run_first.py and then start the UI

@joshuacox
Copy link
Author

@nanafy I have specifically mentioned that this is inside of docker compose

@P4jMepR
Copy link

P4jMepR commented Jul 31, 2024

I have specifically mentioned that this is inside of docker compose

@joshuacox clone my pull and try to compose it again.
It should work just fine.

@joshuacox
Copy link
Author

@P4jMepR I get the same results using your PR

0.163 /usr/local/lib/python3.11/site-packages/pypdf/_crypt_providers/_cryptography.py:32: CryptographyDeprecationWarning: ARC4 has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.ARC4 and will be removed from this module in 48.0.0.
0.163   from cryptography.hazmat.primitives.ciphers.algorithms import AES, ARC4
1.916 2024-07-31 11:24:11,846 (run_first.py:44) - INFO: Started to read from Data/Resumes
1.916 2024-07-31 11:24:11,846 (run_first.py:54) - ERROR: There are no resumes present in the specified folder.
1.916 2024-07-31 11:24:11,846 (run_first.py:55) - ERROR: Exiting from the program.
1.916 2024-07-31 11:24:11,846 (run_first.py:56) - ERROR: Please add resumes in the Data/Resumes folder and try again.
1.916 2024-07-31 11:24:11,847 (run_first.py:57) - ERROR: Traceback (most recent call last):
1.916   File "/data/Resume-Matcher/run_first.py", line 48, in <module>
1.916     remove_old_files(PROCESSED_RESUMES_PATH)
1.916   File "/data/Resume-Matcher/run_first.py", line 32, in remove_old_files
1.916     for filename in os.listdir(files_path):
1.916                     ^^^^^^^^^^^^^^^^^^^^^^
1.916 FileNotFoundError: [Errno 2] No such file or directory: '/data/Resume-Matcher/Data/Processed/Resumes'
1.916 
------
failed to solve: process "/bin/sh -c python run_first.py" did not complete successfully: exit code: 1

@drnkknt
Copy link

drnkknt commented Aug 24, 2024

same here, still error using latest repo

image

KamalMahanna added a commit to KamalMahanna/Resume-Matcher that referenced this issue Aug 30, 2024
Hemanth-Thaluru added a commit to Hemanth-Thaluru/Resume-Matcher that referenced this issue Sep 20, 2024
srbhr pushed a commit that referenced this issue Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants