Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apple M1 Support #64

Closed
agentmorris opened this issue May 20, 2023 · 15 comments
Closed

Apple M1 Support #64

agentmorris opened this issue May 20, 2023 · 15 comments

Comments

@agentmorris
Copy link
Owner

Hi All - very excited about this release!

I know this is probably deep down in the dependencies but wanted to raise given that M1 chips are becoming more common and that Mac instructions are given in the README.

The following error occurred running run_detector.py

Intel MKL FATAL ERROR: This system does not meet the minimum requirements for use of the Intel(R) Math Kernel Library.
The processor must support the Intel(R) Supplemental Streaming SIMD Extensions 3 (Intel(R) SSSE3) instructions.
The processor must support the Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions.
The processor must support the Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.

Issue cloned from Microsoft/CameraTraps, original issue posted by sim-kelly on Jun 23, 2022.

@agentmorris
Copy link
Owner Author

agentmorris commented May 20, 2023

Hmmm... that's not what we want. I don't have an M1 Mac to test on, but this thread looks promising:

https://stackoverflow.com/questions/71306262/conda-env-broken-after-installing-pytorch-on-m1-intel-mkl-fatal-error

It seems like maybe running:

conda install nomkl

...prior to any other package installations (but from within the target conda environment) will fix this. Assuming you're installing from from environment-detector-mac.yml, can you try adding "nomkl" somewhere near the top of that file?

If that works, that's a good solution and we'll update the instructions.

If that doesn't work, that doesn't necessarily mean that "nomkl" isn't a good approach; I don't think in-order installation is guaranteed from a conda environment file, so we may need to do some more experimenting. But that's a good first debugging step.

Mind giving that a try?

Thanks!


(Comment originally posted by agentmorris)

@agentmorris
Copy link
Owner Author

Progressed past the initial mentioned issue

  1. by adding nomkl to yml as mentioned by @agentmorris

and

  1. making the following changes to the instructions
CONDA_SUBDIR=osx-arm64 conda env create --file environment-detector-mac.yml    
conda activate cameratraps-detector-m1    
conda env config vars set CONDA_SUBDIR=osx-arm64
conda activate 
conda activate cameratraps-detector-m1
pip3 install -U --pre torch torchvision --extra-index-url https://download.pytorch.org/whl/nightly/cpu
export PYTHONPATH="$PYTHONPATH:$HOME/Documents/GitHub/cameratraps:$HOME/Documents/GitHub/ai4eutils:$HOME/Documents/GitHub/yolov5"

Output now looks like this:

Running detector on 1 images...
PyTorch reports 0 available CUDA devices
GPU available: False
Using PyTorch version 1.13.0.dev20220704
Fusing layers... 
Model summary: 574 layers, 139990096 parameters, 0 gradients
Loaded model in 3.42 seconds
Fusing layers... 
Model summary: 574 layers, 139990096 parameters, 0 gradients
Loaded model in 1.35 seconds
  0%|                                                                                                                                                       | 0/1 [00:00<?, ?it/s]
PTDetector: image test_images/test_images/caltech_camera_traps_5a0e37cc-23d2-11e8-a6a3-ec086b02610b.jpg failed during inference: 'Upsample' object has no attribute 'recompute_scale_factor'
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.40s/it]
On average, for each image,
- loading took 0.07 seconds, std dev is not available
- inference took 1.31 seconds, std dev is not available

This error is mentioned here ultralytics/yolov5#6948 within yolov5. @agentmorris is this why a specific commit of yolov5 is mentioned?

I have tried each permutation of the following with the same result:

  • Nightly/stable versions of torch/torchvision
  • latest and specified commit of yolov5

Any ideas or something i am overlooking?


(Comment originally posted by sim-kelly)

@agentmorris
Copy link
Owner Author

This issue is why we install a specific version of PyTorch, actually. At the time we released MDv5, even the newest commits to YOLOv5 had this issue when running against the newest version of PyTorch. Our use of a specific YOLOv5 commit is just future-proofing. I don't know that anything would stop working if you used the most recent YOLOv5 commit, but just to minimize the number of variables, I recommend sticking with the recommended commit.

So I think what's happening here is that when you pip installed PyTorch (after installing nomkl), you installed the latest version. Can you install pytorch 1.10.1 and torchvision 0.11.2 and see what happens?


(Comment originally posted by agentmorris)

@agentmorris
Copy link
Owner Author

Closing due to inactivity, but let us know if you're able to make nomkl work by checking out the recommended PyTorch version. Thanks!


(Comment originally posted by agentmorris)

@agentmorris
Copy link
Owner Author

For Apple M1 support you will need the following:

  • macOS version 12.3 or higher (upgrading also fixes TensorFlow-metal object detection bugs)
  • PyTorch nighly build v1.13 or higher
  • Changes in PR #308 applied
  • Modified/Updated version of MDv5

Creating an updated version of MDv5 is pretty easy and will make the model available to newer versions of PyTorch on all platforms by removing the upsample problem.

Set up a new virtual environment with the latest versions of PyTorch and YOLOv5

Follow the YOLOv5 instructions for organizing directories Training Custom Data but only include one image and one label. Then "fine-tune" the current mdv5a.0.0.pt model with all layers frozen:

My dataset.yaml :

train: /home/pete/Desktop/MD/images/train
val: /home/pete/Desktop/MD/images/train
nc: 3
names: ['animal', 'person', 'vehicle']

python train.py --weights md_v5a.0.0.pt --data dataset.yaml --freeze 33 --epochs 1 --batch 1 --img 1280

The resulting model will be usable on all platforms with the latest versions of PyTorch. *NOTE: training can not be done on the M1


(Comment originally posted by persts)

@agentmorris
Copy link
Owner Author

We are frustratingly close to not having to deal with several of these issues: the Upsample issue appears to have been resolved when using the latest YOLOv5 and the latest stable build of PyTorch (1.12); i.e. you can use "vanilla MD" with the latest PyTorch version. We're not updating our recommended environment yet, mostly because we don't want to rock the boat and what we have is working, but for those who have a specific reason to use the latest stable PyTorch build or the latest YOLOv5, I've confirmed that this works.

However, the latest nightly build of PyTorch (1.13) still has the Upsample issue, and 1.13 is what's required for M1 support. Grrr. So to use M1 support, you'll still need a custom MegaDetector and the nightly PT build.

But I just merged Peter's PR to add M1 inference support, as well as a new environment-detector-m1.yml file that is identical to environment-detector-mac.yml, except that "nomkl" has been added. For now, this isn't "officially" supported, but this issue will serve as documentation for adventurous folks who want to try this.

Get the latest CameraTraps repo

These instructions assume you have a recent version of our repo... i.e., go into your CameraTraps folder (c:\git\CameraTraps if you copied and pasted our "standard" instructions), and run:

git fetch && git pull

Get the latest YOLOv5 repo

We can't say that "latest" will always be correct, but for now (September 2022), "latest" works here, but the YOLOv5 commit (c23a441c9df7ca9b1f275e8c8719c949269160d1 ) that we recommend for "standard" MD use does not work. So, if you've already checked out the old commit, head into your YOLOv5 folder and run:

git checkout master

If and when M1 inference becomes "officially supported", we'll pin a new commit that we've tested thoroughly. For now, just go with latest.

Re-build MD for the latest YOLOv5 (or download unofficial re-builds)

Peter's instructions were excellent, I trained one epoch on one image with all layers frozen. My dataset.yml looked like this:

train: /home/user/train
val: /home/user/train
nc: 3
names: ['animal', 'person', 'vehicle']

And the only files in /home/user/train were one image from Snapshot Serengeti and the corresponding .txt file. It looks like you cannot train on only one negative image, there must be at least one bounding box. FYI the image I used is here and the bbox file is here.

I ran:

python train.py --weights ~/train/[md_v5a.0.0.pt](http://md_v5a.0.0.pt/) --data ~/train/dataset.yml --freeze 33 --epochs 1 --batch 1 --img 1280 --name md_v5a

...and ditto for MDv5b.

I compared the output to "stock" MDv5a, and I really want them to be exactly the same. Boxes appear to be the same to around two decimal places in both location and confidence, which is good, but I would feel better if they were exactly the same. I also tried opening up data/hyp/*.yml, where YOLOv5 stores its hyperparameters, and setting all learning rates to zero. This resulted in... slightly different numbers, but off by about the same (very small) amount. This is the main reason this will remain unofficial for now, but for folks who are dying to get around this Upsample issue, here are my re-built MDv5a and MDv5b files. YMMV.

Set up the M1 inference environment

Only very slight variations to Sam's instructions earlier on this issue:

# Will not install PyTorch, includes "nomkl" package
CONDA_SUBDIR=osx-arm64 conda env create --file environment-detector-m1.yml    

conda activate cameratraps-detector 

# Needs PyTorch version >= 1.13, this will get you 1.13.0 as of 2022.09.07 
pip3 install -U --pre torch torchvision --extra-index-url https://download.pytorch.org/whl/nightly/cpu

# Full disclosure: I did this, but didn't actually test whether this is necessary
conda env config vars set CONDA_SUBDIR=osx-arm64

(Comment originally posted by agentmorris)

@agentmorris
Copy link
Owner Author

@agentmorris, I tried using your suggested solution for an Apple M1:

python detection/run_detector.py "/Users/g/Downloads/md_v5a.0.0_rebuild_pt-1.12_zerolr.pt" --image_file "/Users/g/Documents/Images_for_models/Classified_Yamal_2022/Good_Animal_NoBait->>Good_Animal_Bait/cam7_2022-03-25_03-35-00.JPG" --threshold 0.1

and get this error:

Running detector on 1 images...
PyTorch reports 0 available CUDA devices
PyTorch reports Metal Performance Shaders are available
GPU available: True
Using PyTorch version 1.13.0.dev20220920
Traceback (most recent call last):
  File "detection/run_detector.py", line 529, in <module>
    main()
  File "detection/run_detector.py", line 518, in main
    load_and_run_detector(model_file=args.detector_file,
  File "detection/run_detector.py", line 286, in load_and_run_detector
    detector = load_detector(model_file)
  File "detection/run_detector.py", line 263, in load_detector
    detector = PTDetector(model_file, force_cpu)
  File "/Users/gerardocelis/git/cameratraps/detection/pytorch_detector.py", line 41, in __init__
    self.model = PTDetector._load_model(model_path, self.device)
  File "/Users/gerardocelis/git/cameratraps/detection/pytorch_detector.py", line 48, in _load_model
    checkpoint = torch.load(model_pt_path, map_location=device)
  File "/Users/gerardocelis/opt/miniconda3/envs/cameratraps-detector/lib/python3.8/site-packages/torch/serialization.py", line 763, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/Users/gerardocelis/opt/miniconda3/envs/cameratraps-detector/lib/python3.8/site-packages/torch/serialization.py", line 1100, in _load
    result = unpickler.load()
  File "/Users/gerardocelis/opt/miniconda3/envs/cameratraps-detector/lib/python3.8/site-packages/torch/serialization.py", line 1093, in find_class
    return super().find_class(mod_name, name)
AttributeError: Can't get attribute 'DetectionModel' on <module 'models.yolo' from '/Users/gerardocelis/git/yolov5/models/yolo.py'>

Any ideas how to proceed?


(Comment originally posted by gerlis22)

@agentmorris
Copy link
Owner Author

Are you using the specific YOLOv5 commit (c23a441c9df7ca9b1f275e8c8719c949269160d1) that we recommend for a "standard" MegaDetector setup? If so, I'm about 65% sure this is the issue: for the accelerated M1 setup, you'll need a newer version of YOLOv5. I.e., in your YOLOv5 repo folder, run:

git checkout master

My bad, I should have clarified this above as well.

If this works, can you confirm here, and I'll update the instructions?


(Comment originally posted by agentmorris)

@agentmorris
Copy link
Owner Author

Yes, indeed was using YOLOv5 commit (c23a441c9df7ca9b1f275e8c8719c949269160d1). I updated to commit(489920ab30b217fed14d3ddd31c23e9afc5be238) and works now. Thanks!


(Comment originally posted by gerlis22)

@agentmorris
Copy link
Owner Author

Excellent, I added a section to my post earlier on this issue.


(Comment originally posted by agentmorris)

@agentmorris
Copy link
Owner Author

@agentmorris I ran run_detector.py and run_detector_batch.py on the same image and get very different confidence values. For example, run_detector.py shows an animal with a confidence of 0.95, whereas the run_detector_batch.py is 0.2. Could this be an issue with run_detector_batch.py?


(Comment originally posted by gerlis22)

@agentmorris
Copy link
Owner Author

I connected with @gerliss22 offline; it turns out that the discrepancy was between MDv5a and MDv5b (which is fine), not between run_detector.py and run_detector_batch.py (which would be a catastrophe). So, no cause for alarm here, but thanks to @gerlis22 for checking on this, always better to ask!


(Comment originally posted by agentmorris)

@agentmorris
Copy link
Owner Author

agentmorris commented May 20, 2023

Thanks for the great work.

Just in case anyone else gets stuck where I was for the past hours.

Since torch 0.13 is released now, I changed the environment-detector-m1.yml file to install them:

diff --git a/environment-detector-m1.yml b/environment-detector-m1.yml
index 13979af..ad7050e 100644
--- a/environment-detector-m1.yml
+++ b/environment-detector-m1.yml
@@ -27,8 +27,8 @@ dependencies:
   - pandas
   - seaborn>=0.11.0
   - PyYAML>=5.3.1
-  # - pytorch::pytorch=1.10.1
-  # - pytorch::torchvision=0.11.2
+  - pytorch::pytorch=1.13.1
+  - pytorch::torchvision=0.14.1
   # - conda-forge::cudatoolkit=11.3
   # - conda-forge::cudnn=8.1

Then to install:

CONDA_SUBDIR=osx-arm64 conda env create --file environment-detector-m1.yml    

conda activate cameratraps-detector

python detection/run_detector.py ~/Downloads/md_v5a.0.0.pt --image_file ~/Desktop/video/withbird.png --threshold 0.1

And it results in:

Running detector on 1 images...
PyTorch reports 0 available CUDA devices
PyTorch reports Metal Performance Shaders are available
GPU available: True
Using PyTorch version 1.13.1
Traceback (most recent call last):
  File "detection/run_detector.py", line 529, in <module>
    main()
  File "detection/run_detector.py", line 518, in main
    load_and_run_detector(model_file=args.detector_file,
  File "detection/run_detector.py", line 286, in load_and_run_detector
    detector = load_detector(model_file)
  File "detection/run_detector.py", line 263, in load_detector
    detector = PTDetector(model_file, force_cpu)
  File "/Volumes/Work/megadetector/cameratraps/detection/pytorch_detector.py", line 48, in __init__
    self.model = PTDetector._load_model(model_path, self.device)
  File "/Volumes/Work/megadetector/cameratraps/detection/pytorch_detector.py", line 57, in _load_model
    checkpoint = torch.load(model_pt_path, map_location=device)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cameratraps-detector/lib/python3.8/site-packages/torch/serialization.py", line 789, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cameratraps-detector/lib/python3.8/site-packages/torch/serialization.py", line 1131, in _load
    result = unpickler.load()
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cameratraps-detector/lib/python3.8/site-packages/torch/_utils.py", line 153, in _rebuild_tensor_v2
    tensor = _rebuild_tensor(storage, storage_offset, size, stride)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cameratraps-detector/lib/python3.8/site-packages/torch/_utils.py", line 146, in _rebuild_tensor
    t = torch.tensor([], dtype=storage.dtype, device=storage.untyped().device)
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

In this case, the fix is (to RTFM....) download @agentmorris's patched models above. I got confused because it gave me this error rather than the "advertised" Upsampling error.

python detection/run_detector.py ~/Downloads/md_v5a.0.0_rebuild_pt-1.12_zerolr.pt --image_file ~/Desktop/video/withbird.png --threshold 0.1

(Comment originally posted by reinhrst)

@agentmorris
Copy link
Owner Author

See issue 72 where Peter suggests this change:

index 7778d1b..7774ac0 100644
--- a/detection/pytorch_detector.py
+++ b/detection/pytorch_detector.py
@@ -45,8 +45,8 @@ class PTDetector:
 
     @staticmethod
     def _load_model(model_pt_path, device):
-        checkpoint = torch.load(model_pt_path, map_location=device)
-        model = checkpoint['model'].float().fuse().eval()  # FP32 model
+        checkpoint = torch.load(model_pt_path)
+        model = checkpoint['model'].float().fuse().eval().to(device)  # FP32 model
         return model

@agentmorris
Copy link
Owner Author

Finally closing this issue and adopting Peter's recommended change!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant