Move main.predict_file to predict.predict_file and uses trainer.predict() for predict_file(). Speeds up main.evaluate! #550

bw4sz · 2023-11-09T19:35:52Z

Migrate the code in main.predict_file to predict.predict_file with just a wrapper in main. main.py is too long and shouldn't have any complex logic in it.
Used the trainer logic instead of manually moving batches to GPU, this causes significant speed up in evaluate, which should close main.evaluate is too slow #538.

Before was around 37 seconds.

After

Tested on hpc with 1 gpu

(base) [b.weinstein@login12 ~]$ cat tunnel.sh
#!/bin/bash
#SBATCH --job-name=tunnel   # Job name
#SBATCH --mail-type=END               # Mail events
#SBATCH [email protected]  # Where to send mail
#SBATCH --account=ewhite
#SBATCH --nodes=1                 # Number of MPI ran
#SBATCH --cpus-per-task=1
#SBATCH --mem=70GB
#SBATCH --time=12:00:00       #Time limit hrs:min:sec
#SBATCH --output=/home/b.weinstein/logs/tunnel.out   # Standard output and error log
#SBATCH --error=/home/b.weinstein/logs/tunnel.err
#SBATCH --partition=gpu
#SBATCH --gpus=1

(base) [b.weinstein@login12 tests]$ cat profile_predict_file.py
#Profile the dataset class on gpu
from deepforest import main
from deepforest import get_data
import os
import pandas as pd
import numpy as np
import cProfile, pstats
import tempfile
from PIL import Image
import cv2

def run(m, csv_file, root_dir):
    predictions = m.predict_file(csv_file=csv_file, root_dir=root_dir)

if __name__ == "__main__":
    profiler = cProfile.Profile()
    profiler.enable()
    m = main.deepforest()
    m.use_release()
    m.config["workers"] = 0
    m.config["batch_size"] = 24

    csv_file = get_data("OSBS_029.csv")
    image_path = get_data("OSBS_029.png")
    tmpdir = tempfile.gettempdir()
    df = pd.read_csv(csv_file)

    big_frame = []
    for x in range(100):
        img = Image.open("{}/{}".format(os.path.dirname(csv_file), df.image_path.unique()[0]))
        cv2.imwrite("{}/{}.png".format(tmpdir, x), np.array(img))
        new_df = df.copy()
        new_df.image_path = "{}.png".format(x)
        big_frame.append(new_df)

    big_frame = pd.concat(big_frame)
    big_frame.to_csv("{}/annotations.csv".format(tmpdir))


    run(m, csv_file = "{}/annotations.csv".format(tmpdir), root_dir = tmpdir)
    profiler.disable()
    stats = pstats.Stats(profiler).sort_stats('cumtime')
    stats.print_stats()
    stats.dump_stats('predict_file.prof')

ethanwhite

This all looks good. My one question is whether or not we could just import predict_file() directly from predict.py instead of having what is basically just a pass through function, but I'm guessing there's something special about the lightening module class that means we need to put a copy there, so I'm going to go ahead and merge.

Move main.predict_file to predict.predict_file and uses trainer.predict() for predict_file(). Speeds up main.evaluate!

bw4sz added 2 commits November 9, 2023 07:56

move to GPU to profile results

f71eef4

style

fd93aab

bw4sz requested a review from henrykironde November 9, 2023 19:35

change the order of args

6050220

ethanwhite approved these changes Nov 10, 2023

View reviewed changes

ethanwhite merged commit a028d71 into main Nov 10, 2023
5 checks passed

ethanwhite deleted the predict_file_dataloader branch November 10, 2023 15:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move main.predict_file to predict.predict_file and uses trainer.predict() for predict_file(). Speeds up main.evaluate! #550

Move main.predict_file to predict.predict_file and uses trainer.predict() for predict_file(). Speeds up main.evaluate! #550

bw4sz commented Nov 9, 2023 •

edited

Loading

ethanwhite left a comment

Move main.predict_file to predict.predict_file and uses trainer.predict() for predict_file(). Speeds up main.evaluate! #550

Move main.predict_file to predict.predict_file and uses trainer.predict() for predict_file(). Speeds up main.evaluate! #550

Conversation

bw4sz commented Nov 9, 2023 • edited Loading

ethanwhite left a comment

Choose a reason for hiding this comment

bw4sz commented Nov 9, 2023 •

edited

Loading