Memory leak and channel closure issues when reusing/dropping Model #865

solaoi · 2024-10-19T05:24:43Z

Describe the bug

When initializing and dropping the Model repeatedly:

Memory usage continuously increases as GGUF models aren't properly cleaned up
Channel is erroneously closed after the first iteration

Steps to Reproduce

Create a service that initializes and drops the model multiple times
Run the following code:

use anyhow::Result;
use mistralrs::{GgufModelBuilder, PagedAttentionMetaBuilder, TextMessageRole, TextMessages};
use std::time::Duration;
use tokio::time::sleep;

struct ChatService {
    model: Option<mistralrs::Model>,
}

impl ChatService {
    async fn new() -> Result<Self> {
        Ok(Self { model: None })
    }

    async fn initialize_model(&mut self) -> Result<()> {
        self.model = Some(
            GgufModelBuilder::new(
                "gguf_models/mistral_v0.1/",
                vec!["mistral-7b-instruct-v0.1.Q4_K_M.gguf"],
            )
            .with_chat_template("chat_templates/mistral.json")
            .with_paged_attn(|| PagedAttentionMetaBuilder::default().build())?
            .build()
            .await?,
        );
        Ok(())
    }

    async fn chat(&self, prompt: &str) -> Result<String> {
        let messages = TextMessages::new().add_message(TextMessageRole::User, prompt);

        let response = self
            .model
            .as_ref()
            .unwrap()
            .send_chat_request(messages)
            .await?;

        Ok(response.choices[0]
            .message
            .content
            .clone()
            .unwrap_or_default())
    }
}

#[tokio::main]
async fn main() -> Result<()> {
    for i in 0..3 {
        println!("Iteration {}", i);

        let mut service = ChatService::new().await?;
        service.initialize_model().await?;

        let response = service.chat("Write a short greeting").await?;
        println!("Response: {}", response);

        // Model is dropped here, but GGUF remains in memory
        drop(service);

        // Wait to make memory usage observable
        sleep(Duration::from_secs(5)).await;
    }

    Ok(())
}

Cargo.toml is here:

[package]
name = "memory_bug_mistral"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
tokio = { version = "1", features = ["full"] }
anyhow = "1.0"
mistralrs = { git = "https://github.com/EricLBuehler/mistral.rs.git", branch = "master", features = [
    "metal",
] }
regex="1.10.6"

Observed Behavior

Memory usage increases with each iteration even after explicit drop
After first iteration, receiving error:

Error: Channel was erroneously closed!

Expected Behavior

Memory should be properly freed when model is dropped
Channel should remain functional for subsequent iterations

Latest commit or version

mistralrs version: latest master: 32e894510696e9aa3c11db79268ee031a3ecefa6
Mac: M2
OS: Sonoma 14.7
Rust version: 1.80.1

The text was updated successfully, but these errors were encountered:

solaoi · 2024-10-31T04:02:37Z

@EricLBuehler
I'm wondering if you have any plans to address this memory management issue in the library?
While I could work around it using a web server or child processes for now, I'd like to understand your timeline for implementing a native solution. This would help me decide whether to proceed with a temporary workaround or wait for an official fix. Could you share your thoughts on this?

solaoi added the bug Something isn't working label Oct 19, 2024

solaoi mentioned this issue Oct 19, 2024

blocking_recv hangs after first iteration in loop #750

Closed

solaoi mentioned this issue Oct 30, 2024

How to free memory usage #886

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak and channel closure issues when reusing/dropping Model #865

Memory leak and channel closure issues when reusing/dropping Model #865

solaoi commented Oct 19, 2024

solaoi commented Oct 31, 2024

Memory leak and channel closure issues when reusing/dropping Model #865

Memory leak and channel closure issues when reusing/dropping Model #865

Comments

solaoi commented Oct 19, 2024

Describe the bug

Steps to Reproduce

Observed Behavior

Expected Behavior

Latest commit or version

solaoi commented Oct 31, 2024