Repeat generation of aggregation keys #1479

Stebalien · 2021-07-08T19:30:42Z

Description

Currently, if multiple threads try to verify aggregate proofs with new parameters at the same time, we'll try to compute the keys multiple times (last one wins).

Ideally, we'd compute the keys once.

Acceptance criteria

I can try to verify a thousand aggregate proofs at once without waiting a day or two ((100s * 1000)/3600 = ~28 hours).

Risks + pitfalls

Extra complexity in the caches could introduce bugs. We also probably don't want to rely on nightly, which means either rolling our own "once" or using an external crate.

Where to begin

Use something like https://doc.rust-lang.org/std/lazy/struct.SyncOnceCell.html (or https://crates.io/crates/conquer-once) or even just a mutex to define a per-key locked "cells" for every entry in our caches.

The text was updated successfully, but these errors were encountered:

Stebalien · 2021-07-08T19:32:06Z

rust-fil-proofs/filecoin-proofs/src/caches.rs

Lines 40 to 69 in 64a7c43

    
           pub fn cache_lookup<F, G>( 
        
               cache_ref: &Mutex<Cache<G>>, 
        
               identifier: String, 
        
               generator: F, 
        
           ) -> Result<Arc<G>> 
        
           where 
        
               F: FnOnce() -> Result<G>, 
        
               G: Send + Sync, 
        
           { 
        
               info!("trying parameters memory cache for: {}", &identifier); 
        
               { 
        
                   let cache = (*cache_ref).lock().expect("poisoned cache"); 
        
                   if let Some(entry) = cache.get(&identifier) { 
        
                       info!("found params in memory cache for {}", &identifier); 
        
                       return Ok(entry.clone()); 
        
                   } 
        
               } 
        
               info!("no params in memory cache for {}", &identifier); 
        
               let new_entry = Arc::new(generator()?); 
        
               let res = new_entry.clone(); 
        
               { 
        
                   let cache = &mut (*cache_ref).lock().expect("poisoned cache"); 
        
                   cache.insert(identifier, new_entry); 
        
               } 
        
               Ok(res) 
        
           }

cryptonemo · 2021-07-08T19:32:10Z

cc: @nikkolasg This refers to the specialize per pow2 proof aggregation count

cryptonemo · 2021-07-08T19:33:12Z

Hrm, @Stebalien not sure that's the case, let me look, but that access is locked AFAIK

cryptonemo · 2021-07-08T19:35:03Z

Ah, I think I see -- the cache insertion is guarded, but not the generation

Stebalien · 2021-07-08T19:38:20Z

Yep. And we probably don't want to hold a global lock here (maybe not terrible?).

cryptonemo · 2021-07-08T19:39:19Z

They aren't global locks, they are specific to the type (see lines above that method)

cryptonemo · 2021-07-08T19:47:32Z

@nikkolasg I still haven't gotten an answer on if there's any way to re-use the calculations across specialize calls (since we always calculate out to the max we support). If so, we should chat about ways to do that before doing anything here since it could more generally speed things up.

cryptonemo · 2021-07-08T19:57:57Z

Noting also that the reason we've never run into this as an issue is because we have real parameter files that are never generated (so the generate path was previously not exercised ever since mainnet params were available). While we don't generate fake SRS/IPP data either, we do take the time to do a compute intensive specialize call that displays this kind of breakage of cache usage. Unfortunately this was apparently overlooked as the specialize perf was not supposed to be a significant issue.

Stebalien · 2021-07-08T20:10:43Z

They aren't global locks, they are specific to the type (see lines above that method)

Global as in global for the entire cache (versus a lock for the specific key).

cryptonemo · 2021-07-08T20:22:22Z

I ran a similar bench, but of course I warmed it up by running once before spawning threads. Is this an option in the short term if this is a significant issue for the first loading (i.e. verify once before parallelizing)? There are 2 things that should be discussed on our side 1) Is it possible to lock just once no matter how many proofs are being aggregated since the long running part of the computation is always the same, or 2) Do we just leave it as is and come up with some hacky method of locking once per number of proofs being aggregated.

cryptonemo · 2021-07-08T20:30:07Z

If the answer to 1 is yes (cc: @nikkolasg), it could potentially be solved completely in bellman (where that data is re-used across calls), if 2, it could potentially be solved either in proofs or in lotus. For 1, it could be too memory intensive or add another hacky cache file on disk, which will be problematic since like params it can be corrupted so it needs verification, etc. For 2, it effectively serializes first time parallel verifies for each pow2 number of proofs, but keeps computations potentially high on slower machines for some time (i.e. the original problem we had with specialize where it could take 100s several times over). I personally prefer 1, with a memory only cache, but I don't know how much that would affect things (i.e. how big that would be).

Stebalien · 2021-07-08T23:09:15Z

Is it possible to lock just once no matter how many proofs are being aggregated since the long running part of the computation is always the same

This should be fine. Under normal circumstances, lotus will only verify one aggregate at a time.

However, we should compute these ahead-of-time, at compile time, or maybe even when initializing the repo (storing them on disk). We'd expect to compute each one at least once anyways. At the moment, a miner could get caught off guard having to compute multiple of these things on-demand and fail to produce a block in time.

Basically, we save nothing by computing them on-demand.

dignifiedquire · 2021-07-09T14:39:11Z

I wonder if we might want to expose a function to the node, allowing to trigger generation for specific sizes/all sizes. This would allow better control from that side.

For the cache itself I would recommend OnceCell which I usually use in cases like this, so that the cache becomes sth like

use lazy_static::lazy_static;
use once_cell::sync::OnceCell;
use std::collections::HashMap;
use std::sync::Arc;

#[derive(Debug, Default)]
pub struct Cache<V> {
    data: HashMap<String, OnceCell<Arc<V>>>,
}

impl<V> Cache<V> {
    pub fn with_defaults(keys: &[&str]) -> Self {
        let mut data = HashMap::new();
        for key in keys {
            data.insert(key.to_string(), OnceCell::new());
        }

        Self { data }
    }

    /// Returns `None` for non existent entries, `Some(v)` for existing ones, where `v` is either
    /// the result of running `f` or already existing one.
    pub fn get_or_init<F>(&self, key: &str, f: F) -> Option<Arc<V>>
    where
        F: FnOnce() -> V,
    {
        self.data
            .get(key)
            .map(|cell| cell.get_or_init(|| Arc::new(f())))
            .cloned()
    }
}

lazy_static! {
    // can only cache "a" and "b", but avoids having a global mutex on the cache.
    static ref MY_CACHE: Cache<u32> = Cache::with_defaults(&["a", "b"]);
}

pub fn cache_lookup<V, F>(cache_ref: &Cache<V>, identifier: &str, generator: F) -> Arc<V>
where
    F: FnOnce() -> V,
    V: Send + Sync,
{
    if let Some(v) = cache_ref.get_or_init(identifier, generator) {
        return v;
    }
    panic!("unknown identifier {}", identifier);
}

Stebalien · 2021-07-09T15:59:04Z

For the cache itself I would recommend OnceCell which I usually use in cases like this, so that the cache becomes sth like

(ok, I swear I tried to find that exact library and failed)

cryptonemo self-assigned this Jul 8, 2021

cryptonemo mentioned this issue Jul 13, 2021

feat: use OnceCell for srs key caches to avoid duplicate generation #1481

Merged

cryptonemo closed this as completed in #1481 Jul 20, 2021

jennijuju mentioned this issue Jul 26, 2021

Integrate the new proof patch to v1.11.1 filecoin-project/lotus#6873

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repeat generation of aggregation keys #1479

Repeat generation of aggregation keys #1479

Stebalien commented Jul 8, 2021

Stebalien commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

Stebalien commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

Stebalien commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

Stebalien commented Jul 8, 2021

dignifiedquire commented Jul 9, 2021 •

edited

Loading

Stebalien commented Jul 9, 2021

Repeat generation of aggregation keys #1479

Repeat generation of aggregation keys #1479

Comments

Stebalien commented Jul 8, 2021

Description

Acceptance criteria

Risks + pitfalls

Where to begin

Stebalien commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

Stebalien commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

Stebalien commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

cryptonemo commented Jul 8, 2021

Stebalien commented Jul 8, 2021

dignifiedquire commented Jul 9, 2021 • edited Loading

Stebalien commented Jul 9, 2021

dignifiedquire commented Jul 9, 2021 •

edited

Loading