Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check whether the calculation of the 'add_time' is incorrect? #34

Open
yipengLeo opened this issue Sep 11, 2024 · 2 comments
Open

Check whether the calculation of the 'add_time' is incorrect? #34

yipengLeo opened this issue Sep 11, 2024 · 2 comments

Comments

@yipengLeo
Copy link

I found the _get_block_execution_time() function in the vidur/entities/execution_time.py path computes 'add_time' only once.

def _get_block_execution_time(self) -> float:
    return (
        self._get_attention_layer_execution_time()
        + self._get_mlp_layer_execution_time()
        + self._add_time
    )

But in other places, for example, under the vidur/metrics/metrics_store.py path. 'add_time' is calculated twice

self._push_metric(
OperationMetrics.ADD, batch_id, execution_time.add_time * 2
)

Is this a bug?

@AgrawalAmey
Copy link
Contributor

@yipengLeo thanks for pointing this out, there should be two add operations per block in llama models as far as i remember, let me look into this

@nitinkedia7
Copy link
Collaborator

Hi @yipengLeo, there is indeed a bug here. However, the impact is relatively little as the add_time is a small percentage of the time.

For all the models supported right now except phi-2, there are two add operations in each layer.
(only phi-2 has post_attn_norm: false in its model config)
1st add: https://github.com/microsoft/vidur/blob/main/vidur/profiling/mlp/mlp_impl.py#L172
2nd add: https://github.com/microsoft/vidur/blob/main/vidur/profiling/mlp/mlp_impl.py#L179

So, if we ignore phi-2, it would be better to add add_time twice. A proper solution would be to consider these slight differences between model architectures. Please feel free to raise a PR for fixing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants