Skip to content
This repository has been archived by the owner on Jul 15, 2023. It is now read-only.

cccheck: Cached analysis gets horrendously slow over time #423

Open
yaakov-h opened this issue May 18, 2016 · 1 comment
Open

cccheck: Cached analysis gets horrendously slow over time #423

yaakov-h opened this issue May 18, 2016 · 1 comment

Comments

@yaakov-h
Copy link
Contributor

For database-based caching, cccheck uses Entity Framework with lazy-loading. This means that whenever Method.Assemblies is touched, it loads data for every single assembly associated with that method. An example of a piece of code that triggers this data load is here. The query that this expression generates is:

exec sp_executesql N'SELECT 
[Extent2].[AssemblyId] AS [AssemblyId], 
[Extent2].[Name] AS [Name], 
[Extent2].[Created] AS [Created], 
[Extent2].[IsBaseLine] AS [IsBaseLine], 
[Extent2].[SourceControlInfo] AS [SourceControlInfo]
FROM  [dbo].[AssemblyInfoMethods] AS [Extent1]
INNER JOIN [dbo].[AssemblyInfo] AS [Extent2] ON [Extent1].[AssemblyInfo_AssemblyId] = [Extent2].[AssemblyId]
WHERE [Extent1].[Method_Id] = @EntityKeyValue1',N'@EntityKeyValue1 bigint',@EntityKeyValue1=4

An assembly entry in the database is a unique assembly being analyzed. This is not keyed to name, but appears to be maybe some sort of hash? I'm not quite sure. What I do know is that:

  • Analyzing the built assemblies from a project multiple times does not result in new assembly entries.
  • Rebuilding and analyzing the assemblies from a project multiple times results in new assembly entries for each rebuild.

Therefore, this operates at O(M x N) where:

  • M is the number of assemblies that a given method appears in, per build, and
  • N is the number of times a project has been built for static analysis

When analyzing the method System.Diagnostics.Contracts.ContractDeclarativeAssemblyAttribute.#ctor(), for example, which is added into every assembly that gets statically checked, cccheck loads in a tonne of records. At the scale that my team is operating at with multiple analyses per day of projects containing multiple assemblies, this loads in about 10,000 records per day since the cache was last cleaned.

This causes an enormous slowdown over time. A full build with a fresh cache takes my team about 15-20 minutes, but over time this can grow to 60-80 minutes with a very large cache database. For comparison, it takes around 60 minutes without a cache at all.

Looking at the code in Clousot, I don't see any way to fix this performance hit without rewriting the caching layer from scratch.

@SergeyTeplyakov @hubuk Any ideas on how to make this work faster?

@SergeyTeplyakov
Copy link
Contributor

I'm not that familiar with that piece of code, so can't give any reasonable advice. Maybe cleaning up this stuff and rewriting it from scratch is not that bad idea.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants