[WIP]Caching exposed modules for Hackage packages in Pantry DB #4624

qrilka · 2019-03-14T11:36:07Z

Note: Documentation fixes for https://docs.haskellstack.org/en/stable/ should target the "stable" branch, not master.

Please include the following checklist in your PR:

Any changes that could be relevant to users have been recorded in the ChangeLog.md
The documentation has been updated, if necessary.

Tested manually.

qrilka · 2019-03-14T11:36:26Z

qrilka · 2019-03-14T11:48:40Z

@snoyberg could you take a look at this variant? On my machine for stack script --resolver=lts-13.0 hmm.hs from #4536 I get the following:

version	time
`master`	4.4s
this PR, migrated, not cached	5.4s
this PR, cached modules	1.2s
1.9.3	0.4s

From logs it looks like lock files could give a bit more improvement and probably we could get something like 2x slowdown comparing to the current stable version. Also I have an idea to store module list as a blob (maybe even in hackage_cabal table) - this could save us some time by not doing some lookups.
Do you think it make sense to explore that option?

snoyberg

We can discuss this if the overall idea isn't clear here. But: instead of performing as much logic in the Haskell side, what about pushing the lookup logic into the SQLite database instead? I'm picturing something like this:

SnapshotModuleCache
    someUniqueSnapshotIdentifier Text

SnapshotPackage
    snapshot SnapshotModuleCacheId
    cabal BlobId
    UniqueSnapshotPackage snapshot cabal

PackageExposedModule
    cabal BlobId
    module ModuleId
    UniquePackageExposedModule

Then on the Haskell side:

-- Ensures that the tables are filled in for a snapshot
populateSnapshotModuleCache :: Snapshot -> RIO env ()

-- Find all packages which contain the given module name
findPackagesWithModule :: Snapshot -> ModuleName -> RIO env [PackageName]

I have the details very wrong above, but I think with this you could:

Have a relatively slow populate action, with a sticky log message explaining "populating module name cache." This will happen once, and the first usage of a snapshot is already expected to be a bit slow, what with downloading the snapshot file and such.
Have a much smaller surface area for data transfer between Haskell and SQLite for each module name lookup

snoyberg · 2019-03-14T12:20:50Z

subs/pantry/package.yaml

@@ -62,6 +62,7 @@ dependencies:
 - resourcet
 - rio-prettyprint
 - mtl
+- extra


I think we've been trying to avoid adding dependencies on these kinds of packages.

snoyberg · 2019-03-14T12:23:49Z

subs/pantry/src/Pantry/Storage.hs

+-- Cache of modules exposed by a Hackage package
+HackageExposedModules
+    cabal BlobId
+    moduleName P.ModuleNameP


Since there are likely to be a lot of duplicated module names, this would probably benefit from normalization, the same way we normalize package names. How about:

moduleName ModuleId Module name P.ModuleName UniqueModule name

Or something like that?

Caching exposed modules for Hackage packages in Pantry DB

f06935b

qrilka requested a review from snoyberg March 14, 2019 11:36

snoyberg reviewed Mar 14, 2019

View reviewed changes

qrilka mentioned this pull request Mar 15, 2019

Cache exposed package modules in Pantry DB #4628

Merged

2 tasks

snoyberg closed this in #4628 Mar 18, 2019

mpilgrem deleted the cache-package-modules branch October 16, 2022 14:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP]Caching exposed modules for Hackage packages in Pantry DB #4624

[WIP]Caching exposed modules for Hackage packages in Pantry DB #4624

qrilka commented Mar 14, 2019

qrilka commented Mar 14, 2019

qrilka commented Mar 14, 2019

snoyberg left a comment

snoyberg Mar 14, 2019

snoyberg Mar 14, 2019

[WIP]Caching exposed modules for Hackage packages in Pantry DB #4624

[WIP]Caching exposed modules for Hackage packages in Pantry DB #4624

Conversation

qrilka commented Mar 14, 2019

qrilka commented Mar 14, 2019

qrilka commented Mar 14, 2019

snoyberg left a comment

Choose a reason for hiding this comment

snoyberg Mar 14, 2019

Choose a reason for hiding this comment

snoyberg Mar 14, 2019

Choose a reason for hiding this comment