Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lookup buildconfigs in shared indexed cache instead of listing #10923

Merged
merged 1 commit into from
Sep 28, 2016

Conversation

bparees
Copy link
Contributor

@bparees bparees commented Sep 15, 2016

fixes #10616

@bparees bparees force-pushed the indexed_buildconfig_cache branch 3 times, most recently from 6a58a10 to e5a4636 Compare September 15, 2016 08:20
@bparees
Copy link
Contributor Author

bparees commented Sep 15, 2016

@Kargakis @smarterclayton ptal

@bparees
Copy link
Contributor Author

bparees commented Sep 15, 2016

[test]

@bparees
Copy link
Contributor Author

bparees commented Sep 15, 2016

[testextended][extended:core(builds)]

},
}
}

func (factory *ImageChangeControllerFactory) waitForSyncedStores() {
for !factory.BuildConfigIndexSynced() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kargakis this appears to never return true... am I missing some additional initialization step?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest rewritting the controller to use Informers instead of the RunnableController framework. Currently nothing initializes the cache so it never syncs. By using the shared informer framework, the master will do it for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i'm not signing up for rewriting the controllers just yet, that's a much larger piece of work so for now i just want to fix the performance problem.

when i remove the sync check, it actually does work, so the cache is getting properly populated. I couldn't tell from your other PR what is supposed to trigger the cache to be initialized..I assumed since it was a shared informer, that should already be happening even if i'm not using it...

@bparees bparees force-pushed the indexed_buildconfig_cache branch 2 times, most recently from 94fe34b to e30d636 Compare September 15, 2016 10:35
@@ -291,7 +291,9 @@ func (c *MasterConfig) RunBuildPodController() {
func (c *MasterConfig) RunBuildImageChangeTriggerController() {
bcClient, _ := c.BuildImageChangeTriggerControllerClients()
bcInstantiator := buildclient.NewOSClientBuildConfigInstantiatorClient(bcClient)
factory := buildcontrollerfactory.ImageChangeControllerFactory{Client: bcClient, BuildConfigInstantiator: bcInstantiator}
bcIndex := &oscache.StoreToBuildConfigListerImpl{c.Informers.BuildConfigs().Indexer()}
bcIndexSynced := c.Informers.BuildConfigs().Informer().HasSynced
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kargakis the informer is being created/initialized here, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method starts all shared informers and it's called in master here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, so why is my cache never synced? my informer should be one of the ones getting started, right?

@bparees bparees force-pushed the indexed_buildconfig_cache branch from e30d636 to dfc6fdc Compare September 15, 2016 15:21
@smarterclayton
Copy link
Contributor

You have to call .Informer() prior to RunBuildPodController - otherwise
your informer doesn't get started.

On Thu, Sep 15, 2016 at 12:12 PM, OpenShift Bot [email protected]
wrote:

continuous-integration/openshift-jenkins/testextended FAILURE (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin_extended/485/)
(Extended Tests: core(builds))


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10923 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p8Qh-pQ6INdaYBddS9A3WO-mWh4Uks5qqW5_gaJpZM4J9l_s
.

@bparees
Copy link
Contributor Author

bparees commented Sep 15, 2016

You have to call .Informer() prior to RunBuildPodController - otherwise your informer doesn't get started.

@smarterclayton i am:
https://github.com/openshift/origin/pull/10923/files#diff-232120b6845c4d46e68c543a21dbf8c8R295

the issue appears to be that despite putting this in a go routine:
https://github.com/openshift/origin/pull/10923/files#diff-232120b6845c4d46e68c543a21dbf8c8R297

everything hangs inside the waitForSync called within the Create() function, and the sync never occurs unless the main thread is able to make progress. I've even added runtime.Gosched() calls inside my wait loop, but it seems to never give up control so it can return.

@bparees bparees force-pushed the indexed_buildconfig_cache branch from dfc6fdc to 584b411 Compare September 15, 2016 18:29
factory := buildcontrollerfactory.ImageChangeControllerFactory{Client: bcClient, BuildConfigInstantiator: bcInstantiator, BuildConfigIndex: bcIndex, BuildConfigIndexSynced: bcIndexSynced}
go func() {
factory.Create().Run()
}()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton @Kargakis this was the issue. go factory.Create().Run() hangs everything because the Create() call (which is where i'm doing the wait for sync operation) is not run in a go routine. sigh.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not have to call Run(). master.Start should be calling Start for you after you exit. If that's not the case, debug until you find out why.

Copy link
Contributor Author

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I was looking at something else before. This is fine.

@smarterclayton
Copy link
Contributor

Do not call factory.Create.Run() yourself. Load the informer in master
config.

On Thu, Sep 15, 2016 at 4:02 PM, OpenShift Bot [email protected]
wrote:

continuous-integration/openshift-jenkins/testextended SUCCESS (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin_extended/486/)
(Extended Tests: core(builds))


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10923 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p86FELUrBOTKBbtlyvxix5sImDYQks5qqaRrgaJpZM4J9l_s
.

@bparees
Copy link
Contributor Author

bparees commented Sep 15, 2016

Hm?

That factory is for the controller, not the informer, it's the same pattern
all the controllers follow (create controller, via factory or otherwise,
and call run)

And the informer is being created the same way the other controllers are
doing it, which is in masterconfig...

Ben Parees | OpenShift

On Sep 15, 2016 23:13, "Clayton Coleman" [email protected] wrote:

Do not call factory.Create.Run() yourself. Load the informer in master
config.

On Thu, Sep 15, 2016 at 4:02 PM, OpenShift Bot [email protected]
wrote:

continuous-integration/openshift-jenkins/testextended SUCCESS (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin_extended/486/
)
(Extended Tests: core(builds))


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10923 (comment),
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_
p86FELUrBOTKBbtlyvxix5sImDYQks5qqaRrgaJpZM4J9l_s>
.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#10923 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEvl3mwc2mLO3W9qmXgxt9fPDbPOOgWlks5qqbTtgaJpZM4J9l_s
.

func (factory *ImageChangeControllerFactory) waitForSyncedStores() {
for !factory.BuildConfigIndexSynced() {
glog.V(4).Infof("Waiting for the bc caches to sync before starting the imagechange buildconfig controller worker")
<-time.After(StoreSyncedPollPeriod)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

select between this and Stop

return configs, nil
}

func (s *StoreToBuildConfigListerImpl) buildconfigs(namespace string) storebuildconfigsNamespacer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be exported.

@@ -15,6 +17,40 @@ const (
func ImageStreamReferenceIndexFunc(obj interface{}) ([]string, error) {
switch t := obj.(type) {
// TODO: Add support for build configs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove TODO

@bparees bparees force-pushed the indexed_buildconfig_cache branch from 584b411 to 7e63abc Compare September 16, 2016 10:14
@bparees
Copy link
Contributor Author

bparees commented Sep 16, 2016

@Kargakis thanks, comments addressed.

@bparees
Copy link
Contributor Author

bparees commented Sep 16, 2016

flake #10951
[test]

@openshift-bot
Copy link
Contributor

openshift-bot commented Sep 16, 2016

continuous-integration/openshift-jenkins/test Waiting: Determining build queue position

return nil, err
}
if !exists {
return nil, kapierrors.NewNotFound(buildapi.Resource("BuildConfig"), name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be "buildconfigs"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - anyplace you see "Resource" should be treated like the lower case name as seen by the client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has been changed.

}

var configs []*buildapi.BuildConfig

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no newline here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

func (s *StoreToBuildConfigListerImpl) BuildConfigs(namespace string) storebuildconfigsNamespacer {
return storebuildconfigsNamespacer{s.Indexer, namespace}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storeBuildConfigsNamespacer (camelCase)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do.

// GetConfigsForImageStream returns all the build configs that are triggered by the provided image stream
// by searching through using the ImageStreamReferenceIndex (build configs are indexed in the cache
// by image stream references).
func (s *StoreToBuildConfigListerImpl) GetConfigsForImageStreamTrigger(stream *imageapi.ImageStream) ([]*buildapi.BuildConfig, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love taking an image stream here - it would be better to take a stream namespace and name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love taking an image stream here. I'd prefer these look like StoreToReplicationControllerLister. DeploymentConfig is old.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see an equivalent example in StoreToReplicationControllerLister, GetPodControllers() for example takes a Pod object, not a namespace+podname. Which is done because it needs the pod labels. But there's no other similar "look up by this other index object" example.

I kind of prefer this existing signature because it also makes it clear the object you are to pass in must be an ImageStream, whereas just taking (string,string) does not make that as obvious. It also leaves us the freedom to use other parts of the imagestream for the lookup in the future w/o refactoring anything. What is the advantage/reason for preferring (namespace,name) as the signature?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're trying to converge these accessors on the client interfaces, so taking namespace and name is closer to the actual goal. Also creating the object in some contexts is silly (create a whole deployment config to provide a name and namespace) and can actually hide the fact that only namespace and name should be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k - @smarterclayton says you get the final say on this. see discussion above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k - @smarterclayton says you get the final say on this. see discussion above.

Namespace and name please. For the same reasons clayton listed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will change.

// TODO: Add support for build configs
case *buildapi.BuildConfig:
var keys []string

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no newline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll delete the one in the deploymentconfig case just below this as well.... though personally i like a newline separation between variable declarations and code logic.

// uses the input image of the buildconfig as the imagestream
// to trigger off, so we need to look that up.
if from == nil {
from = buildutil.GetInputReference(t.Spec.Strategy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to index on source.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what source?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build source

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm still confused. Why do i need to index on build source? we're not doing any lookups on build source currently. (and by build source what build source do you mean? git source? image input source? secret-as-source?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you have to check .Spec.Source.Images in order to find the other triggers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. will add a more explanatory comment in the code.

@bparees bparees force-pushed the indexed_buildconfig_cache branch from 7e63abc to 177b021 Compare September 16, 2016 22:38
@bparees
Copy link
Contributor Author

bparees commented Sep 16, 2016

@smarterclayton changes i understood have been made. still not clear what you're saying about invoking Run() on the controller after creation, or what you mean about indexing on source.

@bparees bparees force-pushed the indexed_buildconfig_cache branch 2 times, most recently from 1ed11db to aba0916 Compare September 16, 2016 22:43
@bparees
Copy link
Contributor Author

bparees commented Sep 17, 2016

flake #10773
[test]

@bparees
Copy link
Contributor Author

bparees commented Sep 19, 2016

@smarterclayton pushed changes in a new commit. unresolved comments:

  • indexing on build source
  • whether indexer has a bug requiring empty keys to be returned in some cases
  • refactoring the buildconfig lookup by imagestream method signature to (string,string)

@bparees
Copy link
Contributor Author

bparees commented Sep 21, 2016

@smarterclayton bump on #10923 (comment)

@bparees
Copy link
Contributor Author

bparees commented Sep 26, 2016

@smarterclayton need to resolve #10923 (comment)

@bparees bparees force-pushed the indexed_buildconfig_cache branch from 3ee36a2 to f5a2cd8 Compare September 26, 2016 22:14
@bparees
Copy link
Contributor Author

bparees commented Sep 26, 2016

@smarterclayton ok the comment block in index.go has been updated to explain what's going on with the imagetrigger indexing.

other than the open question of what the right signature is for GetConfigsForImageStreamTrigger(stream *imageapi.ImageStream), I think all your comments have been addressed.

@bparees bparees force-pushed the indexed_buildconfig_cache branch from f5a2cd8 to 85adf0f Compare September 27, 2016 13:51
@bparees
Copy link
Contributor Author

bparees commented Sep 27, 2016

@smarterclayton this is ready for final review.

// instead it triggers on the image being used as the builder/base image
// as referenced in the build strategy, so if this is an ICT w/ no
// explicit image reference, use the image referenced by the strategy
// because this is the default ICT.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a note for the next version of the API, we should default this onto the ICT so that it's explicit rather than being implicit. Implicit creates more complicated code and complicates clients.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we needed this because our first version of the ICT didn't even let you specify the image, so this default behavior was needed to maintain compatibility.

@smarterclayton
Copy link
Contributor

Please squash

@bparees bparees force-pushed the indexed_buildconfig_cache branch from 85adf0f to 58ef19d Compare September 27, 2016 15:10
@smarterclayton
Copy link
Contributor

I think you broke the shared build config test making it flake now

On Tue, Sep 27, 2016 at 12:22 PM, OpenShift Bot [email protected]
wrote:

continuous-integration/openshift-jenkins/test Running (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/9352/)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10923 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p42Z29CQm5J7_RM08Ybm4H05qGWDks5quULZgaJpZM4J9l_s
.

@smarterclayton
Copy link
Contributor

Removing merge until we can be sure it isn't impacted.

@bparees
Copy link
Contributor Author

bparees commented Sep 27, 2016

I think you broke the shared build config test making it flake now

got a link to the failure?

@bparees
Copy link
Contributor Author

bparees commented Sep 27, 2016

@smarterclayton so the failure on TestConcurrentBuildImageChangeTriggerControllers is because the cache never syncs. I don't know why that isn't affecting other imagechangetrigger tests... is there something about multiple controllers running that is going to stop the cache from reaching a synced state?

I0927 11:18:44.082349    4355 factory.go:323] Waiting for the bc caches to sync before starting the imagechange buildconfig controller worker

@bparees
Copy link
Contributor Author

bparees commented Sep 27, 2016

@smarterclayton @deads2k i think the failure was caused because the namespace and name arguments were reversed when i switched the api from taking an imagestream. Which is another example of why i think that's a bad api and it's better to take the explicit object instead of 2 strings that can be reversed and cause subtle bugs.

@bparees bparees force-pushed the indexed_buildconfig_cache branch from 58ef19d to afbbd76 Compare September 27, 2016 18:35
@bparees
Copy link
Contributor Author

bparees commented Sep 27, 2016

[merge]

@bparees bparees force-pushed the indexed_buildconfig_cache branch from afbbd76 to 373b27b Compare September 28, 2016 14:48
@openshift-bot
Copy link
Contributor

Evaluated for origin testextended up to 373b27b

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 373b27b

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/testextended SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin_extended/529/) (Extended Tests: core(builds))

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/9406/)

@openshift-bot
Copy link
Contributor

openshift-bot commented Sep 28, 2016

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/9406/) (Image: devenv-rhel7_5093)

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 373b27b

@bparees
Copy link
Contributor Author

bparees commented Sep 28, 2016

[merge]

@openshift-bot openshift-bot merged commit 068f174 into openshift:master Sep 28, 2016
@bparees bparees deleted the indexed_buildconfig_cache branch September 28, 2016 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build controller lists every build config on every image change
5 participants