Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Look into User Datasets issue "dataset does not contain both a dataset.json file and a meta.json file" #39

Open
dmgaldi opened this issue Jul 8, 2022 · 6 comments
Labels
bug Something isn't working

Comments

@dmgaldi
Copy link
Contributor

dmgaldi commented Jul 8, 2022

E-mail from Cristina with details:

see error below 20 mn ago today...
on a VectorBase user dataset gene list for user :
wdk_user = User #40

org.gusdb.wdk.model.WdkModelException: Dataset 4060910 for user 40 does not contain both a dataset.json file and a meta.json file

i checked and it does have the files...
so no idea why this is thrown... and no idea if it affects the UX in any way...

org.gusdb.wdk.model.WdkModelException: Dataset 4060910 for user 40 does not contain both a dataset.json file and a meta.json file
       at org.gusdb.wdk.model.user.dataset.irods.IrodsUserDatasetSession$Err.datasetMissingMeta(IrodsUserDatasetSession.java:794)
       at org.gusdb.wdk.model.user.dataset.irods.IrodsUserDatasetSession.collectionToDataset(IrodsUserDatasetSession.java:626)
       at org.gusdb.wdk.model.user.dataset.irods.IrodsUserDatasetSession.getUserDatasets(IrodsUserDatasetSession.java:200)
       at org.gusdb.wdk.service.service.user.UserDatasetService.getAllUserDatasetsJson(UserDatasetService.java:82)
       at org.gusdb.wdk.service.service.user.UserDatasetService.getAllUserDatasets(UserDatasetService.java:64)
       at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.base/java.lang.reflect.Method.invoke(Method.java:566)
       at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
       at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
       at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
       at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205)
       at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
       at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
       at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
       at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
       at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
       at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
       at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
       at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
       at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
       at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
       at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
       at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
       at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
       at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
       at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
       at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
       at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
       at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
       at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
       at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
       at org.tuckey.web.filters.urlrewrite.RuleChain.handleRewrite(RuleChain.java:176)
       at org.tuckey.web.filters.urlrewrite.RuleChain.doRules(RuleChain.java:145)
       at org.tuckey.web.filters.urlrewrite.UrlRewriter.processRequest(UrlRewriter.java:92)
       at org.tuckey.web.filters.urlrewrite.UrlRewriteFilter.doFilter(UrlRewriteFilter.java:389)
       at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
       at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
       at org.eupathdb.common.controller.ProjectIdMacroFilter.doFilter(ProjectIdMacroFilter.java:39)
       at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
       at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
       at org.eupathdb.common.controller.BetaLegacyFilter.doFilter(BetaLegacyFilter.java:53)
       at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
       at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
       at org.gusdb.wdk.controller.filter.HttpResponseHeaderLogger.doFilter(HttpResponseHeaderLogger.java:49)
       at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
       at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
       at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)
       at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
       at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
       at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
       at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
       at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
       at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:223)
       at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:311)
       at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:793)
       at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:722)
       at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:915)
       at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
       at java.base/java.lang.Thread.run(Thread.java:829)

@dmgaldi dmgaldi added the bug Something isn't working label Jul 8, 2022
@dmgaldi
Copy link
Contributor Author

dmgaldi commented Jul 11, 2022

Here's the code that is throwing. It could be some kind of eventual consistency issue with IRODS is it seems like the code isn't finding the files but we're finding them after the fact.

if (!dsCol.contains(AVU_DATASET_JSON) || !dsCol.contains(AVU_META_JSON))
throw Err.datasetMissingMeta(dsId, usId);

@dmgaldi
Copy link
Contributor Author

dmgaldi commented Jul 12, 2022

It looks like the tarball is unpacked into a staging area and then rsynced into the users directory.

https://github.com/VEuPathDB/EuPathDBIrods/blob/master/Scripts/ud.re#L119-L155

We assume when getting a list of user datasets for a user, that if the directory exists, it is a valid dataset. It's possible that the rsync is partially complete without the meta.json and dataset.json files existing which could cause us to fail to list the directory.

// Using a lazy iterator rather than a stream since we throw exceptions
final Iterator<ICatCollection> datasets = col.streamCollectionsShallow()
.iterator();

It's worth checking if this particular dataset was listed soon after it's creation, or to check the IRODS rules logs if we have them to see if they corroborate this picture.

@dmgaldi
Copy link
Contributor Author

dmgaldi commented Jul 13, 2022

Looks like the error came in on 08/Jul/2022|08:30:26

@dmgaldi
Copy link
Contributor Author

dmgaldi commented Jul 13, 2022

Looks like the write to IRODS came in 2 seconds earlier:

ESC[mESC[36m2022-07-08 08:30:23.140 [rid:     ] DEBUG Irods:47 - writing dataset dataset_u40_t1657283421542_p38.tgz to iRODS /ebrc/workspaces/lz
ESC[mESC[36m2022-07-08 08:30:24.007 [rid:     ] DEBUG Irods:65 - writing flag /ebrc/workspaces/flags/dataset_u40_t1657283421542_p38.txt to iRODS /ebrc/workspaces/flags

@dmgaldi
Copy link
Contributor Author

dmgaldi commented Jul 13, 2022

We confirmed that the genelist.txt file was created in the IRODS workspace at 08/Jul/2022|08:30:27 which is one second after the the request came in and two seconds after the tarball was written. This more or less confirms the above theory.

Next step is to follow up with front-end folks to see if they have any idea why we might be running into this issue now since theoretically this bug would have been there forever.

@aurreco-uga
Copy link
Member

i experienced this myself.. i think when you go quickly after upload to the All datasets page... it is clearly some kind of our of sync situation.. need more detail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants