-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v6r17] Introduce meta-filters in TS #3180
Conversation
With the last commit I've moved the logic from the client to the server side as explained in the first point of the work plan: The integration test still works, but I've removed the unit tests introduced with the first commit, since they are not valid anymore. |
Concerning the three other points of the work plan, detailed here:
So, in principle the workplan is completed. Let me know your comments. |
WRITE_METHODS = FileCatalogClientBase.WRITE_METHODS + [ "addFile", "removeFile" ] | ||
WRITE_METHODS = FileCatalogClientBase.WRITE_METHODS + [ "addFile", "removeFile", "setMetadata" ] | ||
|
||
NO_LFN_METHODS = [ "setMetadata" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is setMetadata in NO_LFN_METHODS
? It takes an LFN as first argument no ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I admit I'm not sure about the definition of 'NO_LFN_METHODS', I've put it here in analogy with the FileCatalogClient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is in the doc of FileCatalog:
The names of those methods are reported by the plug-ins as "no_lfn" methods
in the getInterfaceMethods() call. For those methods there is obviously no
additional check of the structure of the LFNs argument and no corresponding
processing of the results ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, but anyway, we should be coherent between the different catalog plugins, shouldn't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are both right. The setMetadata should be a normal "Write" method following LFN conventions. This is not like that right now in FileCatalogClient. I suggest to merge this PR as it is and in the next release review the metadata related methods consistently for all the catalogs. I will make an issue on that in order not to forget it.
if addFiles and fileMask: | ||
self.__addExistingFiles( transID, connection = connection ) | ||
mqDict = json.loads( fileMask ) | ||
res = catalog.findFilesByMetadata( mqDict ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should test res['OK']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in commit: 36b3a10
# Add the files to the transformations | ||
fileIDs = [] | ||
for lfn in filesToAdd: | ||
if lfnFileIDs.has_key( lfn ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if lfn in lfnFileIDs
Even better and much faster, make a set intersection between filesToAdd
and lfnFileIDs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@@ -479,6 +494,8 @@ def __deleteTransformationParameters( self, transID, parameters = None, connecti | |||
|
|||
def addFilesToTransformation( self, transName, lfns, connection = False ): | |||
""" Add a list of LFNs to the transformation directly """ | |||
gLogger.info( "TransformationDB.addFilesToTransformation: Attempting to add %s files." % lfns ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(lfns)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
res = catalog.getFileUserMetadata( lfn ) | ||
if not res['OK']: | ||
failed[lfn] = res['Message'] | ||
return S_OK( {'Successful':successful, 'Failed':failed } ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you return after the first failure ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Changed with:
if not res['OK']:
gLogger.error( "Failed to getFileUserMetadata for file", "%s: %s" % ( lfn, res['Message'] ) )
failed[lfn] = res['Message']
continue
gLogger.info( "setMetadata: Attempting to set metadata %s to %s" % (usermetadatadict, path) ) | ||
successful = {} | ||
failed = {} | ||
if isinstance( path, dict ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what you are trying to achieve with this test ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found why I had introduced this test... I had some type mismatch problem, because I was using:
@checkCatalogArguments
with
def setMetadata( self, path, metadatadict ):
in Resources/Catalog/TSCatalogClient
So you are right, the test is not necessary.
filesToAdd = [] | ||
|
||
catalog = FileCatalog() | ||
isFile = catalog.isFile( path )['Value']['Successful'][path] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either test res['OK']
or use the .get
syntax
|
||
if not res['OK']: | ||
failed[path] = res['Message'] | ||
return S_OK( {'Successful':successful, 'Failed':failed } ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand now. You only take the first element of the directory, but still return it as a multi input structure ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure to understand your question. But one thing I have to check, is that I was assuming that the corresponding setMetadata method in the FileCatalogDB takes as input either a directory, either a single file, but not a list of files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is my question. But you only treat one element in our method, and return a dictionary of successful / failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the method treats one element, as for the FileCatalogDB, which is either a file, either a directory. The reason why the method returns a dictionary of successful / failed is that, the effect of setMetadata in this case is to add to the transformations all the files that match the query conditions of the transformations. So the effect of setMetadata is on a set of files...
Does it make sense in this case to return a successful / failed dictionary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, because the Failed/Successful dict are indexed on the input path. Since there is a single one, I don't think it is worth having this extra level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've just changed it in commit:
9faad26
with some other minor changes.
metadatadict.update( usermetadatadict ) | ||
gLogger.info( 'Filter file with metadata:', metadatadict ) | ||
fileTrans = self._filterFileByMetadata( metadatadict ) | ||
if not ( fileTrans ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parenthesis are not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
path = [path] | ||
else: | ||
res = catalog.findFilesByMetadata( metadatadict, path ) | ||
path = res['Value'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should test res['OK']
successful[path] = False | ||
elif isFile: | ||
filesToAdd.append( path ) | ||
path = [path] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you do that ?
# Add the files to the transformations | ||
gLogger.info( 'Files to add to transformations:', filesToAdd ) | ||
if filesToAdd: | ||
for transID, lfns in transFiles.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iteritems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
for transID, lfns in transFiles.items(): | ||
res = self.addFilesToTransformation( transID, lfns ) | ||
if not res['OK']: | ||
for lfn in lfns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not returning res
?
failed[lfn] = res['Message'] | ||
else: | ||
for lfn in lfns: | ||
successful[lfn] = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A speedup:
successful = dict.fromkeys(lfns, True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know about the logic of the implementation, but I just added a couple of coding suggestions
typeDict = res['Value']['FileMetaFields'] | ||
typeDict.update( res['Value']['DirectoryMetaFields'] ) | ||
|
||
for transID, query in queries: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
queries.iteritems
? Or is queries
a list of tuples ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a list of tuples
gLogger.info( "Apply query %s to metadata %s" % ( mq.getMetaQuery(), metadatadict ) ) | ||
res = mq.applyQuery(metadatadict) | ||
if not res['OK']: | ||
gLogger.error( "Error in applying query: %s" % res['Message'] ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return res
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Done in commit: 36b3a10.
res = mq.applyQuery(metadatadict) | ||
if not res['OK']: | ||
gLogger.error( "Error in applying query: %s" % res['Message'] ) | ||
elif res['Value'] == True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
== True
is not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
CatalogURL = Transformation/TransformationManager | ||
} | ||
|
||
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally this test was supposed to be for the "TransformationSystem only". With the test that you are adding here it supposes instead that also the DFC is up and running. I think you should move this new test in a separate module (this would also simplify the management of CI infrastructure)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I've put it in a separate module. Please check that the module name and the location are appropriate or suggest a new one.
self.fc = FileCatalog() | ||
self.dm = DataManager() | ||
self.metaCatalog = 'DIRACFileCatalog' | ||
gLogger.setLevel( 'INFO' ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inside a test, please always use "DEBUG"
@@ -479,6 +494,8 @@ def __deleteTransformationParameters( self, transID, parameters = None, connecti | |||
|
|||
def addFilesToTransformation( self, transName, lfns, connection = False ): | |||
""" Add a list of LFNs to the transformation directly """ | |||
gLogger.info( "TransformationDB.addFilesToTransformation: Attempting to add %s files." % lfns ) | |||
gLogger.info( "TransformationDB.addFilesToTransformation: to Transformations: %s" % transName ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you want to merge the last 2 gLogger calls
IMHO this looks OK. I let Chris make the last comments, if any. I remove the "WIP". |
Ok Luisa I think that code-wise there are no more big changes to do. The only thing to add now would be documentation, still in this PR. |
OK, do you mean some documentation about how to use this functionality (and how it works) to be added in: |
Yes, for example. |
Ok. Just one question, I don't remember exactly why, but the doc at the url: |
Always work on a release branch, not on the master branch. This is true
also for documentation. I think at some point I made myself the mistake
of modifying the documentation on the master branch, so now they diverge
a little. If you can, try to synchronize at least this page.
Anyway, for this case just keep adding to this PR.
|
OK. Thank you. I've added the new documentation and tried to syncronizing the two branches. |
Very good documentation. Review OK. |
This PR is about the introduction of 'meta-filters' in the TS.
All the details are explained in the forum:
https://groups.google.com/forum/?hl=en#!topic/diracgrid-develop/BAQYOgLNCAI
As I explained in my last post, even if I already have in mind some improvements on the code, it should be safe to merge it, since the new functionality is optional. Moreover, it has been already successfully tested in the production setup of CTA. Also unit and integration tests pass.