You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So I can see that, in the first instance, we are targeting content directly while, in the second, we are asking for changes. The problem is that no changes are returned from the second set of requests. The response from these calls is:
Regardless of what changes I make to a document that I have been using for testing, the document is not updated. The response from the calls for changes (totalNodes) is always ‘0’.
Adding ‘Filter Configuration’ seems to do very little to change what is picked up
Within my test Alfresco environment I have one site set up (Finance). Within the Finance doc library I have three test docs. No other changes have been made to the Alfresco instance.
Running a crawl with no filter configurations set returns 81 items. This is via the URL in a browser.
If I then set the Site Filter configuration to ‘Finance’ and apply, I still get 81 items when I re-run the crawl.
I can see that the term ‘Finance’ is being added to the URL but this does not seem to change the behaviour.
The text was updated successfully, but these errors were encountered:
IN DETAIL
Looking at the log files (which are set to debug) I can see that, upon the first crawl of Alfresco, Manifold sends the following requests:
DEBUG 2015-10-28 05:24:35,056 (Worker thread '1') - Executing request GET /alfresco/service/node/actions/workspace/SpacesStore/267839b2-f466-42c5-9a35-cb3e41281bb9 HTTP/1.1
DEBUG 2015-10-28 05:24:35,056 (Worker thread '1') - http-outgoing-239 >> GET /alfresco/service/node/actions/workspace/SpacesStore/267839b2-f466-42c5-9a35-cb3e41281bb9 HTTP/1.1
DEBUG 2015-10-28 05:24:35,056 (Worker thread '1') - http-outgoing-239 >> "GET /alfresco/service/node/actions/workspace/SpacesStore/267839b2-f466-42c5-9a35-cb3e41281bb9 HTTP/1.1[\r][\n]"
DEBUG 2015-10-28 05:24:35,070 (Worker thread '1') - Executing request GET /alfresco/service/node/details/workspace/SpacesStore/267839b2-f466-42c5-9a35-cb3e41281bb9 HTTP/1.1
DEBUG 2015-10-28 05:24:35,070 (Worker thread '1') - http-outgoing-240 >> GET /alfresco/service/node/details/workspace/SpacesStore/267839b2-f466-42c5-9a35-cb3e41281bb9 HTTP/1.1
DEBUG 2015-10-28 05:24:35,070 (Worker thread '1') - http-outgoing-240 >> "GET /alfresco/service/node/details/workspace/SpacesStore/267839b2-f466-42c5-9a35-cb3e41281bb9 HTTP/1.1[\r][\n]"
DEBUG 2015-10-28 05:24:35,082 (Worker thread '1') - Executing request GET /alfresco/service/api/node/workspace/SpacesStore/267839b2-f466-42c5-9a35-cb3e41281bb9/content HTTP/1.1
DEBUG 2015-10-28 05:24:35,082 (Worker thread '1') - http-outgoing-241 >> GET /alfresco/service/api/node/workspace/SpacesStore/267839b2-f466-42c5-9a35-cb3e41281bb9/content HTTP/1.1
DEBUG 2015-10-28 05:24:35,082 (Worker thread '1') - http-outgoing-241 >> "GET /alfresco/service/api/node/workspace/SpacesStore/267839b2-f466-42c5-9a35-cb3e41281bb9/content HTTP/1.1[\r][\n]"
DEBUG 2015-10-28 05:24:40,263 (Worker thread '1') - Executing request GET /alfresco/service/node/actions/workspace/SpacesStore/72948f84-4bf1-4ec5-8378-1bed0951600a HTTP/1.1
DEBUG 2015-10-28 05:24:40,263 (Worker thread '1') - http-outgoing-242 >> GET /alfresco/service/node/actions/workspace/SpacesStore/72948f84-4bf1-4ec5-8378-1bed0951600a HTTP/1.1
DEBUG 2015-10-28 05:24:40,263 (Worker thread '1') - http-outgoing-242 >> "GET /alfresco/service/node/actions/workspace/SpacesStore/72948f84-4bf1-4ec5-8378-1bed0951600a HTTP/1.1[\r][\n]"
This picks up all of the content e.g. documents.
Running a second crawl, without any other actions being done, results in the following requests:
DEBUG 2015-10-28 05:26:31,854 (Startup thread) - Executing request GET /alfresco/service/node/changes/workspace/SpacesStore?lastTxnId=333&lastAclChangesetId=13&indexingFilters=%7B%22siteFilters%22%3A%5B%22Finance%22%5D%2C%22typeFilters%22%3A%5B%5D%2C%22mimetypeFilters%22%3A%5B%5D%2C%22aspectFilters%22%3A%5B%5D%2C%22metadataFilters%22%3A%7B%7D%7D HTTP/1.1
DEBUG 2015-10-28 05:26:31,854 (Startup thread) - http-outgoing-248 >> GET /alfresco/service/node/changes/workspace/SpacesStore?lastTxnId=333&lastAclChangesetId=13&indexingFilters=%7B%22siteFilters%22%3A%5B%22Finance%22%5D%2C%22typeFilters%22%3A%5B%5D%2C%22mimetypeFilters%22%3A%5B%5D%2C%22aspectFilters%22%3A%5B%5D%2C%22metadataFilters%22%3A%7B%7D%7D HTTP/1.1
DEBUG 2015-10-28 05:26:31,854 (Startup thread) - http-outgoing-248 >> "GET /alfresco/service/node/changes/workspace/SpacesStore?lastTxnId=333&lastAclChangesetId=13&indexingFilters=%7B%22siteFilters%22%3A%5B%22Finance%22%5D%2C%22typeFilters%22%3A%5B%5D%2C%22mimetypeFilters%22%3A%5B%5D%2C%22aspectFilters%22%3A%5B%5D%2C%22metadataFilters%22%3A%7B%7D%7D HTTP/1.1[\r][\n]”
So I can see that, in the first instance, we are targeting content directly while, in the second, we are asking for changes. The problem is that no changes are returned from the second set of requests. The response from these calls is:
DEBUG 2015-10-28 05:56:42,218 (Startup thread) - http-outgoing-257 << " "totalNodes" : "0", [\r][\n]"
DEBUG 2015-10-28 05:56:42,218 (Startup thread) - http-outgoing-257 << " "elapsedTime" : "8",[\r][\n]"
DEBUG 2015-10-28 05:56:42,218 (Startup thread) - http-outgoing-257 << " "docs" : [[\r][\n]"
DEBUG 2015-10-28 05:56:42,218 (Startup thread) - http-outgoing-257 << " ],[\r][\n]"
DEBUG 2015-10-28 05:56:42,218 (Startup thread) - http-outgoing-257 << " "last_txn_id" : "352",[\r][\n]"
DEBUG 2015-10-28 05:56:42,218 (Startup thread) - http-outgoing-257 << " "last_acl_changeset_id" : "13",[\r][\n]"
DEBUG 2015-10-28 05:56:42,218 (Startup thread) - http-outgoing-257 << " "store_id" : "SpacesStore",[\r][\n]"
DEBUG 2015-10-28 05:56:42,218 (Startup thread) - http-outgoing-257 << " "store_protocol" : "workspace"[\r][\n]"
DEBUG 2015-10-28 05:56:42,218 (Startup thread) - http-outgoing-257 << “}"
Regardless of what changes I make to a document that I have been using for testing, the document is not updated. The response from the calls for changes (totalNodes) is always ‘0’.
Within my test Alfresco environment I have one site set up (Finance). Within the Finance doc library I have three test docs. No other changes have been made to the Alfresco instance.
Running a crawl with no filter configurations set returns 81 items. This is via the URL in a browser.
If I then set the Site Filter configuration to ‘Finance’ and apply, I still get 81 items when I re-run the crawl.
I can see that the term ‘Finance’ is being added to the URL but this does not seem to change the behaviour.
The text was updated successfully, but these errors were encountered: