Recently, I was asked to find out why some documents are not indexed in a particular library and therefore, not showed in the search results. I began by inspecting the crawler log to see if there are any errors or warnings. No errors nor warnings I found. Better, I found that some of the documents are crawled correctly.
Comparing the crawled documents with the non-crawled ones, it appeared that the non-crawled ones are in minor versions (draft). By default, the crawler account is granted 'Full Read' permission. Which mean that it just cannot see draft documents which are visible only to authors who have 'Edit' permission.
So what is the solution? You have to :
- Either grant the crawler account the 'Edit' permission to let him see unpublished files and crawl them. In this case, all draft documents will show in search results to everyone, even to visitors who are not supposed to see them. The search results are not security trimmed (1). However, if you do not have access to a document, you still be denied the access even if it shows in the search result.
- Or keep the crawler account with 'Full Read' and publish the draft documents into major versions.
- Otherwise, accept to not index draft documents
I cannot recommend a solution or another. Every company must have a documents management policy, and its according to this policy that we can decide if we have to raise the right of the crawler account or keep draft documents out of the search scope.
Here are some interesting links to better understand SharePoint search behaviour :
What Does the Crawler Crawl and When?
SharePoint indexing/search behavior on major and minor versions
MOSS Enterprise Search - 16 things you might not know
Hope this helps.
(1) The search results are not trimmed only for draft items. That's what I noticed. For the other items, the results shown are trimmed at query time according to the permissions the user has.
Showing posts with label draft items. Show all posts
Showing posts with label draft items. Show all posts
Tuesday, November 3, 2009
Subscribe to:
Posts (Atom)