Neural networks in Visual Studio and SharePoint Search

Here’s a wonderful video of Dr. James McCaffrey from Microsoft Research explaining basics of Neural Networks and it’s implementation pitfalls

Developing Neural Networks Using Visual Studio
http://channel9.msdn.com/Events/Build/2013/2-401.

Highly recommended for those who would like to get better understanding of science behind Sharepoint Search Ranking models. Looks like this video is a final missing piece in this SharePoint puzzle.

You might want to check more details in following blog posts and official MSFT pages

Advertisements

Custom meta tags for web pages in SharePoint Search

I’d like to share with you an amazing feature of SharePoint Search that exists for very long time,  at least since 2007 version, but very few people, according to my observations, are aware of it.

First of all, it is possible to increase findability by doing basic SEO of intranet sites by adding well-known (http://en.wikipedia.org/wiki/Meta_element) meta tags to web-pages : title, keywords, description. These tags will be picked up by SharePoint Web Crawler and propagated to crawled properties, and then automatically linked to appropriate managed properties, which have very high impact to overall ranking.

Secondly, it is possible to crawl custom meta tags from web-pages using and leverage them in search (a few example are in the end of the post).

All you need to do is to add them the same way as you do with well known tags.

<meta name="XXX content="YYY">

_mata

Then doing a full crawl, go to Search Schema (SharePoint 2013) or Metadata Properties (SharePoint 2010 or FAST), then Crawled Properties Categories. Note: Unfortunately during incremental crawl new properties will not be picked up.

_crawled

Then select “Web” cateory, and here is your newly added crawled property. Now, in order to use it create new managed property and manually map it to crawled property. Then do a full crawl again.

_mapped

That’s it.

Now a few ideas/examples how you can use efficiently use it in your search:

  • Increase findability by using well-known tags
  • Personalize search results leveraging custom tags:
    • In our portal we boost pages if office_city value matches city from user profile. It can be easily done via XRANKs and Query Rules in SharePoint 2013, or custom pre-processing with boosting of query in FAST Search (not an easy option, though).
  • Perform nice and easy integration with intranet portals to enrich search content with structured information.
    • In our portal we crawled employee recognition intranet site, where a set of pages represent rewards for employee and is enriched by custom tags as “employee name”, “reward description”, “reward category”. Then we created a structured search vertical that triggers when user query matches description/category of reward. Technically it was done using  Federated Web part and  Search scopes in SharePoint 2010 Search(FAST) and later on migrated to Result Sources and Results Block in SharePoint 2013 Search.
    • _custom

Microsoft Patents related to SharePoint Search

While investigating relevancy calculation in new SharePoint 2013, I did a research and it turned out that there are a lot publicly available patents which cover search in SharePoint, most of them are done in scope of Microsoft Research programs, according to names of inventors. Although there is no direct evidence that it was implemented exactly as described in patents, I created an Excel spreadsheet which mimiques logic described in patents which actual values from SharePoint – and numbers match!

I hope most curious of you will find it helpful to deep dive into Enterprise Search relevancy and better understand what happens behind the curtain.

Enterprise relevancy ranking using a neural network

Internal ranking model representation schema

Techniques to perform relative ranking for search results

Ranking and providing search results based in part on a number of click-through features

Document length as a static relevance feature for ranking search results

Relevant individual searching using managed property and ranking features

Generating search result summaries

Исследование релевантности поиска в SharePoint2013

Модель ранжирования результатов поиска SharePoint 2013 притерпела серьезные изменения по сравнениею с SharePoint 2010, а также FAST Search for Sharepoint. Более подробную инфорамацию можно получить из соответвуюшего патента. В кратце, для ранжирования используется несклько ранкинг моделей(в том числе BM25F) результаты от которых в последствии обрабатываются нейронной сетью для получения итогового ранка.

К счастью, есть довольно простой способ заглянуть за ширму и познакомится с черной магией поближе. Чтобы продемонстрировать это на примере я изменил Display Template резульатов поиска, добавив ссылку на страницу ExplainRank.aspх которая расположения в  папке {search_center_url}/_layouts/15/.

1 - starwars

В качестве параметров я использовал

  • d=ctx.CurrentItem.Path
  • q=QueryText из свойства ctx.ListData.Properties.SerializedQuery , сырое значение которого равно<Query Culture=”en-US” EnableStemming=”True” EnablePhonetic=”False” EnableNicknames=”False” IgnoreAllNoiseQuery=”True” SummaryLength=”180″ MaxSnippetLength=”180″ DesiredSnippetLength=”90″ KeywordInclusion=”0″ QueryText=”star wars” QueryTemplate=”{searchboxquery}” TrimDuplicates=”True” Site=”1436c4c3-34b4-4e32-bd41-4d28fc0f1435″ Web=”5291616c-f750-45db-9e3b-68c7c2e82b9f” KeywordType=”True” HiddenConstraints=”” />”)

В процессе рендеринга страница ExplainRank.aspx выполняет следующие действия, и по сути можно легко создать ее аналог:

  • Подготавливает KeywordQuery  используя в качестве конфигурации значения из параметров URL которые мы ей передал.  Справедливости ради стоит отметить что возможно передать больше двух параметров (как в нашем примере) но это не было сделано чтобы сохранить простоту.
  • Запрашивает в запросе свойство rankdetail.
  • Выполняет запрос.
  • Обрабатывает значение из свойства rankdetail с помощью RankLogParser (Microsoft.Office.Server.Search.Administration)
  • Отрисовывает извлеченную информацию с помощью ряда специализированных RenderableRankingFeature (Microsoft.Office.Server.Search.WebControls)

В качестве результата предоставляется подробная информация об этапах расчета релевантности. Даже не обладая глубокими знаниями в области моделей ранжиривания поисковой выдачи, можно увидеть по каким свойствам документы было совпадение в поиске и какой вклад внесли те или иные feature в общий результат

3 - explainrank

В следующих главах будет дан более подробный анализ расчета релевантности а также его практические применения для тонкой настройки качества поисковой выдачи.

Explain rank in Sharepoint 2013 Search

Default ranking model in Sharepoint 2013 is completely different from what we’ve seen in FS4SP and is definitely a step forward comparing to SP2010. In a nutshell it uses multiple features (to take into account query terms, it’s proximity; document click-through, authority, anchor text, url depth etc ) which are mixed with help of neural network as a final step. Details can be found in patent claim http://www.google.com/patents/US8296292.

Hopefully there’s a way to bring more light into this black magic. I’ve modified default Display Template and added “Explain Rank” link to each item. This link redirects user to ExplainRank page which hosted under {search_center_url}/_layouts/15/ folder.

1 - starwars

I used

  • d=ctx.CurrentItem.Path
  • q=ctx.DataProvider.$2_3.k  (have found this hidden property using trial & error method)
  • another option is to extract value for q= from QueryText from ctx.ListData.Properties.SerializedQuery which value was<Query Culture=”en-US” EnableStemming=”True” EnablePhonetic=”False” EnableNicknames=”False” IgnoreAllNoiseQuery=”True” SummaryLength=”180″ MaxSnippetLength=”180″ DesiredSnippetLength=”90″ KeywordInclusion=”0″ QueryText=”star wars” QueryTemplate=”{searchboxquery}” TrimDuplicates=”True” Site=”1436c4c3-34b4-4e32-bd41-4d28fc0f1435″ Web=”5291616c-f750-45db-9e3b-68c7c2e82b9f” KeywordType=”True” HiddenConstraints=”” />”)
var explainHtml = String.format('<a href="/_layouts/15/ExplainRank.aspx?q={0}&d={1}" title="Explain">Explain Rank</a>',
ctx.DataProvider.$2_3.k,
ctx.CurrentItem.Path);

(upd: here’s a link to complete display template http://pastebin.com/SfbFj9yG)
ExplainRank.aspx page essentially performs following actions:

  • Prepares KeywordQuery with values specified in URL (query, item url, + a number of configuration parameters that exist in SerializedQuery but were omitted in this demo for the sake of simplicity)
  • Specifies to retrieve ranklog property.
  • Executes query
  • Parsers ranklog with help of RankLogParser (Microsoft.Office.Server.Search.Administration)
  • Renders features/scores using controls derived from RenderableRankingFeature (Microsoft.Office.Server.Search.WebControls)

As a results you are able to review detailed explanation on rank log calculation and have a better understanding why a given document scored with specific rank.

Even if you’re not familiar with advanced information retrieval ranking models, you will easily locate in which properties there were hits, which features contributed to final score.

3 - explainrank

This is only the beginning of our journey into ranking and relevancy tuning of SharePoint 2013 Search. Stay tuned, in next chapters we will dive more deep.

%d bloggers like this: