Blog AboutGalleryPortfolioContact
Kenneth Solberg
Welcome to my blog

Microsoft Search Server Express 2008 and Umbraco

Introduction

Earlier this year Microsoft released Search Server Express 2008. This product is based on technology from Sharepoint and can compare to Google Mini Search Appliance, however MSSE is free with very few limitations (only one I know of is Clustering). Via it's really userfriendly Sharepoint like interface you get full control over sources to crawl and index - both local files and external websites.

1_msse_admin_640 2_msse_admin_sources_640

Microsoft Search Server Express (MSSE)

First lets install:

  • Grab a host operating system, either Windows 2003 or 2008. I chose 2003.
  • From 'Configure your server' in Windows add the 'IIS' role and enable ASP.NET only.
  • Download ASP.NET 3.0 runtime and install.
  • Download Windows Search Server Express 2008 and start the installation.
    • Do NOT install Windows Sharepoint Services first.
    • Run the 'Search Server Preparation Tool'.
    • Run the 'Install Search Server' and follow the instructions.
  • (Optional) Download Acrobat Reader v8.x for PDF IFilter and install for PDF indexing.
    • Download and save the 17x17 PDF icon/gif from here and save as:
      C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\Template\Images\icpdf.gif
    • Edit the 'C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\Template\Xml\DocIcon.xml' file and insert the following line in the '' section in the appropriate place alphabetically for PDF:
    • Add the following registry key and set its value to 'pdf':
      HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\\Gather\Search\Extensions\ExtensionList\38
    • Check the following GUID values are correct in the registry (default values should be {E8978DA6-047F-4E3D-9C78-CDBE46041603}):
      HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office server\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf
      HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf
    • Add "C:\Program Files\Adobe\Reader 8.0\Reader" to the system path.
    • Add PDF document type in the search server by opening up the administration console (http://MYSERVER:48560/ssp/admin/_layouts/managefiletypes.aspx), and add an entry for 'pdf' (no dot).
    • Restart the Search Server Service (from the command line):
      net stop osearch
      net start osearch

 

Now that we have installed MSSE, let's index something:

  • Go to 'Content Sources' and click 'New Content Source'.

    3_msse_admin_add_640
    Check the last checkbox to start full index directly.
    When indexing is done, you can get crawl results in the log:

    4_msse_admin_log_640
  • (Optional) Add crawler rules for authentication, url's to include/exclude, etc.

    5_msse_admin_rule_640
Of course, there's a lot of different configuration options, but I'll cover more of this in a later post. As you've probably noticed from the screenshots above I indexed the Umbraco forum. The final index consist of about 14.000 records. Of course, MSSE also provides a search interface that you can query your indexed sources and it looks like this:

 

6_msse_search_640
PS! Be sure to add a user on this site via the 'Site Actions -> Site Settings -> People and Groups -> Add User'. I created one called 'searchuser' as you'll see further down in the post.

A search for 'macro' returns 1972 records in ~0.5 sec on a Windows 2003 VMWare instance with 1GB of RAM and no spesific optimization for background services and such. Relevance sort is also extremely good and I believe it's even better than a 'site:forum.umbraco.org macro' search on Google!

Now, let's create some querying controls for Umbraco...

Search Community Toolkit

A really nice set of controls to query MSSE can be found at Codeplex. The project is called Search Community Toolkit and consist of two controls:

  • SearchInput which allows customisation of input controls including input box, search button and optionally a listbox with available scopes.
  • SearchResults to present the results of the query. The format of the query is defined in an xml file, and the results are transformed via an Xslt file.

Out-of-the-box these two controls isn't all that "Umbraco-friendly" (read: Public propery controllable), so I created a usercontrol wrapper for each with some extra candy and wrapped it in a Umbraco Package.

MSSE UserControls for Umbraco

Both UserControls expose all members from the underlying Controls from Codeplex and defaults to web.config settings with same name if not specified. Further ResultUrl defaults to currentPage and XSLT is performed in Umbraco context, yes - with umbraco.library, $currentPage and the whole schabong.

Download

Here's download links for the Visual Studio 2008 project files and binary build:

You should also define default values for all Macro parameters in web.config:


    "SearchServiceUrl" value="http://msse/_vti_bin/search.asmx" />
    "SearchServiceCredentialDomain" value="test01" />
    "SearchServiceCredentialUser" value="SearchUser" />
    "SearchServiceCredentialPassword" value="abc123" />
    "SearchTemplates" value="/xml/LiveSearchTemplates.xml" />
    "DefaultScope" value="All sites" />
    "ExcludedScopes" value="Rank Demoted Sites,Global Query Exclusion" />
    "XsltName" value="/xslt/Live.xslt" />
    :

Now, copy the /bin files and the two usercontrols to your site and create the macro with it's properties automatically fetched from the referenced usercontrols. Insert it in a tempalte and try it out! Here's a screenshot from my testsite:

7_umbraco_search 
In part 2 I'll discuss more advanced topics covering tighter integration with Umbraco with 'custom attribute mapping', searching other filetypes such as PDF files, customising the search result XSLT and more. Stay tuned!

posted
categories
22 comments  (turned off)

Umbraco meetup in Oslo

Introduction
On Friday, 29. august Xeed arranged the first Umbraco meetup in Norway. Inspired by the recent meetups in Belgium and the UK, we wanted to achieve something similar in Norway. Our goal was to gather Umbraco interested and start a national network for the exchange of experiences and more.

We had participants from both Trondheim, Bergen, Hamar and Kristiansand. They were:

  • Nicolas Van Etten
  • Siw Ørnhaug Nylund
  • Terje Dahl
  • Ståle Engen
  • Christian Melbye
  • Fredrik Kvivesen
  • Jørgen Tonvang
  • Fredrik Skarderud
  • Kenneth Solberg
  • Andre Brynhildsen
  • Frederik Vig

We started out by having each participant answer four questions: Who are you? What are you doing? Why Umbraco? Goals for the evening? The participants were roughly divided 50/50 between the relatively new and the experienced so the goals for the evening were many.

Presentation
The first hour we highlighted the core concepts of Umbraco as well as the advantages of the different user roles. Download presentation.

umbraco-(1-of-3) umbraco-(2-of-3)

Demos
v3 was used to demonstrate basic concept. In the other half we demonstrated v4 and covered Database Provider, Membership Provider, Master Pages, Umbraco and Visual Studio, Package Repository with Boosts and Nitro, and more. The use of Windows Live Writer was also demonstrated. At the end Umbraco based sites were presented.

Discussions and QA
 umbraco-(3-of-3) The participants were encouraged to ask questions along the way:

  • How multilingual sites are handle in Umbraco. It's common practise to use separate structures and map host headers to root nodes and link to a language so that the culture settings follow each site. There are other ways as well, but this is what we recommend. Translator role was also discussed and explained with reference to the umbraco.tv service coming 15th September with among other things, a video that explains the translation process in detail.
  • Umbraco does not use Windows Workflow Foundation. It has a pragmatic implementation of admin, editor and translator roles. Backend workflow where Umbraco acts as a participant in a workflow can be implemented in many ways, but we recommend looking into Base and ActionHandlers.
  • Umbraco has no GUI for a scheduler services such as in EpiServer. However, you can configure scheduled tasks in UmbracoSettings.Config.
  • Cross publishing is possible in Umbraco. We recommend taking a closer look at Christian Palms MultiplePagePicker and Tim Geyssens Ultimate Picker.
  • Experiences with several editors. Not much experience with several simultaneous editors, but is expected to work like charm :-) Training of editors was really easy compared to other systems due to Umbraco's very simple user interface and the ability to set a starting content/media node, and removing access to sections.
  • There is no whitepaper on Umbraco and security. But a good selling point is that anyone can look into Umbraco source code as it is Open Source. A good infrastructure and routines for security patch installation is also really important.

We also discussed the benefits of establishing a national network of Umbraco expertise:

  • Give confidence to potential customers by referencing national players using Umbraco.
  • Exchange knowledge and experience.
  • Recruitment to the Umbraco community.

Tips and Tricks
We ended the session with Umbraco tips and tricks:

  • Tabs pr. section
  • Empty tags in XSLT, best practice
  • Debugging bookmarklet
  • HTTP compression
  • umbracoNaviHide gives 404
  • Firebug og Yslow extensions i FF
  • mm.

Summary
First and foremost, it was very fun to say hello to the other Umbraco interested in Norway and I hope we can arrange more sessions in the future. Niels also made a video from Belgium where he along with Tim and Ruben worked with Live Edit feature. The video wasn't available during the meetup, but it is. So for those of you who participated (and others if curious), you can see the video by clicking the link below.

posted
categories
3 comments