Showing posts with label Frugal Admin. Show all posts
Showing posts with label Frugal Admin. Show all posts

Friday, December 30, 2011

Frugal Admin, special edition: How to get your SharePoint Foundation 2010 server to index RTF files

Hi there,

In this special edition, I am going to tell you how to index rich text files (document files with the rtf extension).

(to see it done in action, go to http://www.livestream.com/callahanSPF4admins and watch "Enabling RTF indexing on SharePoint Foundation 2010")

Now I know, I know, you've got to be saying, "Callahan, how often does anyone need to upload a rich text file? I mean c'mon."

But hey, it can happen. How about having users that are working on different platforms and don't have Word installed? What if there is a piece of software on your network that puts out RTF files for some reason, and you need to have them in a library on your SharePoint site? Maybe your tech support site uses RTF files so they're compatible with everyone?

For whatever reason, it appears that there is a little something broken in the registry for SharePoint so it can't do something so simple, so assumed, as search rich text files.

You see, it all started when someone tweeted asking if SP2010 could index rtf files natively or if it "needed an ifilter" (meaning they'd have to go install one). I just so happened to be doing a lot of work with PDF ifiltering, so I was well qualified and ready to check into it.

I thought their question was sincere, so I started looking. It turns out that seconds after the question, someone tweeted back saying it couldn't be done.

Of course, I was busy digging, so I didn't know it couldn't be done.

And so..

...I did it.
(later I did find out that there is a book out there telling SharePoint Server people to just register the rtf ifilter DLL and it will work fine for them-- but that definitely doesn't work in SharePoint Foundation, and might've stopped me right there had I known...)

[for my tl;dr readers- the short form of how to get rtf ifiltering to work in SharePoint Foundation:
  • change the value of the key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\14.0\Search\Setup\ContentIndexCommon\Filters\Extension\.rtf to the correct DLL CLSID: {e2403e98-663b-4df6-b234-687789db8560}
  • run the AddExtensions.vbs script that you copy from the internet so it will permanently add an rtf extension to the extensionlist at key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\14.0\Search\Applications\6519b45e-2869-4f5a-9bb5-ec60370309fb\Gather\Search\Extensions\ExtensionList
  • reboot server (you have to to get it to read the changes to the registry)
  • upload an rtf file that has at least one unique word in it to a library in SharePoint
  • then wait for search to run an index, or force a fullcrawl- when it's done, you'll be able to search your RTF by that unique word and have it show up in the search results.
And that's it. but to see why and how I knew to do this stuff, how to do it in step by step detail, and why it works for SPF, read on]


First I checked to make sure that SharePoint Foundation 2010 (SPF) could not, in fact, index RTF files by uploading one to a library, doing an iisreset, then a fullcrawl (stsadm -o spsearch -action fullcrawlstart --keep in mind to run that command in 2010, as opposed to early more security conscious versions, the account your logged in as must OWN the search database...). Then I did a search on the file name, which proves the full crawl worked. Finally I tried to search by text in the RTF file and had it fail- proving that RTF file indexing failed.

Once I knew it failed, I then went to the registry, because I knew that other than an ifilter's DLL, the settings in the registry were key to having ifiltering work in SharePoint Foundation.

Now, when using Adobe's PDF ifilter, I needed to go to the registry, add an entry to the "ExtensionsList" for applications, and a Extension key for .pdf with the correct CLSID pointing to Adobe's PDF ifilter DLL. These two things were critical for success.

So I checked to see if there were any entries for "rtf" in the same places in the registry. I found something interesting.

There was no listing for "rtf" in the ExtensionList key (see figure below for details- the full path in the registry is listed at the bottom of the window). I've been given to believe (and I am correct) that an ifilter won't work for SPF without a listing for the file extension here.

Then I went to check the second registry entry I'd learned was important, a key under Setup\ContentIndexCommon\Filters\Extension. Each file type that SharePoint Foundation can possibly search is listed here with it's own key. The key contains, at the minimum, a default value that is the CLSID of the DLL used by the ifilter for that file type. RTF did have a key.

To be thorough, I wanted to know what DLL that value was pointing to. It should be the CLSID for the file's ifilter DLL.

To check that I selected the CLSID key under HKEY_CLASSES_ROOT and did a find (go to Edit on the menu bar, and click Find, or use ctrl+f keys) for the CLSID value listed for the rtf extension ({35500004-002C-0000-0000-000000000000} as it happens to be). What came up was the plain text filter's CLSID not the one for rich text files:





Every CLSID key for an ifilter has to have an InProcServer32 sub-key. It will list the path to the DLL for that ifilter. In this case, to really prove it has nothing to do with rich text, the InProcServer32 sub-key's path goes to tquery.dll-- the dll used for simple, plain text indexing.


I thought that couldn't be right. It looked like the wrong CLSID for the rtf key for ifiltering had been entered by the SPF installer during setup.

And I figured, if that was the case, I just needed to find the rtf ifilter, if it existed by default (which I had to assume it did, I mean, really), and use it's CLSID instead.

So I went back up to the CLSID key under HKEY_CLASSES_ROOT, and did a Find for "RTF Filter". Why, you ask, did I know to use those exact words? Because the name for the CLSID for the PDF ifilter was PDF Filter, so I figured it would probably be like that for rtf.

And I found it. The value for the rtf ifilter was: {e2403e98-663b-4df6-b234-687789db8560}





Also notice in the picture that the DLL for the rtf filter is "rtffilt.dll". During all this I'd also looked on the internet to see if anyone had been trying to use an rtf ifilter. There were blog entries and forum posts about getting rtf ifilters online, downloading them and using those, and few for SharePoint except, ironically, two for SharePoint Search Express. One refers to a DLL that Microsoft apparently published several years ago named "rtffilt.dll" (now it appears built into server 2008 R2) and one that actually had you register a DLL that was already in system32, so I knew the file already existed on the server.

(to note: however, the blog entry that registers the DLL does something interesting, it has you copy the file from system32 to the sysWOW64 folder and register both: http://thetrainndt.posterous.com/?tag=ifilter Just mentioning it in case your system requires that for some reason- not sure why you would...)

Anywho, obviously, the correct CLSID for the existing rtf ifilter is the value I listed before the picture.

So I copied the correct CLSID value (I right clicked the CLSID key on the right side of the window, and selected "Copy Key Name"), then went back to the rtf Extensions key under ContentIndexCommon\Filters\Extension and changed it's value to the correct one (never forget the curly brackets have to be on either end of the alphanumerics) by pasting the key name. You'll have to delete some of the key information so only the CLSID remains.

Once that was done, I needed to add the .rtf extension to the Applications\Gather\Search\Extensions\ExtensionList (we checked that earlier in this entry, and it was missing). Now these extensions are numbered, so we have to add a string value of the next higher number (in my case that'd be 49, in yours it'll probably be 48). Then double click the value to enter "rtf" (without the quotes of course) as the value.

However, I have found that, with server 2008 R2 (especially with all the most recent updates and service pack) that ExtensionList key is protected, and no matter what I do (take ownership of the key, subkeys, etc., for example), the change is deleted in a few hours or on next reboot.

To overcome this, there is a simple visual basic script you can run to override that behavior and "register" your extension correctly in the ExtensionList. It won't disappear and it won't delete after reboot.

The easy way to get that script is to go to http://support.microsoft.com/kb/2518465 . In that KB article is the text for the visual basic script- just copy and paste it into a text file (if you don't feel like going to the KB, here it is for your convenience):
---------------------
Sub Usage

    WScript.Echo "Usage:    AddExtension.vbs extension"
    WScript.Echo

end Sub

Sub Main

    if WScript.Arguments.Count < 1 then
                Usage
                wscript.Quit(1)
   end if

    dim extension
    extension = wscript.arguments(0)

    Set gadmin = WScript.CreateObject("SPSearch4.GatherMgr.1", "")

    For Each application in gadmin.GatherApplications
        For Each project in application.GatherProjects
                    project.Gather.Extensions.Add(extension)
                Next
    Next

End Sub

call Main
-------------------

Once I copied the text above into a text file, I saved the text file as AddExtension.vbs (make sure you select All Files *.* for the "Save as Type" field, so it doesn't save the file with a txt extension anyway). Always pay attention to where you save files, it comes in handy later.



That script has to be run in order to make the necessary change in the registry. That's why I needed to know where the script was saved. So I opened an explorer window and browsed to the location where I put the new vbs file. Then I shift+right clicked in the window and selected to Open command window here.


I then entered the following command in the command prompt window and hit enter (of course):

wscript AddExtension.vbs rtf



That ran the script and added the correct entry in the registry, which now won't disappear if I reboot.  Which is good, because after the script runs, you have to reboot the server to get it to read the change (I know, that sucks, but at least you know for certain that it's necessary).

--You can confirm if the command ran by trying to run it again- it should give you a warning dialog box saying the object already exists. You can also go into the registry and check for a value in the extensionlist at key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\14.0\Search\Applications\6519b45e-2869-4f5a-9bb5-ec60370309fb\Gather\Search\Extensions\ExtensionList. If it's there, then the script worked.--

Once the server rebooted, I needed something to test to confirm if rtf ifiltering would work. So I uploaded a rtf file with unique text in it:



Then I ran a full crawl (you can wait for the server to do it itself).

An example of how to do that using STSADM:

stsadm -o spsearch -action fullcrawlstart

[remember that to use a PowerShell or STSADM command to do a full crawl with SharePoint 2010 be sure the account you are logged in with owns the search database (yeah, I kid you not)]

People may say you need to restart the search service (net stop spsearch4 then net start spsearch4) before doing the full crawl, but that is not necessary- rebooting the server, by definition, restarts the service.

To test if the full crawl worked, after the master merge has been completed (you can see two entries in the Applications Event log under the category "Content Index Server"), I went to the SharePoint site where I uploaded the RTF file, and did a search using a word in the title of the file. When it came up in the search results, I saw two things. 1) it proved that the full crawl was successful, because SharePoint was at least able to index the metadata for the file's title. 2) if under the title of the file in the search results, a little summary of the text in the file is displayed, then SharePoint was able to index the content inside the file, meaning the rtf ifilter did work.


And, of course, the true test- doing a search on the site where the file is located, using one of the unique words in the rtf file itself- if returns the rtf file in the search results, then it worked. And in my case, it did.

So the bottom line:


-Do not let anyone tell you that SharePoint Foundation 2010 cannot index/search RTF files. It can. Out of the box, with only two registry entries and a reboot.
-Do not let anyone tell you that you must BUY and install an RTF ifilter in order to be able to index RTF files. Spending money is NOT necessary, the file should already be in the system32 folder.
-The suggestions made to get SharePoint Server 2010 to index RTF files (namely, just registering the rtffilt.dll) do not work for SharePoint Foundation 2010. Just because that fix doesn't work for SharePoint Foundation does not mean SPF cannot search rtf files. That's just silly, and I've proven it. Thanks for reading this far. :)

Saturday, July 11, 2009

A few SharePoint-y things going on

I've got two things going on this month involving my sessions that I thought you might want to know about:


1-- The SSWUG folks are rebroadcasting every, single session I've ever done for them (including the ones that are not my favorites, lol). If you're interested (and you are/or want to become a member of that online SQL user group), the sessions will be streamed essentially back to back on the 22nd and 23rd of this month (July 2009). I'll be there for the live chats if you'd like to come visit. To see what those sessions are, check out their blurbs in the "Happenin' things" section of this blog on the right of the page.

(Some inside knowledge for my readers, the frugal admin series of sessions were done this year. The camera set up in the "new" studio was a bit awkward for me, but otherwise I was pretty well rested, familiar with the situation, and healthier. However, for the sessions that are being repeated from 2008, I wasn't so lucky. I was completely new to talking to a camera in an otherwise empty room, had a bit of food poisoning that day, and was suffering from a touch of undiagnosed clinical hypothyroidism (it's since been taken care of, but at the time I was struggling to do the hours and hours of presentation in the new environment). The first presentation of that set (and still my favorite content to do, ever) was about exploring (and exploiting) the free templates available online at Codeplex and Microsoft (namely the fantastic 40 templates). Since then I have expanded that session for the frugal admin series to include a bunch of nifty free tools... but I digress. That freebie session, as it were, is the best of the 2008 sessions, because it was the first one. After that I was so completely exhausted I was grey. So if you'd like to see me working under duress, feel free to stop by and check out those additional 2008 sessions... ; )


2-- On July 25th, I'll be in Baltimore (actually at the UMBC training center, 1442 South Rolling Road in Halthrope Maryland) at the free SharePoint Saturday event, presenting the Dashboards session of the Frugal Admin series. I am not sure exactly when my session will be that day, but I am hoping it'll be in the afternoon, for those of you wanting to drive up and check it out. I'll be giving away a free copy of my book (I only have three copies for giveaway there)-- and, of course, I'll be glad to sign it for you. This is a live and in person event, at no cost to you (except for travel), so if you can make it, I'd love to see you there.

In August I hope to have more time to finish creating the fall and winter Frugal Admin sessions. This will also be the time I put the spring sessions to pasture and offer them for free on my site (making those videos will take some time, so they'll be up no earlier than the end of the month). So stay tuned, this busy summer should soon be winding down, which will give me more time to post things here (and I do have all kinds of ideas).

Wednesday, April 22, 2009

Happy Earth Day!

Howdy everyone!

It's Earth Day today, http://www.earthday.gov/, and in a bit of interesting timing, it's also the first day of the SharePoint VConference (virtual conference) over at the SSWUG site: https://www.vconferenceonline.com/shows/spring09/sharepoint/.

Participating in virtual events are a great way to be green, as you save money on travel, hotel, wear and tear on your luggage, clothing, you, etc.

This time, for this vconference, I did sessions concerning getting the most out of WSS, seeing as many of us (all of us?) need to be able to stretch our time and dollars as far as we can. The sessions start with exploring the built-in web parts available with WSS. The point of the session, for me, was to give you an idea as to what's available in order to have you consider how useful they can be for you. Exploring, very, very quickly, the settings, views, zones, and configuration of the web parts. In addition, I have to admit that I was also focusing on getting the audience to understand web parts so they can use them for dashboards.

Why? Because the second session took those web parts, well at least the list view and content editor web parts, and make them into dashboards. I created two dashboards, one for users and one for management, during the session, securing them as well, then added them to the quick launch. Part of what I was doing with the dashboards was preparing you for the idea of having the resources of a site self-contained in that site.

Why? Because the final session was how, after setting up nice web parts and good dashboards, you can create a template out of the site itself. So it helps to show you how and why, first hand, centralized resources are useful in a site.

The final session, of course, covers how to create a site template out of the site used in the previous two sessions. It starts with what to consider, self-references, how to maximize the size limit for templates, how to create the site template, how to apply a site template, and then how to check to see if it really works and how to tweak it to be appropriate in the new place. I focus in particular as to what you might want to do with a site template, and what to worry about.

(and, because I had one minute and thirty seconds left, I tossed in a "contact us" page and showed an example of using google maps in the content editor web part by using the source editor-- just for those who stuck it out for all three sessions)

Now, for you, my dear readers, I will give you some insights into the sessions from my view, behind the scenes. Last time I recorded vconference sessions, I was really, really nervous and was suffering from a bit of food poisoning (had to eat at an airport on the way there...). This time, I had some real problems on the flight there (lost luggage, delayed flights, then cancelled flights, then a really, really late arrival-- 3 am my time), didn't get to sleep until 5am my time.

Needless to say, I was pretty groggy while at the studio. Sigh.

For the session slides, I was torn. I could have a few graphic filled slides and all demos, or I could have text that could be used as notes so attendees could use them later and a lot of demos. I went for word filled slides, and simply stated that I wouldn't be reading them all during the session, just hitting the highlights. I did try to go over as many as I could to a certain extent, but I also did simply do much of the content, then just gave some slides a mention.

Also, I was really afraid, because I was going to be going so fast to build web parts, dashboards, then site templates, that I was going to lose the more inexperienced viewers. So I might've gone too slow during the explanation portion of the session (tell them what you're going to do, tell them/show them how to do it, tell them what you did). If so, please forgive me, dear readers, I meant well. ; P

Finally, my big issue is, I am used to presenting exactly 1, 1.5, 3, or 6 hour sessions. For this vconference, I had fifty minutes. That's it. I lost ten minutes out of my carefully crafted Frugal Admin content.

And it did kick my butt a little. But, I tried to avoid compromising the content as much as I could.

So now you know, this is where I was while doing my sessions for this conference. I was exhausted, distracted by the truncated time, worried about not giving enough backstory so the attendees could follow along, worried about pacing (because of the loss of the ten minutes and my exhausted habit of rambling-- made worse when I am alone in a room instead of having a live audience), and determined to give you as much hands on proof of what works and what it looks like as I could.

If you want to see what I am talking about, my session are going to be broadcast tomorrow, Weds., April 23rd, 2009. If you are interested, feel free to register (not free, but pretty cheap) for the vconference at http://www.vconferenceonline.com/shows/spring09/sharepoint/. Feel free to use my tell-a-friend code VCTAF502105-0, or the discount code SPVCCASP09. Keep in mind that you can also sign up for the on-demand feature, meaning you can download the videos to see review later.

Another thing to keep in mind concerning sessions you've downloaded, is you can pause and rewind the video so if, during the live session, there was a bit that went too fast, with the on-demand version, download it, and rerun the session, pause where you need it.

Just a thought. : )

Happy Earth Day everybody!!

Thursday, January 15, 2009

Thoughts, plans, hopes, and dreams. Hello and welcome to 2009.

Howdy Everybody.

My apologies, again, for falling behind on my blog.

I'd finished all the work I needed to do for the SSWUG vconference in November, and was working on a two part vcast (of which I have the first half done and mostly edited) that I'd mentioned in one of my last posts, when my video card crapped out. Since, in my case, my video card is part of the motherboard of my machine, I had to wait weeks and weeks for it to be fixed. Falling weeks and weeks behind on blog entries, podcasts, vcasts, and work of any kind that required my trusty laptop.

Then the holidays came. Pushing my productivity down even further.

Then, on January 1st of 2009, I was sent an email congratulating me on being awarded an MVP- SharePoint Services.

Yup, for the year of 2009 at least, I am an MVP! Woo hoo.

That means I will be able to go to the infamous MVP Summit, visit the Redmond campus, meet the program team for WSS in person, and more. This is particularly important as the next version of WSS is right around the corner (well it may not be out til 2010, but there's got to be some beta testing to do), my timing is pretty good.

And now here I am, working on doing some more sessions for the spring SSWUG vconference (more on that in a second), planning for travel to the summit, and trying to get back to my vcasts and other things for this blog.

Concerning working on content for the vconference: I've been thinking of doing a full series of sessions, in part at the conference, or in whole. I may only be able to prep for about three for the conference, and may do the rest here.

You see, I have a problem coming up with titles for my session proposals (which are often, if accepted, used as session abstracts). The content is easier for me, describing just what I am going to do in the session and why that might be interesting. I can do that. But catchy titles? Not so much.

Because I was recently made an MVP I was able to send in some session proposals for TechEd 2009 (of course, I got the MVP code the day of the deadline, so I only had time to send in two). However, I really just didn't feel I did a good job with creating a catchy title for the proposals-- ones that popped, ones that really effectively sold what I was cookin' (so to speak).

Troubled, when I was asked to do some more sessions for SSWUG's vconference in the spring, I wasn't confident that I could really generate some good titles. And everyone knows it's the title that attendees (and the curious) click on. If the title isn't right, no one will bother to read the description.

So I hemmed and hawed, and hemmed some more. I thought of "Super Duper Admin Tricks" and "The Secret Life of WSS: things that even MOSS can't do". But they didn't quite fit for me.

Then I thought of something. Really, a lot of my motivation with sticking stubbornly to evangelizing WSS (instead of MOSS) is because I cringe at the idea of paying out the nose for something that isn't entirely going to be used. There are so many things that WSS does, for free, that there are good, solid reasons to never install MOSS. I like getting the most out of my servers, and their features for the money I spend, before I spend another penny.

And if I need to spend more money, I want to know why, exactly what I need to buy, and exactly what it needs to do before I write any checks.

And that's why I like doing presentations about WSS. Because I like to show you what you can do with what you have. Help you get the most out of the free product before you have to buy the server product, the standard CALs, and even the additional enterprise CALs. Push the envelope, think outside the Admin box.

In the very least, show you it's limits so that you clearly know where the line is, and when it's time to pay for the server version of SharePoint.

In a word, I like to be frugal.

And because that sums up the point, the underlying motivation of a lot of my sessions, I've decided to do a "The Frugal Admin" series (well, if I don't get any feedback telling me not to).

Ideas I have for the series (please let me know which you like):

The Frugal Admin, How to get the most of the built-in web parts: Don't just accept that your home pages are boring. Put some life into them without spending a penny. Explore the existing web part templates and broaden the horizons of existing list view web parts. Push them to the limit and turn your bland, hum drum home pages into the spectacular, useful, web part pages they were meant to be. Wow your users, impress your boss, and never wonder if you could have built yourself what you just paid some one else to make. Know for sure what your options really are, without any additional cost.

The Frugal Admin, Do it yourself dashboards. or maybe How to make your own Dashboards, without being a developer or SharePoint Designer: Dashboards are easy, depending on what you want to do with them. Why pay for one when you can roll your own. Come see the secrets of the simple dashboard; how to create the views, the web parts, and the web pages that make a site's home page more relevant from management to worker. See how far you can go out of the box before you spend a dime.

The Frugal Admin, Make your own Custom Site Templates. So you think only Microsoft can come up with useful site templates? Think again. Don't be trapped into thinking that if you want a nice site (especially one you'd like to deploy in a few places) you have to pay a developer to create it. Come see how we wrap up this three part series by bringing together the fancy web parts and dashboards to create our own unique site templates. Filled to the rim with useful goodness. With tips and tricks concerning workflows, resource libraries, and more. Learn how to make your templates self referencing, so they can pack up and go without having any extra files to worry about. Elevate your status, become that much closer to a WSS expert by seeing how it's really done-- all without special developer training or expensive additional software. Create the templates you wish Microsoft had thought of after attending this session.

The Frugal Admin, Exploiting what's out there. So you need to create a new user group site, or your managers want you to create a time sheet site to track each department's hours on sharepoint related projects. Maybe your IT department want's their own helpdesk site. Well, before you start either trying to create those yourself, or find someone to pay who will-- consider looking online at the resources already available from Microsoft and Codeplex. With the Fantastic 40 templates, Groupboard 2007, and the Community Kit for SharePoint, you've probably got all bases covered, and then some-- for FREE. So before you start making promises to anyone, stop by this session and get a glimpse of the good stuff, and discover all those pre-made templates before the need to make your own wears you down.

Stepping back a bit--

The Frugal Admin, So you're considering installing WSS? A quick run down on what you need to install WSS, tips concerning licensing, Authentication (AD, and a few of its cheap alternatives- AD Account Creation Mode and Forms Based Authentication), Planning, and a quick overview about Design. Consider who you want to use SharePoint and how they're going to use it. Know what you're getting into before you start, and you'll always save money in the end. Don't be surprised, plan ahead. (for those experienced admins, you might want to stop by to get some insight into why, sometimes, it is a good idea to install SQL and SharePoint on the same server... for those tips and more, stop by the session, it'll be worth it)

For those experienced Frugal Admins, I'd like to go into detail about getting more out of SharePoint doing Intranet/Extranet deployments (not always the cheapest thing to do, but I'll show you how to squeeze every penny out of it), The nitty gritty on ADAC, the inside scoop on Directory Mangement Services (and why it's rather a wet balloon), and more. There are also some interesting tools for those admins- for free of course- that's I'd like to explore, such as the administration kit for Sharepoint, and some of the solution accelerators, as well as good ol' Search Server Express.

<>

So what do you think, would these sessions be worth attending? Anything you'd like to see that I haven't mentioned? Feel free to comment. Kthxbai. ; )