Woody Windischman

May-272011

Everything There is to Know About SharePoint 2010 SP1

imgBTruly "Packed" for "Service!"

*** UPDATE: Service Pack 1 has now been released! ***

As mentioned in my previous post, Tech-Ed 2011 was somewhat subdued. We're pretty much "between launches", meaning that the emphasis was on real-world application of the existing technology stack, rather than The Next Big Thing. That said, there were some glimmers of things to come.

One of those glimmers was the announcement that Service Pack 1 for Office and SharePoint 2010 products was almost here. Although an exact date wasn't given, we were told that it would ship in "Late June 2011". That's about as specific as Microsoft ever gets for anything short of a major launch event.

It's In There...

Microsoft also released a few hints of what will be included in SP1. Before I go into detail, however, you may need a little background. The effective definition of the term "Service Pack" at Microsoft has always been something of a moving target. Although, officially a Service Pack is just supposed to contain "bug fixes and stability and compatibility enhancements", Service Packs in the past have ranged from just that, to conglomerations of new features that could have almost justified a full point release. (And let's not even get into actual point releases, "Feature Packs", and the infamous SharePoint 2007 "Infrastructure Update"!) SP1 for SharePoint 2010 falls somewhere in-between those extremes.

Note: This article describes only the features of SP1 that were officially disclosed at Tech-Ed. These may or may not be the only enhancements to come...

Of course, the core of any service pack are those bug fixes I mentioned. While Microsoft didn't disclose any specific new bugs being fixed, they did say that everything in the previously released Cumulative Update packages (up to and including April 2011) will be included. The main difference between the fixes in a Cumulative Update (also called a "Hot Fix Rollup"), and those in a Service Pack, is that the Service Pack fixes have received more thorough regression testing, and are considered appropriate for deployment to all customers. CU's, though supported, are primarily meant for customers who know they're suffering from particular issues included in the package. This is also why, while Service Packs are distributed widely, CU's have to be specifically requested. Of course, all of these updates are much easier to come by than they used to be. Check out the Updates for SharePoint 2010 Products for the latest and greatest details.

Going Beyond the Bug(s)

So, what are the actual new features in SharePoint 2010 SP1?

Support for Internet Explorer 9 Native Mode and Google Chrome

SharePoint has always had great browser support, but this update officially will add Google Chrome to the "A" list of browsers, supporting the vast majority of SharePoint features - including Office Web Apps. Ditto to Internet Explorer 9 in "Native Mode" - whatever that means... :)

Site Recycle Bin

Here's a feature that people have been asking for almost as long as SharePoint has been around! As of SharePoint 2010 SP1, administrators will be able to recover deleted sites and site collections without having to first restore a SQL Server database! There have been third party tools, and open source projects, to accomplish this in the past, but now the functionality will be baked in.

Shallow Copy

Shallow Copy needs a little explanation. No, it doesn't mean a clone with the personality of a Barbie(tm) doll. This feature is primarily of interest to folks using the Remote Blob Storage (RBS) feature of SLQ Server 2008 R2 to reduce the size of their databases. Essentially, when you move a site collection from one database with RBS enabled to another, Shallow Copy allows the file-system-based files to remain where they are, with just the pointers in the Content Database updated. Otherwise, the files would need to be read off the disk, then resaved as part of the copy operation.

StorMan.aspx

This one is less a "new" feature, than the return of an "oldie but goodie". When you had a quota assigned to a site collection, it could be very handy to have a report of where on your site you happen to be using up space. In SharePoint 2007, there was a utility page for this report called storman.aspx. For reasons I'm not sure of, this page was not included in SharePoint 2010. Service Pack 1 brings it back home.

SQL 11 Support

There was a lot of buzz at Tech-Ed about Denali (aka SQL 11 - who knows what the "real" name will be). SP1 brings official support for it to SharePoint. No official word on whether it will light up any new features, except maybe the Crescent real-time Reporting Services tool. I saw a Crescent demo at the show, and it was really cool. I might write more about that later. In the mean time, check out this SQL Reporting Services team blog post...

Fix, or Feature?

In the slide show I saw, several items were listed as "Fixes", though a lot of them sure sounded like new and/or improved functionality to me...

Office Web Apps

The Office Web Apps, or OWA, get a lot of love in SP1. I already mentioned the enhanced browser support (Chrome, IE9 "Native"). In addition, you get such goodies as:

  • Open Document Format (ODF) support for viewing and editing
  • Print Word documents in edit mode (not just preview mode)
  • Insert Charts with Excel Web App
  • Copy/Past values and formulas in Excel Web App by dragging the "fill" handle.
  • Print from PowerPoint Web App
  • Edit directly in more shapes in PowerPoint Web App
  • Insert Clip Art in PowerPoint Web App

All in all, very worthy improvements!

Indexing Connector for Documentum

Even though SharePoint offers all kinds of document management, there are still customers for whom Documentum is the product of choice. The Indexing Connector for Documentum allows SharePoint Search to crawl Documentum repositories and return appropriately ranked results within SharePoint. The specific updates for this connector include:

  • Improves overall crawl performance
  • Provides support for customized Documentum Foundation Services (DFS) URL
  • Provides support for Documentum Trusted Content Services (TCS) "Access Restriction" Access Control List (ACL) for security trimming
  • Provides support for custom security trimming solution for TCS enabled Documentum repository by extracting TCS ACLs into SharePoint crawled properties
  • Provides support for Documentum “superuser” permissions level

FAST Search Server 2010

FAST Search Server 2010 is Microsoft's high-end search product. It wasn't left out of the Service Pack 1 frenzy. Here's what you get:

  • Adds the possibility to add and remove indexer and search columns on a live system
  • Adds more flexible custom property extractors
  • Adds Greek spellchecking and stemming
  • Improves title extraction for Word and PowerPoint documents. Titles are now presented correctly and relevancy for Word and PowerPoint documents is improved.
  • Improves default schema which improves relevancy
  • Improves index backup/restore

Conclusion

And that's the whole thing, at least as far as the information that was released at Tech-Ed goes. Service Pack 1 has been over a year in the making, and it seems pretty clear that it will have been well worth the wait. Although we don't yet have the exact release date, you can always keep up to speed on what patches are current on the SharePoint Update pages.

Here's that Update Page link again

And here's the equivalent for SharePoint 2007 technologies

Have a great Memorial Day weekend!


May-272010

Successful SharePoint 2010 People Search

MC900139387[1]

Finding your Way through the Configuration Maze

SharePoint has two basic configuration modes:

- SharePoint sets up "Everything" for you
- You set up "Everything" manually

There is precious little in between these two extremes. The good news is, if you let SharePoint configure everything, chances are everything will work. The bad news is, these settings rarely reflect best practices, and if (when?) you want to tweak some of those settings later you often find that one change has to lead to another, and another, and another in order to get back to working order. By the time you're done you may as well have done it manually in the first place.

Configuring SharePoint 2010 to do people search is one such area. The first half of the manual configuration (or reconfiguration) process is setting up the User Profile import. That is fairly well documented in several places. Probably the best is by fellow MVP Spencer Harbar in his article "A Rational Guide to Implementing SharePoint Server 2010 User Profile Synchronization".

The Bread of the Sandwich

Given how comprehensive Spencer's article is, you wouldn't think that there is anything more to say, and in truth, it is the meat of the issue and often the hardest part to get working. But as I said, that is only half of the story - getting user profile data into SharePoint. What my article is about is letting your users find the information. Since some of this comes before, and some comes after, the AD configuration in Spencer's article, you could think of this as the bread of the sandwich.

Once Central Administration is up and running, the first thing it offers is the opportunity to let another Wizard configure all of your service applications for you, and set up a default SharePoint web application. If you followed Spencer's advice, you said "No" to its kind offer. His article assumes you did, and gives instructions for setting things up completely manually. For this article, I'll assume you said "Yes" and want to fix things up. For completeness, I cover some of the same ground, and you can safely follow either set of instructions for creating the User Profile Sync service app.

Again, if you say "Yes", you'll get something that works. But if you look carefully, you'll discover two big things that violate good configuration practice for production environments:

  1. The Search service application is configured to use the Server Farm/Database Access account as the default content access account.
  2. My Sites and the Profile host site collection are configured to live within that first web application, which is named with the host name of your central administration server.

The first one is easy to address - on the surface. Create a suitable domain account, then in Central Administration, go to your Search service application and assign it to be the default content access account.

image

SharePoint will give it a default read policy on every web application associated with that service application. That's great as far as it goes, but hold that thought for a moment. I'll be coming back to it shortly.

As for the second issue, having the personal sites embedded in a content web application, you'll need to delete and re-create the User Profile Service application to resolve that. Or create the service application for the first time if you didn't invoke the wizard. Whether correcting from the wizard or creating the applications for the first time, other than the deletion, the steps (and some of the potential issues) are the same.

First, create a "normal" web application for your profiles and personal sites. Create a site collection at the root of the web application using either the "Blank" or "MySite Host" template.

Second, go to your Service Applications page and from the New button select User Profile Synchronization service application. Like most service applications, this one requires you to allocate an application pool and number of databases. The page suggests leaving them as the default names, which you can, though if you do make sure the databases from the original service application (if any) are deleted first. Otherwise, give them appropriate names for your environment.

Toward the end of the configuration page, specify the server in your farm that you want to host the profile sync service, and enter the web application you defined in the previous step.

MyWebApp

After you accept your settings, wait for the service application to finish creating. (You will return to the UI before that process completes.) Now would be a good time to go read Spencer's article to see what you should have done to get to this point, and have your AD administrator set the permissions required for your profile import account.

By that time, you should be able to complete the User Profile service application configuration as instructed.

The Last Piece of Bread

In a perfect world, you would be done. Of course, we don't live in a perfect world. Chances are, you'll get a wonderful set of profiles imported, and you can navigate to them and see everything. If your users create MySites, you'll probably even be able to find their content. But do a people search, and you get a whole bunch of "nothing". That's because you're not actually crawling the profile store - at least not successfully.

Time to go back to Central Administration, and first look at your Search service application's management page. Click the Content Sources link on the left hand side, and open/edit your Local SharePoint Sites content source. In the Start Addresses section, you will see a box with entries similar to those below:

image

Notice the sps3: line. This is the protocol SharePoint uses to read profiles. (Note: It isn't a "protocol", per se. It just instructs SharePoint to call a specific web service hosted at that address.) If you ran the wizard to configure your service applications, it will be pointing at the original web application created by it. You'll need to change it to reflect your new profile web application, then save the changes to your content source definition. Also, if you deleted the original wizard-created web application (or aborted its creation), you'll need to delete the regular http: line referencing it.

You might think (again) that that's all there is, but again you'd probably be wrong. Once you make the change above, you'll probably start seeing access denied errors on that "server". Remember when we assigned a new default content access account way back in step one? Well, even though it has permission to read the contents of the web site, the service under the sps3 protocol leads right back to the User Profile Synchronization service application, and you didn't tell that application to let the new content access account in.

To do that, navigate to the Manage Service Applications page, and highlight your User Profile Service Application. Click the Administrators icon in the ribbon.

ProfileAdmins

You'll need to add your default content access account to the list of "administrators". It won't really be an administrator - notice that there are an array of permissions available. Once you add the account, highlight it and ensure that the "Retrieve People Data for Search Crawlers" permission is checked, as shown below:

PermissionDialog

Click OK, and reset IIS on the profile import server. Maybe even reboot it.

Best Practices?

At last, you're done. You should now have functioning user profiles and people search, configured in accordance with "best" practices. (Yeah, "best" is relative...) Still, there are reasons for this kind of configuration. It gives you an easily manageable farm, with excellent control over My Sites - ensuring that personal content is in separate databases from your corporate portal data. The account used to crawl won't be the "all powerful" Farm account, and you can tell the difference through access and audit logs between administrative access to resources and the search crawler's.

Now, wasn't that a tasty sandwich?


Oct-122009

Knowing Your Limitations

MCj03789710000[1]

"2.1 Billion ID's Should be Enough for Anybody!"

One of the more infamous stories about Bill Gates is that he once said "640K of memory should be enough for anyone." That wasn't true - he never said it, but it did point up the frustration that came from one of the design limits of the original IBM PC. The memory between 640K and 1MB (which was the physical limit of the CPU) was allocated by IBM for video, I/O buffers, and lots of other "housekeeping", and therefore couldn't be accessed by DOS. This was fine at the time, when the typical computer came with 64K of RAM, and even expanding to 512K was a luxury; but when applications (like Lotus 123, dBase III, and even Windows itself) became complicated enough to require that memory, and more powerful CPUs became available that allowed access to even more, that big "gap" before getting to the extended memory required more effort to program around than anyone could have predicted. (Yes, that's way over-simplified, but it is enough to get the point across...)

The reason I bring up this little history lesson is to point out that when you are designing products, you have to set limits somewhere. Sometimes these limits are intrinsic, like the 1MB maximum RAM of the 8088 CPU. Others are compromises, like how much of that 1MB to allocate for system housekeeping, and where to locate it in the address space. You hope you set these high enough that most users will never see them, but they are there.

SharePoint also has a number of limits. Most of them are well documented. Some of them are "soft" limits - places where you see performance degradation. Others are "hard" limits, like the maximum size of an integer value. But some limits are buried under the covers, because they are internal to a function, and users never see the processes that are impacted. If they are set high enough, the users will never even know they exist.

Crawling Forward

Unfortunately, there is a limit that wasn't set high enough. This was buried deep inside the MOSS and MSS search databases. Most database tables have a field for a unique identifier. This is automatically incremented every time a new row is added. Typically, a SQL Server Integer (int) is used for this ID, allowing up to just over 2 billion items to be added (2,147,483,647 if you must know). That's a lot. But this value just goes up - it isn't decremented if you delete a row.

In the SharePoint Search DB, there is a table that keeps track of all of the links in your crawled content. Whenever you do a new crawl, rows are added to and deleted from this table. This table originally used the int referenced above for its ID field. Now, there can be a lot of links in a SharePoint site, but still, 2.1 billion should take an awfully long time to reach, and in most cases it does. But reach it you can. For very large and complicated sites, if you do a full crawl every day (which deletes and replaces all of the link references) you can reach it faster than you might (and the developers did) think.

So, what happens if SharePoint actually hits this limit, and runs out of IDs? It isn't pretty. Essentially the crawling process gets stuck. It asks the database for permission to write the next available row, and since there isn't an ID that can be given to it, the database just says "no". Unfortunately, SharePoint doesn't take no for an answer, and keeps asking. You will, occasionally, see an error in the event log talking about a SQL Identity failure, but unless you were aware of this possibility, it wouldn't make much sense.

Recovering

This also prevents you from effectively controlling search. Because SharePoint insists on finishing the last thing it was doing, you can't stop the crawl. Because there isn't much to go on in the logs, and it takes some SQL Server proficiency to accurately diagnose the problem, many times, this results in folks rebuilding their SSP, with all of the pain and agony that entails, just for the want of an ID.

Note: At this point, you need to consider the search index on this SSP corrupt. There is nothing that can recover the ability to crawl new content without resetting your index and doing a full crawl as described in the prevention section below.

Even if you can successfully diagnose it, there are very few supportable solutions that *don't* involve rebuilding the SSP one way or another. Remember, directly modifying the SharePoint databases yourself can result in an unsupported state. So, if you reset the seed of the maxed-out table to 1 in order to get control of the crawl back and stop it, you should restore the search database from a backup to reach a production state before you reset the crawled content (see below), which resets the database to an initialized state.

You can also restore your whole SSP from a backup, but that's almost as much fun as rebuilding it, and it assumes you have a restorable backup of your SSP.

An Ounce of Prevention

Obviously, it is much better to prevent this problem from occurring in the first place than to try recovering from it. There are a couple ways to do this. The first and best is to upgrade your SharePoint environment to Service Pack 2. Among the many enhancements in SP2, the ID fields in the search databases that were prone to maxing out are updated to "big" integers. BigInts are twice the number of bits as regular integers. That doesn't just double the capacity, though. It makes it 4 billion times as large. (For those who really need to know it makes the number of possible ID's 9,223,372,036,854,775,807!) So, if it took 6 months to reach the old limit, it would take 24 billion months to reach the new one.

If you can't upgrade to SP2, you should consider adding a periodic reset of the index into your maintenance plans - especially if you have a very large corpus, with lots of links. The option to do this is available from Quick launch in the Search Administration page.

image

Resetting the crawled content doesn't impact your settings, keywords and best bets, etc... But it does delete your existing index and completely resets the search crawl database - including the table ID fields. After the reset, search results will not be available until a full crawl is performed, so you should schedule this to take place during a down time and/or notify your users of the search outage. If you have multiple content sources defined, you will need to crawl all of them.

When you select reset, you will get a screen asking if you want to turn off search alerts during the reset. It will default to being selected, and you should leave it that way.

image

The alerts can be reactivated once your crawls have been completed.

Conclusion

As Clint Eastwood once said as Dirty Harry, "A man's got to know his limitations." Everyone, and every thing, has limits.

Limits are only a problem when you don't know about them, and don't take them into account. SharePoint, as powerful as it is, has plenty of them. In addition to the hidden limit I covered in today's article, you might want to review some of the more well known limits in the SharePoint planning material: Planning for Software Boundaries.


Aug-192009

My Free SharePoint Twitter Integration Components

MPj04389110000[1]

Yes - I Still Like Twitter!

If you've been following my saga over the last few weeks, you'll know that I was temporarily suspended from Twitter due to a cross-site attack, that caused an inappropriate spam link to be injected into my tweetstream. While I am still disappointed that it took Twitter customer service almost two weeks to reinstate me, I do still like Twitter.

In an effort to "bury the hatchet", I am re-posting links to some components I wrote to bring Twitter into SharePoint. The first two are simple and fancy Federated Location Definitions for Search Server 2008, or MOSS Search (post-Infrastructure Update). The third is a simple Data View web part that can provide a twitter search result on any SharePoint page, including WSS.

(Note: For all of the download links below, right-click and choose "Save target as" to retrieve them.)

Federated Locations

See the original articles: Part 1, Part 2

Download the "basic" Twitter search results Federated Location Definition Download the "deluxe" Twitter search results Federated Location Definition
image image

Data View Web Part

See the article on how to create this part.

Download this part.

image

You can see all three components in action here.