Making Sense of the SharePoint World

Apr-292010

SharePoint 2007 Security Vulnerability - Action Required

wpe3Stop the Presses!

Microsoft has announced the discovery of a cross-site scripting vulnerability in the SharePoint 2007 (and WSS 3.0) Help system. Although they are still investigating the root cause and working on a long-term solution, they have provided a workaround which will mitigate the only known (at the time of this writing) attack vector. You can read the details of the vulnerability and a server-side workaround in Security Advisory 983438. The Security team have also posted some more explanations about this class of vulnerability and some client-side mitigations in this blog post.

A Little More Info

The vulnerability is what is known as an "injection attack". Essentially, arbitrary JavaScript can be run by being passed as a carefully crafted parameter to the built-in SharePoint Help page. This script will run in the context of the current user's client session, and can therefore perform any actions against the SharePoint site that the user could.

This does not turn the user into an administrator, or otherwise elevate their own privileges. As far as I can tell, it does not (as some reports have suggested) expose the user's password. Update: This is with the default SharePoint authentication. Custom authentication methods could potentially store credentials in an accessible manner. I have no way to test that scenario, but any attacker would need intimate knowledge of how that authentication module worked in order to exploit it. So, while your passwords are probably safe, this vulnerability could allow an attacker to probe for and read any information in SharePoint that the user does have access to, or to vandalize or destroy information the user is permitted to update. Therefore, for the time being I strongly suggest disabling the help.aspx file in the Layouts folder of your SharePoint servers, either by following the instructions in the security advisory or through other means. (At this time, I don't suggest just deleting the file.)

Update #2

It has been pointed out that, although the attack itself cannot (usually) directly glean the user's credentials, an injected script could prompt an unsuspecting user into providing them, thinking the request was coming from your site. This does not change my advice (applying the mitigation procedures), but it should increase your priority in doing so.


Oct-122009

Knowing Your Limitations

MCj03789710000[1]

"2.1 Billion ID's Should be Enough for Anybody!"

One of the more infamous stories about Bill Gates is that he once said "640K of memory should be enough for anyone." That wasn't true - he never said it, but it did point up the frustration that came from one of the design limits of the original IBM PC. The memory between 640K and 1MB (which was the physical limit of the CPU) was allocated by IBM for video, I/O buffers, and lots of other "housekeeping", and therefore couldn't be accessed by DOS. This was fine at the time, when the typical computer came with 64K of RAM, and even expanding to 512K was a luxury; but when applications (like Lotus 123, dBase III, and even Windows itself) became complicated enough to require that memory, and more powerful CPUs became available that allowed access to even more, that big "gap" before getting to the extended memory required more effort to program around than anyone could have predicted. (Yes, that's way over-simplified, but it is enough to get the point across...)

The reason I bring up this little history lesson is to point out that when you are designing products, you have to set limits somewhere. Sometimes these limits are intrinsic, like the 1MB maximum RAM of the 8088 CPU. Others are compromises, like how much of that 1MB to allocate for system housekeeping, and where to locate it in the address space. You hope you set these high enough that most users will never see them, but they are there.

SharePoint also has a number of limits. Most of them are well documented. Some of them are "soft" limits - places where you see performance degradation. Others are "hard" limits, like the maximum size of an integer value. But some limits are buried under the covers, because they are internal to a function, and users never see the processes that are impacted. If they are set high enough, the users will never even know they exist.

Crawling Forward

Unfortunately, there is a limit that wasn't set high enough. This was buried deep inside the MOSS and MSS search databases. Most database tables have a field for a unique identifier. This is automatically incremented every time a new row is added. Typically, a SQL Server Integer (int) is used for this ID, allowing up to just over 2 billion items to be added (2,147,483,647 if you must know). That's a lot. But this value just goes up - it isn't decremented if you delete a row.

In the SharePoint Search DB, there is a table that keeps track of all of the links in your crawled content. Whenever you do a new crawl, rows are added to and deleted from this table. This table originally used the int referenced above for its ID field. Now, there can be a lot of links in a SharePoint site, but still, 2.1 billion should take an awfully long time to reach, and in most cases it does. But reach it you can. For very large and complicated sites, if you do a full crawl every day (which deletes and replaces all of the link references) you can reach it faster than you might (and the developers did) think.

So, what happens if SharePoint actually hits this limit, and runs out of IDs? It isn't pretty. Essentially the crawling process gets stuck. It asks the database for permission to write the next available row, and since there isn't an ID that can be given to it, the database just says "no". Unfortunately, SharePoint doesn't take no for an answer, and keeps asking. You will, occasionally, see an error in the event log talking about a SQL Identity failure, but unless you were aware of this possibility, it wouldn't make much sense.

Recovering

This also prevents you from effectively controlling search. Because SharePoint insists on finishing the last thing it was doing, you can't stop the crawl. Because there isn't much to go on in the logs, and it takes some SQL Server proficiency to accurately diagnose the problem, many times, this results in folks rebuilding their SSP, with all of the pain and agony that entails, just for the want of an ID.

Note: At this point, you need to consider the search index on this SSP corrupt. There is nothing that can recover the ability to crawl new content without resetting your index and doing a full crawl as described in the prevention section below.

Even if you can successfully diagnose it, there are very few supportable solutions that *don't* involve rebuilding the SSP one way or another. Remember, directly modifying the SharePoint databases yourself can result in an unsupported state. So, if you reset the seed of the maxed-out table to 1 in order to get control of the crawl back and stop it, you should restore the search database from a backup to reach a production state before you reset the crawled content (see below), which resets the database to an initialized state.

You can also restore your whole SSP from a backup, but that's almost as much fun as rebuilding it, and it assumes you have a restorable backup of your SSP.

An Ounce of Prevention

Obviously, it is much better to prevent this problem from occurring in the first place than to try recovering from it. There are a couple ways to do this. The first and best is to upgrade your SharePoint environment to Service Pack 2. Among the many enhancements in SP2, the ID fields in the search databases that were prone to maxing out are updated to "big" integers. BigInts are twice the number of bits as regular integers. That doesn't just double the capacity, though. It makes it 4 billion times as large. (For those who really need to know it makes the number of possible ID's 9,223,372,036,854,775,807!) So, if it took 6 months to reach the old limit, it would take 24 billion months to reach the new one.

If you can't upgrade to SP2, you should consider adding a periodic reset of the index into your maintenance plans - especially if you have a very large corpus, with lots of links. The option to do this is available from Quick launch in the Search Administration page.

image

Resetting the crawled content doesn't impact your settings, keywords and best bets, etc... But it does delete your existing index and completely resets the search crawl database - including the table ID fields. After the reset, search results will not be available until a full crawl is performed, so you should schedule this to take place during a down time and/or notify your users of the search outage. If you have multiple content sources defined, you will need to crawl all of them.

When you select reset, you will get a screen asking if you want to turn off search alerts during the reset. It will default to being selected, and you should leave it that way.

image

The alerts can be reactivated once your crawls have been completed.

Conclusion

As Clint Eastwood once said as Dirty Harry, "A man's got to know his limitations." Everyone, and every thing, has limits.

Limits are only a problem when you don't know about them, and don't take them into account. SharePoint, as powerful as it is, has plenty of them. In addition to the hidden limit I covered in today's article, you might want to review some of the more well known limits in the SharePoint planning material: Planning for Software Boundaries.


Sep-202009

Indexing SharePoint List Columns

MCj03800210000[1]

Helping SharePoint Help You

A SharePoint system manages a huge amount of data. Amazingly, in a SharePoint content database, all of the data, for every list and library item, in every site and subsite, is stored in a single table. Looking at hundreds of sites, each with dozens of lists and libraries, each potentially containing hundreds or thousands of items, and you end up with one massive table!

Not So Limiting After All

Everybody has heard about the so-called "2000 item limit" in SharePoint. Remember that this isn't really a limit. SharePoint is quite capable of handling lists with tens, or even hundreds, of thousands of items. The issue is the "rendering" of those items, which starts becoming perceptibly slower if you have more than 2000 items in a single view.

While the indexing discussed in this article can have a minor effect on this rendering, it really is more general, and can improve list performance across the board.

Ask any DBA how to achieve maximum performance on a huge table, while other options may also come up, at a minimum you'll always hear the word "indexing". And make no mistake - SharePoint (whether Windows SharePoint Services 3.0, or Office SharePoint Server 2007) does do a lot of indexing. But that is only dealing with the user data table as a whole.

Once SharePoint has figured out which site and list the data belongs to, normally it is pretty much done with indexing. When you perform a query - whether in code, or for a web part view - each item in the list is examined individually for a match. As the amount of information in your sites grows, this can take quite a bit of time and cause significant slowdowns (This is independent of the "2000 item limit" - see sidebar).

Fortunately, you don't have to put up with this default behavior. SharePoint gives you the additional ability to index the information within individual lists or libraries.

Look at the settings page for just about any list or library, and you will find a link for "Indexed columns":

image

When you click the link, you will be given the opportunity to select which columns in your list you wish to index. This is where an understanding of your information, and how you use it comes into play. While you could just click everything, that isn't usually a good idea. For each column you index, SharePoint needs to store extra information about every item in your list.

You should only select columns for indexing that you will be using to query/filter, sort, and group your list. For this list, I'll usually need to do this with the item's title (or name), who created or modified an item, and when it starts. So those are the columns I'll select to index:

image

Note: When setting up indexed columns, you will almost always want to include the title or name field.

Once you click OK, SharePoint will build the appropriate extra indexes. While there won't be any change to the way your list looks, you should see the performance results almost immediately. (Of course, the more items in your list, the greater the impact will be...)

One place you usually see immediate results is when you click on the context menu of a column title to change the sorting or filtering. The list of unique values builds much faster on an indexed column.

image

That's all there is to it! Setting up indexed columns in your SharePoint list really is that simple. Give it a shot, and you might be surprised at how much faster your SharePoint applications can be.


Sep-162009

SharePoint Values Your Uniqueness

MCj04356080000[1]"Who Did You Say You Were?"

Today I'm going to talk about user accounts. In particular, Windows Active Directory accounts. You might be thinking, "This doesn't sound like a SharePoint topic!" But rest assured, it is.

I've talked about Windows accounts and SharePoint before. For example, I've told you about the many accounts you need to consider when setting up a SharePoint farm, and how to discover the Setup User account after the fact. I've also mentioned how you can let SharePoint know when User ID's change.

Normally, each person has one Windows account (user ID and password). They use this account to log into their PC in the morning, thus proving to the network who they are. This is called "authentication". Many systems that recognize Windows authentication (including SharePoint) will simply accept these credentials from the user's PC, without any further user intervention.  The systems use these credentials to determine what data and functions the user has been "authorized" to access by the administrator of the system.

Note: Behind the scenes, it is quite a bit more complicated than that. While a complete discussion of handshaking and protocols is beyond the scope of this article, understand that the negotiations taking place do result in one of the issues I will describe below.

On Being a Highlander - "There Can Should be Only One"

By and large, this system works pretty well. However, it is based on a pretty big assumption - that each user has only one account for their "day to day" system usage, and each account has only one user. Unfortunately, from time to time this assumption doesn't hold true. This can cause some subtle, and not-so-subtle (but hard to trace) problems in SharePoint if you aren't careful.

I'm going to discuss three main scenarios:

  1. One user has multiple accounts
  2. One account is shared across multiple users
  3. Accounts in different domains that have the same UserID portion, but different passwords.

For each group, I'll talk about why it might occur, how it presents a challenge, a few variations on the theme, and what you can do to minimize the difficulties presented by it.

One User, Multiple Accounts

Why it might happen:

There are many specific reasons a user might have multiple accounts, but they generally fall into three categories - Administration, Test, and Transition.

Transitions can be wide-spread - such as when companies realign or change naming standards, or individual - such as when marital status changes. Regardless of the reason for a transition though, generally there will be an "old" account, and a "new" account.

For purposes of this section, I'm considering administrative and test accounts to be equivalent. They don't need to be "administrators" per se, and the account may exist mainly for applications other than SharePoint. Essentially these are any accounts that a user signs into for a particular task that has different privileges from their normal User ID (e.g. DBA), and their use is often dictated by policy or best-practice. Regardless of the reason, the challenges and resolutions are basically the same.

Why this is a challenge:

Aside from the obvious - making sure each account actually has access to the correct resources - SharePoint keys a lot of stuff based on the current user. From Created and Modified by tags, to personalized pages, and audience targeting, SharePoint knows and shows who you are. But the real hazard of multiple accounts is their effect on profiles and my-sites.

User profiles are crated from Active Directory imports, as well as user-entered information. Certain features of My Sites, such as organizational relationships are built using the imported information. Typically a user's "primary" account will populate such fields a their title, office location, manager, and other useful business information. Administration and Test accounts, on the other hand, generally just use the name to describe the purpose of the account, and little else.

These impact not only the current user, but others as well. For example, suppose you fully populate AD info for each of the accounts. When someone clicks on the manager's profile, suddenly they will show all of the secondary accounts, as well as the "real" user, in their reporting relationships. Or, what if the user has reports of their own? Since each can only have one manager, you need to be very careful to assign it to the correct account, otherwise, someone may end up not appearing in org charts, or they may appear in the wrong place.

image

When someone starts using these secondary accounts for day-to-day activities in SharePoint, this is the account noted as the creator or modifier of data items, and tracked in logs. If you use Communicator, SharePoint's presence indicators can also be affected. They will show the presence and contact info for the account that actually made the change, rather than the potential real presence for the user.

Finally, aside from the profile, each of these accounts will register as separate for the creation of My Sites. Since Office can hook into a user's My Site as a default storage location, this can also cause confusion, as personal documents appear and disappear based upon which ID user is using to perform their activities.

Minimizing the pain:

Wherever possible, strive to make transitions instantaneous - at least as far as SharePoint is concerned. Don't let users access SharePoint with more than one of these accounts at a time. When the time comes, make use of the stsadm "migrateuser" operation as soon as possible, so that users don't get confused or accidentally start generating content under the new account before their old information is reassigned.

For admin/test situations, make sure there is a clear distinction in the users' minds about the purposes of each account. In addition, make sure the metadata in Active Directory correctly reflects which account will be used for day-to-day operations.

One Account, Multiple Users

Why it might happen:

The reasons for multiple users sharing an account typically revolve around cost savings in one form or another - either licensing or administrative.

A typical example might be in a "shop" situation, where one PC is on the floor, and once it is logged in, everyone just accesses the information they need.

Why this is a challenge:

Some of the challenges here are similar to those you might face in an Internet-facing, anonymous access, scenario. Essentially, when someone does something, you don't know who it was.

But there is an added complication. When you are authenticated, this triggers some things in SharePoint. For example, since you have an account, SharePoint will treat you as a named user for "Created by" information. But you can't use effectively use the "only their own" global list permission, or filter things by [Me], as everyone who shares the account will have that permission or see that information. Or on surveys that prevent multiple entries, only one person will be allowed to fill it in from that account. Discussion comments all appear to come from the same person.

Minimizing the pain:

By sharing a single account, you essentially remove the effectiveness "social" elements of SharePoint. Consider reducing confusion by turning off access to MySites and Personalization for the shared account. If you want to use interactive discussions, add a field for users to manually enter their names when posting, and make it a required field.

Frequently such shared resources are "read only". Consider simply allowing anonymous read-only access in those instances.

Another option is to enable a forms-based authentication zone for that group. This allows you to keep your AD clear of staff who otherwise don't use PC's, but still maintain individual control over SharePoint access, and monitor who is posting to writable areas.

Name's the Same, Different Domain

Why it might happen:

Unlike accounts within the same domain, which by definition must have unique IDs, it is possible for accounts in different domains to have the same "userID" part. For example, EMEA\SallieJo, AM\SallieJo, and APAC\SallieJo.

These may represent different users, or one user who travels. Sometimes organizations merge, and each already has their own Active Directory domain. Or for various reasons, your organization requires multiple domains on an ongoing basis. In some cases, this can be similar to "One User, Multiple Accounts". In fact, sometimes the situation is the same - accounts are held by the same user, and/or they are transitional. If this is the case, then the cautions mentioned in that section apply as well; but this situation presents its own unique challenges.

Why this is a challenge:

Here is where that "Behind the scenes" techie stuff I mentioned at the beginning comes into play. Essentially, when a user's web browser (Internet Explorer) and IIS (Internet Information Services) start negotiating authentication, only the UserID portion of their credentials, along with encrypted password information is sent from the client to the server. (SharePoint relies on IIS and the ASP.NET engine for authentication.)

Here's the problem - IIS usually assumes the user will be in the same domain as it is, and will try to match the ID with an account in that domain. Only if it doesn't find a match will it make a deeper query of Active Directory. If it finds an ID match, it will try to validate the encrypted password against the current domain. If the passwords match, no problem - or maybe big problem. SharePoint thinks IIS has authenticated the local user. If both accounts are really for the same person, you're OK. If it is a different user from the other domain, who just happened to have the same ID and password, they could actually be seeing the local user's information!

If the passwords don't match, unlike the situation when the UserID is different, IIS won't continue its search. It will just return a fatal error - typically a "500". (If the user is persistent, this also has the potential of locking out accounts in IIS' local domain.)

Note: This isn't an issue unique to SharePoint - you can encounter this problem with almost any system that requires NTLM pass-through authentication.

Minimizing the pain:

The best defense here is to make sure your user ID's are unique across all trusted domains accessing resources. Before initiating trusts, run reports and reconcile them, and set forest policies to prevent duplicates.

When duplicates exist for one or two users, a workaround for users in the remote domain is to assign your site to a specific zone (e.g. trusted sites) and configure Internet Explorer to always prompt for authentication in that zone.

image

If this is a problem for a large number of users, configuration changes at the server may be in order.

When the majority of your users are in a domain other than the one hosting SharePoint, you can configure IIS to use digest authentication and a different default realm. You can also extend SharePoint into multiple zones, and configure a different realm for each site in IIS. Then ensure that each domain's DNS points to the correct SharePoint zone.

image

Of course, if you change configurations - whether they be IE or AD, IIS or SharePoint - make sure you document them!