Easily migrate content between SharePoint servers using stsadm.exe and fix the HTTP 401.1 site collection administrator local login problem

I needed to migrate content from one install of SharePoint Web Services 3.0 on Windows 2003 to another physical server and ran into an issue with checking my site locally before changing all the DNS records over. The actual export and import process is quite easy but can take a while if there are a lot of subsites or files within your SharePoint portal.

Migrating between SharePoint installs using stsadm.exe

The easiest way to export an entire SharePoint site is to use stsadm.exe, which is typically located in “C:Program FilesCommon FilesMicrosoft Sharedweb server extensions12BIN” and is available when you have installed SharePoint. I created a reference to this in the PATH variable so I could use it everywhere to make things easier. Thankfully the documentation is good (thanks Microsoft) and you can find more explicit details on exporting and importing from TechNet. Note that this is similar to other versions of SharePoint and more information on various methods of migration can be found on Microsoft Support, including moving databases directly. Chris O’Brian also has a good post about the various approaches.

The simple stsadm command I used to export my portal (including all files and subsites etc) at http://portal.sitename.com was:

stsadm -o export -url http://portal.sitename.com -filename sitename.bak

This produced a lot of ~30mb files and a good log of everything that took place. All the Active Directory user permissions were also included in the export, which was one of my big worries moving to a new server. For local users you have more of a problem as these don’t exist on the new server and need to be recreated.

To import your site back into another SharePoint install, you have to first make sure there is an existing web application and associated site collection (on the root “/”) before copying over all your exported files to the new server. The first time I tried I assumed it would regenerate the site collection based on my export from scratch, but apparently not. The command I used to import was:

stsadm -o import -url http://portal.sitename.com -filename sitename.bak

Now even though you will probably have a lot of files from the export process, you do not need to specify them all, just the main one, in this case “sitename.bak”. After a while your new site will be populated with all the content from your export and is ready for testing before you go live and change your DNS records to point to the new server.

Testing your newly migrated site locally, avoiding HTTP 401.1

As I was using remote desktop to access my new server to run the stsadm.exe import command I wanted to test the site locally by logging in with my site collection administrator details before changing the DNS over. To do this I set up a reference in my hosts file “C:Windowssystem32driversetchosts” on the new server to point http://portal.sitename.com to localhost (127.0.0.1) then tried to visit http://portal.sitename.com within my remote desktop session. This is where I hit a HTTP 401.1 login error due to a security setting built into Windows 2003, even though I tried logging in with the correct site collection administrator details.

This is apparently a security fix to Windows Server 2003 SP1 that stops reflection attacks and according to Microsoft “authentication fails if the FQDN or the custom host header that you use does not match the local computer name”. The details on how to fix this are located at Microsoft Support and I’ve noted the easiest way to fix this by removing the loopback check entirely.

First you need to disable strict name checking by editing the registry on your server. Open the registry editor (run regedit.exe) and go to “HKEY_LOCAL_MACHINESystemCurrentControlSetServicesLanmanServerParameters”. Now click “Edit -> New -> DWORD Value” and name it “DisableStrictNameChecking” and then right click and set it to decimal “1″.

Now go to “HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlLsa” and click “Edit -> New -> DWORD Value” and name it “DisableLoopbackCheck” and then right click and set it to decimal “1″ as well.

You need to restart your server for the changes to have any effect and once you have you should be able to log in to your local site at http://portal.sitename.com without hitting a HTTP 401.1 error with your site collection administrator details. Now you can test out the site before changing the DNS records to point to the new server and removing your host file record.

Easily connect and use PHP with SharePoint lists using cURL, RSS and NTLM authentication

Connecting to SharePoint from PHP is actually not that difficult if you have the cURL extension installed on your web server. In the case of my XAMMP windows development server I just made sure the following line in php.ini (c:xammpphpphp.ini in my case) was uncommented before restarting Apache:

extension=php_curl.dll

In Ubuntu/Linux you can usually just install the packages for cURL and after restarting Apache it will become available. Just type the following on the command line:

sudo apt-get install curl libcurl3 libcurl3-dev php5-curl

Then restart Apache

sudo /etc/init.d/apache2 restart

Now the following code comes from both Norbert Krupa’s comment on David’s IT Blog and a question about parsing HTML on StackOverflow. The important thing to note is that I needed to use cURL to authenticate my domain user when connecting to my secure SharePoint Services 3.0 test site. Apparently you can get away without using cURL on sites that don’t need authentication but the same cURL code listed below can be used with a blank username and password for the same effect.

The goal of this listing is to connect to SharePoint using a domain user (can also be a local user if SharePoint is set up that way) and retrieve the contents of a SharePoint list. The trick is to supply the RSS feed url, which allows PHP to parse the RSS feed and neatly list the contents of a SharePoint list. An advantage of using RSS feeds of SharePoint lists is that they are secured using the same method as the list itself and require no extra configuration on the SharePoint side of things. You can also set the RSS feed to only show a set number of items or days, which is useful for regularly updated lists.

// generic function to get the contents of an HTML block
function get_inner_html( $node ) {
    $innerHTML= '';
    $children = $node->childNodes;
    foreach ($children as $child) {
        $innerHTML .= $child->ownerDocument->saveXML( $child );
    }
    return $innerHTML;
}

// username and password to use
$usr = 'DOMAINUSERNAME';
$pwd = 'PASSWORD';
// URL to fetch, this is the address of the RSS feed (go into a list and click "Actions" -> "View RSS Feed" to get the url)
$url = "http://www.sharepointsite.com/_layouts/listfeed.aspx?List=%7BCED7CDDC-49C0-4C46-BDE6-CFC2BA993C84%7D";
//Initialize a cURL session
$curl = curl_init();
//Return the transfer as a string of the return value of curl_exec() instead of outputting it out directly
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
//Set URL to fetch
curl_setopt($curl, CURLOPT_URL, $url);
//Force HTTP version 1.1
curl_setopt($curl, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
//Use NTLM for HTTP authentication
curl_setopt($curl, CURLOPT_HTTPAUTH, CURLAUTH_NTLM);
//Username:password to use for the connection
curl_setopt($curl, CURLOPT_USERPWD, $usr . ':' . $pwd);
//Stop cURL from verifying the peer’s certification
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
//Execute cURL session
$result = curl_exec($curl);
//Close cURL session
curl_close($curl);

$xml = simplexml_load_string($result);

// display results on screen
foreach($xml->channel->item as $Item){
    echo "<br/>($Item->title)";
    $doc = new DOMDocument();
    $doc->loadHTML($Item->description);
    $ellies = $doc->getElementsByTagName('div');
    foreach ($ellies as $one_el) {
        if ($ih = get_inner_html($one_el))
        {
            echo ", $ih";
        }
    }
}

The SharePoint RSS feed is a little interesting as the “$Item->title” object is the main column in the list but the rest of the list is encapsulated in <div> within “$Item->description”, hence the requirement to parse the html.

For a SharePoint list with 3 columns the output will look something like:

(Item 1 Title) , Other Column A: xxxx, Other Column B: yyyy
(Item 2 Title) , Other Column A: zzzz, Other Column B: kkkk

Now the potential for this is great as it allows us to securely synchronise SharePoint lists with external databases, use SharePoint for authenticating PHP applications etc . We are going to be using this for automatically pulling users from a SharePoint list to populate a separate PHP application, whilst keeping user-identifiable data locked away on SharePoint.

Allow larger file downloads than 50mb from SharePoint (fix error 0x800700DF)

I ran into a 0x800700DF error when trying to download a 100mb file from SharePoint using Windows Explorer. I tend to do all my file operations in SharePoint with the WebDAV interface by pointing Windows Explorer to \portal.server.nameportal and logging in with an account with permissions to access the portal. The 0x800700DF error when attempting to copy large files is actually described as:

Error 0x800700DF: The file size exceeds the limit allowed and cannot be saved.

The fix involves changing a registry setting on your client machine to allow 4GB downloads (the maximum possible). From Microsoft Answers:

FileSizeLimitInBytes is set to 5000000 which limits your download so just set it to maximum! (this is client side btw on windows 7)

HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/WebClient/Parameters

  • Right click on the FileSizeLimitInBytes and click Modify
  • Click on Decimal
  • In the Value data box, type 4294967295, and then click OK. Note this sets the maximum you can download from the Webdav to 4 gig at one time, I haven’t figured out how to make it unlimited so if you want to download more you need to split it up.

Make Sharepoint show you error messages rather than “An unexpected error has occured”

This involves changing the web.config file (thekid).

The solution is to change a single entry in web.config, by modifying the line…

<SafeMode MaxControls=“200“ CallStack=“false“…

to…

<SafeMode MaxControls=“200“ CallStack=“true“…

You will also need to set custom errors to ‘Off’ .

<customErrors mode=“Off“/>

Fixing Sharepoint Services 3 Search warnings

I was seeing the following events in the event log when sharepoint was trying to reindex for searching:

The start address <sts3://servername.com/contentdbid={5f9ea3ed-16fc-4ba7-a839-e4cc944ac09b}> cannot be crawled.

Context: Application ‘Search index file on the search server’, Catalog ‘Search’

Details:
Access is denied. Verify that either the Default Content Access Account has access to this repository, or add a crawl rule to crawl this repository. If the repository being crawled is a SharePoint repository, verify that the account you are using has “Full Read” permissions on the SharePoint Web Application being crawled.   (0×80041205)

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

Turns out there is an easy fix, which involves going into the registry and disabling the loopback check (found at nobrainer solutions):

Disable the loopback check

Follow these steps:

1. Click Start, click Run, type regedit, and then click OK.
2. In Registry Editor, locate and then click the following registry key:

HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlLsa
3. Right-click Lsa, point to New, and then click DWORD Value.
4. Type DisableLoopbackCheck, and then press ENTER.
5. Right-click DisableLoopbackCheck, and then click Modify.
6. In the Value data box, type 1, and then click OK.
7. Quit Registry Editor, and then restart your computer.

After applying this fix the search crawler started working again with no warnings.

Sharepoint indexing and 100% cpu sqlservr.exe

Turns out indexing for sharepoint search was set to reindex every 5 minutes, causing 100% cpu on sqlservr.exe. This in turn slowed down the sharepoint portal significantly.

To set reindexing of sharepoint search to a more sensible time (out of hours) open up sharepoint central administration on your sharepoint server. In “Farm Topology” click on the server name that is running “Windows SharePoint Services Search”. Now click on “Windows SharePoint Services Search” in the list that pops up. Now go to the “Indexing Schedule” section and set the indexing schedule to something sensible, like daily between 12am and 12:15am (or whenever the server is not being heavily used).