Deploying Sitecore items to Azure PaaS using Octopus Deploy

Continuing our exploration of Sitecore as PaaS based on Azure Web App services, I wanted to talk how to integrate the CI/CD part in this blog, and specifically the Deployment part.

Currently in our On-Premise setup, we have been using Bitbucket (Code repositoty) + Jenkins (CI) + Proget (Package repository) + Octopus Deploy (for deploying Code and Sitecore items).  Sitecore items were deployed triggering TDS services on Sitecore server through Octopus tentacle.

While exploring Sitecore on Azure PaaS services, we wanted to minimize the changes to the CI/CD pipeline as it has been in a matured state. All the above mentioned CI/CD tools are going to be deployed as IaaS setup. As part of this did not see any challenges for any of these tools how they used to work in On-Premise vs on Cloud with Azure PaaS, except with Octopus Deploy. In the case of Azure Web Apps – there is no concept of Octopus tentacles, but Octopus provides in-built support for deployment to Azure Web App. This will help in our code deployment (binaries, configuration), but I was looking for solution for deployment of Sitecore Items (primarily through Sitecore update package). I did not find any straight forward solution on this, while found some instances of Octopus + Unicorn based deployment. But we wanted to stick to our current packaging setup and way of deploying Sitecore items. Towards this I stitched together a solution and below steps provides some details around that.

  1. As first step to use Octopus to Deploy to Azure Web App, we need to configure the Azure Account in Octopus as explained here.  I used Service Principal account based configuration in this case.
  2. As a next step Create the Environment in Octopus
  3. Create Project in Octopus
  4. Then Create the Azure Web App deployment step as explained from step 3AzureDeploy
  5. Pick the Sitecore items package (Sitecore items update package, packaged in a Nuget package) that you need to deploy
  6.  Pick your Azure Account and Azure Web App to where the package need to be deployedAzureDeploy2
  7. Now with this steps so far when it gets executed, it will deploy the Nuget package tp the Sitecore setup, and extracts the Sitecore update package to the specific folder you specified. Now the challenge is how do we install this package on Sitecore. In the On-premise case, we trigger the TDS service on the target Sitecore server through Octopus tentacle, but there is no such option here. So I was exploring the alternate way to trigger the Update installation package on Sitecore setup. Fortunately on this TDS had provided an alternate way. In this case we can deploy a Sitecore update items package to specific folder and then trigger the StartSitecorePackageDeployer.aspx, which will then then pick the package and installs it.
  8. Note:   By default when we install this TDS Package deployer, the StartSitecorePackageDeployer.aspx gets installed in /sitecore/admin folder which while we can access fine without login in On-premise, in Web Apps scenario this will prompt for Sitecore login, hence we need to move this aspx to website root folder
  9. Now that we have TDS service (StartSitecorePackageDeployer.aspx) which we can trigger to install the Sitecore update items package, next question is how can trigger this from Octopus. The option is of course to use PowerShell script. We can remotely trigger this using Invoke-WebRequest PowerShell command. This will successfully install the update item package the on the target Sitecore server.
  10. In some cases you might want do some additional processing after the installation of items package, like you might want to read the status of the log generated by StartSitecorePackageDeployer and do some processing. In this case the PowerShell will need to triggered on the Sitecore server itself and not remotely. With On-premise setup this is feasible as with Octopus tentacle we can execute the PowerShell command on the target server, However with no tentacle option, the script is triggered on the Octopus server and not the target server
  11. In this case option is use to Azure Web Jobs. We can create Azure Web Job  like following steps mentioned here
  12. We can then trigger this Web Job from Octopus using PowerShell like below
Get-AzureRmSubscription -SubscriptionId $SubbscriptionsId | Select-AzureRmSubscription
Invoke-AzureRmResourceAction -ResourceGroupName $ResourceGroupName -ResourceType microsoft.web/sites/TriggeredWebJobs -ResourceName $SiteName/$WebJobName -Action run -ApiVersion $Apiversion -Force

Happy Deployment.

I will continue share my further exploration of Sitecore setup on Azure PaaS.

Advertisements

Options for Sitecore PaaS on Azure

We are currently in the process of evaluating Sitecore PaaS (Platform-as-a-Service) on Azure for one of our customers. As part of this, I was looking at options available, and below table list out what I have found so far, and my opinion about it.

1 Sitecore Managed Cloud Provisioning
In the case of Sitecore Managed Cloud, based on the details provided by Customer Sitecore will do the Cloud Provisioning.
https://kb.sitecore.net/articles/062788

Service Catalog

https://kb.sitecore.net/articles/137210

Policies
https://kb.sitecore.net/articles/133931
Upgrades needs to be managed by customer
Sitecore provide will provide Hotfixes & patches, and also install them based on issue reported by Customer, and Customer will need to deploy the patch (this I find in no difference in comparison to the PaaS that can be setup and managed by Customer)

2 Azure Market Place Provides a nice Wizard based setup of Sitecore instance and necessary databases for Plain Vanilla Sitecore setup (One click deployment)
3 Sitecore Azure ARM Templates Setup is more technical in nature.
Need manually configure ARM templates (read JSON files) and use Windows PowerShell to configure the Sitecore instance and related DBs
4 Sitecore Azure Toolkit Setup is more technical in nature
Provides tools including the ARM templates to deploy the Sitecore solution to Azure App Service

my POV on these options so far is,

  • All these options are based on Microsoft Azure PaaS (based on Azure App Services)
  • First option – Sitecore Managed Cloud is more relevant for customers who do not have dedicated technical team to manage their Sitecore infrastructure
  • Second option can be used more as accelerated way to provision your Sitecore setup
  • Third and Fourth option are more relevant for customers who want to customize their provision the Sitecore setup as per their architecture

In addition for the first and second options – one needs to go with Sitecore versions currently Supported by Sitecore in Sitecore Managed Cloud / Azure Market place.  In the case of third and fourth options – one should be migrate their current version of Sitecore implementation to Azure without the need of upgrade.

As I continue to explore more on this, will share my experience and POV about the Sitecore PaaS on Azure.

Solr stability issues due to wrong Spellchecker configuration

We have been using SolrCloud in our Sitecore implementation for more than 18 months, and during early part of our Solr implementation we had faced issues during some maintenance activities of Solr, like needing to restart Solr nodes for applying patches etc. During this time getting Solr nodes to active state was taking long time and we found that one of the cause for this was large number of Solr collections / cores in one Solr cluster. We had almost 1,000+ Solr collections / cores. We then did optimization in the usage of collections and also decided to have more than one Solr cluster which resolved most of the maintenance issues we had and our implementation was pretty stable.

We had been using Solr primarily for Sitecore search index and product search capabilities in the site, and had been using Google Search Appliance (GSA) for Content search capabilities. Since GSA was coming to Sunset,  and since we were already using Solr, we decided to switch to use Solr for the content search functionalities. After this switch over, we started noticing lot of issues with Solr, and on restart of Solr nodes, the nodes were taking very long time to come back to active state / were not getting into active state at all and used to crash. We correctly suspected that this issue is related to something related to Content search switch over we did, but we were not able to identify what configuration / what was causing this issue.

Other than restart of Solr nodes taking ever to get into active state, we had also started noticing some of the following things,

Error opening new searcher. exceeded limit of maxWarmingSearchers=4, try again later

Rebuild index was failing (as this also triggers Create Alias command) and we notice that Create Alias command is getting  into overseer queue and not getting processed., and we saw following exceptions

org.apache.solr.common.SolrException; :org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /overseer/collection-queue-work/qn-0000009346

and finally the major piece We are noticing huge number of Solr transaction log files (which we have not noticed previously) in Solr collections. This we correctly suspected was the  cause for Solr nodes taking ever to restart and all the stability issues we are facing related to Solr.

Now that we knew why we are facing the stability issues related to Solr, we had to find what caused this issue. We started then looking into the configuration changes that had been done as part of cut-over for content search functionality to Solr. One of them was the SpellChecker configuration to leverage the “Did you mean” functionality.

We had configured IndexBasedSpellChecker,  the configuration looked like this

<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="classname">solr.IndexBasedSpellChecker</str>
<str name="spellcheckIndexDir">./spellchecker</str>
<str name="field">content</str>
<str name="buildOnCommit">true</str>
<!-- optional elements with defaults
<str name="distanceMeasure">org.apache.lucene.search.spell.LevensteinDistance</str>
<str name="accuracy">0.5</str>
-->
</lst>
</searchComponent>

In this the problem was with setting buildOnCommit to true. Here we found information possible effects of using the buildOnCommt set true.

It was clearly written that,

Building on commit is very expensive and is discouraged for most production systems. For large indexes, one commit may take minutes since the building of spellcheck dictionary is single threaded. Use buildOnOptimize or explicit build instead.

 

We then decided to use DireSolrSpellChecker, instead of IndexBasedSpellChecker, which did not had this negative impact. We cleaned-up all the existing transaction log files in the Solr nodes, applied the new SpellCheck configuration and restarted the Solr nodes.

After that all the stability issues related to Solr issue got resolved.  So major learning from this was that, be very careful with any configuration changes you roll out to your Solr cluster, as one wrong configuration can cause significant performance / stability issuues in your Solr cluster.

Sitecore 8.2 Master Index not getting updated with intervalAsyncMaster

One of the major blockers we faced after we upgraded our implementation from Sitecore 7.2 U6 to Sitecore 8.2 Initial release was, we noticed that Sitecore search Master index was not getting updated. Master index was configured to use following indexing strategy,

<strategies hint="list:AddStrategy">
	<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/intervalAsyncMaster" />
</strategies>

At the same time we noticed Sitecore search indexes which was configured with other indexing strategies like intervalAsyncCore, onPubPublishEndAsync were getting updated fine.

We then configured the CrawlerLog in Debug mode to get further details, and found that for Core Index, it was writing below entries in the log every 1 minute (the interval configured for intervalAsync). But for master index there was no such entries, although similar entries were expected to be written every 5 seconds (interval configured for intervalAsyncMaster)

14376 15:40:51 DEBUG [Index=sitecore_core_index] IntervalAsynchronousStrategy executing.
14376 15:40:51 DEBUG [Index=sitecore_core_index] Event Queue is empty. Incremental update returns

We then tried to check whether EventQueue was getting processed correctly and followed this blog, and found that EventQueue was getting processed fine, but still index was not getting updated

We then raised a Sitecore support ticket and also shared with them the memory dump of w3wp process for further analysis. Sitecore then after the analyzing the memory dump came to conclusion that some changes related ‘AlarmClock’ implementation in Sitecore 8.2 was causing this issue.

The  ‘AlarmClock’ instance must raise the ‘Ring’ event every 5 seconds and trigger the ‘IntervalAsynchronousStrategy’ (5 seconds is the interval configured for intervalAsyncMaster).  If the interval between creation and initialization of ‘AlarmClock’ is more than the ‘interval‘ of ‘intervalAsyncStrategy’, then issue used to occur and this was the case with intervalAsyncMaster. While of intervalAsyncCore it was working fine, because the interval was 1 minute.  Sitecore then asked us to change the intervalAsyncMaster interval to confirm the issue.  When we changed it to 1 minute the Master index started getting updated.

Sitecore then worked a patch #161393, which fixed the issue and we reverted the inveralAsyncMaster interval back to default value of 5 seconds.

Sitecore 8.2 Exceptions when re-index from Control Panel

In our impelementation we have been using SolrCloud version 5.2.1
After we upgraded our implementation from Sitecore 7.2 U6 to Sitecore 8.2 Initial release, we started noticing below exception when the user performed re-index from Sitecore Control panel, in spite of these exception index was however getting updated fine.

Exception: System.NullReferenceException
Message: Object reference not set to an instance of an object.
Source: Sitecore.ContentSearch.SolrProvider
   at Sitecore.ContentSearch.SolrProvider.SolrIndexSummary.get_NumberOfDocuments()

However when the re-index was performed from Content Editor Developer Tab -> Rebuild Index, we were not seeing any exception.

We then raised a Sitecore ticket and reported this exception. Sitecore support accepted this as bug and provided patch #117163 to fix the issue and provided below details about the issue,

Please keep in mind it just handles related NullReferenceException and then shows a user-friendly message that stats are not available.
Currently there is no easy way to make those stats available for a SolrCloud collection, since it requires noticeable changes in API calls

After applying the patches, we no longer saw these exceptions and re-index also did not effected.

Sitecore upgrade from 7.2 to 8.2 with in-place upgrade approach

We recently upgraded our implementation from SC 7.2 U6 to SC 8.2 initial release. We evaluated two approaches for this,

  1. In-place upgrade – the traditional approach for Sitecore upgrade in which case, we need to take the upgrade path of SC 7.2 U6 -> SC 7.5 Initial release-> SC 8.0 Initial release-> SC 8.1 Initial release->SC 8.2 Initial release
  2.  Leverage Sitecore Express Migration tool in this case,
    • Setup a plain vanilla SC 8.2 instance
    • Use Express migration tool to connect to source (SC 7.2) and target (SC 8.2) and migrate the content from Core DB and Master DB
    • Use Express migration tool copy the config files from source (SC 7.2) to target (SC 8.2)

Express Migration tool is a much simpler approach, except that one needs to setup a new instance (a pre-requisite) for this approach.

Express Migration tool migrates only Core and Master DB, and does not support (officially) the migration of Web DB. It is expected that in the upgrade we need to setup a plain vanilla SC 8.2 Web DB, and perform publish from Master to Web to upgrade the Web DB. This was not a feasible approach for us as we had large number of sites in our Sitecore instances and with some of then under freeze to publish due to some major redesign work happening in those sites.  We did try to migrate Web DB by specifying Web DB details in the field of Master DB in Express Tool migration, and it did migrate the Web DB, but this is not officially supported by Sitecore. Hence we decided to take the  traditional In-place upgrade approach where we need to apply multiple update packages to take the upgrade path I have listed above.

We had multiple Production instances and in first few of those instances the upgrade was smooth affair. During the upgrade we had set following attributes to higher value so that we do not encounter Request timeout / SQL Time out  during the installation of update packages,

<setting name="DefaultSQLTimeout" value="00:05:00"/>
<httpRuntime executionTimeout="3600" .... ... ... />

However during the during the update package installation (SC 8.0 package) in one of the instance, the package installation got lot slower almost it was taking 2+ minutes to process each item and then eventually the package installation was timing out.

From the installation log we found that it was slowing down during the installation of /Sitecore/System/Dictionary/* items and in the log we found it was updating the dictionary for each of the language while processing each of the /Sitecore/System/Dictionary/* items

INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'pt-BR'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'es-CL'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'en-IN'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'en-US'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'el-GR'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'en'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'nl-NL'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'en-GB'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'fr-CA'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'en-AU'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'es-MX'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'es-US'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'zh-HK'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'ru-RU'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'de-DE'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'es-ES'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'it-IT'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'en-ZA'.
INFO Loading Dictionary from database. Domain: 'Dictionary'. Language: 'es-CL'.

We then analyzed that if we can disable the OnSave event related to Sitecore.Globalization.ItemEventHandler during the package installation, Dictionary related operation will reduce and speed-up the installation. Since we were not leveraging Sitecore Dictionary it was fine for us to disable this event and Sitecore Support also confirmed the same. So then we went ahead and commented the following config entry

<handler type="Sitecore.Globalization.ItemEventHandler, Sitecore.Kernel" method="OnItemSaved" />

This helped us to overcome the issue we faced of repeated failures during the installation of update package and also considerably speed-up the update package installation during subsequent instances.