Home > Uncategorized > Citrix Provisioning Server, questionable assumptions & why I don’t trust people.

Citrix Provisioning Server, questionable assumptions & why I don’t trust people.

This blog post may seem like a bit of a strange title, but this is a subject I’ve been struggling with for the longest of time and really just need to get it off my chest. I’ve put off this blog post for the longest time, but I still feel it needs to be said.

Firstly I’ll start by saying I think Citrix Provisioning services is an amazing piece of technology. Hosting, maintaining and deploying a single image to as many servers as you like is a solution you would have killed for a few years ago; and the fact that it’s rolled into the platinum licensing model only further wets the appetites of potential customers. In short, it’s a wonderful product and you would be mad (in most cases) not to use it.

Citrix provisioning services is also a consultants best friend, you need only install a a single XenApp server, configure it with best practices, install some software, negotiate some caveats with AntiVirus/Edgesight, document the change procedure then deploy en mass. The customer is amazed you got the job done so quick and you can retreat in success safe in the knowledge that if the customers are capable of reading and following instructions, it should be ok.

This will work great for a small, local company deployment where a single administrator will oversee the changes and understands the process himself. But as it scales up paranoia (for me) sets in.

Here’s the bit I dont like:

1: Citrix provisioning Services employs the assumption you must store as many of the changes to the golden images as you can afford to reduce your chances of being caught out. This may not sound like a big issue to some, but with large rates of change in environments (say even 5 changes a month) and large golden images (lets say 100gb) your storage costs will quickly become unjustifiable over time.

I am aware Provisioning services 6 has the ability to layer changes, but even Citrix recommend chaining 8 or less layers before consolidating these images. Even with the above figures this would spiral quickly.

2: Citrix Provisioning Services employs the assumption that the administrators are going to fully document the changes they made between images.

I’ll openly admit I dont trust techies to follow even the simplest personal hygiene routine never mind change management procedures and you can try to protect yourself with these procedures till you are blue in the face, a process is only as good as your laziest / least forgetful administrator and ultimately its your ass in the hotseat when the shit hits the fan.

3: Corruption. If infrequently accessed parts of the registry, file system or corruption in an infrequently used application occurs before your oldest golden image backup you are well and truly sunk, unless you have full and concise documentation on how this oldest image came to be and all the necessary steps there after.

4: Ever troubleshooted an application under intense pressure, made 4-5 changes at once and found it fixed the problem but the urgency dictated this change needs to go live yesterday? This basically.

Even the most meticulous administrator will suffer pressure related shortcuts and when the pressure is off its all to easy to forget that you need to route cause analysis the issue. Provisioning services allows these shortcuts all too easily.

But what are the alternatives to ensure control?

Who said scripting?

Products that use scripted installs, such as RES Automation manager, Frontrange DSM, Microsoft’s WDS / MDT etc have the benefit of forcing the technical guys to script the task ahead of time. This means they must fully test the install procedure before it gets to the production environment. Scripting the install forces documentation of some kind before time and ensures the package sources are kept in a shared, backed up location.

The downside to this method is the time taken to script a deployment, the potential (albeit, rare) unscriptable changes and the time taken to deploy an image from baremetal to the last job can be quite lengthy.

“But what if you use these tools to manage the golden image, then deploy via Citrix PVS?”

Well, you could do this, but then what are you really achieving? Sure you have a scripted install of your golden image but you’ve to go to the bother of scripting everything, performing a full reinstall (to ensure full operation) then importing the image into PVS. You’ve also now a licensing overhead (in most cases) for each server you deploy thereafter as their software is installed in the Golden Image.

I’ve spoken to three vendors in this market and they all got a little uncomfortable with the idea of using their product up to the finish of the build then uninstalling it in favour of PVS before sealing the image. In RES Automation manager’s case, if you leave it installed at least you then have the benefit of automating user based tasks as well, so i suppose its not a great loss.

In all directions, ask yourself what are you really still achieving using PVS if you have already scripted the entire process?

And what could Citrix do to address my concerns?

To be honest, I’m not sure, I spoke about this frequently at Synergy last year and there were quite a few people with similar concerns. Either way, here’s my 2 cents.

Scan and report:

I would like to see Provisioning Services scan the file system, registry and known keys (e.g. installed applications) of the old and new image, then perform a comparison (alike regshot) and report on the changes each time a new revision is added. This scan would identify key changes to both structures and provide a report of change. E.G.:

Basic Reporting:

  • new folder c:\program files (x86)\application1
  • new file: c:\program files (x86)\application1\runme.exe
  • newer file: c:\windows\system32\drivers\etc\hosts
  • Deleted file c:\windows\system32\some.dll
  • New registry value: hklm\software\somevendor\dangerouskey – Dword – 1
Smart reporting too:
  • new system odbc connection: application1
  • new Installed Application: Application1
  • Windows updates applied: KBxxxxx, kbxxxxy
Etc, you get the idea…

This scan should be mandatory and the results should be stored in the Provisioning services database which can be referenced or searched for keywords at a later point. This scan should also include a mandatory sign off from the upgrading administrator in question so you have accountability for who made the change.

I can’t see this as being difficult, I mean, I’m sure a powershell script wouldn’t be too difficult and it would really address the change comparison to be sure you are aware of what has happened between images.

To really get across what I’m trying to say, I figured as techies we love detailed exam questions, so below are some real life examples of where I’ve seen Citrix Provisioning Services fall down in a production environments.

Example:

George works as an architect in Company A, George’s SBC environment publishes hosted desktops for over 5,000 users on roughly 200 XenApp servers.

  • Company A has a large application land scape of roughly 1,000 applications.
  • Company A has a rapid rate of change, with an estimated 2-4 changes per week.
  • Company A’s golden image including App-V cache is over 100gb’s.
  • Due to the vast size of the golden image, only the last 20 images are stored.
  • George has employed a strategy to ensure all changes to the XenApp environment are fully documented in a sharepoint instance.
  • The corporate policy on acceptance testing dictates no additional changes can be processed until the acceptance test is complete. The acceptance testing period is 10 business days.

Scenario A: An issue has been experienced recently and an office application is crashing. This crash has been ongoing for roughly 6 months but has never generated enough steam until now, after troubleshooting George finds a DLL called OpenEX.dll has been added to excel generating random crashes.

Due to a fall down in documentation there is no record of where this Dll came from. This Dll has been included in the last twenty images and it’s unclear who needs it….

A: FUUUUUUUUUU…

Scenario B:

After a recent purge and merge of layers to the master golden image, the Super Calculation 1 application requires an upgrade to version 2. During upgrade testing it is found that this application requires an un-installation of version 1 before installing version 2

The uninstallation fails and cannot be completed. No golden image stored contains an image before Super calculation 1 was installed. The support vendor cannot assist as they insist a reinstall is required.

A: FUUUUUUUUU…

Scenario C: Roughly a year ago, a new piece of software (Craptastic app) was purchased by the Finance team. A consultant was employed to assist the administration team in deploying the software, due to a communication error between project manager and the consultant no documentation or source files were provided on how to install the application.

The administrator involved has since been fired for stealing pens.

Finance are performing end of year calculations and a seldom used part of Craptastic app is no longer working, it worked last year when they performed closing statements but this year it errors out. At some point you realise an ODBC connection was removed from your image but you are not sure which image contains the correct one. You have 20 images, one or none of which may contain the right value.

The software vendor’s is closed early for a christmas party. They wont be back till Monday, you wont have a job on Monday.

This issue needs to be fixed yesterday.

A: FUUUUUUUUUU……

Andrew, you’re being a pedantic, paranoid, eejit.

Maybe, but I’d rather be the above than suffer a major loss of face, or loss of job from the wrong recommendation!

I’d really be interested in your real world experiences in similar scenarios to the above;

  • How do you enforce documentation and change management?
  • Do you need to regularly police these kinds of changes?
  • Do you share these concerns?
  • Is a certain amount of trust (or faith) in administrators required?
Categories: Uncategorized
  1. June 1, 2012 at 8:00 pm

    Automation is free – there’s no excuse, but there’s also no replacement for following process. The most frustrating conversation to have is the one with the administrator who can’t see problems in his environment caused by no processes or automation.

  2. June 1, 2012 at 9:15 pm

    These concerns are not specific for PVS. They apply to image management and change management in general. These risks can be minimized by keeping your images as small and generic as possible. The following has worked for me:

    – Make application changes to the MSI packages or streamed applications, not to the OS image. Personally i think allowing non-scripted application installations in something as important as a golden image should be avoided.
    – Apply file system and registry changes through Group Policy Preferences, not the OS image.

    Still, i like to keep backups of my images on tape.

  3. June 2, 2012 at 12:08 pm

    This is the battle I am currently fighting. Trying really hard to persuade people that installing everything manually straight into the base image is not the way to go. When everything has been installed manually into the base image there is no get-out of jail free card.

  4. Darren
    June 3, 2012 at 9:26 pm

    Our current 4.5 PBS images are a mess. Sure we document any changes (generally)but as u say, it’s the manual uninstalls and installs of apps etc which leaves behind so much crap I don’t know where to start. I actually tried rebuilding an image from scratch when we were approx 15 modifications in and it was a disaster!
    If u have lazy admins who don’t like to document you are screwed..

    With our 6.5 image we are attempting to use MDT 2010 to sequence our image together with group policy. It obviously adds some complexity to start and we have to push the point to the rest of the IT dept that your change is not getting on unless it can be scripted. Time will tell how successful we are……

  5. June 4, 2012 at 5:25 am

    Great article Andrew! Totally agree with you that automation and documentation is key and that admins should really care about documenting the things they’re doing, even for their own sakes. (to prove they didn’t screw up the image)
    Some smart colleagues of mine developed a great framework (powershell) to automate the XenApp 6.5 installation, configuration, OS hardening and PVS vDisk creation. This way you should always be able to completely rebuild your image from scratch!

    I just had to go through such a discussion with a customer who didn’t want to believe that PVS doesn’t make everything easier. Sure, the deployment process is a lot faster, usually, but the work to get there is often forgotten and if you don’t follow the process/workflow, then some weird stuff can happen.

  6. June 7, 2012 at 2:08 pm

    Andrew – excellent article and one that is so prevalent in large organisations. There are methods to minimise the risk of crap apps. These crap apps a usually for small numbers of users and if you deploy an one-image-fits all strategy you are wide open to scenarios you have outlined. I would suggest that unless a particular Application is required by say 70% of users does it qualify for installation on the master image. Silo off these crap apps to a bank of XenApp servers and stream into the master image. OK you now have two images to maintain now. That’s my two pence worth…

    • June 7, 2012 at 2:13 pm

      Thanks for the comments guys. There’s no single answer as I assumed.

      A recommendation came from Carl Webster to consider using the Microsoft system state analyzer between images to detect changes. I hope to give this a try before the weekend and I’ll post back the results.

      A

      • Aleks
        August 31, 2012 at 1:49 pm

        Hi Andrew, any news on using the Microsoft system state analyzer for tracking changes..? Is it viable for using in a PVS environment for example..?

      • August 31, 2012 at 4:22 pm

        Hi Aleks,

        Sadly no, the system state analyzer is quite poor.

  1. October 22, 2012 at 3:45 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: