Categories
blog howto server windows

printer driver installation error 0x00000057

note: This is a technical article meant for system administrators or advanced users. Make backups of files or registry keys that you manipulate. Use caution and proceed at your own responsibility.

I ran into this problem after cleaning up printer drivers and manually cleaning up in the c:\windows\system32\spool directory.
When trying to connect to a printer on the printserver the error 0x00000057 would show. This error occurs when windows tries to install the printer driver.

If the driver is already present on the system, first try to remove the printer driver using the printserver properties (printui.exe /s /t2). http://support.microsoft.com/kb/2771931 After remove try again to install. If that didn’t work, or like me this particular driver was not installed on the system, read on.

If you have another computer or server that does install the printer without problem, than you can try this.
1) On the working computer open the registry editor and navigate to: (note for 32bit systems its slightly different)
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Print\Environments\Windows x64\Drivers\Version-3\
2) Find the key with the driver you need. Export this driver key to a REG file. Copy the file to the problem computer.
3) On the problem computer open the registry editor and check for the presence of the key, delete if present.
4) Import the REG file made in step 2 on the problem computer.
5) In the key note the value of the Infpath item. It’s the location of the INF file. We need the whole containing folder and items in it.
Should be something like: c:\windows\system32\DriverStore\FileRepository\xxx.inf-yyyyyy-zzzzz\
6) We need to copy this folder and its content to the problem computer. Not an easy task because of the permissions.
But you can do it like this from the working computer:
– Open an elevated command prompt (run as administrator CMD.exe)
– execute the following command: xcopy C:\Windows\System32\DriverStore\FileRepository\xxx.inf-yyyyyy-zzzzz\ \\problemcomputer\c$\Windows\System32\DriverStore\FileRepository\xxx.inf-yyyyyy-zzzzz\ /E /C /F /H /R /K /O
7) Now we have the registry key and the files present on the problem computer.
8) Restart the print spooler on the problem computer and try to connect the printer again or try to install the driver again.

0x00000057 printer driver installation error
source: https://social.technet.microsoft.com/Forums/itmanagement/en-US/a225d71c-be8b-4530-bf50-63001559a978/windows-can-not-connect-to-the-printer-0x00000057?forum=itmanager

Categories
blog network server windows

FreeRadius.net service doesn’t work

The FreeRadius.net package built for windows uses the XYZservice.exe wrapper tool to start a normal application as a service. However on Windows 2008 and higher the service starts but RADIUS is not listening on the configured ports. You can check if it is listening with netstat.
netstat -an | findstr 1812

The Debug mode of FreeRadius.net (using the provided batch file) however works fine. It seems the built radiusd.exe will not start with the default options on Windows 2008 and higher. You can check this by manually starting from a command prompt:
C:\FreeRADIUS.net\bin\radiusd.exe -f -d C:/FreeRADIUS.net/etc/raddb
So that is the reason why the service doesn’t start.

Solution: COnfigure the XYZservice to start the application with the debug parameters.
Edit C:\FreeRADIUS.net\bin\XYZservice.ini
change:
CommandLine = C:\FreeRADIUS.net\bin\radiusd.exe -d C:/FreeRADIUS.net/etc/raddb -AX

Now if you start the FreeRadius.net service and check with netstat you will see the RADIUSD.exe listening

Categories
blog network server windows

RAS or NPS forward RADIUS request to same server different port

The address of the remote RADIUS server x.x.x.x in remote RADIUS server group yyyyy Resolves to local address x.x.x.x.
The address will be ignored.

Use case scenario: You want to forward RADIUS requests incoming on the server to some software, possibly for setting up OTP authentication.

My scenario: Extra security for PPTP vpn tunnel to Windows server with RAS (Routing and Remote Access) by using VASCO Identikey OTP (One-Time-Passkey) software (the same applies for other software such as RSAid). Normally the recommended setup is using two servers, one for the RAS connection and one with the VASCO Identikey middleware software on it. When you deploy like this you will not face the problem I’m about to describe. However if you have only 1 Windows server at your disposal and you install the VASCO Identikey software on the same server as the RAS and NPS (Network Policy Server) role you will run in to this problem.

Problem description: You have configured RAS correctly for PPTP MSCHAP v2 connections. In NPS you have configured a connection policy to forward the RADIUS requests (authentication and accounting) to a remote RADIUS server group. The authentication fails, VASCO audit viewer does not show any attempt to authenticate to the VASCO Identikey Radius server. In the eventviewer application log there is an event ID 25 with the following error:
The address of the remote RADIUS server x.x.x.x in remote RADIUS server group yyyyy Resolves to local address x.x.x.x.

The problem is that NPS cannot forward RADIUS requests to the same IP address as itself. Even if the software is listening on another port, or you configure 2 IP addresses on the same network card. NPS insists that the IP address of the remote RADIUS server is the same as it’s own IP address and ignores your configuration to forward the RADIUS requests.

The solution is to use the loopback IP address range. For example 127.0.0.2. Unfortunately VASCO Identikey is licensed on IP address and as such you can’t change it to listen to the loopback IP address without also requesting a new license. I have not tried this, so even with the new license I’m not sure VASCO Identikey will listen on loopback IP address. Maybe other OTP software can do this, check with your vendor or manual.

What can you do? Use a RADIUS proxy to sit between the NPS and VASCO Identikey. If you have a linux server around you can use opensource FreeRadius software on that linux box to proxy the RADIUS requests between RAS/NPS and VASCO Identikey.
If like me you had nothing but this 1 windows server, you can use the FreeRadius.net software, this is a prebuilt binary of the opensource FreeRadius software made for windows versions. The software is quite old and not updated but it still seems to work for our simple setup.

I have installed the FreeRadius.net software in C:\FreeRadius.net
I have configured it to accept RADIUS requests on interface 127.0.0.2 port 11812 and forward them to a RADIUS server on IP x.x.x.x on port 18120 (I changed the default RADIUS ports for VASCO and FreeRadius to avoid conflicts with NPS/RAS).

configuration file c:\FreeRadius.net\etc\raddb\clients.conf
I have put all the default things in comment (#) and add
client 127.0.0.2 {
secret = testing123
shortname = localhost2
}

configuration file c:\FreeRadius.net\etc\raddb\radiusd.conf
I have put the default listen directive in comment (#) but you must leave the bind = * line and add
listen {
ipaddr = 127.0.0.2
port = 11812
type = auth
}
listen {
ipaddr = 127.0.0.2
port = 11813
type = acct
}

configuration file c:\FreeRadius.net\etc\raddb\proxy.conf
In this file I configured both the NULL realm for plain usernames and the DEFAULT realm for all others to forward to VASCO Identikey wich I have listening on the port 18120 & 18130 (auth & acct).
# This realm is for requests which don't have an explicit realm
# prefix or suffix. User names like "bob" will match this one.
#
realm NULL {
type = radius
authhost = 10.x.y.z:18120
accthost = 10.x.y.z:18130
secret = testing123
}

#
# This realm is for ALL OTHER requests.
#
realm DEFAULT {
type = radius
authhost = 10.x.y.z:18120
accthost = 10.x.y.z:18130
secret = testing123
}

You can now start the FreeRadius.net in debug mode, using the supplied batch file you can test your configuration.

Below I will attach screenshots of my configuration for NPS, RAS and VASCO.
RAS settings (EAP can be enabled if you like)
NPS RADIUS client
NPS configure remote RADIUS server group
NPS connection policy screenshot
NPS Network Policy
VASCO port settings

With thanks to:
http://bent-blog.de/vasco-identikey-server-auf-microsoft-forefront-tmg-2010/

Categories
blog

CentOS 6 kernel panic selinux config file error

If you get a kernel panic on CentOS or RedHat as shown below and you recently changed the selinux config, then there might be an error in the config file.

None of the kernels in the GRUB boot menu will boot, you get a kernel panic:

Kernel panic - not syncing:  Attempted to kill init! Pid: 1, comm: init Not tainted 2.6.32-504.3.3.el6.x86_64 #1 panic+0xa7/0x16f do_exit+0x862/0x870 fput+0x25/0x30 do_group_exit+0x58/0xd0 sys_exit_group+0x17/0x20 system_call_fastpath+0x16/0x1b
Kernel panic – not syncing: Attempted to kill init!
Pid: 1, comm: init Not tainted 2.6.32-504.3.3.el6.x86_64 #1
panic+0xa7/0x16f
do_exit+0x862/0x870
fput+0x25/0x30
do_group_exit+0x58/0xd0
sys_exit_group+0x17/0x20
system_call_fastpath+0x16/0x1b

Booting in single user mode doesn’t work either.

Here’s how to fix this:
1) Reboot, and go in the GRUB menu. You have 3 seconds to strike the arrow keys before it will automatically boot the default kernel.

GRUB boot menu
GRUB boot menu

2) Select the first line, the default kernel, and press the E key on the keyboard to edit the parameters. You will then see the following.

Edit GRUB boot options
Edit GRUB boot options

3) Use the arrow keys to select the 2nd line, that starts with kernel. Press the E key to change this line, use the arrow keys to go to the end and type a space followed by enforcing=0

GRUB edit kernel line
GRUB edit kernel line

4) Press enter to conform and then press B to boot the system.
It should boot up fine now.
Now edit the selinux config file (/etc/sysconfig/selinux) and correct your mistake.
In my case I set disabled for the SELINUXTYPE variable, that’s wrong it has to be set for the SELINUX variable. In the screenshot below I show you the correct settings in the config file to disable SELINUX.

SELINUX disabled correctly
SELINUX disabled correctly

5) Now reboot and everything should be fine!

Categories
exchange network server windows

Uninstall Exchange 2010 on crippled 2008 R2 DC (SBS2011)

FASTTRACK ARTICLE

Exchange 2010 was installed on a domain controller, it actually was a Small Business Server 2011. Something happened to the AD database, backup not good. Server would only boot in AD restore mode.

In AD restore mode, we could login using domain credentials because the other DC (backup) was providing logon and authentication. We could even start a whole bunch of services and also the Exchange services. We seized the Roles on our other DC so that users would be able to logon without issues.In the next days we prepared to move the mailboxes away, in our case to Office365 in the cloud.
After that happened we followed the steps to remove the last Exchange server including disconnecting all mailboxes from AD users, removing public folders and such. On the final step, to uninstall Exchange it would not continue stating that the server was pending a reboot.
“A reboot from a previous installation is pending. Please restart the system and then rerun Setup.”
Restarting did not help of course, because we could only boot in to directory services restore mode.
So I had to find a way to fix the AD.

1) Checking the AD database (ntds.dit) from restore mode:
Open a CMD command prompt (As Administrator);
execute command ntdsutil
at the prompt type: activate instance ntds
at the prompt type: files

A] If you get an error about corruption ->
Move all the .log files in C:\windows\ntds\ to another directory (desktop perhaps);
Open a new CMD command prompt (As Administrator);
execute command: ESENTUTL.EXE /p C:\windows\ntds\ntds.dit
execute command: ESENTUTL.EXE /g C:\windows\ntds\ntds.dit
Try the files command again in the open ntdsutil command prompt.

B] If you get an error about being in the recovered state ->
Open a new CMD command prompt (As Administrator);
execute command: ESENTUTL.EXE /g C:\windows\ntds\ntds.dit
The integrity check will show all is normal, otherwise see step A.
Reboot the computer and set the date in the BIOS a couple of months or from before the backup-date if you tried a AD restore. Or set it a year back or so if you are unsure. Try to boot in normal mode. Normally it should boot up, but could take a while, change the date/time back to the correct values when it’s booted up.

2) Replication and authentication to other domain controllers
If you have other domain controllers to replicate to, then you might probably need to change BURFLAGGS for non-authorative restore (to fix NTFRS corruption on SYSVOL) and first reset the machine account password for the secure channel to the other domain controllers. See this post: http://ares.gobien.be:8080/2013/07/sync-issues-krb_ap_err_modified-0x80090322-target-principal-name-incorrect/

3) Now you can either try to fix everything further or go ahead and uninstall Exchange. Try the uninstall the normal way.
If you get errors about the sate of the Active Directory, try it like this:
Open a new CMD command prompt (As Administrator);
execute command: cd %programfiles%\Microsoft\Exchange Server\v14\bin
execute command: setup.com /m=uninstall /dc:otherdc.domain.local

Make sure the server is still in the “Exchange Servers” security group.
Make sure there are no entries in the hosts file for your DC’s. Because it can also trigger the following error:
Setup encountered a problem while validating the state of Active Directory: ‘server.domain.local’ isn’t a fully qualified domain name (FQDN). Please provide a valid FQDN. For example: ‘SERVER’.

Happy uninstalling!

Log excerpt:

[12/31/2014 08:27:56.0273] [1] Active Directory session settings for 'Get-ExchangeServer' are: View Entire Forest: 'True', Configuration Domain Controller: 'SRV-APP1.contoso.com', Preferred Global Catalog: 'SRV-APP1.contoso.com', Preferred Domain Controllers: '{ SRV-APP1.contoso.com }'
[12/31/2014 08:27:56.0273] [1] Beginning processing Get-ExchangeServer -Identity:'SBS2011'
[12/31/2014 08:27:56.0273] [1] Searching objects "SBS2011" of type "Server" under the root "$null".
[12/31/2014 08:27:56.0273] [1] Previous operation run on domain controller 'SRV-APP1.contoso.com'.
[12/31/2014 08:27:56.0273] [1] Previous operation run on domain controller 'SRV-APP1.contoso.com'.
[12/31/2014 08:27:56.0273] [1] Preparing to output objects. The maximum size of the result set is "unlimited".
[12/31/2014 08:27:56.0273] [1] Ending processing Get-ExchangeServer
[12/31/2014 08:27:56.0491] [1] [REQUIRED] There is a pending reboot from a previous installation of a Windows Server 2008 role or feature. Please restart the system and rerun Setup.
[12/31/2014 08:27:56.0523] [1] Ending processing test-setuphealth
[12/31/2014 08:27:56.0538] [0] **************

[12/31/2014 08:28:01.0312] [1] Ending processing Get-ExchangeServer
[12/31/2014 08:28:01.0702] [1] [REQUIRED] There is a pending reboot from a previous installation of a Windows Server 2008 role or feature. Please restart the system and rerun Setup.
[12/31/2014 08:28:01.0702] [1] Ending processing test-setuphealth
[12/31/2014 08:34:16.0514] [0] End of Setup

[12/31/2014 10:17:29.0782] [1] Ending processing Get-ExchangeServer
[12/31/2014 10:17:30.0047] [1] [REQUIRED] Unable to read data from the Metabase. Ensure that Microsoft Internet Information Services is installed.
[12/31/2014 10:17:30.0047] [1] [REQUIRED] Setup encountered a problem while validating the state of Active Directory: Active Directory operation failed on SBS2011.contoso.com. The supplied credential for 'CONTOSO\Administrator' is invalid.

[REQUIRED] Setup encountered a problem while validating the state of Active Directory: Active Directory operation failed on SBS2011.contoso.com. The supplied credential for 'CONTOSO\Administrator' is invalid.

[12/31/2014 10:53:05.0881] [1] Searching objects "SBS2011" of type "Server" under the root "$null".
[12/31/2014 10:53:05.0897] [1] Previous operation run on domain controller 'SRV-APP1.contoso.com'.
[12/31/2014 10:53:05.0897] [1] Previous operation run on domain controller 'SRV-APP1.contoso.com'.
[12/31/2014 10:53:05.0897] [1] Preparing to output objects. The maximum size of the result set is "unlimited".
[12/31/2014 10:53:05.0912] [1] Ending processing Get-ExchangeServer
[12/31/2014 10:53:06.0287] [1] [REQUIRED] Setup encountered a problem while validating the state of Active Directory: 'SBS2011.contoso.com' isn't a fully qualified domain name (FQDN). Please provide a valid FQDN. For example: 'SBS2011'.
[12/31/2014 10:53:06.0318] [1] Ending processing test-setuphealth

[12/31/2014 10:54:32.0491] [1] Previous operation run on domain controller 'SRV-APP1.contoso.com'.
[12/31/2014 10:54:32.0491] [1] Previous operation run on domain controller 'SRV-APP1.contoso.com'.
[12/31/2014 10:54:32.0491] [1] Preparing to output objects. The maximum size of the result set is "unlimited".
[12/31/2014 10:54:32.0491] [1] Ending processing Get-ExchangeServer
[12/31/2014 10:54:33.0043] [1] [REQUIRED] Active Directory does not exist or cannot be contacted.
[12/31/2014 10:54:33.0043] [1] [REQUIRED] Setup encountered a problem while validating the state of Active Directory: 'SBS2011.contoso.com' isn't a fully qualified domain name (FQDN). Please provide a valid FQDN. For example: 'SBS2011'.
[12/31/2014 10:54:33.0043] [1] Ending processing test-setuphealth

[12/31/2014 10:56:00.0320] [1] Previous operation run on domain controller 'SRV-APP1.contoso.com'.
[12/31/2014 10:56:00.0320] [1] Preparing to output objects. The maximum size of the result set is "unlimited".
[12/31/2014 10:56:00.0320] [1] Ending processing get-EdgeSubscription
[12/31/2014 10:56:00.0574] [1] [REQUIRED] Setup encountered a problem while validating the state of Active Directory: 'SBS2011.contoso.com' isn't a fully qualified domain name (FQDN). Please provide a valid FQDN. For example: 'SBS2011'.
[12/31/2014 10:56:00.0670] [1] Ending processing test-se

[12/31/2014 12:31:07.0542] [1] Previous operation run on domain controller 'SVR-DC1.contoso.com'.
[12/31/2014 12:31:07.0542] [1] Previous operation run on domain controller 'SVR-DC1.contoso.com'.
[12/31/2014 12:31:07.0542] [1] Preparing to output objects. The maximum size of the result set is "unlimited".
[12/31/2014 12:31:07.0542] [1] Ending processing Get-ExchangeServer
[12/31/2014 12:31:07.0791] [1] [REQUIRED] Setup encountered a problem while validating the state of Active Directory: The user-specified domain controller SRV-APP1 does not exist.

Categories
blog howto

How to delete LinkedIn saved e-mail contacts

edit: Solution all the way at the end, skip to the direct link if you are thinking TLDR.

After receiving one to many e-mails from LinkedIn with the message that e-mail contact x had joined LinkedIn, I started looking for a way to delete these contacts or address book and stop receiving these mails. It turns out this is not as easy as you think.

Firstly disable the sending of these e-mails is not possible, it’s not listed on the options to opt-out of different kinds of e-mail communications from LinkedIn.
LinkedIn communication settings

If you read some posts on the internet everyone keeps reffering to https://www.linkedin.com/contacts/manage_sources/ but this only lists options to add sources for synchronizing, I had nothing enabled so I couldn’t disable anything. At the top of content it showed me the number of LinkedIn connections and I could click on that, but it showed me a list of my connections, even after changing the filters to saved contacts it didn’t show anything besides connections.
LinkedIn contact settings

So the wise thing to do is open a support ticket, and so I did but 5 days later the support person could only send me standard responses and did not really understand or read my question.
LinkedIn support ticket

SOLUTION: I then finally found the answer by accident after I was changing my profile information.
When you get to this page: https://www.linkedin.com/fetch/importAndInviteEntry it show at the top Manage imported contacts. And that brings you to this page: https://www.linkedin.com/people/contacts where you will see all your imported contacts and you can delete them all.

Finally, no thanks to LinkedIn support.

Categories
blog howto linux network server virtualization

Virtual Private Server on SSD storage

 Update: After reviewing the offerings, I’m no longer running my VPS at digitalocean. Instead I’m using Linode at the moment.
www.linode.com

Easily deploy an SSD cloud server on @DigitalOcean in 55 seconds.

Recently I read about the virtual private servers you can create on www.DigitalOcean.com. They call them Droplets, and they get created in less then a minute if you don’t enable back-ups, or just a couple of minutes with back-up service enabled. You can choose between different geographically located data centers. You can choose between New York, San Francisco, London, Amsterdam and Singapore. You get one public ip address (or ipv6 if you prefer, but who does anyway).

You can choose out of some pre selected minimal OS installations such as Ubuntu, CentOS, Debian, Fedora and CoreOS. Or you could even deploy your VPS complete with a LAMP (Linux, Apache, MySQL and PHP) or even with WordPress of Drupal setup. If I looked at the price (10$/month or 12$ with back-up) for a VPS, with 1 CPU, 1GB RAM, 30GB DISK and 2TB data transfer, and compared that to what I was currently paying for 2 shared hosting plans, the math was clear. For a bit less than what I was paying I get my very own Virtual Private Server where I can configure everything I want and have full rights on everything.

For me, as an enthusiastic system engineer, with experience on multiple Linux flavors, this was a very nice project. Starting from a minimal CentOS 7 installed Droplet, I quickly installed and configured Apache, Nginx, MySQL and PHP and started serving web pages. My first tests were a success. I configured different management tools and secured the system with a software firewall. Because your VPS has a public ip address you must think good about security. It took some time getting used to the new firewall software system in CentOS 7 called firewalld. After some cursing I had it set up as I wanted.

The next step was to migrate the first of my existing websites over to the new VPS. I chose to configure virtual hosts in an organized manner so that I could always expand to more websites if needed. After transferring the databases and website data, I set course for a new goal. Making my sites more secure by using HTTPS encryption on the login pages. By using the free 1 year class 1 certificates from www.startssl.com I did not have to make any extra costs. Update: Using Let’s Ecrypt now and HTTPS on all pages! After some hours of configuring and testing I had everything running smoothly. I migrated all the DNS records to my new VPS and shortly after my 1st website was running live on the new VPS.

My next goal was to set up mailboxes for every virtual host and using IMAP to connect to them. I choose POSTFIX as the SMTP server and DOVECOT as the IMAP server. POSTFIX was configured for using virtual mailboxes that don’t require a Linux user. DOVECOT was configured for SSL/TLS encrypted connections so password are never sent in clear text. To finish it off I installed ROUNDCUBE as a web mail solution.

After my successful first website migration the second one followed quickly and went smoothly as well. This time I also needed a FTP setup and I chose VSFTPD and again made it possible to use SSL encryption.

The VPS is now running all of my websites, except this blog.

PS: If you are wondering why I don’t migrate this blog, running on my home server, that’s because it’s a challenge to keep a website running on a homeserver with minimal hardware costs and dynamic internet ip address. It also has some other uses for me besides serving this blog.

Categories
blog server windows

Windows 7 profile SID wrong mstsc can’t login

This is a very strange problem I came across on a windows 7 Embedded thin client. I don’t quite understand what went wrong but I’ll give you a detailed description.

CASE:
The user has a thin client with Windows 7 Embedded that’s been entered in to the Active Directory domain. On the public desktop of the thin client there is a RDP file to connect to a Remote Desktop Server (a.k.a. Terminal Server). The user logs on to the thin client using their AD credentials. The user was able to log on to the server using the RDP file without problems until today.

SYMPTOMS:
– User can’t log on to the Remote Desktop Server, the error received is:

The connection was denied because the user account is not authorized for remote login

TROUBLESHOOTING:
Normally this just means that the user is not a member of the “Remote Desktop Users” local group on the server.
– I verified the user was a member of the correct groups to log on to the server.
– I then tried to log on the server with the same credentials from a different workstation. This worked without a problem. Which led me to conclude at the server-side everything was OK.
– On the troublesome workstation (thin client with WIN 7 E in my case) I launched remote desktop with the “Run As Administrator” option and supplied credentials for an admin account. I tried to connect to the Remote Desktop server using the credentials of the troublesome user account. This worked without a problem.
– I tried again without the run as, and it failed again with the same error.

This led me to my conclusion that something was very wrong with the user profile on the workstation for this domain user.

SOLUTION:
I decided to delete the user profile on the local workstation since nothing is stored in it (they don’t work locally). However when I opened Explorer and went to see in “C:\Users” I saw 2 identical folders with the same name (the username of the troublesome user). It seems there were 2 identical profile folders. I didn’t think it was possible for 2 folders to have the same name.
I deleted both folders!
I then opened REGEDIT and went to HKEY_LOCAL_MACHINE\SOFTWARE\MICROSOFT\WINDOWS NT\CURRENTVERSION\PROFILELIST
I saw multiple user SID’s and checked them all. To my surprise there were 2 different user SID’s that both had a value c:\users\problem.username underneath it. So 2 different user SID’s for the same username. I thought that was impossible. I deleted both registry keys.
After deleting the profiles and the keys I logged back in with the user and profile was recreated and the remote desktop worked perfectly.

So it seems that the remote desktop client was sending the wrong SID to the server and that was the reason for the unauthorized error message.

Categories
blog howto server virtualization

HP VAAI plugin missing from Vmware ESXi 5.1 U1 HP Customized edition sep 2013

Also read the updated EDIT section at the end.

Summary:
HP has left out the VAAI plugin in the September 2013 ISO’s for ESXi 5.1 and 5.5. People adding a hosts to a SAN (like a P2000G3) will have troubles without this plugin.
New VMFS-5 datastores that are created on a host that has the VAAI plugin will use ATS-Only locking mechanism for the datastore. Adding another host without this VAAI plugin will keep that host for correctly seeing the datastores.

Situation:
Server HP Proliant DL380 G6 freshly installed with VmWare ESXi 5.1 U1, using the “HP customized sep 2013 ISO”.
A couple of months ago I already had installed 2 other servers with VmWare ESXi 5.1 U1 using the “HP customized apr 2013 ISO”.
These 2 previosly installed hosts had been connected via ISCSI to an HP P2000G3 SAN with 2 LUN’s and 2 datastores were created without problems. Both hosts saw these datastores.

Problem:

The new vmware host would briefly show one or both datastores after a RESCAN HBA, but they disappeared after 2 seconds. Also the capacity values shown during those 2 seconds were wrong.

Troubleshooting:
I switched the ISCSI from the Intel NIC to the Broadcom NIC, but the problem remained the same.
I updated all the firmware on the server using the Servicepack for Proliant CD.
After that I further updated using the “VMware vSphere 5 Supplement for HP Service Pack for ProLiant” and the included HP SUM.
I checked the vmkernellog file on /tmp/scratch and saw these errors:
2013-11-28T13:53:58.536Z cpu11:9191)WARNING: FSAts: 1304: Denying reservation access on an ATS-only vol 'P2000LUN12'
2013-11-28T13:53:58.536Z cpu11:9191)WARNING: HBX: 1955: ATS-Only VMFS volume 'P2000LUN12' not mounted. Host does not support ATS or ATS initialization has failed.
2013-11-28T13:53:58.536Z cpu11:9191)WARNING: HBX: 1968: Failed to initialize VMFS distributed locking on volume 51f29f9b-26f5059a-39c6-00237deeceda: Not supported
2013-11-28T13:53:58.536Z cpu11:9191)Vol3: 2359: Failed to get object 28 type 1 uuid 51f29f9b-26f5059a-39c6-00237deeceda FD 0 gen 0 :Not supported
2013-11-28T13:53:58.536Z cpu11:9191)WARNING: Fil3: 2492: Failed to reserve volume f530 28 1 51f29f9b 26f5059a 230039c6 daceee7d 0 0 0 0 0 0 0
2013-11-28T13:53:58.536Z cpu11:9191)Vol3: 2359: Failed to get object 28 type 2 uuid 51f29f9b-26f5059a-39c6-00237deeceda FD 4 gen 1 :Not supported
2013-11-28T13:53:58.581Z cpu11:9191)HBX: 707: Setting pulse [HB state abcdef02 offset 3440640 gen 1 stampUS 5568056524 uuid 529735be-0b3c273d-6396-18a90550fb2c jrnl drv 14.58] on vol 'P2000LUN12' failed: Not supported

So after researching on the internet I read up on VAAI what is responsible for the ATS locking of the ISCSI volumes. You could turn it off, but you have to do it on all the hosts and power all the VM’s down. This was not desirable and the other hosts worked fine. So I wanted to fix this one new host.
I logged in using SSH on the hosts and check the VAAI status with this command:
esxcfg-scsidevs -l | egrep "Display Name:|VAAI Status:"
On the working hosts it showed my ISCSI disks and next to VAAI status showed: supported.
On the troublesome host it showed the ISCSI disks and the VAAI status showed: unknown.

Solution:
VmWare support were no help, just asking me for a collection of all the logfiles and not reading my log excerpt containing the errors shown above. I gave them the logs, but 5 hours later they still hadn’t responded. That’s not what I call good support.
I read up on more of the VAAI stuff, and apparently it’s HP plugin in VmWare. So I looked around to find a download for it, to force it to update.
To my surprise I find it on this page: http://h18004.www1.hp.com/products/servers/software/vmware-esxi/driver_version.html
It’s listed as a component for the April 2013 version of the HP customized ISO, but they have now left it out in the September 2013 version.
So I download the HP VAAI plugin from here: http://h20566.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdDetails/?spf_p.tpst=swdMain&spf_p.prp_swdMain=wsrp-navigationalState=idx=|swItem=MTX_30e09de4fc7e4498bfd9102a99

Download the zip file, extract it. Inside you’ll find another ZIP file with the word bundle in the filename. Upload this zip to your ESXi server’s /tmp/scratch directory (to do this: enable SSH using the client and use WinSCP or FastSCP).
Login to the SSH shell (using Putty) and execute:
esxcli software vib install -d /tmp/scratch/hp_vaaip_p2000_offline-bundle-210.zip
Reboot the host, and the problem is gone. You’re datastores now appear under storage tab for this host.

Notes:
Just a heads up, it seems HP has also left this HP VAAI plugin out of the ESXi 5.5 ISO’s.
No idea if they just forgot, or are intentionally doing it.

I have asked to close the VmWare support case stating I have found the solution myself some 9 hours ago, but they haven’t even looked at it, it remains open.

EDIT:
It seems HP removed the VAAI plugin on purpose, because of a bug with some RAID controllers. Read the advisory here at this LINK.
Since I don’t have these RAID controllers, I don’t have any problem enabling the plugin.
You can read more in this forum topic:
VAAI support with ESX 5.1U1 on P2000 G3 MSA

Categories
blog howto server virtualization

MONITOR PANIC: Unable to decompress PPN from swap slot for VM

VMWARE ESXi 5.1U1

My VM would power off without apparent reason.
Looking in to the logs this error appears.

MONITOR PANIC: Unable to decompress PPN from swap slot for VM

I believe the underlying storage (a single SATA disk in my case) to be at fault, or almost dying I guess.
I storage vmotion’ed the VM to another disk.