I would like to write this post to give an update on the status we are at after our upgrade to Netapp Data Ontap 8.1 and the issues that it caused us. Posting in the Netapp Communities about the problems we saw after upgrading to Ontap 8.1, such as high cpu, huge latency delays, etc, many more people have replied with the exact same problems.
We had a ticket open from day one of this issue, more than one month again and we are still experiencing issues. Some of the tasks we have under taken just so the system is able to function are such things as:
- Disable all deduplication
- Disable all snapvault jobs
- Align a few Misaligned Virtual Machines
- Added another 24 x 600GB SAS disk shelf, created a new aggregate and migrated/balanced virtual machines between aggregates
- Upgraded again to 8.1.1RC1
After all this we were still experiencing issues with huge latency. We had sent so many perfstats back and forth to Netapp that I have lost count. It was suggested that we install 2 x 256GB flash cache (1 for each controller in HA) to alleviate the latency issues and work loads on the disk. We were able to attain some loan cards to see if the issue would be resolved. The conclusion is that the system has responded a little faster than it has for the last 1 month, however there are still timeouts on virtual machines and the odd high disk latency that appear causing such things as vdi’s to randomly pause for a second or 2 and then continue, users not being able to connect to SQL servers, etc
We never had this issue with Ontap 8.02 and we had everything running such as dedupe, snapvault, etc running in the background. The reason for our upgrade to 8.1 was for the vfiler interactive shell which gives us the ability to run scripts within snapmanager that will trigger snapvault updates.
I’m still battling with this issue and every day try new things that Netapp support are asking us to do.
I’m actually surprised with all the users/customers upgrading to Ontap 8.1 and having the exact same issues, that Netapp have not removed 8.1 from the available software downloads.
I will post an update as we get closer to solving the issue.
Hey. We’re going through the same exact thing. Have you heard anything back?
Hi, we email support most days but hear very little back. I’m not sure if you have seen this but there is a patch upgrade for 8.1 which is 8.1P2, it looks like it addresses a few performance issues. We have not applied it so not sure if it addresses, but if you are experiencing slow performance and high latency might be worth giving it a try. Here is the link: http://support.netapp.com/NOW/download/software/ontap/8.1P2/
If you upgrade to this version, could you please reply back and let us know if the issues remained the same or are resolved ? Thanks
I think there support leaves something to be desired
We were having the same exact things happen to us after upgrading from 8.0 to 8.1P1. After changing the virtual SCSI controller to paravirtual on all NetBackup related servers (backup helpers etc.), the issues went away. Some vm’s were showing event log entries related to SCSI disk timeouts, changing the controller type to paravirtual also remedied those errors. We can’t go to 8.1P2 due to having flash cache, the Upgrade Advisor specifically called that out as a risk. I noted the other day that 8.1P1 has seemingly gone walkabout and is no longer available.
Hi, you are unfortunately not alone and we experience many performance problem since 8.1 upgrade (on 3240) in may 2012 … We have recently upgraded in 8.1p3 and our problems are still there : cpu spikes and very long running command (vol delete or vol clone sometimes take up to 30 mn to complete !!!)
We own many Netapp storage but this problem will be the last one. My next storage system won’t be Netapp anymore because of the lack of real and efficancy support and the poor quality of the products (since ontap 8.0 adventure). (i’m getting tired and fed up with giving them perfstat each day !!!)
If I had $1 each time I heard “send us a perfstat” from Netapp Support, I would be rich.. The most annoying thing I find with gathering perfstat is that because they work in iterations, when the problem is occurring the iteration might be sleeping, so it’s hard to capture the real data. You can also use statit command to gather some information, not as comprehensive as perfstat, but should should you a few interesting things. Have Netapp told you to make sure all your luns, vm’s etc are aligned correctly ?
Hi
Can you provide steps involved in the upgrade process like pre and post upgrade setups for Data Ontap upgrade.
Thanks
Hi, there are some general steps like trigger a manual autosupport (pre upgrade), backup the hosts and rc file, check cpu utilization make sure it isn’t above 50% on either controller during the upgrade, perform the upgrade, trigger a manual autosupport (complete upgrade) go through your usual checks on your systems. It also depends what system you have. I would download the upgrade guide from the now site for the software that you want to run and follow that depending on your system model. I would also make sure you’re current netapp support is current in case something goes wrong.