Tags

, , ,

Couple of months back I upgraded our prime infrastructure to 3.0 from 2.2. That time I chose to go with inline upgrade as it was supported. If you have worked with this product, many of us know “do a fresh install and import maps” is the safest  approach for a Prime Infrastructure Upgrade. Of course you will loose historical data and has to do manual work, still worth doing.

When I upgraded CPI 2.2 to CPI 3.0 most of the settings left as default unless those were changed in 2.2. Within 2 months of the upgrade, got to below alerts stating CPI running on low disk space.PI3.0-BP-01

When checked in CLI, PI database size is 638G (97% of allocated space ). As suggested, did a “disc cleanup” and that helped to recover ~25G. Within a day that space consumed by the database and constantly getting above alert. You can check your CPI database utilization as below (optvol is the one holding CPI database which is running out of space)

prime/admin# root
Enter root password :
Starting root bash shell ...

ade # df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/smosvg-rootvol
                      3.8G  461M  3.2G  13% /
/dev/mapper/smosvg-varvol
                      3.8G  784M  2.9G  22% /var
/dev/mapper/smosvg-optvol
                      694G  638G   21G  97% /opt
/dev/mapper/smosvg-tmpvol
                      1.9G   36M  1.8G   2% /tmp
/dev/mapper/smosvg-usrvol
                      6.6G  1.3G  5.1G  20% /usr
/dev/mapper/smosvg-recvol
                       93M  5.6M   83M   7% /recovery
/dev/mapper/smosvg-home
                       93M  5.6M   83M   7% /home
/dev/mapper/smosvg-storeddatavol
                      9.5G  151M  8.9G   2% /storeddata
/dev/mapper/smosvg-altrootvol
                       93M  5.6M   83M   7% /altroot
/dev/mapper/smosvg-localdiskvol
                      130G   53G   71G  43% /localdisk
/dev/sda2              97M  5.6M   87M   7% /storedconfig
/dev/sda1             485M   25M  435M   6% /boot
tmpfs                 7.8G  2.6G  5.3G  33% /dev/shm
ade # exit

here is how you could do the disk cleanup

prime/admin# ncs cleanup
***************************************************************************
!!!!!!!                           WARNING                           !!!!!!!
***************************************************************************
The clean up can remove all files located in the backup staging directory.
Older log files will be removed and other types of older debug information
will be removed
***************************************************************************
Do you wish to continue? ([NO]/yes) yes
 
***************************************************************************
!!!!!!!                DATABASE CLEANUP WARNING                     !!!!!!!
***************************************************************************
Cleaning up database will stop the server while the cleanup is performed.
The operation can take several minutes to complete
***************************************************************************
Do you wish to cleanup database? ([NO]/yes) yes
 
***************************************************************************
!!!!!!!                USER LOCAL DISK WARNING                      !!!!!!!
***************************************************************************
Cleaning user local disk will remove all locally saved reports, locally
backed up device configurations. All files in the local FTP and TFTP
directories will be removed.
***************************************************************************
Do you wish to cleanup user local disk? ([NO]/yes) yes
===================================================
Starting Cleanup: Wed Nov 11 09:41:11 AEDT 2015
===================================================
{Wed Nov 11 09:44:07 AEDT 2015} Removing all files in backup staging directory
{Wed Nov 11 09:44:07 AEDT 2015} Removing all Matlab core related files
{Wed Nov 11 09:44:07 AEDT 2015} Removing all older log files
{Wed Nov 11 09:44:09 AEDT 2015} Cleaning older archive logs
{Wed Nov 11 09:45:01 AEDT 2015} Cleaning database backup and all archive logs
{Wed Nov 11 09:45:01 AEDT 2015} Cleaning older database trace files
{Wed Nov 11 09:45:01 AEDT 2015} Removing all user local disk files
{Wed Nov 11 09:47:31 AEDT 2015} Cleaning database
{Wed Nov 11 09:47:45 AEDT 2015} Stopping server
{Wed Nov 11 09:50:07 AEDT 2015} Not all server processes stop. Attempting to stop remaining
{Wed Nov 11 09:50:07 AEDT 2015} Stopping database
{Wed Nov 11 09:50:09 AEDT 2015} Starting database
{Wed Nov 11 09:50:23 AEDT 2015} Starting database clean
{Wed Nov 11 09:50:23 AEDT 2015} Completed database clean
{Wed Nov 11 09:50:23 AEDT 2015} Stopping database
{Wed Nov 11 09:50:37 AEDT 2015} Starting server
===================================================
Completed Cleanup
Start Time: Wed Nov 11 09:41:11 AEDT 2015
Completed Time: Wed Nov 11 10:01:41 AEDT 2015
===================================================

ade # df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/smosvg-rootvol
                      3.8G  461M  3.2G  13% /
/dev/mapper/smosvg-varvol
                      3.8G  784M  2.9G  22% /var
/dev/mapper/smosvg-optvol
                      694G  614G   45G  94% /opt
/dev/mapper/smosvg-tmpvol
                      1.9G   36M  1.8G   2% /tmp
/dev/mapper/smosvg-usrvol
                      6.6G  1.3G  5.1G  20% /usr
/dev/mapper/smosvg-recvol
                       93M  5.6M   83M   7% /recovery
/dev/mapper/smosvg-home
                       93M  5.6M   83M   7% /home
/dev/mapper/smosvg-storeddatavol
                      9.5G  151M  8.9G   2% /storeddata
/dev/mapper/smosvg-altrootvol
                       93M  5.6M   83M   7% /altroot
/dev/mapper/smosvg-localdiskvol
                      130G  188M  123G   1% /localdisk
/dev/sda2              97M  5.6M   87M   7% /storedconfig
/dev/sda1             485M   25M  435M   6% /boot
tmpfs                 7.8G  2.5G  5.4G  32% /dev/shm

Since disc clean up did not help, reached TAC to see if they could help here. They logged onto DB and removed some old data (mainly alarms/alerts), still recovered space was not released and disc utilization was same as before. I think this issue is tracked by below bug ID

CSCuv81529PI 2.2 – Need a method to reclaim free space after data retention
Symptom:
PI 2.2 - Need a method to reclaim free space after data retention
As of now once records got deleted from tables that doesn't mean that the database engine automatically gives those newly freed bytes of hard disk real estate back to the operating system. 
That space will still be reserved and will be used later in order to write into database , So we need an enhancement in order to reclaim that unused space

Conditions:
NA
Workaround:
NA
Last Modified:Nov 11,2015
Status:Open
Severity:6 Enhancement
Product:Network Level Service
Support Cases:5
Known Affected Releases: 2.2(0.0.58)

So at this point, no way other than building CPI 3.0 from fresh.

Due to this space recovery issue of CPI 3.0 you have to make sure you modify the default data retention policies appropriately. Here is the values I have modified in this new CPI 3.0 installation (Administration > Settings > System Settings > Data Retention). Note that some of these values suggested by TAC.

CPI-DB-01Under  Alarms and Events settings (Administration > Settings > System Settings > Alarms and Events > Alarms and Events)  you have to modified the clean up options. By default some of these options not enable and if you leave as it is, this will take considerable amount of disk space. Once you migrate such CPI system to 3.0, database size will be assigned depend on the space of Alarm & Event consumed. Later on even if you delete these file CPI 3.0 will not release that space back for any other thing.

CPI-DB-02Data Retention under “Clients & User settings” as well you may have to modified some of those default values.

CPI-DB-03It is a good idea to change some of the event notification threshold. Specially you do not want to hear the bad news when disk is 90% utilized. I have reduced it to 60%

CPI-DB-04After all those policy modifications in fresh CPI 3.0 installation, I have added all network devices manually. With 2 weeks of data I can see database size is 100G which is 11% of the disk allocated. I hope with those modified settings PI database remain manageable size.

ade # df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/smosvg-rootvol
                      3.8G  323M  3.3G   9% /
/dev/mapper/smosvg-varvol
                      3.8G  143M  3.5G   4% /var
/dev/mapper/smosvg-optvol
                      941G   98G  795G  11% /opt
/dev/mapper/smosvg-tmpvol
                      1.9G   36M  1.8G   2% /tmp
/dev/mapper/smosvg-usrvol
                      6.6G  1.3G  5.1G  20% /usr
/dev/mapper/smosvg-recvol
                       93M  5.6M   83M   7% /recovery
/dev/mapper/smosvg-home
                       93M  5.6M   83M   7% /home
/dev/mapper/smosvg-storeddatavol
                      9.5G  151M  8.9G   2% /storeddata
/dev/mapper/smosvg-altrootvol
                       93M  5.6M   83M   7% /altroot
/dev/mapper/smosvg-localdiskvol
                      174G  9.7G  155G   6% /localdisk
/dev/sda2              97M  5.6M   87M   7% /storedconfig
/dev/sda1             485M   18M  442M   4% /boot
tmpfs                  12G  3.9G  8.0G  33% /dev/shm

So here is my advice if you are going to CPI 3.0 from older versions.

  1. Always go with a fresh installation with map import
  2. Modify the Data Retention Policies and Alarms/Events settings. Do not leave the default settings.
  3. If historical data is essential, make sure delete unnecessary files prior to do the inline migration & aware of PI database size.
  4. Monitor the growth of CPI 3.0 database over time take necessary actions before running out of space.
  5. You can copy the license from 2.x to 3.0 ( /opt/CSCOlumos/license)

Not sure how many of you experience this issue (done inline migration & later on had to fresh build). If you managing large scale environment be aware of this.

*** Warnings – 2015-12-22 ***

Refer this post if you are planning to apply any device pack/patch on your PI 3.0. I haven’t apply those in my setup, but those who have done had to rebuild there PI as server want start up after these installation.

References

1. Cisco Prime Infrastructure 3.0 Release Notes
2. Cisco Prime Infrastructure 3.0 Quick Start Guide
3. Cisco Prime Infrastructure 3.0 Administrator Guide
4. Cisco Prime Infrastructure 3.0 Documentation Overview

Related Posts

1. How to go there – PI 2.2
2. Cisco Prime – Device Mgt using SNMPv3
3. Upgrade Prime using CLI
4. WLC Config Backup using Prime