Training Example 2
(Video Quality Problems Across Site)
Back in October, client had been complaining about poor video quality everywhere and particularly on Archiver 2. There was also a complaint about not making retention goals.
Poor video quality (jumpiness or pixelation) implies dropped frames. It can be caused by the following:
- Network congestion visible in windows of higher latency and dropped packet events
- Storage performance visible in volume write latency and volume write queue depth
- Congestion on the storage network which is visible volume write & read latency as well as NIC dropped uploads/downloads on the storage network port.
- CPU Load preventing the system from keeping up.
Not making retention goals can be caused by the following:
- Not enough storage for application
- Cameras generating too much data
- Motion settings not properly configured
- Low-light settings
- Too high framerate
- Inefficient codec
- Data left around in recording partitions
Triage and Analysis
There is some overlap in these symptoms. Essentially if the cameras are generating too much data that can saturate storage, causing you to miss retention goals AND it can create traffic congestion creating dropped frame events.
Go to Archiver 2, set plot window for 10/25/17 through 11/1/17 and plot NIC: Download to see that the system is absorbing 40Mbps of data consistently. CPU: Load and Volume: Free space both look stable.
Add another plot for NIC: Dropped downloads. Here we can see that the dropped packets is relatively high and climbing.
Add another plot for NIC: Upload which shows a very large amount of data getting sent to the network attached iSCSI storage.
Add another chart, choose “Cameras” and Select All. Then choose Camera: 10 second avg bitrate. His plot shows a set of cameras that constantly send a huge amount of data. That looks suspicious.
Duplicate this camera plot and then choose Camera: Tcp port sum. This shows a flat zero which means not using TCP sockets to send data. Change to Camera: Udp port sum instead. Now we see a lot of jumping around implying changes to UDP sockets which is correlated with pixelation.
A set of cameras appear to be generating way too much data. Particularly cameras that are steadily at a very high rate has the symptom of being unintentionally configured to send data constantly.
Customer confirmed that the camera settings were inadvertently way higher than intended. Reconfigured camera settings reduced traffic and stabilized infrastructure.