Viakoo Release Notes 2.6
(March 30th, 2016)
Viakoo Release 2.6 is an update for the Viakoo Service. Release 2.6 expands some of the tools to throttle-down offline alerts for servers introduced in Release 2.5.1, and expands this capability to switches and cameras. The Priority controls for the server, switch or camera are used to adjust this behavior. Additionally, we introduce “Maintenance Mode” in this release which allows administrators to suspend alerting for sites that are under construction or revision.
Offline Alert Throttling
For some of our users, infrastructure on tenuous networks can experience repeated offline events for servers, switches and cameras that are only offline for short periods of time. In many situations these problems resolve themselves faster than users can respond, either because the device reboots itself quickly or because network latency and responsiveness issues makes certain devices appear offline while the device is still active.
In these situations, customers might experience significant numbers of ticket “open” notifications followed by “closed” notifications a short time later. For infrastructure that is not crucial, or where the response SLA is greater than an hour or more, these kinds of tickets can be frustrating as teams begin the process of trying to respond only to have the problem go away before they can start their investigation.
Also, because Viakoo can detect video stream failures on the order of minutes, for some customers with noisy networks, the shear number of camera or stream offline tickets can also create confusion. Beyond just the interruption of an email, examining historic ticket events users may find large numbers of closed tickets reflecting all these offline events, and assume it represents a more serious issue.
In light of the above we’ve introduced the concept of ticket throttling. This is the ability to delay the escalation of problems to an alert level, which can reduce the number of non-critical alert events. The alert level is comprised of ‘CRITICAL’ and ‘FAILURE’ tickets, as opposed to ‘MINOR’ and ‘ELEVATED’ tickets which do not send alerts. This ticket alert throttling is controlled by the “Priority” attribute of each device.
Starting in Release 2.6 a NORMAL priority servers or switch will open only an “ELEVATED” ticket initially when a failure is detected. If the problem persists longer than 40 minutes, implying human intervention is needed, the ticket will convert to a CRITICAL level ticket which then will send out Alert notifications depending upon the setting of users associated with the infrastructure.
Similarly, Cameras or Video Streams of NORMAL priority that fail (“go offline”) will delay the point of opening a ticket for 40 minutes. VPU and status icons will indicate current status for any video stream or camera immediately. However, a new ticket won’t be created unless the problem persists for longer than 40 minutes.
These changes can dramatically reduce the number of tickets a user might be prompted to deal with as well as help users focus on where they need to get involved.
CRITICAL Priority Removes Throttling
When users have servers, switches or cameras that are too important to delay dispatching alert messages, they can set their Priorities to CRITICAL. This tells the system to immediately open “Alert-level” tickets (CRITICAL or FAILED), which then causes alert emails or push notifications to be sent to those users who are configured to received them.
In prior releases, we gave you the ability to set cameras or video streams to CRITICAL. In release 2.6, we extend that capability to Servers and Switches as well. To force the system to alert immediately, use the Details tab to set the Priority field for Servers or Switches or Cameras at the site or server level.
Viewing and Setting a Server’s Priority
A server’s priority is now visible in the summary section on the server’s “Details” tab as well as in the Server table for a site.
From a Server’s Details tab, you can change the priority by clicking on the “Edit” button in the upper-righthand corner.
After you change the Priority and/or Type, you then need to click “Save” to have the changes take affect.
At the site-level, you can change one or more server’s priority by entering into edit-mode for the associated server table in the Site Details tab. To go into edit-mode, click on the “Start Edit” button.
Edit-mode on a server table allows you to change one or more servers’ Type and/or Priority by allowing you to change those values individually by selecting the pulldown for each server’s field.
Server Priority can be changed in bulk for one or more servers at a site by selecting multiple servers, setting priority for all the selected servers, and then clicking the “Save” button.
Switch priority is viewed in switch tables and switch overview panes
Viakoo is a robust service that can detect configuration issues as well as failing infrastructure. Some users have expressed the desire to not receive alerts on problems they are causing during the process of modifying their configuration.
To address this issue users can now declare that their site is in “Maintenance Mode”. This indicates to the service that it should not send out Alerts on any tickets that are created for that site while “Maintenance Mode” is still active.
Putting a Site “In Maintenance”
Administrators can set a Site to “Maintenance Mode” by navigating to the “Details” tab of their Site, and clicking on the "Edit" button in the upper right corner. This will reveal the various editable options for the Site.
From this point you will now see the red “Enter Maintenance Mode” button.
After entering Maintenance Mode the site will stop sending alert emails or push notifications, and an indicator will show up next to the name of your site.
If you navigate to the site’s Ticket Tab while in Maintenance mode you will still be able to see tickets getting opened and closed which you can choose to ignore. We also recommend that upon completion of your maintenance cycle you review the open tickets in Viakoo to verify that all services have been properly restored. This can help to catch significant configuration mistakes before you move the site back out of Maintenance Mode.
You can check that a site is still in “Maintenance Mode” from the Overview and Details tabs by looking at the header. The label “[Undergoing Maintenance]” will appear after the Site’s name with an orange background.
Getting Out of Maintenance Mode
To take a Site out of “Maintenance Mode,” get back into “Edit” mode for the site using the aforementioned instructions. Now you will see a button that allows Administrators to “Exit Maintenance Mode” in the upper right corner of the display. Click to exit.
The Viakoo service is constantly capturing event information such as changes in configuration attributes to Windows events. Fo deeper understanding of issues and when certain problems occurred, it is useful to pull up these events to understand when certain things happened in time. Therefore, for customers of Enterprise Edition (Viakoo Predictive), you will now see an extra “Events” tab when you are in the context of a site or device within a site.
There are potentially thousands of events that can occur in any window of time so it is useful to filter these events by type and by time. The Events tab interface allows you to select event types and date ranges and then click “Get Events” for that type. Depending on how many events you get of that type and in that range of time, you may need to filter the list down (using filtering field) or step to the next, previous or N-th page of information.
The Events Type selector allows you to choose the type of events to load for the time frame you select.
Choose the event type, then the date range (defaults to “today”) and then click “Get Events” to get the list events of that type. The following explains the event types:
- Component Events - Events related to component (e.g., a NIC card) being created or deleted
- Config - Events associated with a change in a configuration parameter
- Flapping - Events associated with a Device switching from offline-to-online repeatedly in a short window of time.
- StorStac - Events associated with IntransaBrand storage appliances
- Threshold - Events which are associated with some performance parameter goes above or below a configured boundary value.
- Ticket Events - Any event associated with when a ticket got created, modified or resolved.
- VSDI - Certain kinds of events that affect the VSDI KPI. Not all these events create tickets. The KPI value may degrade or improve because something like a network dropped packets may be increasing or a RAIDed volume is implemented in a rebuilding diskgroup. Whenever the system changes the value of VSDI, it issues a VSDI event to explain why. Use this filter to find these kinds of events.
- Windows - Events from the Windows Events log. Not all Windows events are brought up to the cloud. However, important ones are and this choice helps you to view them.
- Links to the latest agents to download is now in About Box popup. Find the AboutBox in the Administration Menu in upper right corner.
- Improvements to Switch Port Selector on the Performance Tab has been improved to be more intuitive about which port you are choosing.
- Issues related to displaying same-name performance values for multiple servers within a site at once, for example "Write IOPS for D:" would display those multiple measures as if they were the same trend line. This issue has been fixed and the trend-line for same-named objects from different servers will not get their own name.
- Users who do not yet have a registered Phone Device associated with their account will not be able to set/unset Push Notifications from the WebUI.
- Bitrate column for video streams now is correctly display in units of bits-per-second.
- Most items in tables under the Details tab are now official web links so right-clicking on these links can allow you to open the associated location in another browser window or tab. This is also true of links in the Administration Menu as well.
- Stream Table Priority column now sorts correctly.
If you have any questions, comments, bug reports, or suggestions, please reach out to us through the live-chat feature or contact us at email@example.com.
We love hearing from you!
1 (855) 585-3400