ÂżQue Aprendizaje Tuvimos Durante Esta Ultima Pandemia?

Este ultimo par de años se ha caracterizado por ser muy caotico para algunas personas, muchas empresas estan padeciendo de incertidumbre pues inclusive se cree que la situacion podría complicarse un poco mas.

 

 

Este cambio en nuestra rutina (laboral y personal) llego de una manera muy tempesotuosa pues muy pocas entidades o ninguna etaban preparadas para este cambio y aunque asĂ­ es como inician las grandes transformaciones este cambio es ha sido y seguira siendo muy complicado.

 

En algunos paises estan comenzando a volver a la normalidad sin embargo otros tantos estan empezando a ver una nueva ola de contagios sin embargo el final de esta situacion se visulmbra un poco mas cercano.

Es por eso que la adaptabilidad ha sido un requisito para navegar en esta pandemia.

¿Qué hemos aprendido de esta pandemia?

 

La innovaciĂłn no debe postergarse.

En México las estimaciones afirman que empresas tuvieron que experimentar una cambio de 3 años en solo un mes y es que en muchas ocaciones las empresas dejan para despues sus inversiones en tecnologia lo cual a la larga termina siendo una necesidad para salir adelante.

Es por eso que herramientas como las de Opmantek han entrado en escena para potenciar y facilitar los cambios necesarios en las empresas.

 

Tener finanzas sanas es necesario.

Un forma segura para tener oportunidades de salir adelante durante la crisis es contar con finanzas saludables, es decir evitar caer en gastos inecesarios en tu negocio.

Por ejemplo en el area de TI, donde un error por originado por cualquier evento en tu red podrĂ­a desencadenar en un serie de gastos.

El conocer bien todos los aspectos de tu empresa es importante y retomando el ejemplo de TI puedes confiar en modulos como  de Opmantek para ahorrar y evitar tiempos muertos y cuellos de botella en tu red.

 

Los datos son el mejor recurso

Aprendimos a recopilar los datos de la empresa en general, sobre su comportamiento, sobre clientes, historicos de ventas y pedidos.

Con estos recursos podemos asegurarnos de siempre tomar las mejores deciciones y reducimos considerablemente el grado de incertidumbre.

Aprende como puedes auditar todo lo que se encuentra en tu red con Open-AudIT aquí.

 

Estas lecciones nos han permitido salir adelante en tiempos dificiles y sin duda seguirlas llevando a cabo se ha vuelto una labor de dĂ­a a dĂ­a y para todas las epocas. AsegĂșrese de que su negocio siga adelante, reserve una demostraciĂłn con nuestros expertos.

Uncategorized

How to Manage Complex Event Responses

Managing complex event responses can seem like an overwhelming task, but with the right automated network management software, the process is simpler than ever. Let’s take a look at how an automated system can help you manage complex event responses.

What is a Complex Adaptive System (CAS)?

Complex Adaptive Systems (CAS) are made up of components (or agents) in a dynamic network of interactions that are designed to adapt and learn according to changing events. These interactions may be affected by other changes in the system and are non-linear and able to feed back on themselves. In the Australian healthcare system, for example, complex adaptive systems have been used to analyse systematic changes.

The overall behaviour of a CAS is not predicted by the behaviours of the agents individually. The past of CAS systems is partly responsible for their present behaviour and they are designed to evolve over time.

Event automation and remediation using opEvents

opEvents is an advanced fault management and operational automation system designed to make event management easier than ever. With opEvents, you can improve your business’s operational efficiency and decrease the workload of your staff by expanding on NMIS‘s efforts and improving automated response techniques using scientific methods.

opEvents elevates NMIS’s Notification, Escalation and Thresholding systems by blacklisting and whitelisting events, handling event flap, event storms and event correlation and supporting custom email templates for each of your contacts.

Basic event automation

In order to carry out event automation successfully, there are a few simple steps that you need to take:

1. Network management – identify the top network events you respond to frequently (daily, weekly, etc.)
2. List the steps you take – troubleshooting and remediating – when the issue occurs
3. Identify how these steps can be automated
4. Create an action to respond to the event

Let’s take a look at how opEvents handles events natively:

Event action policy

Event Action policy is a flexible mechanism that dictates how opEvents reacts when an event is created. The policy outlines the order of actions as well as what actions are executed by using nested if/then statements.

Event correlation

Setting event correlation helps reduce event storms inside opEvents. opEvents will use rules that are outlined to group events together and create a synthetic event that contains event information from all events that have been correlated.

Event escalation

opEvents allows for custom event escalations for unacknowledged events. You can set custom rules based on your business or customers.

Event scripts

Events can call scripts that can be used to carry out actions such as troubleshooting, integration or remediation.

Event deduplication

All events that are related to stateful entities are automatically checked against the recent history of events and the known previous state of this entity.

Developing a CAS system

In order to develop a CAS system, it’s essential to complete the following steps:

1. Identify an individual event
2. List the steps you take – troubleshooting and remediating – when the issue occurs
3. Decide what automated action(s) can and should be carried out (data collection, remediation)
4. Identify who needs to be contacted, when (working hours, after hours, weekends) and how (Email, text, service desk)
5. Decide what should happen over time if the event is not acknowledged (remains active)

If you would like to learn more about Opmantek’s event management services, don’t hesitate to get in touch with our team or request a demo.

Uncategorized

Minimising risk of cyber-attacks to telcos & network orchestration platforms

Software-defined networking (SDN) and network functions virtualisation (NFV) may provide opportunities to overcome key networking challenges – but security remains a key concern.

“Telecommunications regulators and national security agencies worldwide are very concerned – even alarmed – about the potential risks of cyber-attacks from state-based actors against centralised telecommunications end-to-end service and network orchestration technology platforms or solutions from single vendors,” says Roger Carvosso, Chief Product Officer. “They are also concerned about the dominance or concentration of market power to one or a few NFV orchestration vendors or standards in the telecommunications, carriage service provider and digital service provider space.”

“The importance of security planning, design and controls needed for any orchestrator that has privileged access to network elements for the purpose of service and network orchestration can’t be understated.”

Security also remains a divisive issue within many telecommunications providers, with an intra-organisational divide between cybersecurity leaders such as vice-presidents of security products, professionals and operations team members and SDN/NFV networking engineers. These engineers are typically not as aware or as knowledgeable of security or cybersecurity-as-a-service, or the importance of baking security as a philosophy or discipline to be baked into SDN/NFV technologies, tools and processes.

At FirstWave, we are working to ensure our CyberCision platform security architecture, including APIs, has the highest level of security accreditation and validation – where the required components have privileged access to telecommunications network elements for service and network orchestration. Our platform effectively supports the diversification of orchestration vendors and a more competitive, secure sector.  Our products and culture can also help telecommunications providers bridge the internal security gap and capture the value possible through SDN and NFV.

For more information, contact us at: sales@firstwavecloud.com.

Uncategorized

Why We Need a Dynamic Baseline And Thresholding Tool?

With the introduction of opCharts v4.2.5 richer and and more meaningful data can be used in decision making. Forewarned is forearmed the poverb goes, a quick google tells me “prior knowledge of possible dangers or problems gives one a tactical advantage”. The reason we want to baseline and threshold our data is so that we can receive alerts forewarning us of issues in our environment, so that we can act to resolve smaller issues before they become bigger. Being proactive increases our Mean Time Between Failure. If you are interested in accessing the Dynamic Baseline and Thresholding Tool, please Contact Us.

Types of Metrics

When analysing time series data you quickly start to identify a common trend in what you are seeing, you will find some metrics you are monitoring will be “stable” that is they will have very repeated patterns and change in a similar way over time, while other metrics will be more chaotic, with a discernible pattern difficult to identify. Take for example two metrics, response time and route number (the number of routes in the routing table), you can see from the charts below that the response time is more chaotic with some pattern but really little stability in the metric, while the route number metric is solid, unwavering.

Comparing Metrics with Themselves

This router meatball is a small office router, with little variation in the routing, however a WAN distribution router would be generally stable, but it would have a little more variability. How could I get an alarm from either of these without configuring some complex static thresholds?

The answer is to baseline the metric as it is and compare your current value against the baseline, this method is very useful for values which are very different on different devices, but you want to know when the metric changes, example are route number, number of users logged in, number of processes running on Linux, response time in general, but especially response time of a service.

The opCharts Dynamic Baseline and Threshold Tool

Overall this is what opTrend does. The sophisticated statistical model it builds is very powerful and helps spots these trends with the baseline tool. We have extended opTrend with some additional functionality so that you can quickly get alerts from metrics which are important to you.

What is really key here is that the baseline tool will detect downward changes as well as upward changes, so if your traffic was reducing outside the baseline you would be alerted.

Establishing a Dynamic Baseline

Current Value

Firstly I want to calculate my current value, I could use the last value collected, but depending on the stability of the metric this might cause false positives, as NMIS has always supported, using a larger threshold period when calculating the current value can result in more relevant results.

For very stable metrics using a small threshold period is no problem, but for wilder values, a longer period is advised. For response time alerting, using a threshold period of 15 minutes or greater would be a good idea. That means that there is some sustained issue and not just a one off internet blip. However with our route number we might be very happy to use the last value and get warned sooner.

Multi-Day Baseline

Currently two types of baselines are supported by the baseline tool, the first is what I would call opTrend Lite, which is based on the work of Igor Trubin’s SEDS and SEDS lite, this methods calculates the average value for a small window of time looking back the configured number of weeks, so if my baseline was 1 hour for the last 4 weeks and the time now is 16:40 on 1 June 2020 it would look back and gather the following:

  • Week 1: 15:40 to 16:40 on 25 May 2020
  • Week 2: 15:40 to 16:40 on 18 May 2020
  • Week 3: 15:40 to 16:40 on 11 May 2020
  • Week 4: 15:40 to 16:40 on 4 May 2020

With the average of each of these windows of time calculated, I can now build my baseline and compare my current value against that baseline’s value.

Same-Day Baseline

Depending on the stability of the metric it might be preferable to use the data from that day. For example if you had a rising and falling value It might be preferable to use just the last 4 to 8 hours of the day for your baseline. Take this interface traffic as an example, the input rate while the output rate is stable with a sudden plateau and is then stable again.

asgard-bits-per-second - 750

If this was a weekly pattern the multi-day baseline would be a better option, but if this happens more randomly, using the same-day would generate an initial event on the increase, then the event would clear as the ~8Mbps became normal, and then when the value dropped again another alert would be generated.

Delta Baseline

The delta baseline is only concerned with the amount of change in the baseline, for example from a sample of data from the last 4 hours we would see that the average of a metric is 100, we then take the current value, for example, the spike of 145 below, and we calculate the change as a percentage, which would be a change of 45% resulting in a Critical event level.

amor-numproc - 750

The delta baseline configuration then allows for defining the level of the event based on the percentage of change, for the defaults, this would result in a Major, you can see the configuration in the example below, this table is how to visualize the configuration.

  • 10 – Warning
  • 20 – Minor
  • 30 – Major
  • 40 – Critical
  • 50 – Fatal

If the change is below 10% the level will be normal, between 10% and 20% Minor, and so up to over 50% it will be considered fatal.

In practicality this spike was brief and using the 15 minute threshold period (current is the average of the last 15 minutes) the value for calculating change would be 136 and the resulting change would be 36% so a Major event. The threshold period is dampening the spikes to remove brief changes and allow you to see changes which last longer.

Installing the Baseline Tool

Copy the file to the server and do the following, upgrading will be the same process.

tar xvf Baseline-X.Y.tgz
cd Baseline/
sudo ./install_baseline.sh

Working with the Dynamic Baseline and Thresholding Tool

The Dynamic Baseline and Threshold Tool includes various configuration options so that you can tune the algorithm to learn differently depending on the metric being used. The tool comes with several metrics already configured. It is a requirement of the system that the stats modeling is completed for the metric you require to be baseline, this is how the NMIS API extracts statistical information from the performance database.

Conclusion

For more information about the installation and configuration steps required to implement opCharts’ Dynamic Baseline and Thresholding tool, it is all detail in our documentation – here.

Uncategorized

Why IP Address Management Is Important

Whether you’re a small organization or an enterprise, efficient management of IP addresses can be the difference between a functional network and an inaccessible service.

Increasing complexities, growing device numbers, Cloud Computing, IoT and BYOD continue to heighten the importance of managing your IP address space.

Relying on manual record keeping for network connectivity and core business functions can prove risky, even for the most organized of spreadsheets.

What’s Needed for an Efficient IP Address Management Strategy

Accuracy

Accurate IP delegation and record keeping, ensuring no conflict or associated service outages.
opAddress can allocate and track IP addresses dynamically. Search, view and manage address information, ensuring a critical information baseline is established.

Simplicity

An easy to use system that minimizes data entry, making the process for IT teams faster, more efficient, and less tedious
With powerful out-of-the-box capabilities, opAddress requires little or no configuration. Automatically discover the network addressing of production networks and quickly edit or reallocate addresses as needed.

Security

Accurate, up-to-date data to help identify new devices and ensure only those authorized are on your network.
New data is captured and recorded by opAddress every thirty minutes.
Gain full visibility over IP address by device and analyze historical information.

Scalability

Future proofing and capacity planning to accommodate increasing device numbers and network complexities.
opAddress is extensible to grow with your business. Handle complex environments such as multiple tenancies, subdomains and overlapping address spaces with ease.

Uncategorized

5 Mistakes Evaluating NMS You Need To Avoid

So, your boss has just set up a blend of different software products or a SaaS product to take care of the network monitoring. Did your boss really do you a favour or just add to your headache? Has the situation truly improved, or do you just have more unresolved problems?

These are the five most common complaints we hear and solve on day one out of the box.

1. Too Many Alerts.

This is probably the most common problem with monitoring tools. Everything is turned on either out of the box or by the administrator’s choosing and organizations must rely on the logs to get the information they need. There is a fear of missing something but setting up alerts should be a thoughtful process, standardized amongst your team, and carefully chosen. Careful and well-considered Integrations with other tools like email, SMS, and ticketing systems are essential – but you can’t be inserting and sending out junk or it will be ignored.

2. The monitoring tool is indeed the resource hog and has a slow database.

Many popular monitoring tools are built on Microsoft technology using multiple on-premises servers. To scale, it usually takes building a replica of your multiple server setup and additional software licensing costs (Microsoft Server, SQL and the Monitoring Tool) every time you add a server. Then there’s the ongoing operational management of the multiple servers. With so much data constantly processed, the user experience is slow and poor.

3. One size does not fit all / no access to the API.

Many popular tools now are built in the cloud, and you do not own your data. Your data may be rolled up, removed, or you only have access to specific periods of your data. It is no good for longer-term trending or baseline troubleshooting. You need complete API access to your data to integrate it into your business operations.

4. Security.

Supply chain attacks are becoming more frequent. We all know what happened this year with many Telecommunications, Managed Service Providers, Internet Service Providers, the US Federal Government forced to turn off their monitoring tools. While patches were developed to work around the issue, the depth of what the hackers got is still not well understood. I feel for MSPs as their SLAs are destroyed. Hopefully, those force majeure clauses get interpreted favourably.

With an on-premise platform, you have to control it 100%. Complete control ensures that the product works within your security parameters.

5. Automation.

If you have installed many different tools, setting up some automation between them is extremely difficult. Furthermore, the automation breaks when you need to update or reconfigure one or more underlying applications for other reasons (e.g. Security). A SaaS solution may have various actions that they class as automation; however, they lack the flexibility you need for your environment.

Here at Opmantek, we have a strong belief that monitoring tools should be customizable. We believe this helps the overall flexibility, extensibility, scalability and security posture of your organization, ensuring that in the end, you get what you’re really after and that is less downtime!

Solve these five problems and more – > ask us how

Uncategorized