Skip to main content

Troubleshooting production issues and log analysis

TROUBLESHOOTING production issues and Splunk log analysis:
This skill is not something you can gain over night. It is a skill that you improve by being in tough and under pressure situations. Then over years you end up painting this whole scheme in your mind on how to approach problems. It almost becomes the second nature to you.
I strongly recommend it to all software engineers. Yes, we all like to be working on new code and writing new applications, but the 10 lines of code that you write will be so much better because you were in the shoes of troubleshooting production issues and you will consider things that other developers will not.
Don’t think of troubleshooting production issues as a downgrade in your career. I’d like to think of it as a necessary step in completing your software engineering skills and graduating. At the end of the day, if we write code, we should be maintaining that same code to appreciate the value of it. If that code is not stable, it is in our interest to make it better if we get called often to troubleshoot it in production.
Thank you for reading this article.
Almir Mustafic

Comments

Popular posts from this blog

Brand New programming language and one solution OR …

Brand New programming language and one solution OR Two existing programming languages, one solution for EACH? I understand that there is no right or wrong. It all depends on your software architecture, team structure, team skills and other factors, but I still want to explain the scenario as it may look familiar to some. Let me explain. Let’s assume that you have microservices and common libraries in two major programming languages. You have some teams who are experts in one and some teams experts in the other programming language. Now you need to come up with a solution for a scenario that all teams will need to leverage. Let’s assume that your cloud platform has an off-the-shelf approach for this but it is supported by a 3rd programming language that your teams do not have much experience in. What is the right thing for your organization and not just from the technical point of view? A) Do you embrace what your cloud platform gives you off the shelf and implement thi...

Owning and improving what you have is a sign of maturity in software engineering — Is it?

Re-factoring vs. Re-Writing? A lot of times we heard the talks about “Never be satisfied” in the context of innovation and driving your teams forward. In that context, this is totally fine, but when this mindset gets blindly used in the software engineering low level details, then you could be constantly  re-writing  code without taking the effort to truly understand it. Is this the right thing for business? Is it costly? Re-factoring  means that you took time to understand what you have, and you are improving it. On the other hand,  re-writing  does not necessarily mean that you took the time to understand the low-level details; it may mean that you went back to requirements and decided to re-write it without trying to understand the low-level details. There could be something in those details that you need to know so you don’t make the same mistake again in the process of re-writing it. At the end of the day, if you are constantly re-writing code, t...

Architecture Diagram for ...

ARCHITECTURE DIAGRAM representing interactions of your family of microservices is one of many important things to achieve the enterprise-level microservice architecture and microservices. After going through the design and contract definitions of your APIs/microservices, your team needs to put together a single diagram that explains the interaction among microservices (owned by your team) and how these microservices interact with other teams’ microservices. This is basically the blueprint for your team’s work and this picture needs to be constantly updated as you are evolving your services. What does this give you? It gives you the big picture view and it allows you to detect good or bad patterns that you cannot see when you are deep in your code. For example, you may be able to see that you are doing too many direct API calls for something that could be done with pub/sub approach. Almir Mustafic