Daily, we are pushing the boundaries of how fast we can deliver software. Delivering something new, better, faster than our competition can mean incredible payoff and we are constantly being asked to cut costs and deliver more, faster, cheaper. But then suddenly you wake up to 189 dead in a plane crash or having to take down and redesign your entire banking service because the architecture didn’t hold up to the load.
Critical failures such as these demonstrate the need for critical thinking and risk mitigation – making a good testing key to delivering ethical software.
“It´s not my decision to go live, I only provide information”, you might think. But I imagine that a long chain of people, probably including us, must have made many decisions leading up to that result. So, where does our responsibility as testers end and how do we act ethically without becoming gatekeepers and/or bottlenecks?
Having researched and analyzed a number of software failures through articles, webinars, conference talks, and discussions, I have distilled the following concepts:
Know your risks and fail-safe
Fail fast and learn fast is a key to continuous delivery and at the core of that I believe are a number of important things, such as psychological safety, knowing your data and risk awareness. Without those – “testing in production” is just an excuse to save money by not testing early and often. With them, it can be a wonderful way of working where the team trusts that they can find and solve issues before they cause users harm and where everyone knows they can make mistakes without being used as a scapegoat when something goes wrong.
Psychological safety is needed in order to have trust in each other, knowing that no one will put blame on anyone person if something happens. Instead, the team will focus on solving the problem, figuring out what went wrong and learning how to improve and reduce the risk of that particular issue ever emerging again.
Being skilled at risk assessment and knowing your data means knowing where your risks are, how to best mitigate them, how to discover a failure fast and of course: how to recover from a said failure in the shortest possible time with low to zero impact to the users and/or the business.
Whole team responsibility does not mean you shouldn’t act responsibly
Big catastrophes usually don´t happen because of one single huge bad decision. Behind every software bug or exploit there are probably a number of small bad choices, cut corners and/or
misunderstandings that in the end led to something completely out of proportion to what we could imagine at the time.
As software professionals, it should be our responsibility to consider the harm our software could do, act according to our ethical and moral baseline and never expect someone else to take action to save ourselves the inconvenience of standing up for something or someone.
“Testers don´t own quality or decide on go or no-go” should never be an excuse to turn a blind eye or not take responsibility when needed.
Quality is built-in, not tested in
One of the reasons we want to shift left is to find problems early, which not only saves money but also saves a ton of rework, creating a better product with less effort. This requires everyone involved to learn how to think critically about everything from “Do we really need this?” and “Can we solve this problem in a simpler way?” to “In what way could this be used to cause someone harm?” and “What criteria need to be met in order for this to fail?”.
Some testers might worry that the demand for their expertise will decrease, but I believe that the reality is the complete opposite. Emerging areas of quality include observability, accessibility, security and data protection, and the ever-increasing software complexity drives demand for skilled testers. The challenge for us is to evolve and add those new skills to our toolboxes.
Agile for good, not for bad
I argue that while the concepts and ideas behind agile development are sound and good, a lot of places are (mis)using the name for bad while pretending to be good.
We should be doing small regular deliveries because they mean lower risk, cutting waste, an even workload and delivering more value to our customers. We should analyze data, user & system behavior in production in order to learn, improve our mean time to recovery and again: deliver more value to our customers!
Unfortunately, those benefits require a lot of organizational changes such as switching to pulling work instead of pushing work, allowing teams to truly self-organize and own their deliveries and letting go of the old ways of planning and managing software development.
What I often see instead, is “agile” simply being used as a way of cutting costs, forcing low-quality software into production and keeping people under constant high pressure. Meaning we get all the drawbacks but none of the benefits, except for being able to claim that we are working agile. As professionals, I believe it is part of our job in these cases is to push back and don´t compromise on quality or risk.
To conclude: As professionals and ambassadors of quality, it can be extremely hard at times to balance the benefits of speed with the drawbacks of having less time to mitigate risks. At times I have also felt that testing is given less and less space in the software development, but I have come to realize that the importance of good testing is actually growing, and I see more and more companies starting to realize that. With an ever-increasing demand for things such as accessible software for everyone, the possible implications of AI and Machine-learning, shifting security from a once-a-year-activity to something embedded in the normal development cycle and added legal demand for everything from how to handle user data to following up on financial solvency for banks and financial institutions, it is clear that the field of testing is not shrinking – it is growing, and we all need to grow with it!