Sometimes I talk about the education-industrial complex on this blog, rarely with kindness. I captured much of that in Stop Stealing Dreams.
Readers will see that not once have I criticized a hard-working teacher who meant well. That’s because it’s the bureaucratic industrial system that’s at fault here, not the teachers.
Now more than ever, with teachers scrambling with remote learning, personal health and the shifts in our culture, they matter.
Teachers matter because they have the guts to buck the dominant test and measure system. Because they show up with care and energy, and because they lead.
By time spent, what percentage of the typical school experience is spent on: tests, test prep, comportment, homework, memorization, the curriculum and the social pressure of fitting in?
And what percentage is spent on daydreaming, inventing, creating from scratch, doing it without a manual and finding new solutions to difficult problems?
I don’t think it’s an accident that we spend a fortune on high school football and almost nothing on creative writing hackathons.
Change is going to come from parents and from teachers who care. The system defends the system, and the system requires adherence and stability.
The massive shift to remote learning opens the door to slip in the kind of challenging problem solving and connection that we need right now. We have to hurry, though, because surveillance and more testing is probably right around the corner.
I started a new job a few years back, with one of my responsibilities stated as “Move the existing automation to the next level”. The company already had a lot of automated tests in place and had worked really hard on making automation a part of the normal team delivery. Great! I´ve worked with a lot of problematic automation and I know a thing or two about what not to do. So, it sounded like a dream!
In my vision, I saw a pipeline where all tests run green and if it fails, we fix it ASAP and it is green the next morning. In reality however, most days at least something had failed. To me, it felt like no one was very worried about it. I worried. And I had a hunch there was a pattern, but I couldn’t pinpoint it when looking at each individual test run. So, I pulled all the historic data, put it into a spreadsheet and started twisting and turning it to look at it from different perspectives. At the same time, I started asking a lot of questions to people from the teams furthest along the automation route who were involved and/or affected by the automation in different ways. Those included testers, scrum masters, developers, architects, product owners, operations and management.
The questions depended on the roles but revolved around finding out how much confidence people had in the automation, how much time they spent on maintaining them vs. improving them and how I could help move things forward.
I had a few questions and hypotheses I wanted to answer, such as: • Were we addressing the actual problem or did we just re-run them locally and blame the environment? • Were they run as often as we needed them, or could they be sped up to a point where they could provide more value to the teams? • Were there problems that could be solved, with the right people and/or resources available? • What problems were we actually trying to solve with these suites?
A few things quickly stood out, clear as day.
• People were so caught up in the daily deliveries that they didn´t feel they had time to work pro-actively on continuous improvements.
• People felt a lot of the issues were “impossible to fix”, because they lacked the right competence in the team
• We were trying to solve a multitude of problems with the same set of tests.
• A number of tests had never succeeded. And a number had never failed.
Ok, with the problems defined we can finally try to figure out a solution! Let us dive into each one and see what can be done.
Breaking out of the hamster wheel
Problem: People were so caught up in the daily deliveries that they didn´t feel they had time to work pro-actively on continuous improvements.
One issue with most software development out there is that unless specifically making room for it, refactoring is always a struggle. To me, it has always been a part of continuous improvement and I usually have no problem creating a business case for it. I do, however, see that a lot of people out there, testers as well as developers, aren´t trained to sell that story. And as a result, they often (feel like they) get too little time to do more than deliver the stories that are in the current pile of “to do now”.
But never looking at the bigger picture means building tech debt, which in turn means increased maintenance cost and less and less value delivered. What I saw here was a never-ending cycle of
• Analyse the last run (never trends, always the last run)
• Debug found issues
• Fix said issued
• Setup for nightly run
This meant that we never took time to work on improving (proactive) only fixing (reactive). And adding more of course. And as a result, completing the cycle took longer and longer, which reduced the available time for improvement. And while you are in this loop – it is really hard to see it!
In this case I had a role and background where I could help but if you don´t – find a sponsor! A sponsor could be someone who can help you prepare the business case, someone with access to a budget who can finance it or why not someone with the organizational power to prioritize your ideas. Depending on where you work this could be different people, but a product owner, a manager or a team lead could be starting points.
The power of relationships is a topic for a separate blog post (I talk about it in “My journey from developer to tester” and Alex Schladebeck & Huib Schoots talk about it in their “Jedi mind tricks for testers”) but trust me on this: People want to help people they like and we like people who like us. So save yourself a lot of trouble and work on those relationships!
Making the impossible possible
Problem: People felt a lot of the issues were “impossible to fix”, because they lacked the right competence in the team
So ok, talking to people about the failing tests, people kept saying the issues where “impossible to fix”. Therefore, the tests were just re-run locally and if they passed nothing more was done to them. Which could of course be ok, there are tests that just randomly fail at times for no apparent reason, but when looking a bit deeper I could find a number of actual problems related to certain areas.
Operations and hardware
In this group goes all the problems that I could trace back to things that the teams felt were out of their control and that are also related to the operations department. Things like: “Oh, our firewall does not allow us to do X, so we needed to do Y to get around it” “The tests fail if they start late or take longer than usual because at 3am they backup system A” “Yes, they crash every 3rd Monday of the month but that´s only because we do patches then”
Issues in this category could be time related such as backups, patches, batch jobs or applications closed certain hours and network related such as firewalls, DNSs, blocked/trusted sites or queues filling up.
Or course, if you don´t have any experience with operations these things might look unavoidable but most of the time they can be sorted by explaining your problem to someone with experience and ask for help.
Potential solutions can range from rescheduling tests, separation of domains, re-configuration of hardware or even by throwing money at it! Explain to someone with the power to change things why this would save money/time and/or increase team/customer satisfaction.
Test environments and test data
Here go problems related to the bane of many a tester: environments, integrations and data. Comments I sort into this group were: “The test failed because the data was used up by another test” “The test run last night failed because someone was locking the environment” “That´s not a bug, it only failed because Service X was down/had the wrong data”
Potential problems here could be concurrency, corrupted data, race conditions, data limitations, limited capacity of the environments, limited number of environments, (unnecessary) dependencies to services that were down and/or had the wrong data/version/state/configuration.
These are, of course, not easily fixed. However, a lot can be improved by some pretty standard practices today such as using synthesized data when possible, removing dependencies with mocks/stubs when possible, using virtual environments that can be spun up when needed, cleaning up data after a run, tests creating their own data instead of looking for existing, asking for more/better environment, getting help from management and/or operations to get the resources and prerequisites needed for a good, modern setup of environments and data. (Throwing money at it can go a long way to ease the pain!)
Processes and communication
Oh, my this is a big one…. In this big pile of opportunities for improvement you could find gold nuggets such as: “We didn’t know the code has changed so we didn´t know the tests needed to be updated!” “We didn’t have time to fix the tests because we were busy implementing new tests!” “Oh, yes that happened because the application in that environment was version X, we can only test version X.Y”
Potential problems related to this group are of course either communication and/or process related. It could be that code changes are not communicated within the team or between teams with dependencies. It could be that those changes were communicated but that no one prioritized fixing it.
It could be that no one had seen the benefits of being able to run multiple versions of a test suite, say to be able to quickly test both planned releases and urgent patches. Or it could be that we were trying to solve multiple problems with the same set of tests, which I will get back to under “One size fit no one” below. Or of course a lot of other things that could be their own series of posts so let’s stick to the ones mentioned above.
These are both really hard and really easy to fix, depending on the individuals, teams and/or organization. A lot should be obvious (but is apparently not always implemented in reality), such as making sure testers (whichever formal role they might have) are included from start to end. Good things to try to improve this are 3 Amigos sessions, Pair-programming, mobbingor just making sure test is always included in discussions about both problem and solution.
Communication breaking down between roles and/or teams is a strong anti-pattern and when you see this it should be on everyone´s priority to work on that problem. Ask for help if it seems too big to handle yourselves. Asking the other teams what information they would be helped by and how they would prefer to get it will improve both relationships and understanding of each other´s problems, needs and domain. And relationships matter, a lot!
Running tests that we know will fail (say we didn´t have time to update them all) is a waste of time and resources and will create noise. A simple solution for that is to disable them until you know they should pass again and put some more effort into testing those areas in other ways. (Or take an informed decision to accept the risk for now)
Not version-handling tests is a problem I was extremely flabbergasted by. Version handling other types of code has been a standard for years, but I keep hearing about tests not following the same pattern. We should of course be able to run the exact version or a test and/or test suite that we need at that point, in that particular environment. This should be standard procedure and since most companies use some kind of version control system today, it should be very, very low effort to implement it. I see no reason not to do this already, it should not be a limitation anywhere today. And if you don´t already – put your test automation code with your production code whenever possible! They should not be separated; they are part of a whole.
One size fit no one: Use the right tool for the job
Problem: We were trying to solve a multitude of problems with the same set of tests.
Another of the anti-patterns I saw was that the test automation suite was used as a band-aid to fulfil a number of different needs for information, on different levels. Sometimes these were even in direct conflict with each other.
We had the developers, who wanted ultrafast feedback on changes. Having that would allow them to make changes without worrying that their changes broke something else. For them, a red test might be expected or even wanted, applying a test-first perspective on the development. As long as they were fixed fast of course, otherwise the red tests are just noise.
Then we had testers, who wanted quick feedback on the stability of a certain release candidate as well as the security of knowing that the existing regression tests were still ok. Having that would allow them to focus their testing on changes and/or exploring certain aspects of the software instead of having to waste time on something unstable, finding obvious bugs or testing all of the old functionality again. They also needed to be able to test different release candidates at the same time and possibly with different sets of data. To them, a failing test should mean a problem had been found, meaning something needs to be fixed ASAP. It should not mean wasting time on analysing the result or missing actual problems because the report was full of noise.
The product owner and the scrum master wanted the tests to answer questions related to planning and stability, such as “Is this release candidate ready for release?”. To them, a failing test should mean either not releasing or taking a quick decision that the problem was not bad enough to stop the release.
And then we had people like me, managers, who wanted information about strategic planning and investments, such as “Are we spending our money in the right places?”. And to do that, I wanted to look at trends, improvements, root cause analyses and things like that.
Of course, all of these questions and needs can´t be met by running a single suite of automated tests in a single environment, with a single set of test data.
The solution to this might require a separate blog post but using layers of tests for different purposes and using different versions, data sets and environments is the short answer.
Dare to delete. It´s ok, I promise.
Problem: A number of tests had never succeeded. And a number had never failed.
One thing I found that I thought would be simple to fix turned out to be very hard to convince people about. Deleting tests.
To me, a test should only be run if it has a value greater than the cost, but I realized that the emotional barrier to removing something that you had invested in was a lot greater than I anticipated. Of course, there is no easy answer to when they start costing to much, since that depends on your context, but measuring total run time and total time for maintenance over time is a good idea. Monitor if any of those metrics start rising without a good reason and try to set at least yearly goals for improvements.
I suggested a few things that I wanted to do:
Remove the tests that had never, once, failed. To me, they were noise that simply made us feel good about having a higher % or passes and at the same time increasing the complexity of the test suite and the cost for running them (time and resources). People did really not want to do this because they felt it would lower the test coverage, but I argue that unless they test something relevant; they are not providing value. And to me, a test that never failed is by nature a bit suspicious. They might, of course, be important! But I would at least make sure they are not always successful because they are asserting the wrong things. It might also be that we never, in several years, changed anything in that area. In that case, do we need to run them every night? Removing does not mean you can never ever get them back but maybe there are other things more important to use those resources for right now?
Remove the test that had never, once, passed. Honestly, if you haven´t fixed the test in a year, why bother running it? Here, the argument was that we would lose coverage and that they were needed but this is frankly a hill I am prepared to die on. Fix or delete, there is no “what if”-here. Do it.
So, to sum this all up…
In the words of Ian Fleming: “Once is happenstance. Twice is coincidence. Three times is enemy action”. Looking at trends rather than only at the current state will help you see patterns that can help you find areas where changes will have big impacts.
Asking for help and/or input is a great way of improving. Involving people with other areas of expertise might solve problems you though were unsurpassable. I know for a fact that my perspective fixed a few, and I have had other people solve my unsolvable problems more than once. Impossible might be possible with another set of tools.
As a professional working with tests, it is your job to make room for continuous improvements. No one can create time, but everyone can raise the issue. And honestly, making time for this type of work will save time in the end. And if you are a manager, it is your job to help people see when they are getting caught up in the daily business and help make room.
And in the words of Marie Kondo: “Do these tests spark joy?” Kon Mari your tests on a regular basis!
At a number of conferences I attended in the past, people connected several fields to software development in general and testing in particular, which (at first) seem unrelated.
Ernie Miller presented “How to Build a Skyscraper” at Full Stack Fest 2015 in Barcelona, a talk that was not actually about building skyscrapers. He rather described lessons that can be learned from building skyscrapers. One of the ‘pro tips’ he presents is this: “A solution that seems unremarkable to you might just change everything for others. (so share what you build)”
Since then, I pondered this for a while (read: more than 4 years). With my background in physics, I see a number of parallels to software testing. This is also a great opportunity to answer a question I get asked frequently: How did you enter software testing, given your background in physics?
Let’s start with a definition for physics:
Physics is an experimental science. Physicists observe the phenomena of nature and try to find patterns that relate these phenomena.
— Young, Freedman. “Sears and Zemansky’s University Physics: With Modern Physics”. Pearson Education. Also see the wikipedia article on physics.
The patterns that relate those phenomena are the theories (or laws) of physics. They are models that describe an aspect of reality. None of these models is complete, in the sense that it describes everything. There is no one physical theory that explains everything. A nice view of the landscape of physical models is shown in the image by Dominic Walliman (see sciencealert.com for details):
In physics (as in science in general), experimental results and predictions created by models are compared, in order to find out how a model does not match observed behaviour. This is important: Experiments can only ever invalidate a model, but not generally confirm its correctness.
To me software systems are models, too: Even though they may represent reality closely, a software system is not the thing it represents. Peter Naur described the relationship between theory building and programming in his paper ‘Programming as Theory Building’ (Microprocessing and Microprogramming 15, 1985, pp. 253-261).
My mental model of software testing is very similar to the one of physics: I see software systems as partial implementations of aspects of a real expected behaviour. In my view testing a system means it and comparing observed results with expectations. The expectations may come from requirements (written or otherwise), previous experiences with similar systems (i.e. another web application from the same company) or other sources.
There are many approaches to testing, in a similar way to the many approaches to physics. Some of them work good in one area but not so well in another. What kind of testing is done, heavily depends on the kind of the software system: Testing embedded software used in medical devices is drastically different from testing, say, a text editor.
It is interesting to go one step further, from physics to science. The Cambridge Dictionary defines science as
the intellectual and practical activity encompassing the systematic study of the structure and behaviour of the physical and natural world through observation and experiment
As I connect with various colleagues and friends who now work from home, they all say the same thing. “I’m always working. I work all day and all night. The work never ends.” They pause. “You’ve worked from home for years now. How do you manage it?”
I create boundaries for my professional and home-based work.
Some of these colleagues have the added pressure of children around the house. Or checking in with their parents or neighbors. Or, extending those helping hands to people outside their homes.
These are not normal times.
That means, that especially in these abnormal times, we need to create boundaries. There’s always more work—another email, another presentation, another report.
When I think about boundaries for work, I think about where I can use constraints. Consider these possibilities:
Create Time Boundaries
I have a normal start and end times for my workday. And, if I need “more” time, I can flex when I start and end my day inside a 13-hour window.
I have a 13-hour window of possible work time. I do not work those entire 13 hours.
Why do I protect some of my time? Because I need slack time—time when I’m not working at all.
When I create the “right” balance of work and slack time, my throughput increases. I create more of everything I want to create.
When I work too much, I complain that I get stuck on writing. My slack time refills my creative well so I can create more.
Can I work 13-hour days? Sure, for limited time periods. Not for weeks or months on end.
Time is my go-to boundary. I also use space.
Create Space Boundaries
If you’re working from home with several other people, you might find it difficult to create space boundaries.
Back when my children were young, we had a rule about Mom’s Office Door: When the door was closed, no one was invited in. If the door was open, they could come in.
And, when the children were older, I wanted them to check in with me when they came home from school. We had working agreements.
As they got older, they didn’t want to talk to me, never mind to spend time in my office. (We’re past that again.)
You might not have a door to your office. If not, and if you have young children, consider explaining the problem and asking them for help. If your kids are like mine, they will develop creative possibilities. Those possibilities might mean they spend hours decorating “walls” or “sidewalks” or whatever they decide you need.
Once you create boundaries, you need to respect them.
Respect Your Boundaries
I assume you can create some sort of boundary on your work time. You might need to alternate work and home time throughout the day, but you can.
You need to respect your boundaries. If you don’t respect your boundaries, no one else will.
You might need to create working agreements with your work colleagues. Maybe even house agreements with the people in your life.
You can’t work, work, work all the time. Everyone needs slack time. Create and respect your boundaries.
That’s the question this week: How can you create boundaries?