This blog post is part of a series specialized in Selenium. Although it’s highly recommended that you read the articles in the predefined order, you can still jump to the parts that most interest you.
In Part 1, we’ll focus on the high-level architecture of the framework. If you’re more interested in the core components of the framework, jump to Part 2. On the other hand, Part 3 will discuss the utilities that we can to increase productivity.
Demand for web development and testing is huge. As of January 2018, there were over 1.3 billion websites on the internet serving 3.8+ billion internet users worldwide (statistics here). As the result, the tooling market is now more competitive than ever. Commercial tool vendors are fiercely stomping on each other to get a piece of the test tool pie. But so far, no one has outshone Selenium in terms of popularity and adoption.
The biggest sweet spot of Selenium is the fact that it is open source, in other words, it is completely free to download and use. Selenium provides an API called WebDriver which enables testers to craft their tests in many programing languages, including Java, C#, Python, etc. Besides web browsers, you can also automate mobile devices like Android, and iOS via Appium.
With all of those capabilities at your fingertips, you might feel invincible. Test automation is now problem-free right? Actually, no.
Capabilities alone are not the end of the story; many test teams have been struggling day to day with maintainability and scalability of their tests. It’s a common issue that after the first initial adoption phase, they often regret the fact that they didn’t spend enough time and effort building a good framework from the start.
How to build a good Selenium framework?
This article is aimed to guide you through steps to build a good Selenium framework that satisfies all of the above mentioned criteria. Below is the outline to initiate the construction of a Selenium framework.
- Choose a programing language
- Choose a unit test framework
- Design the framework’s architecture
- Choose a reporting mechanism
- Decide how to build, version control and implement CI/CD
- Integrate your framework with other tools
As the article progresses, I’ll also include some best practices that you can apply to your project.
Choose a programing language
From an architectural perspective, the very first question you should ask is: in what form do I want to write my tests? There are several competing choices: keyword-driven, BDD, code, etc. Since I’m a Software Development Engineer in Test (SDET), I personally lean towards the code side. So my question becomes: in what programming language do I want to write my tests?
There are some factors that might help you decide:
- What language does your company/client currently use for their software development? If they work with Java to develop software, then maybe that’s the language you should start with.
- Who will use the framework to write the tests? Are they proficient in the chosen programming language?
Selenium is a widely used, open source, portable software testing framework for web applications. In the Selenium world, you have a wide range of programming languages to choose from such as Java, C#, Ruby, Python, etc.
In my opinion, Java is a good choice because it is widely adopted and cross-platform. We can easily find code examples, or troubleshooting tips when you’re stuck. It’s also the top priority for each release version of Selenium.
Another consideration: should you use the widespread method of Behavior-Driven Development (BDD)? In brevity, BDD helps boost the readability of your tests by structuring a test flow into Given, When, and Then (GWT) statements. Not only technical testers but also domain experts and business testers can contribute to test creation, debugging, and updating.
The picture below shows an example of a test written in BDD.
Some tools that you can leverage if you choose BDD:
In my opinion, BDD is suitable for small or short-term projects. It’ll be hard to scale if you have to write a dozen of “And/And/And…” in the GWT syntax. Again, I’d recommend Java for better scalability.
Choose a unit test framework
Next, we need to select the unit test framework that we will base it on. A unit test framework helps us to:
- Mark a class or a method as part of the test using annotations (e.g. @Test)
- Perform the assertion/verification
- Execute test cases from IDE, command line, CI/CD, etc.
- Generate logs
- Produce XML/HTML reports of the test execution (test results)
- Group and prioritize test cases
- Execute tests in parallel
Since we already chose the Java language to write tests, I’d recommend using TestNG as the unit test framework. TestNG offers several important benefits:
- TestNG is similar to JUnit but it is much more powerful than JUnit, especially in terms of testing integrated classes. And better yet, TestNG inherits all of the benefits that JUnit has to offer.
- TestNG eliminates most of the limitations of the older framework and gives developers the ability to write more flexible and powerful tests. Some of the highlight features are: easy annotations, grouping, sequencing and parameterizing.
The below picture shows an example of two TestNG tests. Both tests share the same setup() method thanks to the @BeforeClass annotation.
Design the framework’s architecture
Now, it’s time to take a look at our framework’s architecture. After many high-profile Selenium projects in my company, my team has gradually come up with a sustainable, maintainable, and scalable architecture as below.
The success factor of this architecture comes from the fact that there are two separate components called  Selenium Core, and  Selenium Test. I’ll explain those components in detail in the following sections.
We’ve had an overview of the framework to be built. The next articles will explain how to build those components in detail.
Build the “Selenium Core” component
“Selenium Core” will control/manage the browser instances as well as element interactions. This core helps you to create, reuse, and destroy WebDriver objects. I normally use the Factory design pattern in creating WebDriver objects. One WebDriver object, as its name suggests, “drives” a browser instance – moving from web page to web page.
The test writers should not care how the browser instances are created. They just need a WebDriver object to execute a given test step in your test flow. Let’s see how we can add this abstraction to our framework using the Factory design pattern. Below is an example class diagram.
In the below code snippet, you can observe that DriverManager is an abstract class dictating that its implementations must provide methods such as createWebDriver(), getWebDriver() and quitWebDriver().
The below ChromeDriverManager adheres to the DriverManager interface defined in the above picture. It implements the createWebDriver() method by instantiating a new ChromeDriver with specified options. You must do the same for FirefoxDriverManager, EdgeDriverManager or any other browsers of your interest.
To easily manage the browsers that our project focuses on, we define an enum called DriverType:
DriverManagerFactory is the “factory” that manufactures DriverManager objects for you. You invoke the getDriverManager() method of this class with your DriverType to receive a DriverManager-type object. Since DriverManager is an abstract class, you won’t receive an actual DriverManager, just one of its implementations, such as ChromeDriverManager or FirefoxDriverManager.
In the below code snippet, I’m writing a test using one of the above DriverManagers. As you can see, the test writer doesn’t care whether the WebDriver for Chrome is called ChromeDriver. They only specify the browser type using one of the values inside the enum.
In the below test, we navigate to www.google.com and verify that the site’s title is named “Google.” Not much of a test but it demonstrates how you write a test.
By using this Factory design pattern, we can hide the WebDriver creation logic from the test classes containing the test cases. If there is a new requirement to run tests on a new browser, says Safari, it should not be a big deal. We just need to create a SafariDriverManger which extends DriverManager exactly like the ChromeDriverManager we see earlier. When it’s been created, test writers can simply use the new “SAFARI” value of the DriverType enum.
Additionally, it’s very easy to integrate with Appium when we need to run tests against a mobile native app or web app on mobile browsers. We just implement a new class to handle it like iOSDriverManager.
Build the “Selenium Test” component
This component contains all test cases that use the utilities provided by the “Selenium Core” we saw earlier. Just like the “Selenium Core” component, we have to design this component diligently. As I mentioned earlier, the design pattern we’ll apply here is called Page Object Model (POM).
Page Object pattern
Page Object Model (POM) has become a very popular pattern used in test automation frameworks because it reduces test maintenance cost and duplication of code.
Applying POM means you organize the UI elements into pages. A page can also include “actions” or business flows that you can perform on the page. For instance, if your web app includes several pages called the Login page, Home page, Register page, etc., you’ll create corresponding Page objects such as LoginPage, HomePage, RegisterPage, etc.
Thanks to this smart structure, if the UI of any page changes, we don’t need to update any tests. We just need to refactor the code of the page objects (only at one place).
Each class should strictly contain the methods related to the corresponding web page, and define the web elements (selector) exclusively for it. A page object never contains methods of other pages or any tests, per se. For example, only login() method is in LoginPage class. HomePage should not contain login().
A simple Page object
Let’s zoom into a specific Page object. In the below example, we see that the LoginPage contains several important pieces of information:
- A constructor that receives a WebDriver object and sets its internal WebDriver to that object.
- The selectors that help the WebDriver object find the web elements you want to interact with. E.g. userNameTextBox
- Methods to perform on the Login page such as setUserName(), setPassword(), clickLogin(), and most importantly–login() method that combines all of the three methods above.
How to use a Page object
To use this LoginPage page in your test, you can simply create a new LoginPage object and call on its methods. Since you abstract the web element definitions (selectors) from the test writer, he/she doesn’t need to know how to find the userNameTextBox. He/she just calls the login() method and passes in the username/password. If the web element definitions (selectors) happen to change, you don’t need to update all of tests.
Up to this point, your test automation framework now has a solid skeleton. We can start adding more “meat” or utilities to the framework to increase productivity in the next part. Jump to Part 3. If you want to get an overview of the framework’s architecture, jump back to Part 1.
Choose a reporting mechanism
Reading the test results will be difficult if we don’t have a good reporting mechanism. We need to convert test results into insights that can produce immediately corrective actions. Let’s say when you receive a test failure, how do you investigate the failed test quickly to determine whether it’s an AUT bug, an intentional functional change on the AUT, or the automation’s mistakes?
There are a lot of options available out there for logging your automated tests. Reporting that is provided by testing frameworks such as Junit, TestNG are often generated in XML format, which can easily be consumed by other software like CI/CD servers (Jenkins) but it’s not human-readable.
Third party libraries such as ExtentReport or Allure can create test result reports which are readable for humans. They also include pie charts and screenshots. There are some other open source reporting libraries such as ReportNG for Java – a simple HTML reporting plug-in for the TestNG unit-testing framework. ReportNG provides a simple, color-coded view of the test results. Additionally, setting up ReportNG is very easy.
A good report should provide the detailed information such as: the amount passed/failed test cases, pass rate, the execution time and the reason why test cases failed. Below is an example report generated by TestNG.
I’d recommend TestNG due to its popularity and ease of use. Additionally, it offers very good functionality.
Decide how to build, version control and implement CI/CD
There are other areas of concern accompanied with building a complete Selenium framework.
- Build tools and dependency managers: Dependency managers help manage the dependencies and libraries that the framework is using. Examples of these tools include Maven, Gradle, Ant, NPM, and NuGet. Build tools assist in building the source code and dependent libraries, as well as in running tests. For example, in the below image, we used Maven to execute our tests (mvn clean test):
- Version control: All automation teams must collaborate and share source code with each other. Just like a software development project, source code of the tests and test utilities are stored in a source control system also known as version control system. Popular source control systems are GitHub, Bitbucket and TFS.
- CI/CD integration: Popular CI systems include Jenkins, Bamboo, TFS, to name a few.
Integrate your framework with other tools
For some special needs, there are tools you could leverage:
- AutoIt is a freeware BASIC-like scripting language designed for automating the Windows GUI and general scripting. It will help in case we want to work with native applications like the download dialog of the browser.
- TestRail is a test case management (TCM) system that proves useful when your project has a large number of tests and related work items such as bugs. Our Selenium framework can automatically fill test results to TestRail after finishing execution.
- Jira integration, it’s good if the automated framework automatically posts/closes/opens bugs when test cases get failed.
Selenium is a powerful tool to conduct functional and regression testing. In order to get the maximum benefit out of using it, you should have a good framework architecture from the beginning. First, start with choosing the suitable programming language, and a familiar unit test framework to improve productivity. Secondly, build the Selenium Core by applying the Factory design pattern. Thirdly, build the Selenium Test component by applying the PageObjects pattern. Finally, add more utilities such as a good reporting mechanism, source control, CI/CD integration and third-party integrations whenever and wherever necessary.
Once you cement the foundation, anything you build afterwards is there to stay. From our 20+ years of experience in software testing, this upfront investment pays off with compounding interest in the long run. You won’t regret it.
Author: Truong Pham – TCoE Lead
Truong joined LogiGear Da Nang as a Test Automation Engineer. He has now been working for LogiGear’s Testing Center of Excellence, and he is responsible for building and enhancing test automation frameworks as well as providing high quality testing services. Truong has great a passion for test automation. In his free time, Truong tinkers with new technologies for fun.