A Big and Complex Interview

For this interview, we talked to Greg Wester, Senior Member Technical Staff, Craig Jennings, Senior Director, Quality Engineering and Ritu Ganguly, QE Director at Salesforce.

Salesforce.com is a cloud-based enterprise software company specializing in software as a service (SaaS). Best known for its Customer Relationship Management (CRM) product, it was ranked number 27 in Fortune’s 100 Best Companies to Work For in 2012.

What is big or complex about your system (users, physical size, data, load, distribution, safety, regulation, security, other)?

This should give you an idea. Salesforce processes 700 million highly complex business transactions per day for nearly 3 million active users, whose raw processing needs are growing at a 50% compounded annual rate. We expect to soon exceed 1 billion transactions a day across global 6 data centers housing 20 computing clusters, which we call “pods”. We have a multitenancy architecture where each customer’s data lives with a group of other customers in one of these pods. Within a pod we have a horizontally scaled application tier on x86 commodity hardware that, among other things, also hosts a distributed cache. We have a home grown message queueing system that allows asynchronous processing to be scheduled in either the database tier or the application tier.

Do you remember any remarkable event that changed your mind about how big or complex your system is?

This company has a strong culture of “putting our money where our mouth is”, so it should be no surprise that we use our own product to run our business. It’s also well-known that Salesforce employees collaborate, share, and align on our corporate social networking product, Chatter. When Chatter was still under development, our founder, Marc Benioff, encouraged every employee to share their vision and goals document on their Salesforce profile on Chatter. As a result, one of our non-customer facing pods showed a brief performance decrease while we added physical storage to the file servers. This affirmed that monitoring and management tools are often as important as the product software itself in achieving high uptime. You have to watch what’s happening, you have to respond quickly, and you have to learn from what’s happened. Our early movement towards being an open social enterprise exceeded estimations. However, we were prepared by our DNA of using our own product to run our own business.

Do you document testing as you have in the past, or has documentation become leaner even with a big or complex system?

At Salesforce, we think automated test cases describe how a feature works far better and more efficiently than a design document. We’re an Agile shop, so our design documentation isn’t voluminous. However, the aggregate of tests that have passed and failed, are a more current, accurate, detailed, and up-to-date description of our product than any written test documents. We have made it more efficient based on our customer’s needs. We have reviewed traditional test plans/strategies and kept what is needed but our philosophy is lean: less documentation and more testing. That’s not to say we don’t do test planning. Our Quality Engineers must first think about the feature at a high level. We built a tool that encourages thinking about the feature at a high level. You must understand the customer’s use cases, list out all your assumptions, and plan out a testing strategy is. After this though, we get the engineer right into their coding environment and keep them there. They’ll stub out their test cases and write the intent of the test case and expected result in Javadoc. When we check those tests in, we have a tool that parses the Javadoc, and sticks the test case name, description, expected result into our test case repository automatically. It’s quite intelligent and keeps the engineer productive since they don’t have to switch contexts at all.

What type SDLC do you follow? Have you found limitations in SDLC due to the size of the system you support?

Salesforce has its own flavor of Agile called the Adaptive Delivery Methodology, or ADM. It’s pretty much textbook Scrum with a few twists. Product Owners prioritize features from a backlog based on customer interest, and business opportunity. Teams of four to a dozen engineers in development, quality, performance, user experience, and documentation meet in a daily stand up meeting and collaborate to deliver “potentially releasable” features each iteration, which can be chosen by the individual team. Most are on two-week iterations, but some are on week-long sprints. Our customer base requires that we introduce features with ample notice and staging beforehand on sandbox environments. We are very careful about what goes into a patch release, because a fix for one customer can turn out to be a bug to another.

What is the biggest problem you face in delivering your system to users?

Our platform is basically an ecosystem that is built and managed by us, but controlled by the customers. Our sales are increasing at over 38% year over year. As a result of this success, the performance tuning tweaks we’ve verified and deployed to our system today may be suboptimal a year from now, even if we made no major code changes. Scale is key. We know we have some of the best Technical Operation and R&D teams in the world. Their coordinated success is ultimately the foundation of our business model.

How has your testing strategy changed as your system got bigger or more complex?

Definitely, we have had to look at our customer’s complex implementation needs, complex business processes and customizations and ensured we represent real customer scenarios in our testing. As we grew at an enormous rate, we learned how support cases that escalate to R&D drag the velocity of feature work. They mire teams in bug fixing. Too many bug fixes in patch releases also introduces risk to the product. The goal of our testing strategy is to minimize the amount of time supporting the feature after it’s released. In other words, our aim must be to find all of the impactful bugs, corner cases, and quirks. We leverage tests written by customers in our Apex Code language to verify their use cases before each major release. Since some bugs are inevitable even with a very thorough process, we then put effective monitoring and management systems in place so that we can react to issues immediately when they arise.

Did you change the experience or job requirements for test engineers as a result of a bigger or more complex system?

Salesforce’s customers have expectations that our system will have minimal downtime each year, and no unscheduled downtime. We are delivering a service at a scale where every test must be automated, and most features have more SLOCs of test code than application code. We hire only software engineers into Quality Engineering who can perform white and black box testing. Not every development engineer has the instinct to be a test engineer, and vice versa. Quality Engineering requires solid programming skills, a laser focus on customer service, a knack for risk management, and an eye for hidden or low frequency/high impact bugs. We have also expanded our performance testing team.

Have you changed your reliance on test automation due to size or complexity?

We rely on it in increasing amounts. This is the only way we can continue scaling. We think test automation has a maturity model by which you can measure the commitment to quality within an organization:

  • Level 1 is unit test coverage, representing a certification by the developer that individual implementations of software classes function in a particular way.
  • Level 2 includes functional testing of a module of software classes.
  • Level 3 is end to end testing of every module in a particular application while it is running on a single host or node.
  • Level 4 is testing the application under load for an extended period with all of its supporting subsystems including database, cache, message queues, etc.
  • Level 5 has the same parameters as Level 4, with the added requirement that every piece of hardware, every operating system library, and every configuration is as it appears in a production environment with customer data.

When we start seeing diminishing returns from one level, we move to the next.

What percentage of your tests are automated?

Over 90%. Our goal is that no teams run manual tests, unless you’re counting on exploratory testing (which every team does). That last 10% is amazingly difficult. We’re forced to test manually when the tools for automation are in their infancy, such as on Mobile platforms. We’ve made impressive leaps forward in these areas, but there’s still work to do. Talk to us next year. We’ll have solved some of those problems and be closer to 100%.

What do you see in the future for testing big or complex systems?

Mainstream tools like JUnit were designed for unit testing a single class but have evolved to accommodate complex functional testing scenarios. Unit testing on a simple piece of code has a binary outcome: pass or fail. Functional testing on a live distributed system with components designed on loose service level agreements to accommodate graceful degradation and failure of neighbors is different. The tools for this are in their infancy and require engineers as creative and talented as the ones who designed the system to write frameworks for testing it. This presents an opportunity for a thought leader to emerge with an industry standard for distributed software testing. We’re proud of our accomplishments in this area, and are aiming for that goal.

LogiGear Corporation
LogiGear Corporation provides global solutions for software testing, and offers public and corporate software testing training programs worldwide through LogiGear University. LogiGear is a leader in the integration of test automation, offshore resources and US project management for fast, cost-effective results. Since 1994, LogiGear has worked with Fortune 500 companies to early-stage start-ups in, creating unique solutions to meet their clients’ needs. With facilities in the US and Viet Nam, LogiGear helps companies double their test coverage and improve software quality while reducing testing time and cutting costs.

The Related Post

Bringing in experts can set you up for automation success. Test automation isn’t easy when your testing gets beyond a few hundred test cases. Lots of brilliant testers and large organizations have, and continue to struggle with test automation, and not for lack of effort. Everyone understands the value of test automation, but few testing ...
Mobile testers need to take a different approach when it comes to Test Automation.
What is the Automation ROI ticker? The LogiGear Automation Return on Investment (ROI) ticker, the set of colored numbers that you see above the page, shows how much money we presumably save our customers over time by employing test automation as compared to doing those same tests manually, both at the design and execution level.
Introduction In many of the Test Automation projects that we are involved with using our Action-Based Testing methodology, management has expressed a need to relate tests and test results to system requirements. The underlying thought is that automation will create extra possibilities to control the level of compliance to requirements of the system under test. ...
Elfriede Dustin of Innovative Defense Technology, is the author of various books including Automated Software Testing, Quality Web Systems, and her latest book Effective Software Testing. Dustin discusses her views on test design, scaling automation and the current state of test automation tools. LogiGear: With Test Design being an important ingredient to successful test automation, ...
Having the right Test Automation plan helps bridge gaps and fragmentations in the complex mobile environment. Figuring out the best Test Automation plan is one of the biggest frustrations for today’s digital teams. Organizations struggle to develop cross-platform Test Automation that can fit with their Continuous Integration cadence, their regression cycles and other elements of ...
Based in Alberta, Canada, Jonathan Kohl takes time out of his busy schedule to discuss his views on software testing and automation.
“Happy About Global Software Test Automation: A Discussion of Software Testing for Executives” Author: Hung Q. Nguyen, Michael Hackett, and Brent K. Whitlock Publisher: Happy About (August 1, 2006) Finally, a testing book for executives!, November 17, 2006 By Scott Barber “Chief Technologist, PerfTestPlus” Happy About Global Software Test Automation: A Discussion of Software Testing ...
LogiGear Magazine September Test Automation Issue 2017
Are you frustrated with vendors of test automation tools that do not tell you the whole story about what it takes to automate testing? Are you tired of trying to implement test automation without breaking the bank and without overloading yourself with work? I experienced first-hand why people find test automation difficult, and I developed ...
LogiGear Magazine – October 2010

Leave a Reply

Your email address will not be published. Required fields are marked *

Stay in the loop with the lastest
software testing news

Subscribe