Stress Testing Overview

Importance

Stress and load testing are a very important topic in Web Development. The usability of websites and applications depends upon the fast response from web infrastructure so that users do not get frustrated and leave. Studies have shown that many website users expect to see the new page within 3 seconds.

Terms and Definitions

Stress and load testing refers to the tests and exercises carried out on web development objects by the use of specialized tools, in order to ensure their working capacities meet expectations. Performance testing is a broader term, and it encompasses many types of testing such as “spike tests”, “endurance tests”, “volume tests”, "stress tests", "load tests", "soak tests", and "isolation tests". Some people consider stress testing's goal to be a system failure. This blog post states that breaking the web application is the true nature of stress testing.

Wikipedia has some definitions for what they call software performance testing.

"The term load testing is used in different ways in the professional software testing community. Load testing generally refers to the practice of modeling the expected usage of a software program by simulating multiple users accessing the program concurrently. As such, this testing is most relevant for multi-user systems; often one built using a client/server model, such as web servers. However, other types of software systems can also be load tested. For example, a word processor or graphics editor can be forced to read an extremely large document; or a financial package can be forced to generate a report based on several years' worth of data. The most accurate load testing simulates actual use, as opposed to testing using theoretical or analytical modeling."

Who Does It and Why?

Stress testing is usually carried out by a number of personnel in an organization including system admins, web developers, database administrators, product managers, quality assurance managers, system architects, network administrators, marketing teams (website owners), line of business directors, and sometimes the customers. It is common for third party organizations to be hired to provide consulting where they build the stress test scripts, run the tests, and interpret the results. Some consultants are Performance Engineers, which are highly skilled staff that understand all aspects of web architecture and software development because they need to not only create relevant test plans, but they also must find and eliminate of bottlenecks in system performance.

Conducting stress tests is aimed at engaging controlled processes of analysis and measurement. Usually the system under test ("SUT") is stable enough functionally so that the test can be conducted with few bugs encountered for one user. Load and stress tests are carried out by increasing the loads on the SUT by the use of tools such as JMeter, LoadStorm, LoadRunner, ApacheBench, Grinder, or Pylot. Go here for a full list of tools.

How to Do It?

A good stress testing starts with a planning process. First, you must collect Information to determine the goals of the testing. Do we want to break the SUT with high amounts of user traffic? Do we want to see if the system can recover from a failed hard drive? Do we want to see how the system responds when CPU is consumed by another process or when memory is low?

Second, you must build a good test plan that contains realistic scenarios for simulating usage. Analyzing the plan for errors or omissions and documenting each type of user behavior will provide a much better test plan, and it will result in more accurate performance data when the test is executed. This process involves reviewing of the plan and updating the activities regularly as the development process continues. Load and stress testing is not a one-time event. It should be adjusted as the developers add new features and as the system admins change configurations.

Third, many engineers like to set up a test environment specifically for the technical investigations involved in functional and stress testing. It also allows for experimentation on components that affect scalability. You can add or remove items (e.g. load balancers, bigger database servers, web servers) to see how much response times change. After all the dust settles, it is always a good idea to stress test the system at least once after the release has been moved into production. You never know when something slightly different in the test environment will cause a degradation in the production system.

Fourth, run the tests. Many testers prefer to run a test with gradually increasing volume of users because it will clearly show at what load errors begin to show up from the SUT. You may want to start with a known low volume that your system can handle easily, then reach a load level by the end of the test that you believe is beyond your SUT's capability. For example, you may want to start with 500 concurrent users and increase load over an hour up to 10,000 concurrent users.

Last, analyze the test reporting. Metrics such as response times, requests per second, and throughput are very important in the analysis because they will show you how many hits were applied and how the SUT performed under stress. Each application or website is different, so there is no right or wrong answer. 1,000 concurrent users may be fantastic for some sites, but web applications such as Facebook must be able to respond with pages in a couple of seconds under load of 1 million users. Your stakeholders (i.e. project manager, marketing department, business owners) can tell you what is success for your site. In stress testing, you should take that objective and try to push the system past that level of load.

Summary

Stress and load testing is all about bombarding your site or application with many simulated users. You should build good, realistic test plans, and you should run test loads that are high enough to break the SUT. The test results should include performance metrics that are useful to your stakeholders such as average response times. After presenting these measurements back to the interested parties, if the performance is not deemed adequate, you must begin to tune and engineer your system - and that is a different topic altogether.