Texttest.org

If you run TextTest on a multicore machine, it will run as many tests in parallel as you have cores on the machine. In terms of the configuration this means that the "queuesystem" configuration module is automatically enabled. You can also run tests sequentially, using the "Run tests sequentially" checkbox on the static GUI's running tab, or the "-l" flag on the command line.

To disable this configuration and hence always run one test at a time, you can set

config_module:default

in your config file. To steer how many tests will be run simultaneously, you can set "queue_system_max_capacity" in your config file also.

As soon as each test finishes, the test will go green or red, and results will be presented. Unlike the default configuration, the tests will not naturally finish in order.

When you have more than one machine at your disposal for testing purposes, it is very beneficial to be able to utilise all of them from the same test run. This greatly speeds up testing, naturally, and means far more tests (or longer tests) can be run with somebody waiting on the results.

There are two basic setups here. The older "Grid Engine" setup presumes access to testing resources within your network, a shared file system on that network using something like NFS, and "Grid Engine" software installed and configured on it. The "cloud" setup (new in TextTest 3.27) presumes an account with a cloud provider (Amazon), and also a Virtual Private Cloud setup, so that the instances in the cloud can see your network also and can hence push results there when needed.

Three implementations of this are provided, though it is fairly straightforward to implement new ones if needed. The first is for the free open source grid engine SGE . Note that since Oracle bought Sun there are various descendants of the original SGE available, some Open Source, some with commercial support. See the "SGE" link above for more details. The next is IBM's LSF (which is in theory more Windows-friendly, but costs money). Finally there is support for HTCondor, although this should be regarded as somewhat experimental. You choose between these by setting the config file entry “queue_system_module” to "SGE", "LSF", or "condor".

Acquiring machines and setting up NFS and the grid engine may be quite a large task though, and unless they already exist in your network it's probably a good idea to look into using a cloud instead. This will spread the cost out more, and also make it easier to adjust usage up and down over time.

Right now there is only support for Amazon's EC2 cloud. This can be selected by setting "queue_system_module" to "ec2cloud". It relies on the Python library boto, which you will need to install, for example via "easy_install boto". As stated above you will also need a setup where the EC2 instances can access your own machines, which implies setting up a Virtual Private Cloud.

It expects you to have set up the instances you wish to use outside of TextTest. These instances should have TextTest installed (just run 'sudo easy_install texttest' on them) and obviously any packages that your system under test will require. Also, TextTest does not maintain any mapping of paths between your system and the EC2 instances, it assumes they will always be the same. Therefore you'll need to make sure that everywhere you use on your system, equivalent locations are writeable by the "ec2-user" user on your instances. It's probably best to have some kind of "instance launching" script that can ensure this.

TextTest will start any instances that are stopped, if necessary, but it will not stop them again when it is finished. The reason is that EC2 charges every time an instance is stopped and started, and test usage often entails running tests several times in a row. It's therefore fairly essential to configure up a CloudWatch alarm for each of your instances, that will stop them if they have been idle for a while. Here's some sample code that will set up such an alarm, that will stop after 2 hours where CPU utilization has been under a certain threshold (here 5% divided by the number of cores).

def addAlarm(instId, cores, regionName):
    cwConn = boto.ec2.cloudwatch.connect_to_region(regionName)
    threshold = 5.0 / cores
    action = 'arn:aws:automate:' + regionName + ':ec2:stop'
    alarm = boto.ec2.cloudwatch.alarm.MetricAlarm(name="stop-" + instId, 
                                                  metric="CPUUtilization", 
                                                  namespace="AWS/EC2",
                                                  statistic="Maximum", 
                                                  comparison="<", threshold=threshold, 
                                                  period=60, evaluation_periods=120, 
                                                  dimensions={"InstanceId" : instId},
                                                  alarm_actions=[action])
    cwConn.put_metric_alarm(alarm)

While TextTest is using the instances, it will add a "TextTest user" tag to them to provide a primitive kind of locking and stop others using them. It will also disable all Cloudwatch alarms, to prevent Cloudwatch from closing down an instance when it's in use. When tests are finished, it removes these tags and re-enables the alarms.

TextTest currently assumes all your EC2 instances are in the default region, and that your AWS credentials are available. These can be configured via boto's configuration files, either in ~/.boto or in /etc/boto.cfg. They should contain something like:

[Credentials]
aws_access_key_id = ABCSDSBFDFDBFDFBF
aws_secret_access_key = sdgsdDda87/sdgsd76r7sdgdJSAH/SGSDjds7

[Boto]
ec2_region_name = eu-west-1
cloudwatch_region_endpoint = monitoring.eu-west-1.amazonaws.com
ec2_region_endpoint = ec2.eu-west-1.amazonaws.com
cloudwatch_region_name = eu-west-1

It will make use of the "remote_shell_program" setting (default "ssh") to log in to the EC2 instances. This obviously needs to be possible without typing in a password every time: it is suggested that you make use of an ssh-agent. To test if TextTest will be able to work correctly, set up your ssh-agent and then log in with

my_machine$ ssh -A ec2-user@<ip>
ec2-user@<ip>$ ssh my_machine

i.e. it must be possible to log in and to then log in in reverse, each time without a password of course.

It will also make use of the "remote_copy_program" setting (default "rsync") in order to synchronise files back and forth between the master machine and the EC2 instance. Where possible, test runs will push their own results back to the master rather than waiting for the master to pull them.

By default it will make use of any instances it can find in the default region. You will usually need to restrict this somehow. This is done by adding appropriate EC2 tags to the instances, which are then requested via "queue_system_resource" (see below).

In both of these cases, it will by default submit all tests to the grid engine/cloud. This can be overridden by using the "Use Grid" (or "Use Cloud") radio buttons on the static GUI's running tab.

By default there are just two options, "Always" and "Never". As there is usually a small time penalty for using the grid or the cloud, it can however be useful to configure it to only submit when more than a handful of tests are requested. To do this, you can e.g. set

queue_system_min_test_count:3

In this case there would be a third radio button option , "If enough tests", which is selected by default. The behaviour is then to run locally if only one or two tests are requested. The "-l" option on the command line also works here, and can also take a numerical argument, where "-l 0" corresponds to "Always", "-l 1" to "Never" and "-l 2" to "If enough tests".

Here is a sample screenshot, running TextTest's 1600 acceptance tests in the cloud, which normally takes about 90 seconds using 48 CPUs:

Some tests have finished and gone green, but others are still running and hence yellow. TextTest reports their state in the cloud ("RUN") followed by the cloud instance each is running on in brackets.

Internally, TextTest submits itself to the grid engine and runs a slave process remotely, which runs the test in question and communicates the result back to the master process via a socket. The master process will also poll the grid engine periodically (every 15 seconds) to find out what is happening to its tests, to be able to e.g. pick up internal grid engine states like suspension and also to be able to report if a job dies without reporting in (for example because of hardware problems or because Python cannot be found remotely, or because you forgot to install TextTest on your cloud instance!)

As this functionality works with a different configuration module, additional config file entries, running options and support scripts are available, over and above those provided by default:

Resources are used to specify properties of machines where you wish your job to run. For example, you might want to request a machine running a particular flavour of linux, or you might want a machine with at least 2GB of memory. Such requests are implemented by resources in the grid engines (see their documentation for how SGE and LSF understand resources), and by tags in the cloud. TextTest refers to them internally as resources

TextTest must choose which resources to request on the command line. The procedure here is to request all of the resources as specified below:

The format in all cases is e.g. "os=RHEL6", the resource name followed by "=" followed by the value. Wildcard expansion (modelled on UNIX pathname expansion) is also supported in each case.

When tests complete, TextTest will keep the remote test process alive and try to reuse it for a test with compatible resource requirements. This bypasses the time needed to submit tests to the queue system and wait for them to be scheduled, and reduces network traffic. This can improve throughput considerably, particularly where a large number of short tests need to be run.

By default, until all tests have been dispatched, TextTest will reuse remote jobs in this way, but will also continually submit new jobs at the same time, until it reaches what it regards as the capacity, given by the config file entry "queue_system_max_capacity". In a grid engine it's advisable to set this slightly higher than the maximum number of parallel processes you can reasonably expect from your grid. With a cloud it is a way of regulating cost, so that you don't use more instances than you are prepared to pay for. (Often you get diminishing returns from adding more and more instances to your parallel testing).

A grid engine will be configured to have a number of queues, which will hold jobs (tests in our case) until a CPU becomes available somewhere, and then dispatch them to that machine. These queues also handle job priorities, it is generally possible to set up several queues so that jobs from one will cause jobs from the other to be suspended. In the case of testing, it is often useful to prioritise jobs by how long they are expected to take, so that a one-hour test can be suspended to allow a five-second test to run. To find out more, read the documentation of the grid engine of your choice.

As far as TextTest is concerned, it must decide which queue to submit each test to. The procedure is as follows:

It is often useful to write a derived configuration to modify this logic, for example to introduce some mechanism to select queues based on expected time taken.

As the queuesystem configuration is often used for very large test suites, it will start to try and clean up temporary files before the GUI is closed. Otherwise closing the GUI can appear to take a very long time.

When using the cloud, test results will be pushed to the master process as soon as they are available, and will then be deleted on the EC2 instance. Succeeded tests will by default not be transferred at all. The only way to override this is to use the -keeptmp flag.

The default behaviour with a grid engine is to remove all test data and files belonging to successful tests remotely, i.e. as soon as they complete. This can be overridden by providing the "-keepslave" option on the command line, or the equivalent switch from the Running/Advanced tab in the static GUI, in case you want to examine the filtering of a succesful test for example.

The automatic failure interpretation (or "known bugs") feature allows you to trigger reruns of tests when known issues occur. When running with the cloud or the grid, this means that the rerun may be triggered on a different instance somewhere else (which might be useful if the problem was with the instance for example).

There is a parameter to limit this, "queue_system_max_reruns". This prevents excessive amounts of rerunning that might result if this feature was used too much. By default only 100 reruns per test suite will be allowed, irrespective of how many tests are originally run.

The queuesystem configuration also provides some improvements in default configuration's functionality for comparing system resource usage in tests. This is essentially in the area of the concept of a performance test machine. In the default configuration, tests are run locally, so all we can do is see if our current machine is enabled for performance testing. With a grid engine or a cloud at our disposal, we can actually request a performance machine for particular tests.

The simplest way to do this is to check the “run on performance machines only” box from the static GUI (“-perf” on the command line). That will make sure the grid engine requests that the test only run on such machines.

It is also possible to say that once tests take a certain amount of time they should always be run on performance machines only (it is assumed that the performance of the longest tests is generally the most interesting.) This can be done via the setting "min_time_for_performance_force". The time measures used are those indicated by the setting "default_performance_stem", which if not set defaults to the total CPU time used)

There is also an additional mechanism for specifying the performance machines, which with SGE or the cloud has to be used instead. The config file setting 'performance_test_resource' allows you to identify your performance machines via a resource (see above), for example to say "test performance on all c3.2xlarge instances" in the cloud. This is generally easier than writing out a long list of machines, and is compulsory with SGE or EC2. With LSF, you can write out the machines as for the default configuration if you want to.

TextTest will make sure the program runs with the environment variables specified in your "environment files", but it does not forward environment variables set externally by default. Sometimes it's useful to be able to set something externally and have TextTest forward it to your grid engine of choice. In that case you can list such variables in "queue_system_environment", and TextTest will transfer whatever value they have in the master process's environment. i.e.

queue_system_environment:ENVVAR1
queue_system_environment:ENVVAR2
...

etc. This format is independent of which grid engine you are using, and is preferred to using queue_system_submit_args (see below)

From TextTest 3.23 enabling self-diagnostics also in the slave process requires using a separate flag "-xs" (alternatively a separate checkbox in the UI). It is no longer automatically inferred when self-diagnostics are requested with "-x". These diagnostics will then be written to subdirectories of the location where the master writes its logs, named after the slave job names.

You can provide additional arguments on the command line to the grid engine submission program ("qsub" in SGE or "bsub" in LSF) by specifying the variable "queue_system_submit_args" in your config file(s). For example, to forward an environment variable "ENVVAR" using SGE, you can use

queue_system_submit_args:-v ENVVAR

Note that since TextTest 3.27 the recommended way to do this is to use queue_system_environment, above.

You can configure the TextTest program that is run by the slave process via the environment variable "TEXTTEST_SLAVE_CMD", which defaults to just running "texttest.py". The main point of this is if you need a startup script to find the right version of Python on the remote machine, for example, or if you want to plug in developer tools like profilers and coverage analysers. It is also used internally in the TextTest HTML reports to provide a correct command-line suggestion for starting TextTest.

You can also configure the amount of time to wait before the polling of the grid engine (described above) starts, via the variable TEXTTEST_QS_POLL_WAIT. By default it waits 5 seconds before starting. The interval between polls is controlled by the variable TEXTTEST_QS_POLL_SUBSEQUENT_WAIT, which defaults to 15 seconds currently. The granularity (how frequently to check for completion and/or exiting while waiting) can be configured by TEXTTEST_QS_POLL_INTERVAL, which defaults to 0.5 seconds. These options are mostly useful when testing and debugging.

The queuesystem configuration also provides an improvement to the batch mode functionality for unattended runs. This basically involves adding a horizon when all remaining tests are killed off and reported as unfinished. This will be done if it receives the signal SIGUSR2 on UNIX.

Both grid engines can be set up to send this signal at a particular time themselves, which involves submitting the TextTest batch run itself to the grid engine. In LSF, use “bsub -t 8:00” to send SIGUSR2 to the job at 8am the next morning. In SGE, use “qsub -notify”, and then call “qdel” for the job at the allotted time: this will also cause the signal to be sent.

Jobs in queue systems need to have some location as their current working directory, which is also where core files are written if the job receives one of the signals above. To avoid race conditions TextTest will by default use the location "$TEXTTEST_TMP/grid_core_files" for this purpose, which after the first run will always exist.

If you generate TEXTTEST_TMP automatically, e.g. under a Maven target directory, you may find this won't work. In that case this can be set to some other global location, using the config setting "queue_system_core_file_location". It is also probably a good idea then to periodically clean this location.

Both LSF and SGE have mechanisms to send signals to jobs when they exceed a certain time limit. It is possible to configure the queues such that they send SIGXCPU if more than a certain amount of CPU time has been consumed, or SIGUSR1/SIGUSR2 if too much wallclock time is consumed.

TextTest will assume this meaning for these signals and report accordingly. SIGXCPU is always assumed to mean a CPU limit has been reached. SIGUSR2 is interpret to mean a kill notification in SGE and a maximum wallclock time in LSF, while SIGUSR1 is used the other way round in the two grid engines.

TextTest will also install signal handlers for these three signals to the SUT process, such that they will be ignored unless the SUT decides otherwise. This is mostly to prevent unnecessary core files from SIGXCPU and to allow TextTest to receive the signals itself and terminate the SUT when it's ready to.

To configure SGE to play nicely with this, it's useful to set a notify period of about 60 seconds when jobs are killed (TextTest submits all jobs with the -notify flag). By default, SIGUSR1 is also used to be a suspension notification in SGE, which TextTest does not expect or handle. It's therefore important to disable the NOTIFY_SUSP parameter in SGE if you aren't going to get tests spuriously failing with RUNLIMIT whenever they would be suspended:

Both grid engines have functionality for testing systems which are themselves parallel, setting aside several CPUs for the same test. TextTest integrates with this functionality also.

This is basically done via the config file setting "queue_system_processes", which says how many CPUs will be needed for each test under that point in the test suite. In LSF this basically translates to the “-n” option to “bsub”. In SGE, you need to use an SGE parallel environment (read the SGE docs!), this is specified via the config file entry “parallel environment name”,

The performance machine functionality described above still works here. In this case TextTest will ask the queue system for all machines that have been used, and only if they are all performance machines will performance be compared.

Sometimes certain hosts are reserved as database hosts, while many more may be used to run tests. In this case it is useful to set up a "proxy" which can perform the database setup and then start the real test process also via the grid. This is done by setting the "queue_system_proxy_executable" setting to point out the script which can perform this setup. The machines where it may run can be identified via resources, using "queue_system_proxy_resource", which works in the same way as "queue_system_resource".

This proxy program will be given the command to use to run the real test via the environment variable "TEXTTEST_SUBMIT_COMMAND_ARGS", which will be in Python list format. It is therefore obviously easiest to write your proxy in Python. The basic plan is to do whatever needs doing, set up the database and then start the test as instructed by TextTest. For example:

#!/usr/bin/env python

import os, subprocess

# Do whatever we need to do, setup database etc.
...

commandArgs = eval(os.getenv("TEXTTEST_SUBMIT_COMMAND_ARGS"))
subprocess.call(commandArgs)