Active Automata learning for web applications

 

Pascal McKeon 13136348

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

University of Limerick

 

Abstract

This paper
is a study of Active Automata Learning and its application to model both simple
and complex web applications in continuous integration environments. Active
Automata Learning uses multiple algorithms. This paper will discuss the
benefits and limitations of each while modelling web applications.

 

Introduction

Web applications in a continuous integration
environment require continuous testing to see that there are performing as
expected. Large amounts of test cases are needed to get satisfactory test
coverage but even then, certain aspects of the application are often
overlooked.  Active Automata Learning
models black box systems such as web applications to see how a system will
perform and can often show what was overlooked by testing. Automata Learning
consists of two phases exploration and testing. During exploration phases
membership queries are used to construct hypothesis models of a system under
learning. In testing phases equivalence queries are used to compare respective
hypothesis models to the actual system. The phases are repeated until an
accurate model of the system is produced.

 

Algorithms

Direct
Hypothesis Construction

Direct Hypothesis Construction (DHC) uses a breadth
first search of states of automaton which are created on the go. The algorithm
creates a queue of states to be explored. All explored states are removed from
the queue, while newly discovered states are added to the queue. The algorithm creates
a tree of explored states in a breadth first search manner (see image below).

 

All discovered states are regarded to be equal to all
previously discovered states until a complete hypothesis is formed. DHC
reconstructs the tree after each iteration. DHC has a small memory footprint
which is useful for modelling systems with a very large number of states. As
hypothesis automata are created on the go, it leads to many unnecessary
membership queries. These unnecessary membership queries are expensive on
resources and would lead to unnecessarily high web traffic while testing web
applications.

 

L* Algorithm

L* algorithm uses observation tables which represent
the outcome of membership queries. Rows are labelled by prefixes and columns by
suffixes. The table cells contain the result of the corresponding membership
query.

 

 

?

a

?

0

0

a

0

1

aa

1

0

b

0

0

ab

0

0

aaa

0

0

aab

0

0

 

 

 

 

 

L ? algorithm supports the incremental construction of
hypothesis models and thus creates less traffic than DHC.

Discrimination
Tree Algorithm

Discrimination Trees are redundancy-free in the sense
that only Membership Queries that contribute to the distinction of states must
be performed. The discrimination tree algorithm consists of main function
Discrimination Tree and helper functions Sift, Hypothesis and Update Tree.

 

 

 

 

The 0 values in ? column are
enough to distinguish b from other states so values can be removed from the
observation table.

Discrimination trees have exactly one distinguishing
suffix for each pair of states. The process by which a string is assigned to a
leaf is called sifting. Starting at the root node of the tree recursively
follow the right branch if the Deterministic Finite Automata(DFA) accepts the
concatenation with the string labelling the current internal node of the tree.
Otherwise follow the left branch. This procedure is iterated until a leaf is
reached. Sifting thus requires a number of membership queries bounded by the
height of the tree.

 

 

 

 

 

 

 

 

 

 

 

Static vs
Dynamic Pages

The above image is a simple model of a web application
using the L* Algorithm. After logging in you go to the main web application
page. As the web application uses dynamic pages, the page reloads with
different information instead of loading a new page. As you can see at state 1
both edit profile and select desired topic were successful but didn’t move to a
new state. The above image is easy to follow with little information but
modelling large web applications that use dynamic pages could lead to a lot of
information that can be hard to follow. Static web applications can be modelled
in a clear manner with multiple states compared to dynamic pages.

 

 

 

 

 

 

 

 

 

References

Introduction to Active Automata Learning from a
Practical Perspective – Bernhard Ste?en, Falk Howar, and Maik Merten

 

Automata Learning with on-the-Fly Direct Hypothesis
Construction- Maik Merten, Falk Howar, Bernhard Ste?en1, and Tiziana Margaria

 

http://www.eziobartocci.com/learnlib-tutorial.pdf