UNIVERSIDAD DE MURCIA

Loading...

UNIVERSIDAD DE MURCIA FACULTAD DE INFORMÁTICA

Model-Driven Modernisation of Legacy Graphical User Interfaces Modernización Dirigida por Modelos de Interfaces Gráficas de Usuario

D. Óscar Sánchez Ramón 2014

Model-Driven Modernisation of Legacy Graphical User Interfaces

A dissertation presented by Óscar Sánchez Ramón and supervised by J. García Molina & J. Sánchez Cuadrado

In partial ful llment of the requirements for the degree of Doctor of Philosophy in the subject of Computer Science

University of Murcia October

Modernización Dirigida por Modelos de Interfaces Grá cas de Usuario R Motivación Actualmente numerosas empresas abordan la migración de los sistemas heredados (legacy systems) que disponen, con el n de adaptarlos a nuevas tecnologías de so ware que ofrecen mejores características (por ejemplo, mayor facilidad de mantenimiento o mejor experiencia de usuario). Las interfaces grá cas de usuario (Graphical User Interfaces, GUIs) constituyen un elemento importante en dichas migraciones, dado que son el medio que los usuarios utilizan para interaccionar con el sistema. Además, la aparición en los últimos años de una gran variedad de dispositivos capaces de ejecutar aplicaciones (tabletas, teléfonos y televisiones inteligentes, etc.) ha repercutido en que el diseño de las interfaces de usuario se convierta en un reto mayor. Un ejemplo típico de sistemas heredados son las aplicaciones creadas con entornos D (Rapid Application Development), tales como Oracle Forms y Microso Visual Basic, que gozaron de gran aceptación en los noventa. Nuestro trabajo se centrará en este tipo de aplicaciones, a las que nos referiremos como aplicaciones D. Éstas ofrecían un paradigma de programación centrado en la GUI que permitía la creación de ventanas en un tiempo reducido. Sin embargo, las aplicaciones D poseían dos características fundamentales que re ejan prácticas desaconsejadas en ingeniería del so ware. La primera característica es que la posición de los controles (tales como cajas de texto o etiquetas) estaba expresada con coordenadas. Esto constituye una mala práctica porque el cambio de posición de un control puede implicar la modi cación de la posición de otros. Además, las interfaces expresadas con coordenadas sólo están optimizadas para una resolución y tamaño de ventana determinados (no se adaptan al tamaño de éstas), por lo que no se muestran adecuadamente cuando redimensionamos las ventanas o cuando se ejecuta la aplicación en dispositivos con pantallas de diferentes dimensiones. Por el contrario, en la actualidad se usan gestores de layout (layout managers) como FlowLayout o BorderLayout en Java Swing, que permiten adaptar el contenido a las dimensiones de la ventana. La segunda característica consiste en que el código de los manejadores de eventos de la interfaz frecuentemente mezcla diferentes aspectos, desde aspectos arquitecturales como la lógica de negocio o la presentación, hasta aspectos funcionales como la validación de los formularios o el i

ujo de navegación entre las vistas de la aplicación. En la actualidad habitualmente se utilizan frameworks de desarrollo que fomentan la separación de aspectos porque facilitan en gran medida el mantenimiento y la extensibilidad de las aplicaciones, en contraste con las aplicaciones D que eran más difíciles de mantener. En relación con la primera característica y su tratamiento en una migración, existen diversidad de trabajos que versan sobre ingeniería inversa de GUIs [ ] [ ], sin embargo en muchos de ellos la migración de las ventanas se limita a detectar controles y traducir dichos controles al toolkit de la tecnología destino. Especial relevancia tienen los trabajos [ ] [ ] y [ ], que presentan tres enfoques que prestan atención al layout de las vistas (ventanas, páginas web, etc.) y que extraen un modelo que representa dicho layout. La desventaja fundamental de estos enfoques es que obtienen una única representación del layout (por ejemplo, usando el GridBagLayout de Java Swing), con lo cual, la generación de interfaces utilizando otros tipos de layout (por ejemplo, las capas otantes de CSS) no es una labor directa. Con respecto a la segunda característica, podemos encontrar varios trabajos centrados en el análisis de código de la GUI [ ] [ ] [ ]. La gran mayoría se centra en extraer las transiciones que se producen entre las distintas vistas de la aplicación, que normalmente se representan con algún tipo de máquina de estados, con el objetivo de utilizar esta información para en el ámbito de la comprensión de programas (program comprehension) o para realizar pruebas unitarias. Objetivo Nuestro objetivo consiste en facilitar la migración de aplicaciones D a través de la creación de un framework de migración de GUIs de sistemas D heredados. El framework está destinado fundamentalmente a inferir el layout de la aplicación original y separar los aspectos que se encuentran entremezclados en los manejadores de eventos. El análisis de varias aplicaciones creadas con entornos D y el estudio del trabajo relacionado condujeron a la extracción de una serie de requisitos que orientó el diseño de la solución, y que son los siguientes: (R ) Extracción explícita de información. Es necesario obtener una representación explícita de alto nivel la información de la interfaz de usuario. (R ) Modularidad. Es deseable fragmentar el proceso de reingeniería en etapas más sencillas para favorecer su mantenimiento. (R ) Automatización. El proceso debe ser automatizado en la medida de lo posible. ii

(R ) Independencia del y origen y el destino. Debe ser posible extender el proceso y su reutilización con distintas tecnologías de origen/destino con un esfuerzo relativamente reducido. (R ) Asemejar la estructura visual y lógica. La estructura lógica de las vistas, esto es, cómo están contenidos unos controles en otros, debe coincidir con la estructura que un usuario percibe al observar la vista. (R ) Representación de alto nivel. El layout de la vista debe expresarse con construcciones de alto nivel, como por ejemplo los gestores de layout de Java Swing, que controlan la disposición espacial de componentes en una ventana. (R ) Tolerancia a controles desalineados. La solución debe manejar la situación en que los controles se encuentren levemente desalineados. (R ) Soluciones alternativas. Un mismo layout puede lograrse con varias combinaciones distintas de gestores de layout, y sería deseable que los desarrolladores pudieran conocer esas alternativas. (R ) layout con gurable. El conjunto de gestores de layout a utilizar debe ser parametrizable. (R ) Abstracción de código. El código se debe abstraer para facilitar su análisis. Dado que el código de los manejadores de eventos responde a una serie de patrones recurrentes, sería interesante detectar esos patrones para abstraer el código. (R ) Categorización de código. Es necesario que sea posible identi car los distintos aspectos arquitecturales del código de la aplicación, esto es, el código de la lógica de negocio, de los controladores y de la interfaz de usuario. (R ) Identi cación de las interacciones y ujos de navegación. La solución deber permitir además identi car otros aspectos, como las interacciones que existen entre los controles (por ejemplo, que al marcar una casilla de veri cación se permita editar un determinado campo de texto) o el ujo de navegación entre las distintas vistas de la aplicación.

iii

Desarrollo de la arquitectura del framework Nuestro framework ha sido construido aplicando la Ingeniería del So ware Dirigida por Modelos (Model-Driven Engineering, MDE) que se caracteriza por utilizar modelos a varios niveles de abstracción para representar diversos aspectos del sistema, con el n de obtener una automatización en el proceso de desarrollo. En nuestro caso, MDE aporta a nuestra solución dos principales bene cios : la representación de aspectos del sistema heredado mediante modelos y metamodelos, y la automatización del proceso y modularidad de la solución por medio de cadenas de transformaciones que incluyen transformaciones modelo-a-modelo, modelo-acódigo y código-a-modelos. La arquitectura de modelos que hemos diseñado incluye dos modelos que independizan la solución de la tecnología origen (el modelo GUI normalizado y el modelo de comportamiento D), y una serie de modelos de interfaz de usuario concreta (CUI, Concrete User Interface) que aportan independencia de la tecnología destino. Hemos de nido varios modelos de CUI, de modo que cada uno de ellos trata un aspecto diferente (aquí no nos referimos a aspectos arquitecturales), con lo que se fomenta la separación de aspectos. Los modelos CUI implementados son: • Modelo de estructura: muestra la estructura lógica de las vistas, esto es, muestra las partes distinguibles de las vistas y los controles que contienen. • Modelo de layout: representa la disposición espacial de los controles que contiene la vista en términos de gestores de layout. • Modelo de separación de aspectos: expresa el código de los manejadores de eventos mediante patrones de código y etiqueta dicho código con el aspecto arquitectural al que corresponde (lógica de negocio, GUI, o controlador). • Modelo de interacciones: expresa las dependencias entre los controles de la interfaz, así como el ujo de navegación que existe en las diferentes vistas de la aplicación.

Inferencia del layout La inferencia del layout de las vistas consta de tres fases: i) extracción de regiones, ii) representación de relaciones espaciales relativas, y iii) descubrimiento del layout de alto nivel. Se han desarrollado dos versiones del proceso de inferencia del layout. En la primera versión se iv

abordaron las tres fases mencionadas, siendo la última de ellas implementada mediante una aproximación heurística. En la segunda versión se sustituyó el algoritmo de la tercera fase por un algoritmo exploratorio, más so sticado que en la primera versión, lo que conllevó también a realizar modi caciones en la segunda fase. En las aplicaciones D pueden existir controles simples (no contenedores, como los botones) que no se encuentren contenidos en controles contenedores (por ejemplo, paneles), sino que se encuentren solapados con estos. El proceso de extracción de regiones (primera fase de la inferencia del layout) en primer lugar hace explícita esta relación de contención entre los controles. Para lograr esto, se crea una región para cada control, y para aquellos controles que visualmente tienen borde y contienen a otros controles simples, se añaden las regiones de estos últimos a la región del control que los contiene visualmente. En segundo lugar, la extracción de regiones evita que existan controles simples al mismo nivel que controles contenedores. Para ello, crea regiones nuevas que contienen las regiones de aquellos controles no contenedores que están al mismo nivel que controles contenedores. Al nal de este proceso se tiene la vista organizada en un árbol de regiones, donde la estructura lógica concuerda con la estructura visual. La segunda fase de la estrategia de inferencia del layout es la representación de relaciones espaciales relativas a partir de la información de las regiones. En esencia trata de expresar las relaciones entre controles contiguos por medio de un grafo de posiciones relativas, donde los vértices son los controles y las aristas son las relaciones espaciales. La implementación de este grafo ha variado entre la primera y la segunda versión del proceso. En la primera versión se representa explícitamente la posición entre dos controles mediante las relaciones arriba, abajo, izquierda, derecha, y una distancia signi cativa entre los controles se representa por medio de vértices especiales denominados huecos. En la segunda versión se optó por representar la posición entre dos controles por medio de dos intervalos Allen [ ], uno para el eje X y otro para el eje Y, y la distancia entre los nodos se mide en niveles discretos que se calculan dinámicamente aplicando técnicas de agrupamiento (clustering). La tercera fase obtiene el diseño expresado por medio de una composición de gestores de layout. En la primera versión se implementó un algorítmico heurístico basado en el encaje de patrones. Se de nió un patrón para cada tipo de gestor de layout, así como una función de idoneidad que, aplicada a un conjunto de nodos del grafo de posiciones relativas, devuelve el porcentaje de nodos encajados en el patrón. El modo de funcionamiento es el siguiente: para cada grafo de posiciones relativas que proviene de una región contenedora se aplican las funciones de idoneidad de todos los gestores de layout, y se aplica el patrón asociado a aquella v

función que obtiene un valor más alto. Este algoritmo tiene un inconveniente de especial relevancia: no permite detectar patrones anidados, con lo cual, las vistas que tienen un diseño complejo en muchas ocasiones no serán reconocidas correctamente. La segunda versión del descubrimiento de alto nivel utiliza un algoritmo exploratorio que se basa en el encaje de patrones y la reescritura del grafo de posiciones relativas. Cada gestor de layout tiene un patrón asociado. El algoritmo en primer lugar genera todas las secuencias de gestores de layout posibles, e intenta llegar a una solución aplicando cada secuencia. Para cada secuencia, se aplican los patrones sobre el grafo en el orden indicado por ésta, de modo que cuando un patrón encaja en un subgrafo, éste se reemplaza por un único nodo. Se continúa aplicando el proceso de encaje de patrones y reescritura del grafo hasta que queda un único nodo, lo que denota que hemos alcanzado una solución. Si sucede que tras un número adecuado iteraciones no se han producido cambios en el grafo, entonces se detiene la búsqueda pues no es posible hallar una solución con esa secuencia. Cada solución obtenida es evaluada por una función de idoneidad que nos indica cómo de buena es la solución hallada. Al nal del proceso se tiene un modelo que indica una serie de posibles layouts para cada contenedor de la vista, y también nos indica cuál es el mejor layout de acuerdo con la función de idoneidad. El conjunto de gestores de layout utilizados en la solución es con gurable, con lo que es posible limitar o extender el mismo según las características de la tecnología destino. Desarrollo del enfoque de análisis de manejadores de eventos Hemos desarrollado una solución para separar los aspectos que se encuentran mezclados en los manejadores de eventos. Concretamente abordamos la separación de los aspectos arquitecturales de la aplicación (lógica de negocio, controlador e interfaz de usuario), así como la extracción de las interacciones que existen entre los controles y entre las vistas de la GUI. Para alcanzar este objetivo realizamos una fase de abstracción del código previa a la separación de aspectos. La abstracción consiste en representar el código fuente de los manejadores de eventos en términos de primitivas que expresan patrones de código comunes en las aplicaciones D. Por ejemplo, Oracle Forms utiliza el lenguaje PL/SQL para implementar los manejadores de eventos, y en este lenguaje se puede hacer uso de cursores para el acceso a base de datos. Nosotros simpli camos dichas instrucciones de apertura y lectura del cursor explícito con una primitiva que indique una lectura de base de datos. Algunas de las primitivas que hemos de nido son: lectura de base de datos, escritura en un control o invocación a una función de lógica de negocio. El código expresado de este modo es más sencillo de analizar que el vi

código fuente. El código representado por medio de primitivas es entonces analizado para separar los aspectos arquitecturales, obteniéndose el modelo de separación de aspectos. Para tal n, las primitivas se dividen en bloques básicos [ ] que se estructuran formando un grafo de control de ujo. Cada bloque básico a su vez se divide en fragmentos, que son conjuntos de instrucciones relacionadas que pertenecen al mismo aspecto (lógica de negocio, controlador o GUI), y que por tanto deben ser migradas conjuntamente. Los fragmentos se obtienen analizando el tipo de las primitivas y las variables de entrada y salida que poseen. Gracias a que las primitivas guardan referencias al código original, es posible utilizar el grafo de ujo de fragmentos para clasi car el código original y guiar la migración a una arquitectura de capas. Las primitivas también se utilizan en la identi cación de interacciones entre los controles y entre las vistas. Se analiza recursivamente el ujo de control de las primitivas para extraer: i) los controles que generan los eventos, ii) las condiciones en las cuáles se disparan los eventos, iii) los controles en los que se produce un efecto, y iv) el efecto producido sobre éstos. Por ejemplo, seleccionar una opción determinada de una lista desplegable puede producir que se habilite un formulario que antes no se mostraba. Con esta información se construye un grafo multi-nivel donde los vértices son los controles y las vistas, y las aristas son las interacciones entre ellos. El grafo es multi-nivel porque un vértice que represente una vista contendrá a su vez el grafo formado por los controles que forman parte de ella. Este grafo puede ser de utilidad para documentar el sistema, generar artefactos que describan el ujo de navegación entre las vistas, o detectar llamadas asíncronas en un entorno web con Ajax. Evaluación Las dos versiones de la solución de inferencia del layout han sido evaluadas. En la primera versión se realizó mediante un caso de estudio de migración de dos aplicaciones Oracle Forms a Java. El proceso de evaluación básicamente consistió en generar automáticamente el código Java y analizar manualmente las ventanas obtenidas. Particularmente se midió el porcentaje de partes distinguibles que habían sido colocadas correctamente, así como el porcentaje de controles situados en el lugar correcto. En el posicionamiento de partes se obtuvo una tasa de éxito del y en cada una de las aplicaciones, y el porcentaje de controles correctos fue de y en cada una. El caso de estudio reveló varias limitaciones de la primera versión del enfoque, siendo particularmente destacable la incapacidad para detectar layouts complejos (que no pueden ser expresados con un único gestor de layout). vii

La segunda versión se diseñó para paliar las limitaciones de la primera versión. En este caso, la aproximación se testeó en un escenario diferente a la migración, concretamente la generación de una nueva interfaz web a partir de esbozos (wire ames) creados con alguna herramienta para tal efecto. La evaluación se llevó a cabo con profesionales de las TICs que siguieron el siguiente proceso: leer una breve documentación de la aplicación propuesta, realizar los esbozos de la GUI, generar automáticamente el código, analizar los resultados y rellenar un cuestionario. El de los participantes indicaron que las vistas se habían generado totalmente o en gran medida como ellos esperaban, el estaban totalmente o parcialmente de acuerdo en que las ventanas generadas podían usarse en aplicaciones reales, y el estuvieron de acuerdo en que la herramienta es útil. Las características de nuestra solución que incidieron negativamente en el resultado fueron dos: i) la con guración de los parámetros del algoritmo, que en algunos casos era vital para obtener el resultado adecuado, y ii) la función de idoneidad, que obtenía buenas soluciones en cuanto al número de gestores de layout empleados, pero no siempre obtenía la mejor solución desde el punto de vista visual. Para comparar la segunda versión con la primera se evaluó el nuevo algoritmo con una de las aplicaciones del caso de estudio de Oracle Forms, obteniéndose un de acierto en la organización de las partes y un en el posicionamiento de controles. El hecho de aplicar el enfoque de inferencia del layout en dos escenarios diferentes nos sirve para demostrar que la solución es aplicable en cualquier caso en que se disponga de una interfaz donde los controles se posicionan con coordenadas. La evaluación de la separación de aspectos estructurales de los manejadores de eventos se llevó a cabo con un caso de estudio de migración de una aplicación Oracle Forms a una arquitectura cliente-servidor de capas, donde la capa de presentación se implementaba en el navegador, y la lógica de negocio permanecía en el servidor y se exhibía al cliente mediante un servicio REST. Este caso de estudio nos permitió evaluar también el enfoque de abstracción de código, en el que el del código fue encajado en alguno de los patrones de nidos, y se obtuvo una tasa de código correctamente transformado en primitivas del . La tasa de error del fue ocasionada por ciertos elementos del código PL/SQL que no se tratan en la implementación actual, como las excepciones, y otras funciones especí cas de Oracle Forms que no se traducen correctamente. Con respecto a la separación de aspectos, se obtuvo un de código correctamente clasi cado, lo que demuestra que ésta es altamente dependiente del éxito del proceso de abstracción del código.

viii

Conclusiones La arquitectura MDE que hemos desarrollado nos ha permitido solventar los requisitos R , R , R y R . Concretamente la representación explícita de la información (R ) se ha logrado por medio de metamodelos, la modularidad (R ) y la automatización (R ) se han conseguido mediante cadenas de transformaciones, y la independencia del origen y el destino (R ) se ha obtenido gracias a los metamodelos diseñados para tal efecto. El requisito de asemejar la estructura lógica y visual (R ) se cubre mediante el modelo de regiones. La representación de alto nivel (R ) se logra mediante el modelo de layout. La tolerancia a controles desalineados (R ), las soluciones alternativas (R ) y el requisito de diseño con gurable (R ) se ha conseguido implementando un algoritmo de inferencia parametrizable. Cabe destacar que no se han encontrado trabajos que planteen una solución a los requisitos R y R , dado que los trabajos existentes presentan algoritmos ad-hoc para generar layouts compuestos por un gestor de layout [ ] [ ] [ ]. La abstracción de código (requisito R ) se ha logrado mediante el modelo de primitivas de comportamiento abstracto, la categorización de código (R ) se ha conseguido a través del grafo de ujo de fragmentos de código (modelos de separación de aspectos), y el requisito de identi cación de interacciones y ujos de navegación ha sido obtenido por el modelo de interacciones. No hemos hallado ningún trabajo relacionado que utilice una representación similar para abstraer código. Con respecto al requisito R , los trabajos existentes separan la aplicación en capas [ ], pero requieren asistencia del desarrollador, mientras que en nuestra solución este proceso ha sido automatizado. Contribuciones Las contribuciones de esta tesis son fundamentalmente tres. La primera es una arquitectura de modelos que puede ser utilizada para migrar aplicaciones D. Esta arquitectura posee una serie de características (reusabilidad, extensibilidad, mantenibilidad) muy útiles para la migración. Además, como parte de esa arquitectura destacamos el diseño del modelo CUI, que favorece la separación de aspectos en el desarrollo de una GUI . La segunda aportación es la estrategia de inferencia del layout, de la cual se proponen dos versiones. El enfoque propuesto permite inferir diversas opciones de layout en base a un conjunto de gestores de layout parametrizable, y que puede ser utilizado no solo en un escenario de migración sino también de ingeniería directa, como la generación de código a partir de wireframes de la GUI. La tercera contribución es la solución de análisis de código de los manejadores de eventos para separar ix

los diferentes aspectos que se encuentran mezclados en el código, tanto arquitecturales como otros tales como las interacciones entre los controles de la vista.

x

Agradecimientos

M

. Siempre me he preguntado cómo me sentiría en este instante, que signi ca el nal de una etapa para mí. Son varios años de trabajo, mucho esfuerzo condensado en un documento, y mucha gente que de una manera u otra me ha apoyado y me ha ayudado a seguir adelante. Quiero empezar dedicando unas palabras de agradecimiento a mis padres Juan y Soledad, que siempre han velado porque me centrara en los estudios y nunca me faltase de nada. Gracias también a mis hermanos Juan Miguel y Marisol que siempre me han apoyado y me han demostrado que están ahí, y a Laura, Dani, Álvaro y Héctor, que endulzan nuestra familia con su inocencia y alegría. Jesús García Molina y Jesús Sánchez Cuadrado, mis directores de tesis y amigos, han sido piezas clave para superar con éxito esta odisea. Jesús García me acogió en su grupo allá en , y me enseñó que los modelos no solo des lan por las pasarelas. Años más tarde, Jesús Sánchez aceptó unirse al carro de las interfaces de usuario y se unió a Jesús García para guiarme por el tortuoso e incierto mundo de la investigación. ¡La de veces que habré maldecido RubyTL!... (y que posteriormente he alabado). A ambos les debo mi formación, y les agradezco el esfuerzo y tiempo que han invertido en mí. En mi camino de investigación se han cruzado muchos compañeros que han dejado huella. Empecé trabajando en el grupo de investigación desarrollando wrappers de código con Javier Cánovas, que se sentaba en la mesa contigua, y tantas veces me ha escuchado y soportado. En aquel momento integraban también el laboratorio Jesús Sánchez, Fernando Molina, Francisco Javier Lucas, Joaquín Lasheras, Miguel Ángel Martínez, y posteriormente llegaron Espinazo, Javier Bermúdez, Jesús Perera y Juanma. Me vienen a la memoria el descubrimiento del Musicovery, el secuestro del peluche, la escena del electricista, la escala Cuadrado, las JISBD en Gixi

jón... Gracias a todos por los buenos ratos que pasamos, en los que me enseñásteis otra forma de ’investigar’. De mi estancia en Bélgica en guardo gratos recuerdos. Pese a vivir la primavera más nublada que había visto en mi vida, mis compañeros de laboratorio François Beauvens y Jérémie Melchior, que se pasaban los lunes discutiendo del Madrid y el Barça, me hacían más llevaderas las frías mañanas de Louvain-La-Neuve. Allí conocí también a Vivian, Ugo, Diana, Cinthya, Diogo, Sophie, Nesrine, Mathieu, Edu y otros tantos que me han demostrado que tengo amigos distribuidos por el mundo. Quiero hacer una mención distinguida a mi supervisor en Bélgica, Jean Vanderdonckt, que sin conocerme prácticamente de nada me otorgó la posibilidad de realizar la estancia. No quiero olvidarme tampoco de mis amigos Edu, Anabel, Pablo, Laura, Daniel, Carras, la peña La Jarra, los monitores del campamento, y de los últimos visitantes del laboratorio, Saad, Manal y So a. Ellos han sufrido mis inquietudes y preocupaciones, y han sido de un modo u otro, testigos de mis logros y mis fallos durante el transcurso del doctorado. A todas y cada una de las personas citadas, gracias.

xii

Model-Driven Modernisation of Legacy Graphical User Interfaces A Businesses are more and more modernising the legacy systems they developed with Rapid Application Development ( D) environments, so that they can bene t from new platforms and technologies. As a part of these systems, Graphical User Interfaces (GUIs) pose an important concern, since they are what users actually see and manipulate. When facing the modernisation of GUIs of applications developed with D environments, developers must deal with two non-trivial issues. e rst issue is that the GUI layout is implicitly provided by the position of the GUI elements (i.e. coordinates). However, taking advantage of current features of GUI technologies o en requires an explicit, high-level layout model. e second issue is that developers must deal with event handling code that typically mixes concerns such as GUI and business logic. In addition, tackling a manual migration of the GUI of a legacy system, i.e., re-programming the GUI, is time-consuming and costly for businesses. is thesis is intended to address these issues by means of an MDE architecture that automates the migration of the GUI of applications created with D environments. To deal with the rst issue we propose an approach to discover the layout that is implicit in widget coordinates. e underlying idea is to move from a coordinate-based positioning system to a representation based on relative positions among widgets, and then use this representation to infer the layout in terms of layout managers. Two versions of this approach have been developed: a greedy solution and a more sophisticated solution based on an exploratory algorithm. To deal with the second issue we have devised a reverse engineering approach to analyse event handlers of D-based applications. In our solution, event handling code is transformed into an intermediate representation that captures the high-level behaviour of the code. From this representation, separation of concerns is facilitated. Particularly it has allowed us to achieve the separation of architectural concerns from the original code, and the identi cation of interactions among widgets. All the generated models in the reverse engineering process have been integrated into a Concrete User Interface (CUI) model that represents the different aspects that are embraced by a GUI. xiii

e two layout inference proposals and the event handler analysis have been tested with real applications that were developed in Oracle Forms. e exploratory version of the layout inference approach was in addition tested with wireframes, which poses a different context in which the layout inference problem is also useful.

xiv

[...] You push at the boundary for a few years. Until one day, the boundary gives way. And, that dent you’ve made is called a Ph.D.

Of course, the world looks different to you now:

So, don’t forget the bigger picture.

Keep pushing.¹

¹h p://ma .might.net/articles/phd-school-in-pictures/

xvi

Contents

INTRODUCTION .

Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

BACKGROUND .

So ware modernisation . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Graphical User Interfaces (GUI) . . . . . . . . . . . . . . . . . . . . . . . .

.

. .

Visual GUI features . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Legacy GUI features . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Use scenarios of GUI reverse engineering . . . . . . . . . . . . . . .

Model Driven Engineering (MDE) . . . . . . . . . . . . . . . . . . . . . . . . .

Metamodelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Domain-Speci c Languages (DSLs) . . . . . . . . . . . . . . . . . .

. .

Model transformations . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Model-Driven Modernisation (MDM) . . . . . . . . . . . . . . . .

STATE OF THE ART .

Analysis of layout recognition approaches . . . . . . . . . . . . . . . . . . . . .

Lu eroth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Rivero et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Sinha and Karim . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Other approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii

. . .

.

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Analysis of behaviour extraction approaches . . . . . . . . . . . . . . . . . . . .

Memon (GUIRipping) . . . . . . . . . . . . . . . . . . . . . . . .

. .

Heckel et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Morgado et al. (ReGUI) . . . . . . . . . . . . . . . . . . . . . . . .

. .

Other approaches . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

GUI representation approaches . . . . . . . . . . . . . . . . . . . . . . . . . . .

Knowledge Discovery Metamodel (KDM) . . . . . . . . . . . . . .

. .

Interaction Flow Modeling Language (IFML) . . . . . . . . . . . . .

. .

Cameleon framework . . . . . . . . . . . . . . . . . . . . . . . . .

. .

User Interface Description Languages (UIDLs) . . . . . . . . . . . .

. .

. . .

UsiXML . . . . . . . . . . . . . . . . . . . . . . . . . .

. . .

Maria . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . .

XAML . . . . . . . . . . . . . . . . . . . . . . . . . . .

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

OVERVIEW .

Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Architecture of the solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

e Concrete User Interface model . . . . . . . . . . . . . . . . . .

. .

Overview of the migration architecture . . . . . . . . . . . . . . . .

. .

Requirement implementation . . . . . . . . . . . . . . . . . . . . .

LAYOUT INFERENCE: GREEDY APPROACH .

MDE architecture for layout inference . . . . . . . . . . . . . . . . . . . . .

.

Reverse engineering metamodels . . . . . . . . . . . . . . . . . . . . . . . .

.

Challenges in layout reverse engineering . . . . . . . . . . . . . . . . . . . .

.

Detecting regions and containers . . . . . . . . . . . . . . . . . . . . . . . .

.

Uncovering relative positions . . . . . . . . . . . . . . . . . . . . . . . . . .

.

High-level layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Detailed example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Injection of Forms models . . . . . . . . . . . . . . . . . . . . . . . xviii

.

.

.

. .

Mapping Oracle Forms to

D models . . . . . . . . . . . . . . . .

. .

Identi cation of the regions . . . . . . . . . . . . . . . . . . . . . .

. .

Recovering the low-level layout . . . . . . . . . . . . . . . . . . . .

. .

Recovery of the high level layout . . . . . . . . . . . . . . . . . . . .

. .

Generation of Java Swing code . . . . . . . . . . . . . . . . . . . . .

Case study: from Oracle Forms to Java . . . . . . . . . . . . . . . . . . . . . . .

Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Evaluation results . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Limitations of the approach . . . . . . . . . . . . . . . . . . . . . .

Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Mapping Oracle Forms to Normalised models . . . . . . . . . . . .

. .

Reverse engineering . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Forward engineering . . . . . . . . . . . . . . . . . . . . . . . . . .

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

LAYOUT INFERENCE REVISITED: EXPLO

TORY APPROACH

.

MDE architecture for layout inference (revisited) . . . . . . . . . . . . . . .

.

Reverse engineering metamodels . . . . . . . . . . . . . . . . . . . . . . . .

.

.

.

. .

Structure metamodel . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Layout metamodel . . . . . . . . . . . . . . . . . . . . . . . . . . .

Changing the positioning system . . . . . . . . . . . . . . . . . . . . . . . . . .

Creating the view graph . . . . . . . . . . . . . . . . . . . . . . . .

. .

Representing widget relative positions . . . . . . . . . . . . . . . . .

. .

Representing widget distances . . . . . . . . . . . . . . . . . . . . .

. .

Tile model example . . . . . . . . . . . . . . . . . . . . . . . . . .

Infering a high-level layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

e layout pa erns . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

Layout inference algorithm . . . . . . . . . . . . . . . . . . . . . .

. .

Layout inference example . . . . . . . . . . . . . . . . . . . . . . .

. .

Performance evaluation . . . . . . . . . . . . . . . . . . . . . . . .

Case study: from Wireframes to uid web interfaces . . . . . . . . . . . . . . . .

Context of the case study . . . . . . . . . . . . . . . . . . . . . . . xix

. .

.

Evaluation of the approach . . . . . . . . . . . . . . . . . . . . . . . . . .

Methodology . . . . . . . . . . . . . . . . . . . . . . . .

. . .

Quantitative results . . . . . . . . . . . . . . . . . . . . .

. . .

User assessment . . . . . . . . . . . . . . . . . . . . . .

. . .

Approach limitations . . . . . . . . . . . . . . . . . . . .

Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Mapping WireframeSketcher to Normalised models . . . . . . . . .

. .

Mapping Normalised models to Structure models . . . . . . . . . .

. .

Generation of the web interface . . . . . . . . . . . . . . . . . . . .

. .

e tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Comparison of the greedy and exploratory approaches . . . . . . . . . . . .

.

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

EVENT HANDLER ANALYSIS .

Architecture for analysing events . . . . . . . . . . . . . . . . . . . . . . . .

.

Running example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Representing event handling code . . . . . . . . . . . . . . . . . . . . . . .

.

. .

Metamodel description . . . . . . . . . . . . . . . . . . . . . . . .

. .

Deriving a

. .

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DBehaviour model . . . . . . . . . . . . . . . . . . .

Separating concerns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Metamodel description . . . . . . . . . . . . . . . . . . . . . . . .

. .

Fragment identi cation . . . . . . . . . . . . . . . . . . . . . . . . . . .

Creating a control ow graph of fragments . . . . . . . . .

. . .

Giving a descriptive name to the fragments . . . . . . . .

. . .

Se ing dependencies among fragments . . . . . . . . . .

.

Generating layered code . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Capturing dependencies among the GUI elements . . . . . . . . . . . . . . .

.

. .

Metamodel description . . . . . . . . . . . . . . . . . . . . . . . .

. .

From

. .

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DBehaviour to the Interaction model . . . . . . . . . . . .

Evaluation of the approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Evaluation of the code abstraction . . . . . . . . . . . . . . . . . . . xx

.

. . Evaluation of the separation of concerns . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONCLUSIONS . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Goal : Architecture for migrating legacy GUIs . . . . . . . . . Goal : Analysis of GUI de nitions for migration . . . . . . . . Goal : Analysis of the code of event handlers for migration . Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . First contribution: MDE-based migration architecture . . . . . Second contribution: Layout inference approach . . . . . . . . ird contribution: Event handler analysis approach . . . . . Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CUI metamodel . . . . . . . . . . . . . . . . . . . . . . . . . Region identi cation . . . . . . . . . . . . . . . . . . . . . . . High-level layout inference . . . . . . . . . . . . . . . . . . . . Event handler code abstraction . . . . . . . . . . . . . . . . . Identi cation of widget dependencies . . . . . . . . . . . . . Publications related to the thesis . . . . . . . . . . . . . . . . . . . . . Journals with impact factor . . . . . . . . . . . . . . . . . . . Renowned international conferences . . . . . . . . . . . . . . Other journals . . . . . . . . . . . . . . . . . . . . . . . . . . Other international and national conferences and workshops . Other publications in the MDE area . . . . . . . . . . . . . . . . . . . Journals with impact factor . . . . . . . . . . . . . . . . . . . International conferences and workshops . . . . . . . . . . . Projects that are related to this thesis . . . . . . . . . . . . . . . . . . Contracts supporting this thesis . . . . . . . . . . . . . . . . . . . . Research stays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transfer of technology . . . . . . . . . . . . . . . . . . . . . . . . REFERENCES

xxi

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

xxii

Listing of gures

.

Tag cloud of the blended elements in with a legacy GUI. . . . . . . . . . . . .

.

e Horseshoe model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Example view for entering personal information. Widgets are placed with explicit coordinates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

An excerpt of the GUI tree for the window in Figure . . . . . . . . . . . . . .

.

Login window created with WireframeSketcher. . . . . . . . . . . . . . . . .

.

(a) Fragment of the original GUI tree. (b)

.

A calendar component emulated by a grid of bu ons. . . . . . . . . . . . . .

.

Example of mixing of concerns in an Oracle Forms application . . . . . . . .

.

Fragment of a Delphi event handler that checks if a task is active before deleting it. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

MDE applied to reengineering . . . . . . . . . . . . . . . . . . . . . . . . .

.

Schema of the Rivero et al. approach . . . . . . . . . . . . . . . . . . . . . .

.

Sinha and Karim approach . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Approach of Heckel et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Approach of Morgado et al. (ReGUI) . . . . . . . . . . . . . . . . . . . . .

.

KDM layers and packages . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

KDM metamodel. UI package (UIResources) . . . . . . . . . . . . . . . . .

.

KDM metamodel. UI package (UIRelations) . . . . . . . . . . . . . . . . .

.

KDM metamodel. UI package (UIActions) . . . . . . . . . . . . . . . . . .

.

Example of user interface (le ) and corresponding IFML model (right) . . .

.

Cameleon framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii

e expected GUI tree. . . . . . .

.

Abstraction, rei cation and translation in the Cameleon framework . . . . . .

.

UsiXML models conforming to Cameleon . . . . . . . . . . . . . . . . . . .

.

Concrete User Interface models in our solution . . . . . . . . . . . . . . . .

.

Architecture of the solution (GUI MO framework) . . . . . . . . . . . . . .

.

Part of the architecture explained in this chapter. . . . . . . . . . . . . . . . .

.

Model-based architecture used to migrate legacy GUIs. . . . . . . . . . . . .

.

Excerpt of the Normalised metamodel. . . . . . . . . . . . . . . . . . . . . .

.

Simpli ed CUI metamodel. . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Example view for entering personal information. (Same window as Figure . ).

.

Region metamodel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Le : example window for the region detection. Right: the logical structure of the widgets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Structure of the regions a er step for the example in Figure . . . . . . . . .

.

Case A. Le : example window with a base region R . Right: a new extra region R created to contain CloseWindowBu on. . . . . . . . . . . . . . . . . .

.

Case B. Le : example window with a base region R and an extra region R . Right: the base region R is augmented to include SearchBu on completely and the extra region R is diminished. . . . . . . . . . . . . . . . . . . . . .

.

Case C. Le : example window with a base region R and an extra region R . Right: a new extra region R is created to contain NextBu on, and the region R is diminished. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Tile metamodel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Adjacency example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Horizontal intersection value example . . . . . . . . . . . . . . . . . . . . .

.

Example window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Excerpt of the

.

Some regions identi ed for the example window in Figure . . . . . . . . . .

.

Excerpt of the Region Model for the example window in Figure . . . . . . .

.

Representation of the tiles in the upper part of the window . . . . . . . . . .

.

Excerpt of the Tile Model for the example window in Figure . . . . . . . . .

.

Representation of the tiles in the lower part of the window . . . . . . . . . .

.

Properties of the lower-le tile of bu ons . . . . . . . . . . . . . . . . . . .

D Model for the example window in Figure .

xxiv

. . . . . .

. .

Excerpt of the CUI Model for the example window split into two parts . . . . e example window shown in Figure .

migrated to Java Swing . . . . . .

.

Sca er plot that represents the accuracy of part detection for the case study A.

.

Sca er plot that represents the accuracy of part detection for the case study B.

.

Sca er plot that represents the accuracy of widget placement for the case study A.

.

Sca er plot that represents the accuracy of widget placement for the case study B.

.

Missing part identi cation problem . . . . . . . . . . . . . . . . . . . . . .

.

Non-regular layout detection problem . . . . . . . . . . . . . . . . . . . . .

.

Model-based architecture used to migrate legacy GUIs. . . . . . . . . . . . .

.

Excerpt of the Oracle Forms metamodel. . . . . . . . . . . . . . . . . . . . .

.

Model-based architecture used to migrate legacy GUIs. . . . . . . . . . . . .

.

Steps to explicitly infer the layout information. . . . . . . . . . . . . . . . . .

.

Relation between the CUI and the Structure and Layout metamodels. . . . .

.

Structure metamodel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Layout metamodel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Tile metamodel (new version) . . . . . . . . . . . . . . . . . . . . . . . . .

.

Allen intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Allen interval example for a pair of widgets . . . . . . . . . . . . . . . . . . .

.

Problem when se ing xed limits for the closeness levels. . . . . . . . . . . .

.

Closeness assignment example. (a) Widgets and distances between them. (b) Result graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Login window created with WireframeSketcher. . . . . . . . . . . . . . . . .

.

Graph representation of the login window example. . . . . . . . . . . . . . .

.

Pa ern matching example on four widgets . . . . . . . . . . . . . . . . . . .

.

Border layout supported pa erns. . . . . . . . . . . . . . . . . . . . . . . .

.

Examples of widgets that do not match any pa ern . . . . . . . . . . . . . .

.

Example of non-valid match for the Vertical Flow Layout pa ern. . . . . . . .

.

Example of match split for the Vertical Flow Layout pa ern. . . . . . . . . . .

.

Inference example. Permutation {HFlow, VFlow, Form} applied to the graph in Figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Inference example. Permutation {VFlow, HFlow, Form} applied to the graph in Figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv

.

Alignment columns for the Login window. . . . . . . . . . . . . . . . . . . .

.

Execution time for widgets in a single container. . . . . . . . . . . . . . . . .

.

Execution time for widgets arranged in containers (a container every

.

Are the generated views as I expected? . . . . . . . . . . . . . . . . . . . . .

.

Are the margins, gaps and alignment correct? . . . . . . . . . . . . . . . . .

.

When resizing the windows, are the widgets resized appropriately? . . . . . .

.

Could the generated windows be used in a real application? . . . . . . . . . .

.

Is the layout inference tool useful? . . . . . . . . . . . . . . . . . . . . . . .

.

Example of the closeness problem. . . . . . . . . . . . . . . . . . . . . . . .

.

Example window. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Example horizontal-vertical ow. . . . . . . . . . . . . . . . . . . . . . . . .

.

Example horizontal-vertical ow resized. . . . . . . . . . . . . . . . . . . . .

.

Example vertical-horizontal ow. . . . . . . . . . . . . . . . . . . . . . . . .

.

Example vertical-horizontal ow resized. . . . . . . . . . . . . . . . . . . . .

.

Parts of the MDE architecture related to the Wireframes to ZK case study. . .

.

Excerpt of the WireframeSketcher metamodel. . . . . . . . . . . . . . . . . .

.

e login window generated in ZK. . . . . . . . . . . . . . . . . . . . . . .

.

Layout inference parameters. . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Example of an Oracle Forms window. . . . . . . . . . . . . . . . . . . . . .

.

Generated window by the rst approach for the Oracle Forms window. . . . .

.

Generated window by the second approach for the Oracle Forms window. . .

.

Part of the GUIZMO architecture explained in this chapter . . . . . . . . . .

.

Model-based architecture for reengineering D-based applications. Solid lines mean transformations and dashed lines are model dependencies. . . . .

.

Grants example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

PL/SQL trigger for the checkbox change event . . . . . . . . . . . . . . . .

.

Excerpt of the

.

PL/SQL to

widgets).

DBehaviour metamodel. . . . . . . . . . . . . . . . . . . . DBehaviour mappings . . . . . . . . . . . . . . . . . . . . .

.

DBehaviour example for the checkbox event . . . . . . . . . . . . . . . .

.

Excerpt of the EventConcerns metamodel . . . . . . . . . . . . . . . . . . .

.

EventConcerns model derived from the model in Figure . . Labels A, B, C, D are used to show the primitives that originate the basic blocks. . . . . . . . xxvi

. . . .

Fragment identi cation example . . . . . . . . . . . . . . . . . . . . . Horseshoe model applied to the separation of concerns . . . . . . . . . Interaction metamodel . . . . . . . . . . . . . . . . . . . . . . . . . . Interaction model for the event handlers of the window shown in Figure

xxvii

. . . . . . . . . . .

xxviii

List of Tables

.

GUI features of three different

D environments . . . . . . . . . . . . . . .

.

Summary of layout inference approaches . . . . . . . . . . . . . . . . . . . .

.

Summary of the behaviour extraction approaches (PC stands for Program Comprehension) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Relationships between the requirements and the discussion of the state of the art. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Requirements that cover bad practices in

.

Implementation of the requirements . . . . . . . . . . . . . . . . . . . . . .

.

Evaluation results for the case study A. . . . . . . . . . . . . . . . . . . . . .

.

Evaluation results for the case study B. . . . . . . . . . . . . . . . . . . . . .

.

Forms to Normalised mappings. . . . . . . . . . . . . . . . . . . . . . . . .

.

Classi cation of the approach of this chapter . . . . . . . . . . . . . . . . . .

.

Evaluation results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Evaluation results for the case study A. . . . . . . . . . . . . . . . . . . . . .

.

Classi cation of the approach of this chapter . . . . . . . . . . . . . . . . . .

.

D primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

DBehaviour evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

EventConcerns evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

Classi cation of the approach of this chapter . . . . . . . . . . . . . . . . . .

.

Ful lment of the requirements of goal G xxix

D environments. . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. .

Ful lment of the requirements of goal G Ful lment of the requirements of goal G

xxx

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Straight ahead of him, nobody can go very far... Antoine de Saint-Exupéry, e Li le Prince (Suggested by Daniel Medina)

1

Introduction

Graphical User Interfaces (GUIs) represent a crucial part of so ware systems as they are what users actually see and manipulate to interact with them. erefore, the design and implementation of GUIs is an issue that must not be neglected and developers typically devote a great effort to build application GUIs. Lately there has been a signi cant growth of types of devices that can run applications (smartphones, tablets, televisions, and so forth), each one having different screen sizes, resolutions and even interaction modalities (e.g., tactile screens). GUI technologies have also evolved to offer new possibilities that improve the user experience, particularly in the web se ing with the emergence of HTML and Ajax (Asynchronous JavaScript And XML). is variety of devices and sophistication in technologies has brought that GUI design is now more challenging than ever. In fact, companies spend large amounts of money of their budget creating interfaces that must be functional, appealing and, at the same time, usable, because they are aware that this is key to succeed in their business. e challenge of creating quality GUIs does not only concern to the development of new applications, but is also faced at present by companies that are migrating their legacy applications to modern technologies as they offer a be er user experience.

So ware modernisation refers to understanding and evolving existing so ware assets to maintain their business value. A legacy system is modernised when maintenance is not enough to achieve the desired improvements (e.g., new capabilities or greater maintainability) and that system must be extensively changed. So ware migration is a form of modernisation that involves moving an application, as a whole or a part of it, from the platform on which is currently operating to a target platform that provides be er features. A migration can be done in a disciplined way by applying a so ware reengineering process that consists of three stages: reverse engineering the legacy system to obtain a representation of the system at a higher abstraction level, restructuring these representations according to the new architecture, and nally creating code of the new system from the restructured information [ ] [ ]. Reverse engineering techniques are therefore essential to understand and obtain representations at a high level of abstraction when a reengineering process is applied. GUI migration has been typically regarded as a straightforward research topic, in which the only concern is to establish mappings between widgets of the source and target technologies. However, dealing with current technologies and devices requires a thorough analysis of the user interface so that it can be suitably reengineered. is analysis affects both the structural and behavioural aspects of a GUI, and sophisticated reverse engineering algorithms must be designed to cope with it. Model Driven So ware Engineering (MDSE or simply MDE) has emerged as a new area of so ware engineering that emphasizes the systematic use of models in the so ware lifecycle in order to improve its productivity and so ware quality aspects such as maintainability and interoperability. MDE techniques, e.g. metamodeling and model transformations, allow tackling the complexity of so ware by raising its abstraction and automation levels [ ]. ese techniques are useful not only for developing new so ware applications [ ] [ ] but also for reengineering legacy systems [ ] [ ] and dynamically con guring running systems [ ]. In the latest years, MDE techniques have been applied to a variety of modernisation scenarios, especially in the migration of applications [ ] [ ] and some MDE tools have been created [ ] [ ] [ ]. A notable effort is the Architecture-Driven Modernization (ADM) initiative [ ], which was launched in and is targeted at offering a set of standard metamodels for representing information that is frequently implicated in modernisation. Although MDE is increasingly gaining acceptance in the so ware community [ ], “the adoption of this approach has been surprisingly slow” [ ] and there is still a need for successful experiences of using MDE in real projects.

e purpose of this thesis is to bring together the elds of Reengineering, Reverse Engineering, Model Driven Engineering and Graphical User Interfaces (GUIs) in order to encompass them all and create a solution for migrating GUIs of legacy systems to modern frameworks and technologies. In particular, we have designed and implemented a solution for migrating applications created with Rapid Application Development ( D) environments, but the proposed approach is applicable to other legacy systems sharing the same requirements we have considered for D-based applications. e rest of this chapter is organised as follows: rst, the motivation of the work is presented; then, the goals of this thesis are outlined; a erwards, the development of the solution is explained and the main contributions of the thesis are enumerated; nally, the contents of the rest of this manuscript are summarised.

.

M

Most information systems dating from the ’s were built using D environments. e D paradigm appeared in the early ’s as a response to the non-agile development processes that existed [ ], and a number of Integrated Development Environments (IDEs) supporting fourth generation languages ( GLs) for the D paradigm also appeared. Oracle Forms, Visual Basic or Delphi are well-known examples of D environments. ese IDEs provided a programming paradigm centered on the application GUI, allowing developers to create initial prototypes rapidly and reducing development time by facilitating GUI design and coupling data access to graphical components. However, the gaining of productivity is achieved at the expense of reducing the so ware quality. Next, we discuss two features of applications that have been created with a D environment (hereina er referred as D applications), which negatively affect the so ware quality: the use of coordinates and the tangling of concerns. In D environments, the position of widgets was expressed in terms of abolute or relative coordinates (normally pixels), so the windows created with them were optimised just for a certain size. Nonetheless, this is a bad practice since the interfaces are difficult to maintain. Let us consider a GUI de ned by coordinates and a change consisting of adding a new widget. at change may lead developers to shi the coordinates of other widgets. Furthermore, designing user interfaces for a xed resolution and screen format is no longer admissible. With the popularisation of smartphones and tablets, there has been a explosive growth of devices that can run graphical applications (either natively or by means of a web browser).

erefore, applications can be executed on a variety of devices with different features such as screen size, computing capacity or modality (e.g. tactile or voice) that produce different user experiences. Developers have now to meet the challenge of implementing GUIs that can be accessed via different devices with different screen features. As a result, in the last few years, exible interfaces (non- xed layouts) have gained in popularity due to the fact that designing different interfaces for the same application but targeted at different devices is impractical. Layout managers came up in the late nineties to overcome the weaknesses of coordinated-based GUIs by offering a mechanism to locate widgets in such a way that they are adapted to their container elements. On the other hand, in D environments, event handlers (which were sometimes included in the same le as the GUI de nition) usually contained code belonging to several aspects of the application. For example, an event handler could accomplish the validation of a form and if it succeeds, then perform some calculations by applying some business rules and nally write the calculated data in a database by itself. is tangling of aspects is nowadays considered as a bad practice since it has a negative impact on so ware maintenance and reuse. Moreover, D developers o en implemented event handlers which were a ached to widgets that accessed the database and at the same time manipulated the GUI. is makes migration difficult, in particular to web platforms, since database code cannot be executed in the client side. D environments have been used to develop a great number of desktop applications as part of information systems, many of them being still in production. However, the evolution of these applications is hindered in the long term because of the two aforementioned issues: xed GUIs (non-adaptable GUIs) and tangling of aspects in the GUI code. is has motivated a large number of businesses to manually migrate their D legacy systems to new platforms (typically Web platforms), which be er meet their needs of extensibility, maintainability or distribution, among others. Another reason for this migration is that some vendors are increasingly ceasing support in favour of other platforms. As pointed out in [ ], migrating a legacy business application to a new technology necessitates tackling three main aspects: data access, business logic and graphical user interface (GUI). Besides, migration would be facilitated by tools that help to discover architectural concerns that are only implicit (and mixed together) in the source code, such as database access, navigation ow, validation or exception handling. Figure . shows many of the aspects that are tangled in a legacy GUI.

Figure . : Tag cloud of the blended elements in with a legacy GUI.

To our knowledge, just a few works have dealt with the migration of D-based legacy systems [ , ], and they regard GUI migration as a straightforward task which is addressed by mapping GUI components between the source and target views. However, dealing with current technologies and devices requires a thorough analysis of the user interface so that it can be suitably reengineered. Notably, there are two main types of artefacts involved in a GUI migration: GUI de nitions and event handlers. We will refer throughout this document to GUI de nition as the so ware artefact or set of them that describe the widgets that compose the view, their location and their graphical properties, which are normally generated by a GUI builder. Reverse engineering the layout of the user interface (i.e. obtaining an explicit model from the spatial relationships among widgets) is crucial to migrate the GUI of a D application to modern GUI toolkits. However, works about migration of D applications reveal that layout inference is o en neglected. In fact, just a few works have reported a restructuring of coordinate-based GUIs to views where the layout is managed by the toolkit [ ] [ ] [ ]. In contrast, there is a variety of works coping with static or dynamic analysis of event handlers in order to obtain a state machine or a similar representation of the ow of windows and events, which is mostly used for testing or program comprehension purposes [ ] [ ] [ ] [ ]. Nevertheless, we have not found reverse engineering literature dealing with the comprehension and automated migration of event handlers in the context of D applications.

.

P

e hypothesis we intend to demonstrate in this thesis is the following: We claim that the migration of a legacy GUI should consider the recognition of the graphical structures that compose the layout of the original application, and should also separate the concerns that are blended into event handlers. en, it is possible to develop algorithms and techniques to uncover the GUI layout and disentangle the application concerns. Furthermore, we believe that MDE is a paradigm that facilitates the achievement of this goal since it has some features, namely metamodelling and model transformations, which ease the development of an automated solution. When we will refer to legacy systems throughout the thesis, we will speci cally refer to the applications created mostly during the ’s with the aid of D environments and Fourth Generation Languages ( GLs), such as Oracle Forms , Delphi , or Visual Basic . e legacy system term embraces much more platforms than D environments, however we will restrict the term to such context, for which we have analysed and tested some applications. Nevertheless, our proposal may be used in other scenarios, for instance, the layout inference process can be applied to the generation of nal GUIs from mockups as we will see in Chapter . ree high-level goals are derived from the previous statement, namely: (G ) Design an MDE architecture for migrating legacy GUIs. We need to create a solution to tackle the migration of legacy systems to modern technologies, which will be se led on MDE because it provides the foundations to explicitly represent the information extracted (by means of models), and to automate the generation of these models (by means of model transformation chains). ere will be models that represent the information of every GUI aspect considered, like the layout. All these models will be described by a metamodel, and the mappings between two related models will be de ned by a model transformation. e construction of a full- edged MDE solution will involve the use of several MDE tools such as model injectors (for transforming text into models), model transformation languages (to de ne mappings between models) and template languages (to generate code from models). (G ) Separate and make explicit the information of GUI de nitions. It is important to separate and make explicit the information contained in the GUI de nitions (the definition of the views). Speci cally, separate the logical structure of the views, the style and the layout. In a legacy application the layout is implicitly expressed in coordinates.

Representing that layout by means of high-level elements such as layout managers is a particularly challenging problem that must be addressed to achieve a good-quality migration. en, data structures (metamodels) for representing GUIs and layout inference algorithms must be part of the solution. (G ) Separate and make explicit the information of event handlers. As noted above, the code of the event handlers usually tangles several concerns, ranging from view manipulation (e.g., enabling/disabling form elds), navigation to other views, form validation, and so on. It would be desirable to uncouple them all to promote the evolution of the system. To this aim, considering the distinctive nature of D applications may lead to be er results that addressing the problem from a general point of view. en, performing some kind of preprocessing of the code of the event handlers before dealing with the separation of concerns will be helpful, for example, obtaining a representation that summarises the meaning of snippets of code and extracts the variables of each type.

.

D

In the late nineties many companies began to migrate their Oracle Forms applications to modern platforms like JavaEE or .NET. e ModelUM group started in a research project to investigate to what extent an MDE-based solution could automate such migrations [ ]. is pilot project was carried out in collaboration with the Sinergia IT so ware company, and the main aim was to develop a framework for automating the migration of the GUI and the data access layers. Some research problems related to the GUI migration arose at the early stages of the project, which se led the objectives of this thesis. In the rst place we performed a literature review to know the state of the art about reverse engineering and reengineering of the GUI of legacy systems. We inspected applications in Oracle Forms , Delphi and Visual Basic to know the features that typify the D environments, and we also analysed several User Interface Description Languages (UIDL) and Concrete User Interface (CUI) models, with the intention of using a technology-independent representation of the GUI. ese languages and models represented the layout in a simple way, and they were not focused on separation of concerns, so we nally decided to create our own CUI representation and we de ned a rst version of our architecture. e inspection of views of different

D applications revealed that widgets were always placed

by means of coordinates expressed in pixels or other xed units, whereas in modern technologies the use of xed measures is not recommended but some sort of layout managing system is advised. en we developed a reverse engineering approach to deal with the inference of the layout, which was presented in the ASE conference [ ]. An extension of this contribution, which described the approach in detail and the validation accomplished, was published in the ASE journal [ ]. Before developing the approach, we did a literature review for searching works that performed some sort of layout inference from coordinates and we only found one relevant approach [ ] that was not easily extended to different layout managing systems. Our solution was tested with two real case studies with positive results. en we moved to the analysis of event handlers. Since the analysis of code is a totally different area, we performed a new literature review and we learned the foundations from the existing approaches. We realised of the speci c features of the event handlers of D applications and we decided to pro t from them by identifying code constructions that were frequently found. We develop a program comprehension approach to disentangle the different concerns that are mixed in the code of the event handlers. e cornerstone of that approach was a model representing the behaviour of the code in an abstract way so the later reverse engineering tasks were facilitated. e work resulted in contributions in the WCRE [ ], JISBD [ ] and UIDL [ ] conferences. In the PhD candidate did a -month research stay in Louvain-La-Neuve (Belgium), in the LILab group, which is a renowned team in Human-Computer Interaction (HCI) led by Jean Vanderdonckt. During the stay, a tool for analysing web pages into UsiXML [ ] speci cations was developed, and we cooperated in a work presented in the RCIS’ conference [ ]. is work made us reconsider the layout inference solution to implement some improvements. Back to ModelUM, we started to work on the idea of applying our layout inference solution to generate GUIs from wireframes. en we considered the approach developed during the stay in Belgium to overcome some of the limitations detected in the rst version of our layout inference algorithm. erefore, the last period of the research was devoted to design and implement a new version of the layout inference approach and develop a tool to automatically generate GUIs from wireframes. e new version of the approach was tested with a real wireframing tool, and an article that described our work was submi ed to the IST journal [ ], which is under review at present.

.

O e structure of the rest of this document is as follows: • Chapter introduces the background needed for a be er understanding of this thesis. It comprises basic concepts of so ware modernisation, the features of legacy GUIs, scenarios in which extracting information of the GUI is useful, and the principles of the MDE paradigm. • Chapter analyses the state of the art in three areas, namely layout inference, code analysis of event handlers, and MDE approaches for reengineering GUIs. For the rst two areas, some dimensions will be de ned to compare the works and a discussion in each area will present the lacks and weaknesses of up-to-date approaches for reverse engineering legacy GUIs. • Chapter outlines our proposal for migrating legacy GUIs. It describes the overall challenges we have found when addressing the problem, and we identify the requirements that we believe that a proper solution should have. We will also present the general architecture of the solution, which includes models specially designed to deal with each concern, and we will foresee how do we cope with each one of the elicited requirements in the solution. • Chapter explains the rst approach we devised to tackle the layout inference of legacy GUIs. We will expound the data structures and algorithms involved in the solution. Finally we will present the validation of the approach through a case study of the migration of Oracle Forms applications to the Java platform. Besides the evaluation of the approach, the case study has served to disclose the limitations of the current solution and draw conclusions. • Chapter expounds a second version of the layout inference algorithm. e chapter starts stating the reasons that motivated a new version and shows the changes that have been accomplished in the solution to incorporate the new requirements. e main change affects the high-level layout inference algorithm, which is now an exploratory algorithm based on graph rewriting and pa ern matching, which will be explained in depth. A new case study about reengineering of wireframes will be introduced, which will be used to

test the new approach. We also include a brief evaluation to compare both approaches in the context of the migration of legacy GUIs, concretely Oracle Forms windows. • Chapter focuses on the part of the reengineering architecture that is devoted to the analysis of event handlers. We will present the metamodel (i.e., data structure) we have built to represent the code in a concise manner, and the pa ern recognition algorithm designed to extract that representation. We put into practice this metamodel in two cases: the separation of the code in layers (business logic, the controller and the GUI code), and the identi cation of the interactions among widgets and among views. e separation in layers will be also tested with a case study to migrate Oracle Form event handlers to an Ajax application. • Chapter concludes this thesis by analysing the level of achievement of the goals we presented in Chapter and the requirements enumerated in Chapter . Our solutions are contrasted with the related work and a discussion about the bene ts and disadvantages with regard to those works is included, which leads to the future work proposal. Finally, the results of this thesis in terms of publications and projects are enumerated.

e world is full of obvious things which nobody by any chance ever observes. Mark Haddon, e Curious Incident of the Dog in the Night-Time (Suggested by Javier Cánovas)

2

Background

is chapter introduces the background needed for a be er understanding of this thesis, which consists of: the basic concepts in the area of So ware Modernisation, some common notions about GUIs, the particular features of the GUI of legacy systems, GUI modernisation scenarios in which an inference process is useful, and the foundations of Model Driven Engineering such as metamodelling and model transformations, and their applicability to so ware modernisation.

.

S

Modernisation is a form of so ware evolution of legacy systems which involves deeper and more extensive changes than maintenance, but in which the system still has some business value that is preserved [ ]. A modernisation process is applied when the desired properties of a legacy system cannot be achieved by means of maintenance. Two kinds of modernisation are distinguished: white-box and black-box. In the former the internal details of the system must be understood and some signi cant changes in the system structure are required (e.g. code re-

structuring). In the la er, the analysis of legacy systems is based on their input and output, e.g., a wrapper is a commonly used technique to achieve black-box modernisation. Migration is a kind of modernisation in which an entire source application or a part of it is moved to a different technology, for instance a source code translation or a database engine change. Reengineening is also a form of modernisation that applies so ware engineering practices to an existing system to meet new requirements [ ]. Tilley and Smith [ ] de ne reengineering as “the systematic transformation of an existing system into a new form to realise quality improvements in operation, system capability, functionality, performance, or evolvability at a lower cost, schedule, or risk to the customer”. A reengineering process can be applied in three stages [ ]. Firstly, a reverse engineering stage analyses the existing system and extracts knowledge which is represented at different abstraction levels. A second stage restructures these abstract representations in order to establish a mapping between the existing system and the target system. Finally, a forward engineering stage is applied to obtain the artefacts of the new system from the output of the restructuring stage. As the horseshoe model [ ] illustrates (see Figure . ), the reverse engineering process can be applied in several steps which form a transformation chain. at chain is intended to increase the level of abstraction of the extracted knowledge so it achieves an architectural representation of the system. en restructuring and forward engineering can be applied at different abstraction levels for any of the obtained representations to derive artefacts of the new system. Reverse engineering is an essential activity in a reengineering process which is based on code and data comprehension techniques. Chikofsky and Cross [ ] de ne reverse engineering as “the process of analyzing a subject system to i) identify the system’s components and their interrelationships, and ii) create representations of the system in another form or at a higher level of abstraction”. Reverse engineering techniques are commonly classi ed in two major groups [ ]: static analysis is based on the inspection of the application artefacts (normally source code), and dynamic analysis examines the state of a running application. Each type of technique has its limitations: with static analysis it is difficult to have good coverage of highly dynamic applications, while dynamic analysis faces problems with guaranteeing that generated models fully capture the behavior of the system. A third technique is hybrid analysis, which joins both static and dynamic analysis to take the best of each procedure. ere are numerous forms of reengineering [ ]. A platform migration typically combines several of these forms, for instance, source code transformation, program modularisation, and data

Figure . :

e Horseshoe model

reengineering can be involved in a D-to-Java platform migration. Revamping is connected with the modernisation of user interfaces, in which only the user interface is changed to improve some aspects like usability. ese days, with increasingly high interest in the Internet, the most popular form of revamping is adding a web interface to legacy systems. In the past a very common practice was replacing a text interface with a graphical user interface. One of the methods for this kind of revamping was screen scraping, that is, a black-box method, in which an application (usually an existing component) is ’redirected’ from a console screen into a graphical frame of web interface [ ]. is method is relatively cheap and results of a modernisation are well visible. Nevertheless the UI is just a wrapper on the old system which remains unchanged, so adding new functionalities or further maintenance is still very difficult, because system extensibility has not been improved. When these improvements are needed, a white-blox approach should be applied to move the GUI legacy code to the target platform, for instance, when an Oracle Forms application is converted into a Java Server Faces ( JSF) one.

.

G

U

I

(GUI)

A User Interface (UI) is the part of a so ware/hardware system that is designed to interact with users. A Graphical User Interface (GUI) is a UI that takes advantage of computer graphics to facilitate the interaction with users. Before the popularisation of touch devices such interaction has been typically performed by means of a cursor on the screen that is controlled by a mouse, which lets the user select graphical elements such as menu items or bu ons. User interfaces have a static component which is related to the presentation of the information (i.e., the structure, the layout, the usability, the accessibility or the aesthetics), and a dynamic part that is associated

with the behaviour when the user interacts with it (i.e. the events that are triggered and perform actions and/or changes in the interface). A GUI toolkit (or widget toolkit) is a library that supports building GUIs for a particular programming language and sometimes is tied to a framework or operating system. For instance, Gtk+ for desktop applications in C/C++ under Windows/Linux/Mac, or the Java Android SDK for mobile applications in Android. Each toolkit provides different features for the static and dynamic aspects of the GUI. We will use the term view to refer to the graphics displayed on device screens. Common examples of views are windows in desktop applications, web pages in web applications, and views in mobile applications. e elements displayed in views are widgets, controls or visual components (e.g., bu ons or combo boxes). e term widget will be used plenty of times throughout this document. ere are different kinds of widgets, and every widget is characterised by a type, a set of graphical properties such as background colour or font type, and status properties such as visibility (if the widget is visible) or editability (if the widget can be edited). In general, widget types are commonly classi ed according to their purpose: entering data (e.g., text elds), showing information (e.g., data grids) or interacting with the system (e.g., bu ons). ere are also widgets (like panels) that are used to structure views, in such a way that bu ons or text elds are contained in panels (similarly to a Composite pa ern). In this sense, views are containers too and they are actually the topmost components in the aggregation hierarchy of the GUI elements, which is sometimes referred as GUI tree. Figure . shows an example view for recording user data which contains NameLabel, NameBox, PaymentFrame and some other widgets, and PaymentFrame is in turn the container of CardLabel, CardCombo, DiscountLabel, and DiscountCheck. A part of the GUI tree of this view is shown in Figure . . e layout of a graphical user interface is the spatial distribution of the elements in the views of the application. ere are GUI toolkits that de ne explicit components for laying out content (e.g., the hbox and vbox in ZK [ ]), while in other cases the layout is de ned by properties (e.g., oat in CSS [ ]) or assigning prede ned layout types to certain groups of widgets (e.g., Java AWT [ ] layouts). e la er are commonly known as layout managers. ey are so ware components that automatically lay out the widgets on a view based on relative relations and restrictions that are inherent to the layout type and partly speci ed by the programmer. In every modern GUI technology, the GUI behaviour is implemented by an event-driven ap-

R1

R2

R3

Figure . : Example view for entering personal information. Widgets are placed with explicit coordinates.

RecordWindow: Canvas PaymentFrame: Frame CardLabel: Label CardCombo: ComboBox DiscountLabel: Label DiscountCheck: CheckBox ...

Figure . : An excerpt of the GUI tree for the window in Figure . .

proach. Each widget is able to trigger some types of events under certain conditions. For instance, typical types of events for a bu on are click (the bu on has been pushed) and hover (the cursor is over the bu on), and common types of events available for text boxes are change (the content of the text eld has been modi ed) and focus (the text eld has been selected and is ready for writing). Different types of widgets can trigger the same event types, but not all the types of events are available for all the types of widgets. For example, bu ons and text elds can trigger the hover event, but the change event makes no sense for bu ons. Note that the set of events supported by widgets is not standard, but each GUI toolkit may implement a different one. Widgets can be a ached actions that are implemented by programming code (i.e., event handlers) that are executed every time a certain event happens on the widget. ese actions can provide some application functionality, modify the aspect of the current view, or change the view, among others. In short, an event is featured by three elements: i) a widget, ii) an event type, and iii) an event handler that deals with it. . .

V

GUI

Widgets are not randomly distributed on the screen but they form some sort of design (layout) that deeply affects the readability and usability of the GUI. e layout is probably the most complex element of the visual part of a GUI, as it cannot be de ned by a single value or a list of values, but it is the result of applying several features on different widgets or groups of them. We have identi ed several features that characterise the layout. Next we comment on them. • Visual structure. It is related to the human perception about the widget arrangement, and is a key feature to allow adapting the content of a view to different text or screen sizes. Knowing the visual structure requires analysing the positions of all the elements in the view to recognise the ’shapes’ they form and how they are visually grouped. A horizontal ow of widgets or a grid of elements are examples of visual structures. Note that different layout arrangements may produce similar visual structures that can be equally valid for the same view. For example, in the login view shown in Figure . , nameLabel and nameField form a line, passwordLabel and passwordField form a second line, and the ok and cancel bu ons form a third line. Another layout possibility would be to put nameLabel and passwordLabel in one column, and the rest of widgets in another column.

Figure . : Login window created with WireframeSketcher. Sometimes there are widgets that are surrounded by a rectangle because they are related to the same topic, or there are groups of widgets that are visually distant to other groups. In these cases, the groups should be identi ed and handled as a unit if compared to the rest of elements. • Sizing e size of the widgets is another feature that must be considered. Sizes can be expressed in absolute units like pixels, or in relative units, for example in percentages regarding the container element. It is advisable to always use relative units so the measures are independent of the concrete screen of the device. • Spacing e spacing between the widgets in the view is also relevant. We must distinguish between the gaps and the margins. We call gaps to the spacing between the single widgets (e.g. the separation between a label and a text eld). Margins are the distances between the single widgets and their container. Note that gaps and margins are either horizontal or vertical, depending on the axis in which they are observed. Like sizes, gaps and margins can be expressed in absolute or relative units, though the la er are preferred. • Alignment e alignment is either horizontal or vertical, and it is de ned for a widget with respect to other widgets, or de ned for a widget regarding its container. For instance, in Figure . the widgets nameField, passwordField and cancel are aligned to the right with regard to each other.

Looking at these three widgets carefully we can see that they are not perfectly aligned, though it seems that the intention is that they are aligned. erefore, when dealing with the layout, it would be interesting to accept some degree of misalignment, i.e., the analysis of the positions of the elements must be exible. In addition, in cases it may happen that the area taken by a widget slightly overlaps other widgets, and it is neccessary to deal with some small overlapping. . .

L

GUI

e GUI of a legacy system commonly has some features that are not present in modern GUI technologies. Some of them are discouraged practices in so ware engineering that are no longer implemented. We have studied the GUI de nition and code of three different D environments, namely Oracle Forms , Visual Basic and Delphi . Next table summarises the main features of the studied environments. Oracle Forms Year Implicit layout Proprietary units Clustering elements (containers) Container overlapping Widget set Widget-database links Table widget Code mixes aspects GUI de nition format Event handler format

Yes Yes (points) Canvases, Frames, Rectangles, ... Yes standard Yes Multirecord text- elds Yes binary PL/SQL triggers (binary)

Microso Visual Basic Yes Yes (twips) Frames only Not compulsory standard, + complex controls Yes (ADO) ADODC Yes property=value Visual Basic subroutines, mixed with GUI de nition

Table . : GUI features of three different

Borland Delphi

Yes No (pixels) Panels, GroupBoxes, RadioGroups... Not compulsory standard, + complex controls Yes (ADO) TDBGrid Yes property=value Delphi methods

D environments

Based on the mentioned table, we list some features that can be frequently found in GUI de nitions that have been built with D environments: • Implicit layout. e position of widgets is stated by means of a pair of coordinates that are relative to the main window or another container, and rarely, relative to another wid-

get (e.g., a label to its text box). e size (width and height) of a widget is also given explicitly by the D environment. is means that, for example, when a window is resized the widgets are not resized or rearranged accordingly. As it can be seen, the three studied D environments have an implicit layout. In cases, these technologies do not use standard units like pixels or centimetres, but proprietary units. For example, in Visual Basic the default measurement unit is the twip, which is / of a typographical point ( / of an inch). Twips are screen-independent units, they were created to avoid the disadvantages of xed units like pixels, but they are no commonly found in modern IDEs. • Clustering elements. ere are special widgets which are intended to group and/or highlight semantically-related widgets. In particular, we distinguish between elements that arrange a window in parts (in some legacy environments they can also be reused between windows), and elements that are used to highlight a set of widgets in close proximity, frequently by means of a border. • Overlapping. Widgets are o en loosely contained in their container, that is, they are overlapped with the container instead of having explicit containment relationships. A container could also be overlapped with another container. is means that a container may not have any children widget in the GUI tree, although there may be some widgets that would (visually) be expected to be contained. In Visual Basic and Delphi , containers and widgets may be overlapped, but in Oracle Forms this overlapping is unavoidable. For example, in relation to the view in Figure . , Figure . (a) shows a fragment of the original GUI tree created by a D environment like Oracle Forms, and Figure . (b) shows the expected GUI tree. In that view it can be seen that PaymentFrame surrounds CardLabel, CardCombo, DiscountLabel and DiscountCheck (the checkbox next to DiscountLabel), but these widgets are only visually contained in the frame, that is, their parent element in the model is not PaymentFrame, but rather RecordWindow (see Figure . (a)). We could expect that the GUI tree would be like Figure . (b). • Widget set. D environments as well as modern frameworks share a common set of standard widgets, such as text boxes, bu ons, combo boxes, tables, and so forth. However, some environments like Delphi include technology-dependant widgets that may not have an equivalent in other environments. Developers sometimes wanted to use

RecordWindow: Canvas

a)

CardLabel: Label CardCombo: ComboBox DiscountLabel: Label DiscountCheck: CheckBox PaymentFrame: Frame ...

RecordWindow: Canvas

b)

PaymentFrame: Frame CardLabel: Label CardCombo: ComboBox DiscountLabel: Label DiscountCheck: CheckBox ...

Figure . : (a) Fragment of the original GUI tree. (b)

e expected GUI tree.

complex widgets that were not available in the GUI technology in which they were programming and they did a bit of a trick by emulating those complex widgets by means of a composition of the available widgets. For example, a calendar (which is nowadays a common component) was typically emulated in Oracle Forms by means of a grid of bu ons (see Figure . ). Another example is a table with a scrollbar, in which the parts of the scrollbar were emulated by bu ons.

Figure . : A calendar component emulated by a grid of bu ons. • Widget-database links. Sometimes widgets are tied to table columns in database tables. In Oracle Forms, Visual Basic and Delphi, the property sheets of widgets include some properties to indicate that information. Particularly, in Visual Basic and Delphi, widgets contain datasource and data eld properties. e former is con gured by means of an ADO control that indicates the connection string and the data table, and the la er is the column name in the database table. Oracle Forms does not use ADO, but there are rather similar properties to indicate the database connection. e code of event handlers in legacy systems also has some characteristics that are not typically found in modern applications. Next we list some of them.

• Tangling of concerns. Code managing the GUI is mixed with business logic and database access. ere is no clear separation among the different concerns of the application. For example, the event handler shown in Figure . b takes the value of ABE_IMPP, divides it by the euro exchange value obtained from the database, and places the result in ABE_IMPE. As it can be seen, the database access and the GUI are tightly tied. • Simple behaviour. It does not perform complex algorithms or calculations. Event handlers are hardly ever complex, which is caused by the fact that complex functionality is typically implemented in separate functions or stored procedures that are called by the handlers. • Restricted looping. Loops are only used to iterate over database tables. is is a consequence of the previous point, since algorithms used to solve problems are programmed in procedures. Loops are only found when using collections or using sentences to iterate over database rows. • Conditional paths. Several levels of nested conditional statements are common, where conditions check values from the GUI or the database. Actions such as updating the GUI or modifying the database are normally performed in the most inner blocks. • Idiom-based programming. Applications usually repeat a series of idioms. Some of them are speci c of each D environment, while others are conventions dependant on the company. Querying a value from the database and placing it in a text eld a er some kind of modi cation is a recurrent pa ern carried out in event handlers of legacy applications, as it is done in the example code of Figure . b. Another example is shown in Figure . , and excerpt of an event handler in Delphi . e code checks whether a task is active before deleting it, and if the task is active, then aborts the deletion operation. Checking if a value exists in a database before performing an operation is also a common pa ern. • Similar programming abstractions. Although each legacy environment has its own programming language to write event handling code, most of them provide similar constructs. As it was seen in Figure . b, in Oracle Forms simple database access can be performed with implicit PL/SQL cursors, and in Delphi it can be accomplished through a TADOQuery object.

(a) Example window fragment

(b) POST_CHANGE Event handler associated with ABE_AYU_INSE (PL/SQL)

Figure . : Example of mixing of concerns in an Oracle Forms application

Figure . : Fragment of a Delphi event handler that checks if a task is active before deleting it.

. .

U

GUI

Inferring information of the GUI such as the layout or the aspects involved in code, and representing it explicitly, is useful in a variety of cases. Next, we brie y comment on several scenarios in which GUI reverse engineering activity would enable GUI reengineering and other types of activities to be performed: • Revamping. As we have already mentioned, this is the case in which the business logic of the legacy system is reused, and only the views are changed. Frequently, this scenario involves wrapping[ ] the legacy code in order to be able to access from the code of the new GUI technology. A few changes are performed on event handlers just to adapt them to the new views. A particular case of revamping is the layout-preserving migration, which takes place when a migration project have a requirement which speci es that the original GUI layout must be preserved in the target application due to users are averse to change. • GUI testing. ere are different strategies to accomplish GUI testing. An strategy is to generate a mock application with the views original application which tracks user input in order to generate test cases [ ]. Another strategy consists of symbollically executing the code to generate test inputs [ ]. Other works instrument event handler code to record user interactions which are later analysed [ ]. • GUI adaption. Migrating to a new GUI technology requires taking advantage of the target technology’s features (e.g. usability standards, high-level layout models of modern GUI toolkits, etc.). Deep changes in views and event handlers are usually required in this scenario. A particular case of this category would be the migration to technologies with constraints related to the screen size, such as mobile devices. • Quality improvement. Perfective maintenance tasks may be required to improve the system quality, such as the detection of usability issues, non-visible widget removal, GUI resizing and beauti cation [ ], separation of concerns [ ], code refactoring [ ] or death code removal. • Forward engineering. Forward engineering approaches to develop new systems can also bene t from GUI reverse engineering. In so ware development methodologies, GUI designs are validated in the early stages of development with a mockup (Figure . shows a plain mockup), which is a GUI representation that is created before the nal

product so stakeholders can check it. e same approach used in reverse engineering an existing system can be applied to the development of a new one just by taking mockups as source artefacts. en, nal GUIs for different platforms or technologies can be generated from the GUI representations.

.

M

D

E

(MDE)

Model-Driven So ware Engineering (MDSE or simply MDE) is an emerging area of So ware Engineering which addresses the systematic use of models to improve the so ware productivity. Models can be used in the different stages of the so ware lifecycle to raise the abstraction level and automate development tasks. ere exist several MDE paradigms such as Model Driven Architecture (MDA) [ ] or Domain-Speci c Development [ ] [ ] which share the same four basic principles [ ]: (i) models are used to represent aspects of a so ware system at some abstraction level; (ii) they are expressed using DSLs (a.k.a. modelling languages) (iii) that are built by applying metamodelling techniques and (iv) model transformations provide automation in the so ware development process.

. .

M

A metamodel is a model that describes the concepts and relationships of a certain domain. A metamodel is commonly de ned by means of an object-oriented conceptual model expressed in a metamodelling language such as Ecore [ ] or MOF [ ]. A metamodelling language is in turn described by a model called meta-metamodel, therefore, a metamodel is an instance of a meta-metamodel and a model is an instance of a metamodel. Metamodelling languages generally provide four main constructs to express metamodels: classes (normally referred as metaclasses) for representing domain concepts, a ributes for representing properties of a domain concept, association relationships (e.g., aggregations and references) between pairs of classes to represent connections between domain concepts, and inheritance between child metaclasses and their parent metaclasses for representing specialisation between domain concepts. In the following chapters we will use metamodels to describe the data structures involved in the proposed solution.

. .

D

-S

L

(DSL )

In contrast to General Purpose Languages (GPLs), Domain-Speci c Languages (DSLs) are languages that are de ned to solve problems in a speci c domain. In the MDE context, the DSL and modelling language terms are commonly used to refer to the languages used to build models, which are usually created by applying metamodelling, that is, the language allows creating models whose structure is determined by a metamodel. A DSL consists of three basic elements: abstract syntax, concrete syntax and semantics. e abstract syntax describes the set of language concepts and their relationships, along with the rules to combine them. Metamodelling provides a good foundation for this component, and it is the most widespread formalism in MDE but other formalisms have also been used over the years, such as grammars for programming languages and DTD/XML schemas for XML documents. e concrete syntax de nes the notation of the DSL, which can be textual or graphical (or a combination of both). e semantics de nes the behavior of the DSL; there are several approaches for de ning it [ ], but it is typically provided by building a translator (i.e., a compiler) to another language that already has a well-de ned semantics (e.g., a programming language) or an interpreter. An example of graphical DSL for creating quick designs of GUIs are mockup tools (e.g., Balsamiq [ ]), as they conform to a formalism (metamodels or DTD/XMLSchema in most cases), they have a graphical notation (widgets) and they have the semantics of the GUI toolkits for which the GUI code can be generated. DSLs have been used since the early years of programming, however, MDE has substantially increased the interest in them. Most MDE solutions involve the de nition of one or more DSLs in order for users to create the models that are required. When MDE is applied in reengineering legacy systems, concrete syntaxes are not needed for the metamodels that represent the information gathered in that process if such information is not intended to be understood by users. Actually, in our case we have not de ned a concrete syntax for any of the metamodels we will present, but models (i.e. instances of metamodels) have been directly manipulated by model transformations, which we introduce next. . .

M

Model transformations allow automating the conversion of models between different levels of abstraction. An MDE solution usually consists of a model transformation chain that generates

the desired so ware artefacts from the source models. ree kinds of model transformations are commonly used: model-to-model (M M), model-to-text (M T) and text-to-model (T M). M M transformations generate a target model from a source model by establishing mappings between the elements de ned in their metamodels. One or more models can be the input and output of a M M transformation. M M transformations are used in a transformation chain as intermediate stages that reduce the semantic gap between the source and target representations. e complexity of model transformations mainly depends on the abstraction level of the metamodels to which the models conform. e most frequently used M M transformation languages (e.g., QVT [ ], ATL [ ], ETL [ ]) have a hybrid nature since M M transformations can be very complex to be expressed only by using declarative constructs [ ]. ese languages allow transformations to be imperatively implemented by using different techniques: i) imperative constructs can be used in declarative rules (e.g, ATL and ETL), ii) a declarative language is combined with an imperative one (e.g., QVT Relations and QVT operational), or iii) the language is designed as a DSL embedded into a general purpose language (e.g., RubyTL [ ] into Ruby). Using model transformations to solve reverse engineering problems is an example of scenario where a high degree of processing of information is required and the complexity of transformations can become very high. A survey on model transformation languages can be found in [ ]. M T transformations generate textual information (e.g. source code) from an input model. M T transformations produce the target artefacts at the last stage of the chain. MOF Text [ ] and XPand [ ] are some of the most widely used M T model transformation languages. Finally, T M transformations (also called injectors) are used to extract models of the source artefacts of an existing system, and are mainly used in so ware modernisation to obtain the initial model to be reverse engineered. Hence, they are less frequently used than M M and M T. Among the tools for extracting models from code we remark MoDisco [ ] that implements parsers (called discoverers) for Java and other languages, the XML injector of the Eclipse Modeling Framework (EMF) [ ] that obtains Ecore models from XML schemas, and Gra MoL [ ], which is a textual DSL especially designed to de ne T M transformations when the source artefact consists of text that conforms to a grammar, by establishing mappings between that source grammar and a target metamodel.

. .

M

-D

M

(MDM)

MDE is increasingly gaining acceptance, mainly because of it is being successfully used in building new so ware systems (forward engineering) [ ] [ ]. But MDE techniques, such as metamodelling and model transformations, are also useful to evolve existing systems, as they can help to reduce the so ware maintenance and modernisation costs by automating many basic activities in so ware evolution processes. In this se ing, Model-Driven Modernisation (MDM)¹ has emerged as an MDE approach to be applied in the so ware modernisation scenario. Several experiences of applying MDM have been recently published [ ] [ ] [ ], which have showed how MDE techniques facilitate the obtainment of representations that have an abstraction level higher than source code, and how modernisation tasks can be automated, e.g., providing metrics to analyse the impact of the changes or automatically generating so ware artefacts of the evolved system.

Figure . : MDE applied to reengineering In the MDM context, reengineering is accomplished by applying model transformations in each of three stages of the process (see Figure . ). Reverse engineering gets models from the source artefacts which are not just a model representation of the code, but they provide a higher abstraction level. Frequently, this step is tackled by a T M transformation that gets a low-level ¹Model-Driven Reengineering (MDR) is an approach related to MDM that advocates the use of models in reengineering.

representation of the code (and is therefore dependent on the type of source artefact), followed by one or more M M transformations that get more abstract representations. A crucial aspect is the de nition of the metamodels that are appropriate to represent the knowledge collected in each step of the transformation chain. Model-Driven Reverse Engineering (MDRE) [ ] [ ] is a common term referred to the use of MDE in the reverse engineering stage. In the restructuring stage the models are transformed into other ones that conform to some aspects of the target architecture, which is accomplished by one or more M M transformations. Finally, the forward engineering stage takes the models obtained in the restructuring stage and generates artefacts of the new system, which can be performed by a M T transformation. If there is a wide semantic gap between the models obtained a er the restructuring stage and the target code, a M M transformation chain nished by a M T transformation is frequently advised. To increase the interest in applying MDE to modernise legacy systems, OMG launched the Architecture Driven Modernisation (ADM) initiative in [ ], whose objective is to develop a set of standard metamodels for common tasks in so ware modernisation in order to facilitate the interoperability among tools. Several modernisation scenarios in which ADM metamodels have prove to bring bene ts are described in [ ] [ ]. Among these metamodels, Knowledge Discovery Metamodel (KDM) [ ] plays a main role due to it is targeted at representing application code at different abstraction levels, from GPL statements to business rules. It is, therefore, an arguably large metamodel structured in four layers, namely In astructure, Program elements, Resource, and Abstractions. e Abstract Syntax Tree Metamodel (ASTM) is a metamodel that complements KDM and is devised to represent code in the Abstract Syntax Tree (AST) form. In [ ] a detailed explanation on how to use KDM and ASTM to model PL/SQL code can be found, as well as a case study for gathering so ware metrics is presented. Other ADM metamodels are So ware Metrics Metamodel (SMM) for representing metrics, and Automated Function Point (AFP) for automating the extraction of function points. Up to the present time, the impact of the ADM standards has been very limited, mainly due to the complexity of KDM [ ] and few works that illustrate real case studies have been published. In [ ] some MDM tools that have been recently developed are presented, among which MoDisco has received greater a ention. MoDisco[ ] is an extensible open source MDRE framework to develop model-driven tools to support use-cases of existing so ware modernisation. MoDisco aims at supporting the description, understanding and transformation of existing sofware by providing four elements: i) metamodel implementations like relational database, KDM and

JavaSE metamodels, ii) discoverers to automaticaly inject models of these systems such as a discoverer from Java code to KDM models, iii) generic tools to understand and transform complex models created out of existing systems, and iv) use cases illustrating how MoDisco can support modernisation processes.

We ourselves feel that what we are doing is just a drop in the ocean. But the ocean would be less because of that missing drop. Mother Teresa of Calcu a (Suggested by Jesús García Molina)

3

State of the art

Our work tackles the problem of reverse engineering the GUI of legacy systems, concretely two aspects, namely layout and behaviour. To cope with it we have used MDE techniques. Consequently, the analysis of the state of the art has been classi ed in three sections: layout recognition approaches, behaviour extraction approaches, and MDE approaches for representing GUIs.

.

A

In this section we will present some works related to GUI layout inference. ree works are of special relevance for our work, which are [ ] [ ] [ ], since they deal with the extraction of a layout expressed in coordinates and they deserve a section each one to analyse them in detail. Other works related to reverse engineering layout and structure of GUIs will be summarised in a single section, as they are not as close to the topic as the former ones. We have identi ed a set of dimensions which are useful to classify layout inference approaches. e three aforementioned works will be categorised according to the following dimensions:

. Source/target independence: whether the proposed approach is generic, i.e. it is independent of the source and target technology. . Tested source technology: the technology or type of tool which was originally used to create the GUI de nitions in the case studies of the approach. For example, a D environment such as Oracle Forms, or a wireframing tool like Balsamiq. . Tested target technology: the platform and toolkit in which the nal GUI is created in the case studies of the approach. For instance, the ZK web framework, or the Java Swing toolkit for desktop applications. . Reverse engineered information: the kind of information that is extracted in the GUI reverse engineered process. Different approaches may describe a user interface by using different types of information, for example, the sizes of the widgets or how the widgets are contained in other widgets (containment hierarchy). . Layout model: the data structure devised to explicitly represent all the information that has been extracted from the original GUI. A layout model based on combining horizontal and vertical elements (HVLayout) is one simple example. is representation is a cornerstone in the approach since any forward engineering approach to generate a nal GUI will use this representation. . Algorithm type: the algorithmic strategy involved in the discovery of the layout, such as backtracking or heuristics, and/or the theoretical basis to solve the problem, e.g. linear programming. . Implementation technology: the technological basis used to implement the approach, for instance, an MDE-based approach. . Automation degree: wether the approach is totally automated or mostly automated with user intervention in many cases (semi-automated). Next we analyse the three approaches that are closely related to ours. . .

L

Lu eroth [ ] claims that most GUIs are speci ed in the form of source code, which hard-codes information relating to the layout of graphical controls. He points out that hard-coded GUIs

lack in dynamic layout as the position and size of the elements are expressed in pixels, and that this representation is very low-level and makes GUIs hard to maintain. He suggests a reverse engineering approach that is able to recover a higher-level layout representation called the Auckland Layout Model (ALM). e author argues that GUIs using pixel units have many disadvantages. GUIs can be executed in different devices with different resolutions, and even the visible part of the GUI is modi ed when the window is resized. He claims that, in those cases, pixel-based GUIs do not guarantee a correct display. Moreover, when the content of a widget changes, the size of the widget has to be manually re-de ned, and when some widgets are added or removed, it is likely that other widgets have to be manually modi ed. All these adjustments do not automatically happen in a pixel-based GUI. e ALM is a mathematical model that captures the invariants of a GUI by using linear programming. An invariant is a condition to be satis ed, e.g., the width of a widget must be less than the width of the panel it contains it. ose invariants are used as constraints in an optimisation process that results in the calculation of an adapted layout whenever circumstances change (e.g., the dimension of the window is altered). ALM offers different layers of abstraction on top of bare linear programming (which is very low-level) that make it possible to specify the invariants of typical GUIs more conveniently. ALM allows developers to de ne linear constraints in terms of tabstops and areas: . A tabstop represents a position in the coordinate system of a GUI. All positions and sizes in a layout are de ned symbolically using tabstops as variables. Tabstops form a grid in which all the controls are aligned. . An area is a rectangular portion of space de ned by the tabstops of the upper-le corner and the lower-right corner, the control that occupies the space, and the preferred, minimum and maximum sizes of the space. Heuristics are applied for choosing the preferred, maximum and minimum size of the area depending on the control. For example, bu ons do not normally change their size when they are resized, whereas text areas commonly take the extra space of the window. Two types of constraints can be speci ed: hard constraints and so constraints. Hard constraints have to be always satis ed, and so constraints may not be satis ed fully if circumstances do not permit so.

e input of the reverse engineering process is a hard-coded GUI, and the output is a set of areas containing the children controls and a set of linear constraints (equations/inequations with the tabstops as variables). From the point of view of the developer, a layout manager is provided, that resolves the linear constraints and adapts the layout accordingly. ere is an implementation of the layout manager for C , so developers can use this layout manager to lay out containers such as Form elements. e reverse engineering algorithm uses some criteria to beautify the recovered layout, namely: . Controls can be slightly misplaced when creating the GUI. e algorithm can correct these misplacements by introducing some additional constraints. . Margins are standardised. Distances between controls or between controls and borders are adjusted so they are similar. . Sizes are standardised. For example, make the controls in the same column have the same height. . Keep rows/columns of similar controls in a certain proportion of other rows/columns. . Use real world units such as centimetres to make GUIs be rendered consistently on different screen resolutions. . .

R

.

In [ ] authors state that mockups have become a very popular artefact to capture GUI requirements in agile methods, but most development approaches use them informally without providing ways to reuse them in development processes. ey bet on taking advantage of mockups during so ware development to automate the creation of GUIs, and they propose a modeldriven approach for importing mockups and transforming them into a technology-dependant model that can be used to generate code for web technologies. ey have set their approach in the context of a WebTDD process though they claim that it can also be used with RUP-based processes or Extreme Programming. e approach can be seen in Figure . . For each mockup tool, a parser needs to be created (step ). en, the controls are rearranged as explained below (step ) and the Abstract Mockup model is obtained (step ), which helps to abstract mockups in a tool-independent way. is model can be used to

derive UI class stubs or models implemented with a concrete technology. For each concrete technology of interest, a code generator must be constructed (step ).

Figure . : Schema of the Rivero et al. approach (extracted from [ ]). Unlike common UI frameworks, mockup tools do not generally provide ways of de ning UI control composition, but all the controls are at the same level (controls are not contained in other controls). e Abstract Mockup metamodel takes this issue into account in order to derive complete UI speci cations for concrete technologies. e mockup parsers scan the UI speci cations looking for controls and storing their properties (e.g., position or size), and they also detect clusters of controls, so each cluster represents a set of components in a unique graphic space (e.g., a page, a window or another grouping concept). en, the Processing engine creates a hierarchy of controls as follows: if a control is graphically contained in another one and the rst one is a composite control (i.e., a panel), the second one is added as a child of the rst one. Because of the myriad of different web technologies, an absolute positioning scheme is not sufcient to model a UI in a platform-independent way. To avoid this problem, the Processing engine arranges components in a platform-independent layout. Particularly a GridBag layout similar to the Java Swing layout manager of the same name has been implemented. is layout manager arranges components in the same way it is done in HTML tables and it was selected because authors consider that it is richer and more exible than others. e algorithm to obtain a GridBag layout starts by placing all the components in a single cell, and iteratively divides it so creating a grid of cells. In every iteration, a new column or row is created, and the algorithm stops when every cell is occupied by at most one widget. ere may be widgets that take more than one cell, e.g. a text eld t that occupies the space of two cells (t.colspan = ). Since their approach can be used in iterative processes in agile methodologies, UI evolution is

an important concern. Between two iterations, existing UI controls can be possibly modi ed, which could entail a problem if the automatically generated UI component identi ers change from the previous iteration. e solution proposed is to indirectly reference UI components by means of an identi er translation function (reference translator), which maps logical identi ers of UI components to real identi ers assigned by the code generator. erefore, every time it is required to access to a control, the reference translator is used. en, that problem can be solved by correcting the real identi ers in the reference translator between iterations. e proposed architecture is extensible, given that a developer can take the framework and extend it. In order to add a new mockup tool, a parser that returns a collection of control clusters must be implemented. With the aim of adding a new target UI technology a code generator must be implemented. e framework provides some helper classes (e.g. indentation for code generators) and uses object oriented pa erns such as the abstract factory or visitor pa ern to make extension easy. As a proof of concept, authors have tested the approach with different mockup tools (Pencil, GUI Design Studio and Balsamiq) and target web technologies (YUI and Ext JS). . .

S

K

A recent work by N. Sinha and R. Karim [ ] proposes a model-based approach to compile mockups to exible web interfaces. e authors refer to exible layout as a layout that is uid (when the window is resized the content scales accordingly) and elastic (the content resizes on changes in font sizes). Two phases are de ned in the process of obtaining high-quality web pages from mockup editors (see Figure . ). e rst phase is to infer the right page layout, i.e. the vertical/horizontal ow of content that preserves the relative sizes and alignment of individual elements. e second phase is to encode the inferred layout in a HTML page faithfully.

Figure . : Sinha and Karim approach (extracted from [ ]). A mockup is de ned as a collection of rectangular objects (boxes), each box having its visual properties (e.g., size or colour). Given that a native web application is laid out with HTML/CSS

boxes, the authors propose a box-based layout. ey suggest two box-based layouts: grid layout (a unique grid with n × m cells) and HVBox layout (hierarchy of horizontal/vertical boxes), and they claim that HVBox layout is preferred since grids result in ne-grained layouts which have additional overhead. ey made the decision of inferring HVBox layout from mockups. In order to infer the box hierarchy their approach employs a combinatorial search, which is inspired on the explore-fail-learn paradigm used in constraint solving problems. e algorithm starts with the single boxes and applies a bo om-up approach to merge pairs of boxes until a solution is reached. When a pair of merging boxes intersect other boxes, the con guration is discarded since it will not reach a valid solution. A er obtaining the layout tree, nodes that have children of the same type (vertical or horizontal) are compacted. HVBox layout is not natively supported in HTML/CSS, therefore, the boxes must be encoded to create the desired layout. ey have a set of modular rules to encode the layout in HTML/CSS such as rules to pre-compute the offset and height/width of an element relative to its parent(enclosing) box, rules to compute the size and margin in percentages of the width of the parent (height is le unconstrained), or rules to mark HTML tags to be oat. e authors mention the following four additional implementation considerations: • Rounding: prevent that rounding errors during margin and size calculations cause that a child content over ows its parent. • User guidance: the mockup may be ambigous and not fully capture the designer intent. ere may be multiple valid merge choice sequences and therefore multiple feasible layouts. Consequently the algorithm may not obtain the desired layout. e tool allows users to guide the algorithm by indicating which boxes can be merged or not in a con guration le. • Browser incompatibilities: the pages may not be displayed correctly in browsers that do not implement CSS . completely. • Overlapping boxes: the framework discards overlapping boxes before inferring layout. e approach has been tested with a mockup builder called Maqe a for a set of web pages constructed by the authors which follow common design pa erns extracted from the web. ey have also veri ed its correctness in some up-to-date web browsers. Tests resulted in high-quality

replicas of the original mockups in most cases. Sometimes undesired boxes were merged together and user guidance was required, and in other cases ne-grained tweaks were required to x the layout. . .

O

In this section we will show some other works that do not strictly deal with layout inference but they are somewhat related to the topic. A well-known example of GUI builder with code generation facilities for the NetBeans IDE is Matisse [ ]. It is a full- edged design tool that supports the user in the GUI design and which generates code that perfectly ts the design. e generated code is based on the GroupLayout, a layout manager which was intentionally introduced to work with IDEs. e tool automatically generates code for Java Swing, particularly based on the GroupLayout, and is tied to the NetBeans IDE. An approach with which to migrate Windows applications to Visual Basic .NET can be found in [ ]. Its aim is to replicate the GUI’s look & feel by means of mapping runtime objects to .NET objects, so explicit layout recovery is not tackled. In [ ], the authors propose a pixel-based approach based on real-time interpretation of the GUI to identify the hierarchical model of complex widgets. is information is then used to modify an existing GUI (e.g. to translate the text of the widgets) with independence of the interface implementation. VAQUISTA [ ] is a tool which performs the reverse engineering of web pages into XIML [ ] models according to exible heuristics, and requires user interaction during the reverse engineering process. In this case, the source are web pages wri en in HTML which were laid out with tables, and the tool maps each table cell to a target element, so the table layout is replicated. In [ ] an approach for extracting the web content structure based on the visual representation is proposed, which simulates how users understand web layout structure based on their visual perception. e approach is tightly based on the nature of the HTML code and cannot be applied to coordinated-based interfaces. Some other related works propose the reengineering of web pages, particularly to adapt them to mobile devices. e following two works fall into this area. In [ ] an approach with which to structure web pages in a two level hierarchy is presented, in such a way that if a user selects a part of the web page, this part will be displayed with the screen size like a zoom-in. In [ ], a solution

for generating dynamic web migratory interfaces is explained. e authors rely on the analysis of HTML tags in order to split the original web pages in regions that are transformed into web pages with hyperlinks between them. It is worth noting that UI reengineering approaches for web pages work on DOM trees, which are tree-based representations of the HTML code, in which the GUI structure is already explicitly expressed by means of HTML tags. . .

D

We have presented several works related to reverse engineering of GUIs, and we have focused on three of them that deal with layout inference, which are summarised in Table . . Next we will contrast these approaches and we will indicate desirable features of a layout inference solution. In two of the proposals (Rivero et al., Sinha and Karim) the source technology is a mockup and the target technology is a web technology, whereas in Lu eroth the source technology is a GUI programmed with object oriented code and the target is a desktop toolkit. Two of the approaches (Lu eroth and Rivero et al.) are general, i.e., they can be used with any pair of source/target technologies, and the work of Sinha and Karim is tightly tied to the web target platform. It is clear that a generic solution (not tied to source/target technologies) is desirable. Since hard-coded GUIs and mockups have implicit layouts expressed in pixel coordinates, the same approach could be used for both cases. With regard to the extracted information, we can see that all these approaches collect some common data (sizes, margins) but they recreate the layout based on different information: Lutteroth uses constraints; Rivero et al. identify the widgets in each grid; and Sinha and Karim extract HTML boxes. We believe that a good layout inference approach should extract all the information we presented in Chapter . . explicitly. Granted, some information can be used in place of other one to obtain a similar visual appearance. For example, Lu eroth extracts information about constraints and margins, but it does not get explicit information about the alignment between widgets, so one widget below another one both having the same le margin may look aligned though the layout manager does not explicitly know that they are aligned. Having explicit information about alignment and other features of the source GUI can ease the forward engineering step and led to be er adapted layouts. In Sinha and Karim and Rivero et al., the layout model that is the result of the reverse engineering process is a concrete layout manager model that can be found in numerous GUI frameworks (particularly GridBagLayout and HVFlow) . In contrast, Lu eroth obtains a model with in-

Approach Source/target independence Tested source technology Tested target technology Information extracted Layout model Algorithm type Implementation technology Automation degree

Lu eroth Yes Hard-coded GUIs (C ) Desktop toolkit Positions, margins, sizes ALM (constraint model) Linear programming, heuristics Programming language Automated

Rivero et al. Yes Mockups (Pencil and others) Web (YUI, Ext JS) Containment hierarchy, layout structure GridBagLayout Heuristics Model-based approach Automated

Table . : Summary of layout inference approaches Requires implementing the layout manager in every target technology

Sinha and Karim No (target must be web) Mockups (Maqe a) Web (HTML/CSS) Boxes, margins, sizes HVFlow Exploratory Modular rules Automated

formation about widget constraints. Given that nowadays most GUI frameworks offer layout managers, representing the design of the GUI in terms of layout managers will make the forward engineering step much more easier than using other models such as the ALM model. e proof of concept of Lu eroth generates a C GUI, which involved the creation of a layout manager in C to deal with the linear constraints, so in case of using his solution with another target technology, programming the layout manager would be required. is is likely to be a more complex solution than mapping a prede ned layout manager (e.g., GridBagLayout in Rivero et al.) to the set of layout managers of the target technology. e representation used to de ne the GUI structure (the layout model) has a great impact in the forward engineering step of the process. It must be exible enough to represent any design, but at the same time it must be close to the well-known existing layout managers in order to make the mapping to other GUI toolkit easy. e works of Sinha and Karim and Rivero et al. rely on single concrete layout managers so the whole reverse engineering process is aimed at generating a design using a certain layout. However, when designing a GUI (either programming or with visual builders) developers do not normally use a single layout manager but a composition of them. Due to this reason, we believe that the layout model should contemplate a set of generic layout managers in such a way that a layout is de ned by using the layout managers that are more suitable for the concrete GUI. Moreover, it would also be desirable that the set of layout managers used in the layout model is parameterised. e rationale is to avoid emulating them or implementing new layout managers if they are not available in the target technology. ere is a variety of algorithmic techniques that can be used in the inference approach (linear programming and heuristics in Lu eroth, heuristics in Rivero et al. and an exploratory algorithm in Sinha and Karim), and any of them can be equally valid. e implementation technology may have some importance in the overall solution. Rivero et al. proposes a model-based approach to implement the solution, whereas the others use imperative or object-oriented programming. We think that a model-based approach endows the implementation with additional bene ts to implementing good-quality solutions over classical programming. For instance, transformation chains offer a straightforward solution to obtain source/target independence. MDE also brings other bene ts such as automation, thanks to the model transformations. In short, we believe that a good layout inference solution should: • be source/target independent • provide explicit information for every layout feature

• use a layout model made up of a variety of layout managers to facilitate the layout de nition, which can be selected by developers • be implemented using a paradigm (e.g., MDE) that provides architectural bene ts such as extensibility.

.

A

In this section we comment on some works which perform some kind of reverse engineering of the GUI behaviour. We will emphasise three of them that we considered more interesting to accomplish the separation of concerns in legacy GUIs, though in the discussion we will take into account the nine works that are mentioned throughout this section, as they can be compared by using the same criteria. Works presenting solutions for code analysis that are not focused on the GUI but other concerns (e.g., business rules) such as [ ] and [ ], which present C++ static analysis solutions to generate UML models, have been excluded from this discussion. We will classify each work according to the following four criteria: . Source artefacts: the source artefacts that are the input of the analysis process (including programming languages and UI toolkits used). For example, Gtk C++ les. . Extracted information: the output of the analysis. For instance, a state machine model representing the ow of events. . Goal: the purpose for which the information extracted by the analysis is going to be used. . Analysis type: It can be static (the source code is analysed statically), dynamic (it analyses information that is collected when executing the code in some way), or hybrid (uses both static and dynamic analysis). . .

M

(GUIR

)

In [ ] an approach called GUIRipping to reverse engineer a runtime GUI into three models is described, namely GUI forest, an event- ow model and an integration tree¹. ese models, which ¹

e author later refers to all the aforementioned models as an event- ow model

we will explain next, are intended to be used to automatically generate test cases. e approach has been implemented in a tool called GUIRipper. e GUI forest is a representation that indicates for each window which other windows are opened if performing an event in the former. Two windows are distinguised: modal windows and modeless windows. e former once invoked monopolise the GUI interaction, whereas the la er do not restrict the user focus. e author de nes a component as a modal window together with the modeless windows that have been directly or indirectly invoked from the former. In an event- ow graph for a speci c component (a modal dialog), the vertices represents all the events in the component. e outgoing directed edges from a vertice represent which vertices can be reached from that vertex (i.e., which events can be performed immediately a er the event associated with that vertex). e types of events identi ed are ve: • Restricted-focus events: open modal windows. • Unrestricted-focus events: open modeless windows. • Termination events: close modal windows. • Menu-open events: open menus. • System-interaction events: interact with the underlying so ware to perform some action. e integration tree is constructed to show the invocation relationships among components (modal dialogs) in a GUI. It is obtained by integrating the information of the GUI forest and the event- ow model. is decomposition of the GUI makes the testing process intuitive for the test designer because he can focus on a speci c part of the GUI. GUIRipper rstly obtains the GUI forest by performing a depth- rst traversal of the hierarchical structure of the GUI. e runtime GUI is analysed (e.g., using the Windows API in case of a Windows application) to get the top-level windows, the executable widgets (widgets that invoke other GUI windows), and the windows that are opened by performing events on executable widgets. During the traversal of the GUI, the event type is also determined by using low-level calls. A er the automating ripping process has nished, manual inspection is required since some information cannot be extracted by the GUIRipper. e event- ow model can be used in the de nition of event-space exploration strategies for automated model-based testing, particularly: i) goal-directed search for model checking, ii) graph-

exploration for test-case generation, iii) operator execution for test-oracle creation. delves into these strategies for several scenarios in [ ]. . .

H

e author

.

In [ ] a methodology to deal with the evolution of legacy systems to three-tier architectures and Service Oriented Architectures (SOA) is proposed. is methodology is based on the Horseshoe Model introduced in Section and consists of three steps, namely reverse engineering, redesign, and forward engineering, preceded by a preparatory step of code annotation, which can be seen in Figure . .

Figure . : Approach of Heckel et al. (extracted from [ ]). e source code elements (packages, classes, methods, or code fragments) are annotated by code categories (step ) with respect to their architectural function in the target system, e.g., like GUI, application logic or data. Annotations are manually wri en by developers in the original source code in the form of comments, and they are propagated through the code by categorisation rules de ned at the level of abstract syntax trees, so it is not needed for developers to annotate all the source code elements. From the annotated source code, a graph model is created (step ), whose level of detail depends on the annotation. e graph model is a reduced Abstract Syntax Tree (AST) representation where the nodes are packages, classes, methods, parameters and variables, and additionally CodeBlocks to represent groups of statements, and the edges represent the order of the nodes. Moreover, there is an node type to represent the categories of a code element. en, all the contiguous statements that are annotated in the same way are grouped in the same CodeBlock node, and associated a category. is step is a straightforward translation of the relevant part

of the code into its graph-based representation. e relation between the original (annotated) source code and the graph model (relation R ) is kept to support traceability. During the redesign phase (step ) the source graph model is restructured to re ect the association between code fragments and target architectural elements. Code categories guide the automation of the transformation process. is transformation is speci ed by graph transformation rules aimed at performing code refactoring. e relation with the original source code is kept (relation R ) in order to support the code generation. e target code is either generated from the target graph model and the original source code or obtained through the use of refactorings at the code level (step ). e result of this step is the annotated code of the new system wri en in the target language. . .

M

. (R GUI)

is work [ ] presents a dynamic reverse engineering approach and a tool (ReGUI) aimed at diminishing the effort of producing visual and formal representations of the GUI, which enables veri cation of properties or can serve as the input of Model-Based GUI Testing techniques.

Figure . : Approach of Morgado et al. (ReGUI) (extracted from [ ]). e approach, which is depicted in Figure . , has two main components: the analyser and the abstractor. e analyser component uses UI Automation, the accessibility framework for the Microso Windows operating systems supporting Windows Presentation Foundation. With this framework, the runtime instances of a Windows application can be explored. During the

exploration process, every menu option is navigated to extract its initial state (i.e., enabled or disabled), and each menu option is triggered to verify what windows are opened because of that interaction and also see if the state of any element has changed. e analyser extracts some information about the GUI elements and their interactions. Particularly, the analyser distinguises two GUI elements: Windows, which can be modal or modeless, and Controls, which can be menu items or other controls. e interactions between the GUI elements can be of ve different types: Open, a window is opened; Close, a window is closed; Expansion, new controls become accessible (e.g., the expansion of a menu); Update, one or more properties of one or more GUI elements are updated; Skip, nothing happens. e abstractor component generates different views on the extracted information, which are: • ReGUI tree: represents the different aspects of the structure of the GUI (e.g., the containment hierarchy of a menu). • Navigation graph: stores information about which user actions must be performed in order to open the different windows of the application. • Window graph: is a subset of the information represented in the navigation graph that describes the windows that may be opened in the application. • Disabled graph: its purpose is to show which nodes are accessible but disabled at the beginning of the execution. • Dependency graph: A dependency between two elements means that interacting with the former modi es the value of a property in the la er. is representation shows all the dependencies among controls. Apart from these views which can be used to inspect the GUI, an Spec model and an Symbolic Model Veri cation (SMV) model can be generated. Spec is a formal speci cation language that can be used as input to Spec Explorer [ ], an automatic model-based testing tool for test generation. An SMV model can be used in combination of model checking techniques to verify properties, which is useful, for example, in usability analysis. . .

O

We will summarise other works that analyse UI behaviour and are less relevant to the purpose of separating concerns. First we will comment on two static analysis approaches [ ] [ ], and

then we will oversee four dynamic analysis approaches [ ] [ ] [ ] [ ]. In [ ] a static analysis for GUIs is presented, which extracts information about the GUI out of the source code. It is targeted at applications wri en in programming languages such as C/C++ using GUI libraries such as GTK [ ] or Qt [ ]. e goal is to extract, from the source code, the widget hierarchies forming the windows together with the widget a ributes and event handlers. GUI detection is accomplished to determine which types, variables, functions and les are relevant to the GUI. en ISSA (Interprocedural Static Single Assignment) form is used to detect the widget hierarchy, and also determine the widget a ributes and event handlers. A er detecting the GUI and obtaining the widget hierarchy, a window graph is created. In this graph nodes are given by windows and indicate that an event raised in the rst window can create or show the second window. Edges are labelled hence with events or sets of events. In order to create the outgoing edges for the nodes, the algorithm inspects all the event handlers for the events issued by members of the hierarchy of widgets of the window. An event handler gives rise to an edge if the handler itself or some function directly or transitively called by it creates or shows a window, and if no window is created or shown along the control- ow path in between. e work presented in [ ] proposes an approach to obtain state machines of the transitions between windows based on source code wri en in Java. e approach is implemented by three tools: FileParser, which parses a particular code le, ASTAnalyser that slices the Abstract Syntax Tree (AST) obtained by FileParser, and Graph which generates metadata les with the state machines. e approach uses Strategic Programming and Program Slicing to isolate the parts of the code which are related to the GUI, in order to make the approach easily retargetable to different programming languages and GUI toolkits. e state machine representation they propose is a graph where states represent windows and transitions include: i) the internal state of the window (it is useful for example to detect windows complexity), ii) the user action that triggers the event, and iii) the condition that must be hold for the transition to occur. Stroulia et al. [ ] propose a method for migrating Text-based User Interfaces (TUIs) in the context of the CelLEST project. ese TUIs are part of legacy distributed systems in which there are terminals that interact with a mainframe by means of a communication protocol. Its novelty lies in that it models the system dynamic behavior based on traces of the user interaction with the system, instead of focusing on the system code structure. e reverse engineering phase is based on the analysis of the dynamic traces generated by real user interaction. In order to obtain traces, they propose using an emulator that provides users with a text-based interface that mimics the original hardware terminals used to access the host system, on which the legacy

application resides, by implementing the protocol of communication between the host and the emulator user interface. e emulator is instrumented so that it also records the interaction between the legacy application and its users. A trace recorded by this emulator consists of a sequence of snapshots of the screens forwarded by the legacy application to the user’s terminal. Between every two snapshots, the user keystrokes are recorded. e result is a model of the TUI behaviour represented as a directed state-transition graph. e graph nodes correspond to the distinct interface screens, which are identi ed by clustering all the screen snapshots, contained in the recorded trace according to their visual similarity. Each edge of the graph corresponds to an action that can be taken, i.e., a command that can be executed when the source-screen node is visible to the user and leads to the destination-screen node. A GUI test generation approach based on symbolic execution is presented in [ ]. e GUI testing framework (named Barad) generates values for data widgets and enables a systematic approach that uniformly addresses the data- ow as well as the event- ow for white-box testing of a GUI application. e approach is applied to Java event handlers. Firstly, the event handler bytecode is instrumented, i.e. it is modi ed to execute a custom code a er every sentence. During the instrumentation, they generate an inline version (with branching statements removed) of the program with primitives, strings, and conditional instructions replaced with the corresponding symbolic values. en, the code is symbolically executed. Basically, symbolic execution uses symbolic values instead of actual data, and represents the values of program variables as symbolic expressions. e symbolic execution is performed by applying a chronological backtracking that visites all the branches of the program. For a branch to be explored, the set of constraints of the states must be satis ed. When an entire branch has been executed, the test case for that branch is generated, and the program state (the set of values of the variables) is restored. A er test cases have been generated, some heuristics to reduce the test suite are applied. e resulting suite maximises the code coverage while minimising the number of tests needed to systematically check the GUI. In [ ] the authors present a reverse engineering approach for abstracting Finite State Machines representing the client-side behaviour offered by Rich Internet Applications (RIAs). e reverse engineering process consists of two activities: extraction and abstraction. During the extraction activity, the user interacts with the RIA in a controlled environment and the sequences of events are registered. e abstraction activity is composed of three tasks: RIA Transition Graph building, Clustering, and Concept assignment. e rst task builds the Transition Graph from the traces stored in the extraction activity. is graph models the ow of RIA views that

were generated. e second task analyses the Transition Graph and clusters the nodes and edges that are equivalent. e Finite State Machine models the event listeners that are associated with DOM elements of a web page, which can be: user events listeners, time event listeners (due to the occurrence of timeout conditions) and H P response event listeners (due to receptions of responses to some H P request). It also models the transitions between web pages and the events that caused those transitions. ese events can be associated to web page requests (traditional H P requests) or XmlH pRequests (asynchronous Ajax requests). Mesbah et al. [ ] describe a technique for crawling Ajax-based applications through automatic dynamic analysis of user interface state changes in web browsers. e analysis process infers a state machine that models the navigational paths within an Ajax application, which can be used in program comprehension, analysis and testing. e analysis works in the following way. Firstly, the Controller traverses the web page to nd clickable elements, which are elements that have event listeners and can cause a state transition. For each element, the crawler instructs the Robot to ll in the form elds and re events on the elements in the browser. When the events are triggered in the clickable elements, changes in the DOM tree are produced. en the DOM Analyzer compares the current DOM tree and the previous one by using some heuristics. If a state change is detected, a new state is created and added to the state machine. If a similar state is recognised, that state is used for adding a new edge (no new state is created). e algorithm uses backtracking to recursively traverse all the code branches until all the code is executed. When applying backtracking, the DOM tree has to be set to a previous state. is is achieved by using the browser history if the Ajax application has support for it, or reproducing the event sequence from the initial state in contrary case. . .

D

We have reviewed some of the most relevant approaches up to date about reverse engineering and reengineering of UI behaviour. Now we will make some re ections about these works. First of all, we see that the majority of the works ( out of ) coincide in representing the behaviour by means of some sort of state machine (transition graph) where the states represent views and the transitions represent the events that trigger the changes. e granularity of the states and events represented differs between the different works. For instance, in [ ] events represent transitions between complete views, so the state machine is used as a model of the navigation among them. In contrast, in [ ] events represent changes in parts of a view, as it

Approach Memon et al. Heckel et al. Morgado et al. Staiger Silva et al. Stroulia et al. Ganov et al. Amal tano et al. Mesbah et al.

Source artefacts Runtime GUI ( Java/Windows) Annotated code ( Java) Runtime GUI (Windows) GTK/Qt code (C/C++) Java code TUI runtime traces Java bytecode Instrumented RIA Ajax web applications

Extracted information Transition graph AST-like graph with code categories Interaction model Widget hierarchy, transition graph Transition graph Transition graph Symbolic tree, test suite Transition graph Transition graph

Goal Testing Migrate to -tiers PC, verif. properties Maintenance PC., testing Migration to the web Test generation Maintenance, testing PC, analysis, testing

Analysis type Dynamic Static Dynamic Static Static Dynamic Dynamic Dynamic Dynamic

Table . : Summary of the behaviour extraction approaches (PC stands for Program Comprehension)

happens in Ajax applications, which has a much smaller granularity level than in the previous work. In [ ] several models that focus on speci c behaviour are even created, such as a model to know which elements that are disabled at the beginning are accesible a er a sequence of events. erefore, depending on the purpose of the reverse engineering or reengineering, different information represented in the form of a state machine may be useful. With regard to the goal of the reverse engineering, most of the works are aimed at perform testing or program comprehension ( out of ), and a few works ( out of ) are targeted at generating a new system. In [ ] the separation of legacy applications in layers in order to generate web applications is proposed, and the idea of abstracting the source code in a model that guides the generation of the new system is introduced. It is worth remarking that different from the rest of the works, it addresses a separation of concerns, particularly from the point of view of the architecture of the application (business logic, UI, data access). In that work, the reverse engineering is assisted by the developer, that must tag the code parts so the tool knows which layer the code belongs to. is procedure is useful, but developers must spend time in inspecting the whole code by hand. Most approaches ( out of ) are based on dynamic analysis while the rest apply a static one. is is due to it is easier to determine which views are displayed from other views with dynamic analysis. In general, static and dynamic analysis provide us with different kinds of information: static analysis can access to all the code (which can be executed or not) so the information of all the possible states of the application is available, whereas dynamic analysis can obtain data about every state that is reached by execution. Moreover, when no source code is available, dynamic analysis is the only option. An scenario in which static analysis is not enough to obtain proper information is the reverse engineering of Ajax applications [ ], and in that case, also dynamic analysis is required. On the other hand, static analysis can access to all the code, which is necessary to accomplish a faithful migration of the code. In addition, static analysis is faster and easier to perform than dynamic analysis that implies executing the code and maybe redeploying the application or running the source runtime platform (e.g., the Oracle Forms runtime environment). To sum up the aforementioned approaches, we reckon that an the extraction of the behaviour of the GUI aimed at migration should: • separate the different concerns that are tangled in the code of event handlers, but different from [ ], marking code by hand should be avoided.

• represent the transtition between views and dependencies between widgets by means of a state-machine-like representation, as there is a wide consensus about that. • static analysis is desirable if source code is available, given that we need the whole information about the GUI and the runtime information is not enough.

.

GUI

is section is devoted to describe well-known metamodels (KDM, IFML) and User Interface Description Languages (UIDLs) that can be used to represent user interfaces. We will also introduce the Cameleon framework, though it is neither a metamodel nor a UIDL, it establishes different abstraction levels that are desirable for modelling user interfaces and it is used by many UIDLs. Since these approaches are rather heterogeneous, we are not going to classify them as we did in the previous sections, but we will restrict ourselves to describe them and put some examples.

. .

K

D

M

(KDM)

Section . . introduced KDM as the core element of the ADM initiative. KDM is a metamodel aimed at representing so ware systems at different levels of abstraction which range from program elements to business rules. KDM is intended to facilitate the interoperability between so ware modernisation tools, as a common representation for so ware artefacts. It is a very large metamodel that is composed of twelve packages organised in four layers: Inastructure, Program elements, Runtime resources and Abstractions (see Figure . ). Each package de nes a set of metamodel elements whose purpose is to represent a certain independent facet of knowledge related to existing so ware systems. e packages de ned in the speci cation are: • Core and Kdm: de ne common elements that constitute the infrastructure for other packages. • Source: enumerates the artefacts of the existing so ware system and de nes the mechanism of traceability links between the KDM elements and their original representation.

Figure . : KDM layers and packages (extracted from [ ]). • Code. It is focused on representing common program elements supported by various programming languages, such as data types, data items, classes, procedures, macros, prototypes, and templates, and several basic structural relationships between them. • Action: Along with the Code package, it represents the implementation level assets of the existing so ware system. is package is focused on behaviour descriptions and control and data- ow relationships determined by them. • Platform: de nes a set of elements whose purpose is to represent the runtime operating environments of existing so ware systems. • UI: represents facets of information related to user interfaces, including their composition, their sequence of operations, and their relationships to the existing so ware systems. • Event: it speci es the high-level behaviour of applications, in particular event-driven state transitions. • Data: it is used to describe the organisation of data in the existing so ware system. • Structure: it is aimed at representing architectural components of existing so ware systems, such as subsystems, layers, packages, etc. and de ne traceability of these elements to other KDM facts for the same system.

• Conceptual: it provides constructs for creating a conceptual model during the analysis phase of knowledge discovery from existing code. • Build: represents the facts involved in the build process of the given so ware system (including but not limited to the engineering transformations of the “source code” to “executables”). From the point of view of the migration of graphical user interfaces, four of these packages can be useful: the Code, Action, UI and Event packages. e Code and Action packages can be used together to represent programming code with independence of the speci c programming language. e UI package was conceived to represent the elements and behaviour of the GUIs. Next we will deep into the UI package to analyse its usefulness. Figures . , . and . compose the UI package. As there are a lot of dependencies among several the packages, we will only brie y comment on those metaclasses that are relevant for us.

Figure . : KDM metamodel. UI package (UIResources)(extracted from [ ]). Figure . shows the UIResources that can be de ned: Screens, Reports, UIFields, UIEvents. Screens are units of display in an application, such as windows or web pages, and Reports are printed units of display, like a printed report. UIField is a generic element to represent any eld in a Screen or Report, such as a text eld or a combo box. UIEvents can be declared and ’be’ associated with a UIAction; a UIAction can have associated zero or more events (e.g., a UIAction

called ’navigate’ can be triggered by many UIEvents such as ’click’ or ’select’). Note that UI resources can contain other UIResources (e.g., a Screen can contains UIFields that in turn contain UIEvents).

Figure . : KDM metamodel. UI package (UIRelations)(extracted from [ ]). In Figure . there are two generic relationships, UILayout and UIFlow. UILayout indicates the layout of a UIResource, and UIFlow allows de ning the ow of Screens (without indicating the event that originated it). e diagram of Figure . de nes several relationships between a UIResource and ActionElement. e la er is de ned in the Actions package and refer to a block of code. ese relationships represent the effect of a block of code in the UIResources: Displays allows a UIResource to be shown, ReadsUI takes the value of a UIField, WritesUI puts a value in a UIField, DisplaysImage shows an image, and ManagesUI represents other accesses to the UIResources. As it can be seen, the UI package can be used to represent the logical structure of views, the spatial relationships among the UI elements (layout), and the events associated with them. However, the speci cation just offers a few generic concepts for them. For example, related to the logical structure it de nes Screen as a container and UIField as a generic widget, and related to the layout of the elements it de nes a generic layout (UILayout). Finally, the Event package, is not aimed at expressing the event ow of the GUI (which is actually addressed in the UI package), but is aimed at describing the behaviour of the entire system as a state machine. It could be used somehow to express the behaviour of the UI, though it was not

Figure . : KDM metamodel. UI package (UIActions) (extracted from [ ]). conceived to that goal. . .

I

F

M

L

(IFML)

e Interaction Flow Modeling Language (IFML)[ ] has been recently adopted (March, ) as an OMG speci cation for building visual models of user interactions and front-end behavior in so ware systems. As indicated in [ ], IFML can be seen as the consolidation of the Web Modelling Language (WebML) [ ] de ned and patented about years ago as a conceptual model for data-intensive web applications. In fact, WebRatio, which has been supporting WebML over the years, is now adopting IFML as official notation. e objective of IFML is to provide system architects, so ware engineers, and so ware developers with tools for the de nition of Interaction Flow Models that describe the principal dimensions of an application front-end: the view part of the application, made of containers and view components; the objects that embody the state of the application and the business logic actions that can be executed; the binding of view components to data objects and events; the control logic that determines the sequence of actions to be executed a er an event occurrence; and the distribution of control, data and business logic at the different tiers of the architecture. An IFML diagram consists of one or more top-level view containers. Each view container can be internally structured in a hierarchy of sub-containers. e child view containers nested within a parent view container can be displayed simultaneously or in mutual exclusion. A view container

can contain view components, which denote the publication of content or interface elements for data entry (e.g., input forms). A view component can have input and output parameters. A view container and a view component can be associated with events, to denote that they support the user’s interaction. Events are rendered as interactors, which depend on the speci c platform and therefore are not modeled in IFML but produced by the PIM to PSM transformation rules. e effect of an event is represented by an interaction ow connection, which connects the event to the view container or component affected by the event. e interaction ow expresses a change of state of the user interface: the occurrence of the event causes a transition of state that produces a change in the user interface. An event can also cause the triggering of an action, which is executed prior to updating the state of the user interface. An input-output dependency between view elements (view containers and view components) or between view elements and actions is denoted by parameter bindings associated with navigation ows (interaction ows for navigating between view elements).

Figure . : Example of user interface (le ) and corresponding IFML model (right) (Extracted from [ ]). e le part of Figure . shows two states of the same view, and the right part represents the IFML diagram. In the example there is one top-level container (Albums&Artists) that comprises three view containers: one with a list of artists and of their albums, one with the details of an artist, and one with the details of an album. e la er two view containers are mutually exclusive, so if a user selects an artist, the details of that artist are displayed, or if the user selects

an album, the details of the album are displayed. . .

C

e Cameleon framework [ ] is a model-based approach devised to cover the design, maintenance and evolution of a multi-target user interface. is framework does not describe concrete metamodels but recommends an architecture of models and the way it can be used to deal with forward engineering and reengineering of user interfaces. e overall architecture is shown in Figure . (arrows indicate which models originate other ones). ree types of models are differentiated: ontological, archetypal and observed.

Figure . : Cameleon framework (extracted from [ ]).



e ontological models (le side of the gure) are metamodels of the concepts (and their relationships) involved in a multi-target UI. ese models are instantiated into archetypal and/or observed models, which depend on the domain and the interactive system being developed.

• Archetypal models are declarative models that serve as input to the design of a particular interactive system. ey are instances of the ontological models for a speci c target. • Observed models are executable models that support the adaptation process at runtime. ey have been omi ed in Figure . because they are out of the scope of our work and will not be explained. e types of ontological models are: • Domain Models: cover the domain concepts and users tasks. Domain concepts denote the entities that users manipulate in their tasks. Tasks refer to the activities users undertake in order to reach their goals with the system. • Context of use Models: describe the context of use in terms of the user, the platform and the environment. • Adaptation Models specify the reaction to adopt when the context of use changes. It includes information about the new UI to switch to, and the particular transition UI to be used during the adaptation process. e ontological models are independent of any domain and interactive systems, and de ne key dimensions for a given retargeting. On the contrary, archetypal models are instances of the ontological models in a speci c context (a speci c domain, platform, etc.). e information of the archetypal models is used to express a UI at four levels of abstraction, from the task speci cation to the running interface: • Task and Concepts level. It corresponds to the Computational-Independent Model (CIM) in MDA [ ] and considers: (a) the logical activities (tasks) that need to be performed in order to reach the user goals and (b) the domain objects manipulated by these tasks. O en tasks are represented hierarchically along with indications of the temporal relations among them and their associated a ributes. is level uses the information of the Concepts, Tasks and User models. • Abstract User Interface (AUI). Corresponding to the Platform-Independent Model (PIM) in MDA, is an expression of the UI in terms of interaction spaces (or presentation units), independently of which interactors are available and even independently of the

modality of interaction (graphical, vocal, haptic, etc.). An interaction space is a grouping unit that supports the execution of a set of logically connected tasks. • Concrete User Interface (CUI). It corresponds to the Platform-Speci c Model (PSM) in MDA. It is an expression of the UI in terms of Concrete interactors, that depend on the type of platform and media available and has a number of a ributes that de ne more concretely how it should be perceived by the user. Concrete interactors are, in fact, an abstraction of actual UI components generally included in toolkits. e CUI model uses the information of the Platform and Environment models. • Final User Interface (FUI). It is related to the code level in MDA and consists of source code, in any programming language or mark-up language (e.g. Java, HTML , VoiceXML, X+V, ...). It can then be interpreted or compiled. A given piece of code will not always be rendered on the same manner depending on the so ware environment (virtual machine, browser, etc.). For this reason, Cameleon considers two sublevels of the FUI: the source code and the running interface.

Figure . : Abstraction, rei cation and translation in the Cameleon framework (extracted from [ ]). When using Cameleon in a development, three different paths can be followed, namely rei cation, abstraction and translation, which are depicted by downward, upward and bidirectional arrows in Figure . . Rei cation is the transformation of a description (or of a set of descriptions) into another one that has a less abstract than the former. Abstraction is the transformation of a description into another one whose semantic content and scope are higher than the

content and scope of the initial description content (i.e., is more abstract). In the context of reverse engineering, abstraction is the elicitation of descriptions that are more abstract than the artefacts that serve as input to this process. Finally, a translation shi s the interface from one type of platform to another, or more generally, from one context to another (e.g., a legacy UI migration). . .

U

I

D

L

(UIDL )

UIDLs are DSLs for de ning user interfaces. In the following subsections we present three of the most widespread UIDLs: UsiXML [ ], Maria [ ] and XAML [ ]. Some other examples of UIDLs are: User Interface Markup Language (UIML) [ ], eXtensible Interface Markup Language (XIML) [ ], eXtensible Interaction Scenario Language (XISL) [ ] and XML User Interface Language (XUL) [ ]. . . .

U XML

User Interface eXtensible Markup Language (UsiXML) is a DSL used in Human-Computer Interaction (HCI) and So ware Engineering (SE) in order to describe any user interface of any interactive application independently of any implementation technology. e language is able to represent user interfaces which vary on the context of use (in which the user is carrying out her interactive task), the device or the computing platform (on which the user is working), the language (used by the user), the organisation (to which the user belongs), the user pro le or the interaction modalities (e.g., graphical, vocal, tactile or haptics). UsiXML has following features which are interesting in GUI migrations: • Model-driven: it is de ned according to the principles of MDE. Metamodels are expressed in MOF and OWL . Full [ ]. • Multi-level of abstraction: it is compliant with the four levels of abstraction of the Cameleon framework (as it is shown in Figure . ). It provides a metamodel for Abstract User Interfaces, and a metamodel for Concrete User Interfaces which can be used for different modalities (graphical, vocal, haptic, etc.). e Task level is based on Concur Task Trees (C ) [ ] and the domain is expressed with UML class and object diagrams [ ]. • Complete lifecycle support: it provides means for conceptual modeling of task, domain abstract user interface, concrete user interface, and contexts of use as de ned in the Cameleon

framework. In addition, it covers transformation, mapping, adaptation, and interactor modeling, so all the paths of reengineering (reverse engineering, restructuring, and forward engineering) can be tackled by means of the UsiXML metamodels and tools.

Figure . : UsiXML models conforming to Cameleon (extracted from [ ]). Figure . shows an example of the four levels of the Cameleon framework in UsiXML. In the bo om part of the gure we see an HTML form with a text eld and two bu ons to perform a search. ese controls are represented in the CUI level with independence of the concrete technology (HTML). e AUI level abstracts the elements of the CUI model so they are independent of the modality (GUI controlled by keyboard and mouse). Finally, the Task level captures the sequence of tasks to perform a search, this is, write some keywords in the text eld and then click on the one of the two bu ons. ere is a variety of tools supporting UsiXML for creating the models (IdealXML, KnowiXML), obtaining UsiXML models from code (ReversiXML) or other representations (SketchiXML, VisiXML, TransformiXML, etc.) or generating new systems (FormiXML, Gra XML, FlashiXML, etc.).

. . .

M

MARIA, Model-based lAnguage foR Interactive Applications [ ] is a universal, declarative, multiple abstraction-level, XML-based language for modelling interactive applications in ubiquitous environments. e language inherits the modular approach of its predecessor, TERESA XML [ ], with one language for the abstract description and then a number of platformdependent languages that re ne the abstract one depending on the interaction resources considered. Some features of the language that are relevant for our purposes are: • Model-driven: the language has been described by means of MOF metamodels. • Multi-level of abstraction: MARIA conforms to the Cameleon framework and de nes metamodels for the four abstraction levels: de nes an Abstract description metamodel, a few Concrete description metamodels for the desktop, mobile, vocal and multimodal platforms, and relies on C for the Task level. • Events at abstract and concrete levels: an event model has been introduced at different abstract/concrete levels of abstractions. e introduction of an event model allows for specifying how the user interface responds to events triggered by the user. • Extended Dialog Model. e dialog model contains constructs for specifying the dynamic behaviour of a presentation, specifying what events can be triggered at a given time. e dialog expressions are connected using C operators in order to de ne their temporal relationships. • Continuous update of elds. It is possible to specify that a given eld should be periodically updated invoking an external function (i.e., it supports Ajax scripts). is can be de ned at the abstract level and detailed at the concrete level. • Dynamic Set of User Interface Elements. e language contains constructs for specifying partial presentation updates (dynamically changing the content of entire groupings) and the possibility to specify a conditional navigation between presentations. is is useful for supporting Ajax techniques. e Maria language is supported by the Maria tool.

. . .

XAML

Extensible Application Markup Language (XAML) is a markup language developed by Microso for declarative programming of user interfaces in the .NET framework. XAML is used extensively in .NET Framework . and.NET Framework . technologies, particularly in the Windows Presentation Foundation (WPF) [ ], Silverlight, Windows Work ow Foundation (WF), Windows Runtime XAML Framework and Windows Store apps. In WPF, XAML forms a user interface markup language to de ne UI elements, data binding, events, and other features. In WF contexts, XAML is used to describe potentially long-running declarative logic, such as those created by process modeling tools and rules systems. e scope of this language is more ambitious than that of most user interface markup languages, since program logic and styles are also embedded in the XAML document. Functionally, it can be seen as a combination of XUL, SVG, CSS, and JavaScript into a single XML schema. XAML directly represents the instantiation of objects in a speci c set of backing types de ned in assemblies (.NET libraries). is is unlike most other markup languages, which are typically an interpreted language without such a direct tie to a backing type system. XAML is supported by the Microso environments such as Visual Studio and can also be used to generate desktop applications, Silverlight applications, Windows Phone apps and Windows Store apps among others. . .

D

We have presented the different approaches we analysed for representing GUIs. We found though different disadvantages that led us to eventually discard them and de ne our own metamodels. Next we enumerate the reasons for this decision. KDM is a complex metamodel which can be used to model an entire so ware system. e Code and Action packages, which are the most extensive of KDM, can be used to represent programming code in a generic fashion. However, given that KDM intends to be language-independent, some of its packages are too generic to be useful as-is, and need to be extended in some way. For instance, we could see that the UI package does not offer a widget or layout classi cation, so if distinguishing the different widget types or layouts is needed, extending KDM is required. KDM itself offers an extension mechanism, but as mentioned in [ ] it is poor in practice and using it means losing the interoperability among tools, which is one of the presumed bene ts of KDM. Moreover, using such a large metamodel like KDM involves a lot of unnecessary com-

plexity which most of the times does not pay off (e.g., model transformations become far more complex than using a simple metamodel). On the other hand, representing event handlers is awkward in KDM, because it is possible to de ne which events are triggered by each widget, but we cannot specify the code that is executed in each case. With respect to IFML, it allows expressing the events and the effect they produce in the GUI, and it can be considered to model the behaviour of the GUI in a technology-independent fashion. Since it has recently appeared, it was not considered in our solution. Regarding the UIDLs, UsiXML and Maria are technology-independent languages which have been designed to cope with multi-modal UIs in ubiquitous environments. A forte of UsiXML is that it has several graphical DSLs supporting the creation of the different models, and an interesting feature of Maria is that it includes elements to deal with Ajax applications. Both offer a wide widget hierarchy, but the layout representation is somewhat limited. For instance, UsiXML . just offers a generic TableLayout (similar to HTML tables) to represent layout. For this reason, these UIDLs are not suitable to be used in the reverse engineering stage as intermediate representations to manipulate the data. On the contrary, both, UsiXML and MariaXML could be used to represent a generic GUI at a CUI level (i.e., the technology-independent level, which is the abstraction level in which our reverse engineering proposal is enclosed). On the other hand, XAML is a UIDL which is devised to work with Windows frameworks, and includes information that is dependent of those frameworks. Moreover, it has a complex speci cation due to it mixes different kinds of information, and in fact some people are critical of this design, as many standards ( Javascript, CSS, etc.) exist for doing these things. For all these reasons, XAML is not suitable for representing generic GUIs.

Fall in love with the problem, don’t fall in love with the solution. Paul Graham (Suggested by Jérémie Melchior)

4

Overview So far we have set the context of this thesis. In the introductory chapter we motivated and stated the problem that is tackled. ere we introduced two main issues to be addressed when migrating D applications to modern platforms: coordinate-based layouts and tangling of concerns. ese two issues were explained in more detail in Section . . , where we discussed the main features of legacy D applications that were summarised in Table . . Now we will clearly de ne the requirements that we expect in our solution, and we will outline the generic architecture of the solution. is chapter serves as a brief guide of the entire work and summarises the solution that we will describe in detail in the following chapters.

.

G

Our main goal is to develop a migration framework for GUIs of legacy systems built with D environments, in order to migrate them to modern platforms and/or different GUI frameworks in such a way that the implementation of the new system follows common best practices. As stated in Section . , to reach this objective we have identi ed three high level goals: to dene an architecture (goal G ) for the framework to be developed, which should separate and

make explicit the different aspects involved in the GUI of D applications, and that should deal with the layout and event handlers (goals G and G ). is implies addressing the two aforementioned issues: coordinate-based layouts and tangling of concerns. We propose to apply static analysis on the GUI-related artefacts of the source legacy systems in order to extract relevant information for implementing the migrations. Particularly, we intend to analyse the view de nitions in order to extract a layout de nition, and analyse the code of event handlers to obtain an abstract representation that let us to achieve separation of concerns and get other useful information such as the navigation ow. Granted, a requisite to apply our solution is that source code is available. From the examination of GUIs of D applications expounded in Section . . and the study of the state of the art presented in Chapter we have elicited a set of requirements for our solution, which can be organised in three groups: general requirements, requirements speci c to the layout inference, and requirements of the analysis of event handlers. Hence, our solution is driven by the following general requirements: (R ) Explicit GUI information. A high-level representation of the GUI must be discovered, i.e., metadata concerning the GUI. e metadata should be of interest for migrating the GUI to different platforms and GUI frameworks. It must be possible to analyse and automatically transform this metadata. (R ) Modularity. Owing to the wide semantic gap between D environments and current technologies, it would be desirable to split the reengineering process into simpler stages to make it maintainable. In addition, a solution split in decoupled stages would facilitate extension (for instance, to add new processing stages) and reusability in different projects. (R ) Automation. We are interested in automating it as much as possible so that it can be easily applied to a large number of applications with minimum effort. Ideally, the process would end up in a generation task that would produce artefacts that could be seamlessly integrated in the new system. (R ) Source and target independence. e reengineering process should be easy to reuse with different technologies (source independence). Furthermore, it must be extensible, so that new target platforms can be added without changing the reverse engineering and restructuring stages (target independence).

With respect to the coordinate-based layout issue, we expect to ful ll the following requirements: (R ) Matching between the visual and logical structure. e logical structure of the views (the GUI tree, i.e., the nesting of the widgets) must mirror the visual structure that users perceive when they see those views. (R ) High-level layout representation. e layout of the views must be represented in terms of high-level structures that ensure a proper visualisation in different screen sizes and resolutions, such as the well-known layout managers that are used at present in a myriad of GUI frameworks. (R ) Misalignment tolerance. e solution proposed must take into account that some graphical editors (e.g., D editors) do not include alignment guidelines and therefore some minor misalignments can occur if developers are not careful. en, the solution must allow certain degree of imprecision when recognising widget location. (R ) Alternative solutions. It would be useful that the layout inference solution could output different ranked alternatives in order to know which options are be er according to some criteria. en, if the solution marked as ’best’ does not produce the desired results, developers could inspect the other options and choose a different alternative. (R ) Con gurable layout set. Developers should be able to choose which layout managers can be used for laying out views. is can be useful if some layout types are not available in the target toolkit or if using certain types can result in awkward or unexpected design. Regarding the issue of tangling of concerns, we intend to endow our solution with the following features: (R ) Code abstraction. Code should be abstracted so it is possible to understand what it does (how it works). is abstraction consists of moving the code representation (’how to do it’) to the intention of the code (’what it does’). For example, opening a database cursor is a recurrent pa ern in PL/SQL, which typically requires several instructions. From the reverse engineering and restructuring point of view it is useful to know that these statements just perform a database access. Raising the abstraction level in this way facilitates later processing.

(R ) Code categorisation. Related to the previous requirement, the solution must provide automated categorisation of pieces of code, so separation of concerns in the source system can be enabled. In this way, it should be possible to differentiate between statements related to the GUI, the control or to business logic in order to structure the new system in multiple tiers (n-tier architecture). In [ ] this is regarded as an important activity to disentangle spague i code. Furthermore, sub-concerns of the GUI such as validation rules should be also detectable. (R ) Explicit interaction and navigation ows. e solution must be able to explicitly represent the interactions that exist between widgets and the transitions that may occur between views. For example, this is useful for migrating to many modern frameworks that provide a means to declaratively express the navigation ow (e.g., Java Server Faces). ese requirements cover the ones that we extracted in the discussion of the state of the art for layout recognition approaches and behaviour extraction approaches, as it can be seen in Table . . Source/target independence Provide explicit information Layout model with a variety of layout managers Use of implementation paradigm with architectural bene ts Separate tangled concerns Represent transitions and dependencies with State Machine

R R R ,R R ,R ,R R R

Table . : Relationships between the requirements and the discussion of the state of the art. In Table . we also show how these requirements try to address some of the bad practices that are present in the applications created with D environments. Implicit layout Overlapping Widget-database links

R ,R R R

Table . : Requirements that cover bad practices in

D environments.

.

A

In this section we will present the architecture of the framework we have devised for migrating legacy GUIs, which will deal with the aforementioned requirements, and which is called GUIZMO (GUI to MOdels). e MDE paradigm presented in Section . provides mechanisms (mainly metamodels and transformations) and bene ts which t the architectural requirements R , R , R and R . erefore, we decided to implement our solution with an MDEbased architecture because it is suitable to cover these requirements. e summary of this section is as follows. Firstly we present the Concrete User Interface (CUI) model that we use to represent GUIs. Secondly, we will show how the CUI model is integrated in the context of the migration of legacy GUIs by means of the MDE-based architecture. Finally, we will indicate how the requirements we have listed are ful lled by GUIZMO. . .

T

C

U

I

According to the Cameleon reference framework introduced in Section . . , a CUI model is a technology-independant representation of a GUI that can be seen as an abstraction of the Final User Interface (FUI). Actually, we have not devised a single CUI model, but a series of models arranged in a star. As it can be seen in Figure . , the CUI model has a base model representing the structure of a GUI and different models connected to it that represent aspects of that GUI that are interesting to cope with in a migration. In this thesis we have dealt with three aspects: layout, event concerns and interactions. Validation and style are other aspects that should be considered in a good-quality migration, but they have not been addressed in this thesis. Next we will outline all of these models. e Structure model is the pivot of the CUI model. It describes the logical structure of views, that is, the hierarchy of widgets that compose the views. is hierarchy must be aligned with what the user sees in the screen. Moreover, it must include support for internationalisation (i n) and has to be backed up by a Resource model (omi ed in the gures) that contains the paths to the actual resources (images, icons, language les, etc.). e rest of the models reference the widgets de ned in this one. e spatial arrangement of the GUI is represented by the Layout model. e layout is made up with a composition of high-level layout components, such as layout managers (e.g., Flow layout or Grid layout). e composition should ensure that the view will be displayed properly under

Figure . : Concrete User Interface models in our solution different screen sizes and resolutions, and when the views are resized. e Style model de nes the look and feel of the views, that is, background and foreground colours, font types and sizes, border types and so forth. With this model, styles (groups of visual properties) would be de ned, and inherited from other styles in order to promote reuse (somewhat similar to CSS philosophy). e code of the event handlers is represented in the EventConcerns model in a language-independent fashion. is model presents an abstraction of the code where groups of sentences of the original code that match some pa ern are replaced by application primitives that express the semantics of the code. Moreover, the code fragments are tagged with the concern (view, controller, business logic) they are related to, and they are also structured in a control- ow graph. e information of the Interaction model is twofold: it speci es the dependencies among views, and also the dependencies among the widgets contained in these views. It represents the navigation ows of the application by means of a Finite State Machine in which the states are the different views of the application and the transitions are the events that let them happen. For each view, the dependencies among widgets are represented through a dependency graph, in which dependencies are expressed with an event-condition-action schema (similar to transitions between views). For example, selecting a speci c checkbox triggers an event that enables

M2M

UNIVERSIDAD DE MURCIA

UNIVERSIDAD DE MURCIA FACULTAD DE INFORMÁTICA Model-Driven Modernisation of Legacy Graphical User Interfaces Modernización Dirigida por Modelos de In...

4MB Sizes 2 Downloads 23 Views

Recommend Documents

ESN MURCIA - Universidad de Murcia
Hay clases de todos los niveles: los miercoles las clases son de un nivel avanzado, para todos aquellos que ya esteis in

UNIVERSIDAD DE MURCIA
funcionalidad testicular, el proceso de espermatogénesis o estrés oxidativo (Kefer et al., 2009;. Mieusset et al., 198

Untitled - Universidad de Murcia
Rubén Pallol Trigueros, Carlos Hernández Quero and Cristina de Pedro Álvarez (Universidad. Complutense de Madrid), Im

UNIVERSIDAD DE MURCIA
El capítulo 3 de esta memoria ya ha sido publicado como Mompeán et al. ...... de la capa de mezcla y las entradas de n

UNIVERSIDAD DE MURCIA
del Bonito. Se establece la metodología de captura para peces de 1-2 kg y de ..... cultivo es baja, normalmente inferio

índice - Digitum - Universidad de Murcia
división parvocelular del PVN durante el síndrome de abstinencia a morfina (48;49). Además, durante .... Ambos perten

Download (PDF) - Universidad de Murcia
federación a servicios que no disponen de un soporte adecuado para ninguno de los tipos ... tipo de federación. Sin em

UNIVERSIDAD DE MURCIA FACULTAD DE MEDICINA
El gen ornitina descarboxilasa-like (ODCp) murino codifica una proteína inhibidora de antizimas (AZIN2) carente .... fe

Revista de docencia universitaria - Universidad de Murcia
May 2, 2017 - Como aspecto más notable del estándar citado está el que las publicaciones acogidas a RED practican com

Listado de Libros - Universidad de Murcia
Listado de Libros. 1. Abramowitz, M.; Stegun, I.A., Handbook of Mathematical Functions. Dover. 2. Abrikosov, A.A.; Gorko