Mathematics: Special Issue "Big Data: Multidimensional Design and Modeling"
Sergio Luján-Mora, Juan Trujillo. Journal of Computer Information Systems (JCIS), XLVI(5), p. 30-58, Special issue 2006. ISSN: 0887-4417. https://doi.org/10.1080/08874417.2006.11645923
The design, development and deployment of a data warehouse (DW) is a complex, time consuming and prone to fail task. This is mainly due to the different aspects taking part in a DW architecture such as data sources, processes responsible for Extracting, Transforming and Loading (ETL) data into the DW, the modeling of the DW itself, specifying data marts from the data warehouse or designing end user tools. In the last years, different models, methods and techniques have been proposed to provide partial solutions to cover the different aspects of a data warehouse. Nevertheless, none of these proposals addresses the whole development process of a data warehouse in an integrated and coherent manner providing the same notation for the modeling of the different parts of a DW. In this paper, we propose a data warehouse development method, based on the Unified Modeling Language (UML) and the Unified Process (UP), which addresses the design and development of both the data warehouse back-stage and front-end. We use the extension mechanisms (stereotypes, tagged values and constraints) provided by the UML and we properly extend it in order to accurately model the different parts of a data warehouse (such as the modeling of the data sources, ETL processes or the modeling of the DWitself) by using the same notation . To the best of our knowledge, our proposal provides a seamless method for developing data warehouses. Finally, we apply our approach to a case study to show its benefit.
Keywords: Data warehouse, design, UML, Unified Process