Duration: 2 Days
Course Overview
Business intelligence coding and the underlying data warehouse and its construction are interesting special cases of application development to be tested. The main testing function is validating that the warehouse was constructed from the source data with data integrity, so that queries against the warehouse can be trusted.
The other aspect of testing here is for the actual data science coding that gets done to perform some business intelligence based query against the data source.
In this specialist module, we investigate both these areas of quality assurance, and describe strategies for testing them efficiently, with a clear focus on the test data set construction and on the testing of the ETL and BI coding.
The module is taught 50% with the remaining 50% being based on small group design discussions mixed with practical hands-on exercises using the tools typical for testing in these environments.
How can I attend my course?
Course Content
Introduction
• Strategies for testing the data warehouse
• Business intelligence applications
• Trusting the data
Schema testing and validation
• What is schema testing?
• Star schemas, Facts and Dimensions
• Validating the schema
Data testing
• Using data sets to validate warehouse construction
• Automating the warehouse queries
• Integrating warehouse tests with unit test runners
Extract, Transform and Load test planning
• Creating test plans and estimating completion times
• Designing test cases and selecting test data sets
• Test automation and execution
ETL testing
• Identifying data sources
• Data acquisition testing
• Dimensional design validation
• Testing the data population process
Business Intelligence and Data Analysis
• What is BI and Data Science?
• Case study: R
• Comparing R and Pythin
• Performing data science operations with R
• Testing Python and R with Pytest
• Validating BI operations automatically