Thursday 26 April 2007

MI world on who is responsible

There is a confution in MI world on who is responsible for what. The following article was published on a supplement to the December issue of DM Review.

As organizations move their data warehouse toward production and user acceptance, they must begin planning for the long-term support of this capability.1 Like any other system, it will require a number of disciplines to manage and maintain it and to improve performance throughout its life cycle. This article provides a heuristic model for helping organizations plan for the operations and maintenance (O&M) work that will ensue. The estimated time commitments are based on industry practices and observations of O&M staff in a number of post- production data warehouse environments.While the primary purpose of this article is to define the roles, responsibilities and estimated time commitments for O&M, the last section describes ideas and methods for enhancing the warehouse's capabilities in subsequent design/build iterations. It is important to note, however, that the time commitments described apply only to O&M and have not taken into account any additional work effort associated with new requirements that would drive the need for subsequent design/build iterations.

Roles and Responsibilities

As the data warehouse moves into production, a number of roles must be established and responsibilities assigned to each role. From a macro point of view, three distinct groups will participate in operating and maintaining the warehouse and its components: warehouse support staff, additional support staff and business staff.

This article makes no assumptions about the warehouse maintenance staff and to which organizational entity they report. Similarly, this section describes the roles that must be performed but makes no distinction about how many named individuals will be assigned to fulfill these roles and responsibilities or whether the staff will be assigned on a full-time or part-time basis.

Figure 1 shows the three macro-level support groups, roles assigned to each and some of the more substantive relationships among the roles. Note that some of the roles apply only to new development rather than to O&M. These roles are described in the New Development subsection. Note that the warehouse maintenance staff may play roles in both O&M and new development, or separate teams for each may be used.


Figure 1: Macro-Level Support Groups

Warehouse Support Staff

The roles of the warehouse support staff include:

  • Data Warehouse Administrator
  • OLAP/Reporting Tool Administrator/Developer
  • Data Modeler
  • ETL Specialist
  • Development DBA

Data Warehouse Administrator

The data warehouse administrator is responsible for day-to-day oversight of the production warehouse. This individual serves as a conductor-cum-project manager and the rest of the warehouse support staff reports to him.

Responsibilities:

  • Day-to-day oversight of the operational warehouse and warehouse support staff
  • Single point of contact for warehouse-related issues and problems
  • Assignment of tasks to the warehouse support staff
  • Creating operational procedures and guidelines including problem escalation procedures
  • Liaison to IT
  • Reporting status and issues to data warehouse champion on an as- needed basis
  • Assuring yearly maintenance payments are made to the product vendors
  • Configuration management of the production environment
  • Coordinating and scheduling product upgrades
  • Defining the service level agreement (SLA) for the warehouse
  • Resolving conflicts (i.e., troubleshooting disagreements) within the support team and between the support team and ITS
  • Communicating status, issues and problems in a real-time manner to business users of the warehouse

Skills:

  • Knowledge and understanding of business intelligence/data warehouse concepts and principles
  • Project management
  • General business knowledge of enterprise
  • General knowledge and understanding of system components such as RDBMS, operating system, hardware, ETL tools and processes, data modeling, etc.

OLAP/Reporting Tool Administrator/Developer

Responsibilities:

  • Liaison between the enterprise and the OLAP/reporting tool vendor for all technical issues
  • Supporting power users in problem resolution
  • Day-to-day tool administration
  • Upgrading products and applying patches as necessary
  • Maintaining and modifying reports/queries on an as-needed basis
  • Coordinating schema changes with the data modeler
  • Troubleshooting operational issues
  • Managing and supporting the security implementation
  • Developing and maintaining technical documentation about the implementation
  • Scheduling batch jobs with the IT operations staff
  • Testing changes/modifications
  • Troubleshooting database connectivity issues
  • Optimizing performance
  • Providing assistance to other technical areas as required

Skills:

  • Knowledge and understanding of business intelligence/data warehouse concepts and principles
  • High proficiency in understanding relational and dimensional data models
  • Proficiency in structured query language (SQL) including the use of SQL editors, stored procedures, database triggers and optimizing SQL statements
  • Familiarity with a data modeling tool
  • Understanding of open database connectivity (ODBC) concepts and ability to configure and troubleshoot client- to-server connectivity issues
  • Experience with and working knowledge of client and server operating systems
  • General understanding of the RDBMS
  • Excellent written and verbal communication skills
  • Strong analytical and problem solving abilities
  • Working knowledge of business functions, terms and processes

Data Modeler

Responsibilities:

  • Maintaining and modifying data warehouse, staging area and data mart logical and physical schemas
  • Maintaining and controlling schema definitions and meta data
  • Establishing and maintaining meta data standards
  • Coordinating with the business staff and the warehouse support staff in matters regarding meta data capture and meanings, schema modifications and enhancements
  • Developing and maintaining technical documentation about the data
  • Coordinating with the ETL specialist on data mappings and conversion rules
  • Coordinating with the development DBA for instantiating logical schemas into physical schemas
  • Identifying opportunities to optimize warehouse performance
  • Data model version control

Skills:

  • Knowledge and understanding of business intelligence/data warehouse concepts and principles
  • In-depth knowledge of relational and dimensional data models
  • Proficiency in data modeling tools
  • General understanding of the RDBMS
  • Excellent written and verbal communication skills
  • Strong analytical and problem solving abilities
  • Working knowledge of business functions, terms and processes

Extract, Transform and Load (ETL) Specialist

Responsibilities:

  • Managing ETL sessions and load order
  • Managing source input availability
  • Scheduling sessions
  • Automating ETL processes at the operating system level
  • Modifying and validating mappings
  • Revalidating sessions
  • Maintaining, backing up and restoring the ETL repository
  • Performance tuning of sessions with load-order logic
  • Reviewing session exception reports and identifying ways to resolve them
  • Developing and maintaining technical documentation about the ETL product implementation
  • Coordinating with the business staff to correct rejected records/fields whether it be through the source systems or through enhanced/modified conversion rules
  • Communicating status of ETL processes to the data warehouse administrator and/or the business users when the warehouse/marts have been refreshed and/or when the ETL processes have failed or records have been rejected
  • Modification and instantiation of business rules into "code"
  • Testing all ETL processes after changes have been made

Skills:

  • In- depth knowledge of and experience with ETL tool (or programming language) as both a developer and administrator
  • Liaison between the enterprise and the ETL tool vendor for all technical issues
  • Upgrading ETL software and applying patches as necessary
  • Knowledge and understanding of business intelligence/data warehouse concepts and principles
  • High proficiency in understanding relational and dimensional data models
  • Proficiency in SQL including the use of SQL editors, stored procedures, database triggers and optimizing SQL statements
  • Familiarity with the data modeling tool
  • Experience with the RDBMS
  • Working knowledge of and experience with the client and server operating systems
  • Knowledge of and experience with XML
  • Excellent written and verbal communication skills
  • Strong analytical and problem solving abilities
  • Working knowledge of business functions, terms and processes

Development DBA

Responsibilities:

  • Instantiating logical and physical schemas into physical tables
  • Supporting the data modeler in defining and modifying logical and physical schemas
  • Identifying ways to optimize database performance and coordinating with the production DBA to make sure changes are made
  • Troubleshooting database issues
  • Supporting the ETL specialist in loading the data into the warehouse/marts
  • Capacity planning and analysis and communicating anticipated disk capacity upgrades to IT staff
  • Developing and maintaining technical documentation about the database implementation

Skills:

  • In-depth knowledge and experience with the RDBMS
  • Knowledge and understanding of business intelligence/data warehouse concepts and principles
  • High proficiency in understanding relational and dimensional data models
  • Proficiency in SQL including the use of SQL editors, stored procedures, database triggers and optimizing SQL statements
  • Familiarity with the data modeling tool
  • Working knowledge of and experience with the client and server operating systems
  • Excellent written and verbal communication skills
  • Strong analytical and problem solving abilities
  • Working knowledge of business functions, terms and processes

IT Support Staff

Because information technology (IT) organizations are generally well established within the organizations, this section will only describe their responsibilities as they relate to operating and maintaining the warehouse environment. This section will not address skills. It is noted that the roles, as defined here, may not exactly match the roles and job titles that already exist within IT organizations.

The roles of IT include the following:

  • Operating system administrator
  • Operations support (including help desk)
  • Production DBA

Operating System Administrator

Responsibilities:

  • Operating system security including the addition and deletion of users and the granting of privileges
  • Supporting the installation and upgrades of vendor tools such as OLAP/reporting and ETL
  • Troubleshooting operational problems with the warehouse support team
  • Communicating and coordinating operating system upgrades and patches with the warehouse support team

Operations Support (Including Help Desk)

Responsibilities:

  • Serving as the single point of contact (help desk) for all business users' questions and escalating issues to the appropriate technical staff member. The help desk is not expected to answer user questions about any of the warehouse tools (e.g., OLAP/reporting meta data, data quality, etc.), but rather to contact the appropriate individual(s) who can answer the question or resolve the issue.
  • Working with the ETL specialist and OLAP/reporting administrator/developer to schedule jobs optimally
  • Monitoring scheduled jobs/tasks and notifying the appropriate warehouse administration team member when the tasks do not run to completion
  • Assuring sufficient server, disk and network capacity exists for the warehouse
  • Backing up and restoring the databases, repositories, reports, etc. that are stored on servers
  • Securing additional server capacity as needs are identified

Production DBA

Responsibilities:

  • Working with the development DBA to identify ways to increase and optimize performance
  • Monitoring query usage and recommending ways to modify tables for increased performance (e.g., adding indices, joining tables, adding foreign keys, etc.)
  • Granting privileges to all consumers of the databases
  • Managing disk space usage and allocation
  • Securing additional disk capacity as needs are identified
  • Working with the data warehouse administration team to troubleshoot and correct errors
  • Communicating and coordinating RDBMS upgrades and patches with the warehouse support team

Business Staff

Responsibilities:

  • Working with the ETL specialist to identify the appropriate way to correct records/fields that were rejected during the ETL processes whether they are due to a business rule deficiency or a data quality error. These corrections may take the form of amending the business rules used for ETL purposes or re- entering/amending the data within the source system(s).
  • Upon completion of each warehouse/data mart refresh cycle, certifying that the data reconciles with the source system(s) of record

OLAP/Reporting Tool User

It is expected that the users of OLAP/reporting tools will create and modify reports and drill paths as a part of their normal duties and responsibilities. Therefore, these functions are not considered within the context of this document. However, they do have certain responsibilities as they relate to O&M.

Responsibilities:

  • Communicating bugs/problems/issues to either the power user or help desk
  • Seeking help from the designated power user prior to contacting the help desk
  • Attending training on the usage of the OLAP/reporting tool

Power User

In general, power users are identified within a given business area and are the "go-to" people for general questions and/or the creation of more complex and sophisticated queries and reports. These users tend to have an even mixture of business and technical knowledge. Regardless of how many power users are identified and trained, each end user should be assigned a power user as their initial "go-to" person for questions and support.

Responsibilities:

  • Serving as an expert on the usage of the OLAP/reporting tool from a report and query perspective
  • Supporting general business users and answering their questions
  • Designing and creating the more complex and sophisticated set of reports for their business area
  • Escalating problems/issues that they cannot resolve to the OLAP/reporting tool developer/administrator

Resource Estimates

Figure 2 provides estimates for the amount of time it will take to perform each role described in the preceding sections. These estimates are stated as the percentage of a person's workday or workweek that will be involved with supporting the warehouse. These percentages represent good estimates for planning purposes. It is also important to note that an individual can simultaneously perform in more than one role, depending on individual strengths, experience and deep-rooted technical and/or business abilities.






Figure 2: Estimated Time Commitments in an O&M Environment

The percentages have be

en stated for the first three months the warehouse is in production, the next three months and then seven months and beyond. Because these estimates represent only work associated with warehouse O&M, it is anticipated that the amount of time needed will decrease over time. This decrease is attributable to the following factors: the number of bugs and performance issues will decrease over time; ETL processes will be enhanced to take into account additional business or data anomalies; support functions will be automated and proceduralized; the individuals responsible for supporting the warehouse will gain experience and become more efficient.

Other Considerations

While it is generally understood that the roles and responsibilities may fall on a few key individuals, each with one or two deep- rooted skills, this will not be enough to assure the continued success of the warehouse in a production environment. It will be necessary to make sure that each skill exists within more than a single person so that continuity exists when team members take vacations, are ill or leave the organization entirely. Ways to accomplish this include:

  • Cross-training of team members
  • Assigning roles and responsibilities to more than one individual so that they are shared equally
  • Assuring that the technical environment and the processes and procedures are well documented and easily available to those who need them
  • Carefully managing leaves of absence (vacations, business trips, training, etc.) so that critical skills are always present within the organization on any given business day
  • Outsourcing the warehouse administration functions

New Development

This section describes the roles and responsibilities that should be established or maintained in order to move the usage and value of the warehouse beyond that delivered in any single iteration.

Data Warehouse Champion

Responsibilities:

  • Serving as the visionary for identifying ways in which the warehouse can be enhanced to provide added value to both the organization and the organization's customers
  • Continually polling the business areas for ideas and needs that can be met with the warehouse
  • Creating and maintaining the data warehouse steering committee charter
  • Scheduling and administering the warehouse steering committee meetings as well as creating the agenda and publishing the minutes
  • Creating actionable items resulting from the steering committee meetings
  • Gaining consensus for warehouse-related priorities
  • Estimating budgetary needs for fulfilling the needs of the steering committee
  • Presenting the appropriate documents in order to gain budget approval
  • Securing the services of the organization's staff and/or consultants to implement each new design/build iteration
  • Coordinating with IT to assure that appropriate budget and infrastructure exists for each iteration of the warehouse

Data Warehouse Steering Committee

The steering committee should be composed of individuals from each business area seeking to benefit from the warehouse. These individuals should be high enough within the organizational hierarchy so that they may speak authoritatively for their business area and are authorized to commit their budget to warehouse- related activities. The steering committee should meet on a regularly scheduled basis, whether it is monthly, bimonthly or quarterly.

Responsibilities (of each member):

  • Actively participating in steering committee meetings
  • Speaking (or being empowered to speak) authoritatively for their own business areas
  • Identifying priorities with the organization as a whole and working with the data warehouse champion to create realistic and valuable iteration packages
  • Including warehouse funding in their yearly budgeting process
  • Assigning individuals within their area to work with the development team so that all business requirements can be fully articulated and tested

In the excitement around bringing a BI/DW solution to fruition and looking for the next mountain to climb, it's easy to forget that a new system is now in production - one that requires just as much attention as other mission-critical systems. As organizations plan for BI/DW implementations, the post-production ramifications must be factored into the planning and cost equations. In doing so, a smooth transition from development to production can be expected and organizations will be better able to absorb the additional workload associated with the warehouse.

Reference:
1. The terms "data warehouse" and "warehouse" are used generically to describe the total set of products, processes and functions that constitute the business intelligence solution which is commonly referred to as a data warehouse. These terms include the data warehouse database itself, the data marts, the staging area, the reports and online analytical processing (OLAP) capabilities, extract/transform/load (ETL), data modeling and meta data, as well as all the supporting hardware, software and network devices.


Performance and space usage in Data Warehousing

I have compiled the following sources to help me and other users of the blog how to use different tools and techniques to improve performance and space usage in data warehouse.
The following links are just few of the many that can help in accomplishing most of the issues on performance and space usage in data warhousing

  • Power point presentation on practical space management in data warehouse
  • PRDBMS Performance Tuning Guide for Data Warehousing
  • 6 parts practical guide to data warehouseing in Oracle
  • Tips for Optimizing SQL Server OLAP/Analysis Services
  • Value-Driven Data Warehousing

    Jay Foulkrod a new columist on DWReview has written an article on value-driven data warehousing. He tried to address where the DW/BI (MI world) industry is heading. To read more Value-Driven Data Warehousing Interpreting Industry Currents - Promise or Problems?

    Tuesday 17 April 2007

    Building a Hybrid Data Warehouse Model

    James Madison has published an article on building a hybrid data model. He has highlighted why it is sometimes good to build using hybrid model. More detail Building a Hybrid Data Warehouse Model

    Monday 19 March 2007

    RALPH KIMBALL in London

    RALPH KIMBALL the author of Data Warehouse Lifecycle Toolkit is coming to London to give a four day course on dimenstional modelling in depth. More details can be found at

    www.Q4K.com

    Tesh

    Thursday 15 February 2007

    Misconception of data warehouse?

    I used to think data warehouse to be a VLDB. I then started asking some DBA's/Developers. To my surprise, 4 out of 5 people whom I expected them to know what a data warehouse told me what I thought it was at the first place. So, what is data warehouse? How is it implemented ? Why use data warehouse ? The above are some of the questions that I would like to discuss. If you have further comments or are board of reading my posts please let me know and I will promise will stop ?

    Friday 8 December 2006

    Kimball Group

    Find all interesting webcasts on method of analysis, design and implement data warehouse and business intelligence. It is an expert forcast and if you spend a few minutes you will definately gain some knowledge. You can some of the groups forcasts on microsoft web site by clicking this link. Click here Kimball Group Webcasts