Jul 14, 2014

How to handle Year Wise Cubes in a Multiprovider


Applies to: Up to BI 7.0

Introduction

                 After the BW implementation is over in any client, the data from the Source systems will keep on extracting to BW system on a daily basis. The size of the data volume increases day by day incrementally. This fills up(accumulates) all our Infocubes what we have developed in our Data Mart Layer over a period of time. Then the question of maintenance will come into the picture. That's exactly, I am going to demonstrate about the steps and Strategy to handle it.

Disadvantages with the current Infocubes

  • Large data volumes in Cubes, which leads to higher administration efforts
  • Dropping and Creating indices takes longer times which can delay process chains
  • Aggregates Roll-up will be longer
  • Attribute Change run takes substantial time during Master data loads
  • Reporting performance will be a big question
  • Recreating Aggregates may require every now & then to meet reporting performance.
  • Initial Fill up will take substantial times to fill up new aggregates
  • Adhoc Querying during Year Ends will be tough, while fetching data from bigger Cubes/Aggregates

                 I am sharing my current Data Mart Layer which is under a Multiprovider. The Go-Live had happened in 2008.This tells you that the data is available from 2008 to till date in our Infocubes. Our implementation partner had created Year wise Cubes like below.

Yearwise MP.JPG

                This above Layer has been serving good so far. Because our data volume is able to adjust with the available cubes. Now the data volume has substantially hiked since 2012 on wards. The data volume is roughly around 13 Crores records in (2012 and 2013) Cube. The previous years data was also some big figure. We have other small cubes in the Multiprovider also. Hope you got the bottle neck situation now.

                I have proposed a solution to overcome above mentioned all disadvantages. Archiving is not possible in my case as it is against our management policy. But we are trying our best to convince the management to get the approvals . So I have made a proposal to organize all years data as per the extent of usage. You could see "Very Less usage" and "Mostly Used in Trend Analysis" in the above figure for your understanding.

                Strategy 1 :
I have proposed to club the older years cubes by keeping the extent of usage in mind. Usually we do Trend Analysis by considering Last Year and Current year figures. Some companies may go with last 3 years also, depends up on the client's requirement.

                                                                      My Proposed Dart Mart LayerYearwise MP1.JPG

      � �          I have clubbed 2008 to 2011 into a single Infocube as it is hardly used for business analysis. If you have observed, I have split  (2012 & 2013) Cube into separate Cube for each year. Because, As I stated in the beginning, I have 13 Crore records for both years in a single Cube. It has been observed that reports are running slower when the users do trend analysis, as we approach towards the year end.

                 Eventually, we will have to club 2012, 2013 and 2014(may be) into a single Infocube when the importance of that period is very less. The whole idea is to merge older Cubes as the time passes by based on the importance and usage of data by Business Analysts.

Advantages of this Strategy

  • Lower Administration efforts by focusing on  current Cubes
  • Faster dropping and re-building indices during data loads
  • Trend Analysis can be done quickly
  • All the other disadvantages (which are mentioned in the beginning) can be overcome by this approach

Steps to carry out this entire activity without disturbing the existing setup

  1. Draw flow charts like above in a white paper. So that you will have great idea on what you are going to do in your system
  2. Take any one year Cube and make 4 copies(only Structures) as per the second image
  3. Do not forget to partition these new cubes according to the respective years by 0calmonth. Because, the copied Cubes will have old partitions based on first image. This is a key step to organize your data systematically
  4. Note: We have EDW Layer(Central DSO which is our total warehouse) under Data Mart Layer. So I will be uploading data to these new Cubes from the central DSO
  5. Copy the transformation bet ween DSO and existing Cube(any year, because transformation is uniform for all year wise Cubes) to the DSO and New Cubes
  6. Create Full load DTPs for all new Cubes and load all the respective Cubes by chunks
  7. Copy the existing Multiprovider with a new technical name and extend the description by "Back up".
                    
         Strategy 2 :  The idea in copying is, I will detach the existing year wise Cubes from the original Multiprovider and add the new year wise Cubes to the original Multiprovider(make sure assignments are done to the new cubes). By this, we need not to wo rry about our Bex Queries, WAD and Portal. Because, all the Queries which are based on the existing Multiprovider will continue to run as usual. The only difference is, the data will be retrieved from the new year wise Cubes. This strategy will completely avoid the risk of touching front-end.

  8.   Regenerate all your Queries in RSRT to refresh them from the database level. This will ensure all your queries will fetch data from the new Cubes
  9.   Make sure to change the 0INFOPROV restriction to the new Cubes in your queries, if any
10.   Modify all your Process Chains with the new Cube variants
11.   Verify the data thoroughly in your Portal Report s by comparing with the data display of you Back Up Multiprovider with same Query restrictions/Filters etc..
12.   You can also verify the data by copying Queries from your main Multiprovider to Back up Multiprovider by using RSZC
13.   Observe your Portal reports for 2-3 days by comparing with the Back up Multiprovider. If everything is ok, then you can drop all the old Cubes

Conclusion

                 This is a pure Production Support/maintenance activity, which will have to be taken up over a period of time. As Archiving is not in my scope, I have proposed above ideas and implemented the same.

No comments: