Tip of the day – Compact Repostiory

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

Hi Frenz,

Today morning when I was talking to one of my colleague, he questioned me “Sudheer –¬† What’s today’s BODI tip?” and at the spur of the moment an idea popped up in my mind (Somehow my mind is driving crazy and working pretty sharply these days ūüôā ).¬† Why can’t I share DI tips which I’m aware of and which would be useful.

Note: You can¬†find all my technical tips by selecting the “Tip of the day” in the category¬†list starting from today.

Today’s¬†tip topic – Compact Repository

Compacting/Compressing the local repository on regular basis would definitely help your¬†repo to respond quickly. What happens is… whenever you save any object in DI designer, the new version of the object information will be written into your local repo. Repository compression process removes the unnecessary and obsolete information from your local repo and thus helps to respond your repo quickly.

 

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

ETL Revision History and its importance

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

Well, I’m an ETL consultant and I mostly work at clients place for short / long term assignments. As soon as I get the assignment with the proper environment setup I have to jump into ETL activities like, writing the ETL code, changing the existing code as per new requirements or what soever, enhancements etc.,

When it comes to the development, I always strongly believe in writing an ETL code from the scratch is much easier than understanding the existing code written by XYZ. Sometimes….. no… no… many times in consulting world, you may touch others code and you might change any object job, workflow, dataflow, table, script, function, etc.,

Assume that you need to work on one of the object Рjob/workflow/dataflow, and it was developed by XYZ, who already left the organization for some reasons. Now the actual problem comes here, there was no proper documentation  to describe the flow of the job, or the appropriate document was misplaced or no one knows the exact location of the document or in other words the document was not updated after the initial draft or something like that, finally for whatever reasons there was no document which describes the actual flow of the job. And no other person in the organization can guide you to make you to understand the flow of the job.

There might be many questions raise in your mind

  • How many times that dataflow got revised?
  • Who modified it? and when was the last modified date ?
  • What is the latest version of the code and where it is available?
  • What are the base documents (ETL Design docs and mapping docs) to build the dataflow and where can you find these documents (URLs to Teamsite/ sharepoint location)?
  • What is the latest Build number? what were the defects handled in the last build?

You might get answers for these questions from Defect tracking tools, Code version controlling tools etc., but can you get answers as soon as you received the code??? No way… you have to do all your analysis (that would cost good amount of time) then you may need to gather information from different people in the team, who can help you out.

If you have a REVISION HISTORY inside your ETL code, probably it may answer you some of these questions immediately in one go such as

  • Who created the code and when it was created.
  • Who modified(if at all) and when it was modified by.
  • A brief description of previous change in the dataflow.
  • What are the defects fixed in the current version of the code.

You can use a script step in BODI before the parent workflow. However, different organizations use different templates, some people use BODI annotations inside/outside the dataflow  and some uses scripts before the dataflow/workflow etc.

Here is the revision history/change management template that I would like to share. If you have similar thing in your job then you could easily track out things without any pain and in no time.

################################################################################################
#PROGRAM NAME                    : JOB_CUSTOMER_DIM
#PROGRAM DESCRIPTION    : Job to load Customer dimension
#CREATED ON                            : 04/04/2006
#CREATED BY                             : Sudheer Sharma Goda
#VERSION NUMBER                 : 1.0
#CHANGES MADE                      : New Code
#Design Doc Reference               : URL of the share point location
#Mapping Doc Reference           : URL of the share point location
################################################################################################

################################################################################################
#################################### CHANGE MANAGEMENT #########################################
################################################################################################

################################################################################################
#CHANGE DATE                : 05/05/2006
#CHANGED BY                   : Sudheer Sharma Goda
#VERSION NUMBER        : 1.1
#CHANGES MADE             : 1- Added new query transformation(qry_getkeys) to get the keys
#                                               : 2- Replaced source customer table with customer_type
#                                               : 3- Applied lookup function in 2nd qry transform
#                                               : 4-
#                                               : 5-
#
################################################################################################

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 
Posted in ETL

2009 dwhnotes.com site visitor stats

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 
Monthly history
Unique visitors: 0Number of visits: 0Pages: 0Hits: 0Bandwidth: 0 Unique visitors: 0Number of visits: 0Pages: 0Hits: 0Bandwidth: 0 Unique visitors: 0Number of visits: 0Pages: 0Hits: 0Bandwidth: 0 Unique visitors: 0Number of visits: 0Pages: 0Hits: 0Bandwidth: 0 Unique visitors: 0Number of visits: 0Pages: 0Hits: 0Bandwidth: 0 Unique visitors: 0Number of visits: 0Pages: 0Hits: 0Bandwidth: 0 Unique visitors: 12Number of visits: 30Pages: 2427Hits: 13169Bandwidth: 40.65 MB Unique visitors: 163Number of visits: 301Pages: 6783Hits: 16864Bandwidth: 184.46 MB Unique visitors: 203Number of visits: 281Pages: 1270Hits: 12343Bandwidth: 231.93 MB Unique visitors: 175Number of visits: 294Pages: 2240Hits: 20995Bandwidth: 210.86 MB Unique visitors: 218Number of visits: 336Pages: 3297Hits: 19864Bandwidth: 245.16 MB Unique visitors: 256Number of visits: 391Pages: 6800Hits: 26719Bandwidth: 297.84 MB
Jan
2009
Feb
2009
Mar
2009
Apr
2009
May
2009
Jun
2009
Jul
2009
Aug
2009
Sep
2009
Oct
2009
Nov
2009
Dec
2009
Month Unique visitors Number of visits Pages Hits Bandwidth
Jan 2009 0 0 0 0 0
Feb 2009 0 0 0 0 0
Mar 2009 0 0 0 0 0
Apr 2009 0 0 0 0 0
May 2009 0 0 0 0 0
Jun 2009 0 0 0 0 0
Jul 2009 12 30 2427 13169 40.65 MB
Aug 2009 163 301 6783 16864 184.46 MB
Sep 2009 203 281 1270 12343 231.93 MB
Oct 2009 175 294 2240 20995 210.86 MB
Nov 2009 218 336 3297 19864 245.16 MB
Dec 2009 256 391 6800 26719 297.84 MB
Total 1027 1633 22817 109954 1.18 GB
What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

MDM

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

Now a days many companies acknowledge that MDM is part of their future to maintain their Data Governance.¬† Let’s talk about Master Data Management (MDM)

  • What is MDM ?

Let’s take a small example, In my organization, we have two source systems,¬† one is Legacy system (Mainframe) and another one is Revamped system (which would eliminate existing legacy system once our Enterprise DW goes live). I will take a small piece to elaborate this, someone entered a member information thru legacy system (it has fixed approach to enter nomenclature of members). let’s say legacy system entered a name in its database as Sudheer G Sharma. Where as same member information entered from another system (revamped) as Goda, Sudheer Sharma.

Although these systems storing the same member information with different naming conventions in their respective databases. You would face real challenges when you reconcile the data. You gottu de-dupe the data, must have to have complex logic for data reconciliation.

If you have MDM in place, you would load the same member information in MDM database(which maintains high quality data), from there you can pull the information to respective databases. Thus you wont miss quality of the data. You dont have to do reconciliation work . Have a master copy in a consolidate, centralized database. Pick the record from this harmonized, centralized database whenever you want.

  • What are the major activities of MDM?

Planning, Implementation, and Control activities to assure the consistency of data. In other words ‘Golden Version’ of contextual data values.

  • Planning:¬† First of all, understand Reference and Master data integration needs, then understand the sources, based on your inputs design the architecture.
  • Implementation: Define rules, establish Golden version of records and Implement appropriate data integration solutions.
  • Other activities: Replicate and Distribute Reference & Master Data.

This is just a small piece of MDM and how it works. Hope you can distinguish the importance of MDM now.

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

First SAP MDM implementation using BusinessObjects XI- DI/DQ services

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

I was going thru some  MDM articles on net,  I found a nice video post  about SAP MDM implementation using SAP BO family.  Browse the link http://sapmdm.ning.com/ for more information.

Lexmark MDM Implementation from Nick Chapman on Vimeo.

Courtesy : http://sapmdm.ning.com/

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful