SAP Data Services
SAP Business Objects Data Services Overview
Data Services is a GUI based ETL tool from SAP Business Objects. It is a client server technology and an integrated tool-set for designing, running, monitoring, scheduling and administrating the ETL jobs.
Data Integrator/Data Services allows to develop data warehouse applications in both Batch and Real-Time job mode. It manages the data from heterogeneous and homogeneous sources and transforms the data according to business logic, integrates into single data source for analysis. It extracts the data from operational source systems and external source systems typically like RDBMS, Flat files, XML files, XML Messages, SAP IDoc’s, PeopleSoft, Siebel etc. We can develop complex dataflow mappings using simple GUI, point & click techniques
Here I’m going to share some basic stuff in DI/DS, which is useful for beginners (novice).
Data Services Topics
- Case
- Date_Generation
- Data_Transfer
- Effective_Date
- Hierarchy_Flattening
- History_Preserving
- Key_Generation
- Map_CDC_Operation
- Map_Operation
- Merge
- Pivot
- Query
- Reverse Pivot
- Row_Generation
- SQL
- Table_Comparison
- Validation
- XML_Pipeline
- Lookups
- Migration of Repositories (Export/Import)
- System Configuration – Multiple Datastore settings
- DI Central Repo vs IBM’s Rational ClearCase (Version Controlling)
Best Practices
- Significance of Overflow option
- Flatfile Lookup
- Cummulative Measure
- Loading DI Log files(Error, Trace, Monitor) into a table
- Repo Password Encryption
- How to get job execution statistics using repository tables.
- How to import atl into repository using al_engine
Tips & Tweaks
- How to view more than 1000 records in View data section window
- How can I see complete Workflow/Dataflow names in the workspace
- Troubleshooting DI standard toolbar docking
- Importance of Annotations
- Compact Repository
- Pushdown_Sql()
What’s New in SAP BODI/DS
- Business Objects Data Integrator 6.1
- Business Objects Data Integrator 6.5
- Business Objects Data Integrator 11.5 XI
- Business Objects Data Integrator 11.7 Accelerated
- SAP Business Objects Data Services 12.0
- SAP Business Objects Data Services 12.1
- SAP Business Objects Data Services 4.0 Vision & Roadmap
- SAP Business Objects Data Services 4.0

Dear Sudheer,
We got very good stuff on database,Datawarehousing,BODI in single blog, This stuff is more
than enough for BODI learners. Please update with new features in BODI.
Cheers
Balamurali(BM)
Dear Sudheer,
The info on BODI is magnanimous.This is the real good stuff for the BODI learners.After checking the blog we came to know some hidden featues in BODI.Please update with remaining features. We all will contribute u with some new info to make the blog more successful.
Cheers
Sant:)sh
Dear sudheer,
Doing great job, Keep it up.
Thanks & Regards,
Sunil Lakhani
Nice blog sudheer, will keep you posted of any updates on my blog ref webservices and other new ideas
http://www.bodiwebservices.blogspot.com/
Thanks again
Phatz
Hi Sudheer,
can you plz also add some info about other remaining transforms and functions.This is the only Blog which has helped me gaining some confidence on BODI.The explanation is awesome.
Thank you
SK
Hi,
We really appreciate your effort in putting this together. Can you please add more information about scheduling DI jobs as a web service?
Thank you.
Is this stuff is enough to get the certification of BO DI XI R2 Accelerated.
Hey Sandeep,
No, this stuff is not sufficed to get the certification in DI. My technical dairy would give you an idea that how to implement things in DI, that’s it.
You need to grind technical manuals completely and I would say you needed to have a practical experience first.
Pay attention on these concepts
DI Designer Transformations
DI Management console/Web Admin
Metadata Management
If you have any outstanding questions/doubts you can e-mail me. Will try to reply as soon as possible
Cheers
Sudheer
Hi,
Im quite new to BO-DI! While surfing I got to see ur blog. Really amazing… as u have put all the info at a single POC. Actually Im interested in the installation of BO-DI on Windows(Im having Oracle 10g on my system). I have gone thru the topic “Data Integrator Installation Steps, but it just says how to create a repo. Can u suggest me the steps to install, coz I got the following error during installation: The installer cannot configure the CMS Integrator because BOE Java SDK is not installed.” From where can I get the S/W of BOE Java SDK ? Even is it licensed?
Please help me out in resolving the installation issue.
Thanks,
Lalitha
Hello Lalitha,
Thanks for visiting my site and leaving your valuable comments on it.
You haven’t mentioned which version of DI you’re using. I think you’re trying to configure Metadata repository thru Metadata integrator using the CMS server, you should have BOE Client/server Application installed on the same machine where Metadata Integrator is installed and BOE should be installed before Metadata Integrator.
If you want to use only DI/DS designer and related web components (DI Admin & Metada Reports). While installing you can uncheck Metadata Integrator component.
Steps
1. Run the setup
2. Keying the Product License
3. Select the path
4. Select the componets (Here you need to uncheck the Metadata Integrator component)
Just give a right click on the component you want to uncheck and say ”Entire feature will be unavailable”
Now, you can see a cross mark image on the left side of the component and go ahead with the other installation steps.
Hope this helps you! let me know if you have any concerns or questions
Cheers,
Sudheer
Hi Sudheer,
I found this forum very intresitng ..where one get a complete knowledge about DI. However i have one question…how to avoid access violation error in DI. Some of my daily jobs are failing due to access violation error. Hence please give me some idea about this error and how to resolve/avoid this one.
Thanks
Sarada
Thanks for visiting my notes and for your comments.
In order to solve your issue, I need more specific information from you….
1. Please provide error description and error #
2. A brief description of your flow
3. Version of DI and Database information
If I know these details precisely, I could figure out and try to give my best solution to get rid of this issue.
Cheers,
Sudheer
Really a gud stuff…..
Sudheer,
Very useful stuff. Thank you!
The link to How to get job execution stats using repository tables seems to be broken. Can you please fix that?
Thank you.
Thanks for your comments and letting me know about the broken links issue Sushmitha and Hari
Broken links are been corrected.
Thanks,
Sudheer
Hi
I watched you are site once. It is awesome.
But from past 2 days it is not opening please try to check this.
Good site !!!!
Thank
Hello Sudheer,
I was looking Data Integrator stuff on net and I came across your wonderful site. I really appreciate your efforts for preparing such a wonderful notes in DW related area to help people like me. It would be great if you could provide some information about other transformations and more useful features DI. Thanks for sharing this useful information.
-Sri
Hi Sudheer,
Previously i have mentioned about an error that is Access violation error and below are the details–
Source — SAP
Target — Datawarehouse (Oracle)
below is the error–
(11.7) 01-22-10 23:06:42 (E) (7488:8624) SYS-170101: |Session JOB_W_DMN_Sales_Partner_SAP_GEB|Workflow WFL_Sls_Ptnr_Orphan|Dataflow DFL_Orphan_Sold_to_Cust
System Exception occurred. Process dump is written to
and
Process dump is written to and
Call stack:
001B:00D0DBCB, vcharUserInterfaceRegular::allocVchar()+0059 byte(s), x:\src\eval\vchar.cpp, line 0334
001B:00D1493A, XRow_data::set_attr()+0650 byte(s), x:\src\eval\row.cpp, line 2110+0105 byte(s)
001B:00D17E3D, XRow_data::cloneVarchars()+0285 byte(s), x:\src\eval\row.cpp, line 0264+0014 byte(s)
001B:00D17FBC, XRow_data::cloneUtilCopy()+0300 byte(s), x:\src\eval\row.cpp, line 0396
001B:00D18405, XRow_data::clone()+0117 byte(s), x:\src\eval\row.cpp, line 0531
001B:FFFFFFFF, WahNotifyAllProcesses()+-1908358535 byte(s)
001B:00D3D384, XTran_desc::execute()+0244 byte(s), x:\src\core\xtran.cpp, line 0546
001B:00C48D33, Rww_thread::main()+0083 byte(s), x:\src\rww\rww.cpp, line 0394
001B:01FEC20B, RWThreadFunctionImp::run()+0139 byte(s)
001B:01FDC9D2, RWRunnableImp::exec()+0226 byte(s)
001B:01FEC4DC, RWThreadImp::exec()+0076 byte(s)
001B:01FED819, RWThreadImp::_setTimeSliceQuantum()+0249 byte(s)
001B:77E6482F, GetModuleHandleA()+0223 byte(s)
Registers:
EAX=00000015 EBX=00000015 ECX=059D8C68 EDX=01174140 ESI=059D8C28
EDI=00000000 EBP=00000015 ESP=0815FC34 EIP=00D0DBCB FLG=00010202
CS=001B DS=0023 SS=0023 ES=0023 FS=003B GS=0000
Exception code: C0000005 ACCESS_VIOLATION
Fault address: 00D0DBCB 01:005CDBCB E:\DATAIN~1\Bin\acta.dll
Hey Sarada…
Process dump can be happen when your Job/WF/DF consumes more memory. Try to optimize the DF as much as you can… There is a feature/setting called pageable cache that may help you to get rid of this issue. However, you have to find out that what causes this memory consumption??
Cheers,
Sudheer
Hi Sudheer,
It’s very nice site to get complete information on Data Integretor. Thanks for putting all the info in one place. I am quite new to BODI. Using the BODI 11.5 version. One of the requirement is notifying supervisor with a mail when there is a bad reacord/ duplicate record with first 20 lines of error log. Could you suggest me how do we do this. Appriciate you suggestions.
Thanks,
Prahas.
My apologies for late reply and Thanks a lot for your comments Prahas. Well, for your scenario the best approach is you need to use 3rd party mailing utilities which can send attachments (for ex:- BLAT32). And for the first top n lines of error log, you can use “head” unix command, I’m assuming yours is a unix box. If it is Windows box, try with same command if it doesn’t work then you may need to download Windows 2003 Resource kit from Microsoft website.
Please let me know if you have any questions.
Cheers,
-Sudheer
Hi sudheer,
I have put the settings to pageable cache but it is not working. The job runs daily but it is not failing everyday…somedays it is failing…my concern is like what is the reason for this…..and i guess memory is not an issue…because ther is sufficient memory available…..I am eager to find out the root cause of this issue.
Thanks
Sarada
I would suggest you to talk to SAP support people or post it on DI forum (http://www.forumtopics.com/busobj), hope you will receive better relies there.
Thanks,
Sudheer
Hey Sudheer,
I am trying to find a solution to this ABAP issue all over.
Already an ABAP programme is created by SAP consultant. this truncate/reloads data in one of the SAP table.
How can we call that ABAP programme in DI XIR2
Thanks in advance
Subhash,
I never get a chance to work on ABAP calls using DI. I would highly recommend you to post this on DI forum http://www.forumtopics.com/busobj/ . Hope you’ll get better responses there.
Cheers,
Sudheer
Hi Sudheer,
Hope you remember me Dude,
This site is really very good. I learn t a lot of stuff on DI from http://www.dwhnotes.com.
Thank you very much Sudheer for sharing the information.The way you presented TOPICS (Transformations and Best Practices) are awesome.
I feel proud to work with you.
Regards
Sudheer.:)
Thanks for your comments Sudheer.
Really this Blog giving A very gud stuff for learners….superb one…Its enough for Beginners…
Hi Sudheer,
Im working on BODI and got stuck up with few jobs. The number of rows in the jobs are huge. I want to test the job with limited number of rows. What can i do to limit that? Thnx
Hi Nikhil,
You could filter out rows using some where conditions in query transform. Have a look at this post and give a try
http://dwhnotes.com/archives/1325
Let me know if this helps.
-Sudheer
Hi Sudheer,
I am facing a new issue. What am i supposed to do to import the data from source system to a CVS file in local system?
Thnx
-Nikhil
Elaborate your issue, so that I can provide some inputs to get rid of that.
Cheers,
-Sudheer
Hey nice interface..!!! We are importing data and getting it into our system as permanent tables after running our jobs.But now user wants the data of few tables in .csv file. What need do i make to make it successful.?Thnx
-Nikhil
Thanks….As a spectator, even I like the interface. Well… If I understood your porblem correctly then create a job with csv delimited flatfile as a target and dump all your source data in a specified location.
Let me know if you need any help.
Cheers,
Sudheer
Hi ,
Thanks for the help.Having an issue, how do we map the columns in the flat file as a target??
Okay…
1. Fisrt of all place your soruce table and then
2. Connect it to the query transform and drag the required coulmns to out schema
3. Now place a defined flatfile(csv delimited) as target and connect it to query transform.
4. Now check all the connections, names, fileds and their datatypes properly.
5. Save the Dataflow/job definition and validate the job to make sure without any errors.
That’s it your done with the design, now execute the job. Data will be populated into your file.
Cheers,
Sudheer
Hey Sudheer ,
thanks .. I figured the last issue out . Its working fine now. thanks again man ..
Wow.. that’s great… Not a problem dude.
-Sudheer
Hey .. is there any method where we can send the target data directly to the e-mail? If there is whats the procedure for that ??
Unfortunately there is no straight way to send email attachments from BODI/DS. You have to setup thrid party tools(like BLAT32 etc.,) to send email attachments from BODI/DS other option is using scheduler tools you can achieve this, again perhaps your organization uses 3rd party scheduler tools and that should have the functionality to send email attachments.
-Sudheer
Okie .. thanks man .. we are working on server and doesnt have rights to save the data on the server. where we have rights to save data , there we cannot view data.
Have to check with the manager regading this .. anyways thanks for your support.. Will disturb you as soon as i get sumthin
Not a problem… bug me anytime…
-Sudheer
Hi Sudheer,
Is there any way in which we can make log the jobs running in DATA MANAGEMENT CONSOLE inot excel automatically?? As in the jobs run directly go into the excel,so that we can analyse the jobs there using filters there ..
Hi Sudheer,
Is there any way in which we can make log the jobs running in DATA MANAGEMENT CONSOLE inot excel automatically?? As in the jobs run directly go into the excel,so that we can analyse the jobs there using filters there ..
Thnx..
Nikhil
Not exactly… but certainly my previous post could help you out. Just change the target object(Define excel file formats for each log and place them as targets).
http://dwhnotes.com/data-integrator/loademtfiles
Cheers,
Sudheer
Hi Sudheer,
We have have a table generated in ORACLE database which has information about all other tables which we are saving in the database. I want to know more about this auto generated table. As in do we have this table for each repository or for all the repository there will just be one table.?
Also what is the information that is there in that table about other tables?
What im planning to do is import this table into the BO and then take its data to the excel sheet for weekly maintainance for our jobs ?? What do u say about it ..??
Thnx
Nikhil
Well, in any project we’ll used to store some metadata information about required DI objects and database objects in a control tables. And we maintain these control tables accross all the environments. Generally these tables stores info about job name,wf name, df name, src & tgt table names, jobid, job start time, end time, no. of rows, …. etc., In some organizations they’ll have their own mornitoring tools to get the data from the these control tables. In some organizations they wont mantain anything… Support ppl maintains kind of excel sheets to log these details.
Coming to your thought… that’s not a bad idea… you can give a try… but tell me what you’re trying to putup in excel sheet? what you want to maintain weekly?
-Sudheer
Hey Sudhir,
can you please let me know how to create a DS job with the below info
A specific folder contains around 200 flat fiels..and I need only the name of the flat files to get loaded into the target table not the data of the flat files.
how to do this in DS??
Thanks in advance
Sarada
Hey Sarada,
Well I could think of these two methods,
Method 1 and the easiest way:
Define a fileformat with required parameters and there is an option called DI_Filename or something like that… on the Fileformat editor page… you can define a global variable and pass it over there to read the file names. Now inside the Dataflow you can read the file as a source and map only DI_Filename as a target coulmn and ignore other columns. This way you can achieve.
Method 2:
Write a batch scirpt/shell script to store all the filenames and put it on the server. Now read that filenames file as a source file in DI and populate the data into respective tables.
Let me know if this helps.
Cheers,
Sudheer
Hi Sudheer ,
Im loading data from sap server to local rep. using BO ETL. but there are some tables which have huge data and while loading the tables the error im getting is “TIME LIMIT EXCEEDED”. and the job fails. Can you please suggest some ways to get this rectified.
Nikhil
Dear Nikhil,
Seems like there is time setting while extracting info from SAP Server. I would suggestyou to post on http://www.forumtopics.com/busobj/viewforum.php?f=83&sid=2061addd53064a06411cae9085ec2141
Hope you would get a better responses over there and post it with a clear content like
1. Error number
2. Error Description
3. Source & Target Databases and their versions
4. ETL version
5. and small idea about your issue.
Cheers,
Sudheer
Hi
Can you please tell me in detail how to use the Matched Transform? What all steps do we need to do so that it starts working.
A user wants to use this to compare data from 2 tables and is having a RED CROSS on the matched transform stage while doing it.What can be the error and how to remove it ?
Thanks
Nikhil
Hi Nikhil,
I would need more details, probably a screen shot could help to figure out the root cause of the issue. And I would recommend you to go thru the technical manuals for all the transformations in detail before you work on that component. If I get time will definitely update content on blog with new Transformations & their functionalities(which are due from my side).
Cheers,
Sudheer
Hi
While using the Data Management Console in BODS, im getting the Error -
ORA-00933: SQL command not properly ended .
What can be the cause of such an error . How can we resolve this issue.
Thanks
Hi,
Can you tell me what exactly you’re trying to access in management console?
-Sudheer
Hi ,
The 3rd tab in data managment console is for Repository Schedule .. while clicking on that tab this error is coming on the screen.
thnx
Sudheer ..
Can i have your mail id so that i can mail you directly ..??
Hi Nikhil, I setup a login credentials for you. You could use it to post comments. Check you personal email Id for the same.
Cheers,
-Sudheer
Seems like your metadata information is not writing/updating correctly. Try to access other components in Management console and let me know if you face any issues.
-Sudheer
Hi,
I tried running the jobs . They are running fine . Just that while scheduling this problem occurred. I dont know how, but while scheduling i got this error
I’m suspecting some of your repository tables might be corrupted. How many users are there for the same repo, I mean how many developer are accessing the same repository? If more than one user is accessing the same repo then try to access the same page in another system.
That page in another system is showing valid. As in someone other rep , the same page is showing valid. In our rep. only this prob is happening. I think i know the job which is showing error. Should i drop that table and again rerun the job. how will it work then?
No do not drop any repository tables just like that. Have a backup of everything(copy of repo in atl, database backup of your complete repo) and now, upgrade your repository using repository manager. Remember click on upgrade option. It will update the repository tables, once this is done retest the issue. Let me know if you find any difficulties.
-Sudheer
i think its an SQL error . While scheduling the job this error occured. What can be the possible solution for the error. I know the table which is causing this issue.
Sudheer ,
I think there is a error statement in the database in the back end . I think we need to roll back that statement to make that run. How can we do that ?
What’s that error statement that you see on the screen? send me screen shot to godasudheer@dwhnotes.com and mention the short notes/steps that you’re following
-Sudheer
Hey Sudheer,
I deleted the job and reimported it , its working fine now. I think there is an issue in the table , and it has got corrupted. Lets see how this is going to work out . Anyways , thanks for your help . love u brother.
Nikhil
Great man…Great… now, you have got the solution.
Cheers,
-Sudheer
Hi sudheer ,
The same problem has occured again
. Now while scheduling any job in data Management Console, when i schedule a job, after filling in the schedule details, when i click the confirm button the error message comes there.
ORA-00933: SQL command not properly ended
This is the only thing that comes there . Can it be a database thing ??
thats the error im getting .
Could be metadata tables problem. Follow the below steps carefully…
1. Remove your repository from mgmt. console.
2. Compact your local repository once.
3. Now, turn on calculate column mappings.
4. Save your repo with all data and close.
5. Now attach your repo to mgmt console and retest the issue
If it still persists
1. Take then entire repo back(both database level backup and atl version)
2. Now, upgrade your repo using repo manager
3. re-configure your repo in management console.
Now, retest the same issue.
Let me know if you face any issues.
Cheers,
-Sudheer
Hi Sudheer,
Very interesting and useful blog. I have a few questions to you. Please reply, as and when you get a chance.
1) User-id will be created at BO side or DB side?
2) One repository can be accessed by different users with different credentials (ID/PWD)?
3) What is the best practice to create BODI landscape? I mean DEV/QA/PRD? Shall we install separately on servers? OR install one time and create separate repositories for DEV/QA/PRD?
Thanks,
Sree
Hi Sree,
Very interesting and useful blog.
Sudheer> Thanks for visiting my site and appreciate your valuable comments
1) User-id will be created at BO side or DB side?
Sudheer> When you say user-id, I’m assuming it as a repository. User should be created at DB side only
2) One repository can be accessed by different users with different credentials (ID/PWD)?
Sudheer>Yes. One repository can be accessed by different users, but its not at all recommended.
The best practice for this is…
a) Create no. of users at DB as per your requirement (lets say 10 users)
b) Then I would recommend you to create two logical job servers(with diff unique ports) on DI/DS installed server machine to configure these users
c) Now assign, 5 users(repositories) to one logical job server (JS1) and attach the other 5 users(repositories) to another logical job server(JS2).
3) What is the best practice to create BODI landscape? I mean DEV/QA/PRD? Shall we install separately on servers? OR install one time and create separate repositories for DEV/QA/PRD?
It depends, if your organization is not afford to have multiple servers, then you can go with server/two servers (One for Dev & QA and one server dedicated for PROD) which is not recommendable again but depends. Most of the organization go with multiple dedicated server licenses. In that case, your organization may have to buy multiple server licenses(one server for each environment)
Hope this helps!….do let me know if you need more clarification on this.
Cheers,
Sudheer
Thanks for your prompt respone. Appreciate your inputs. I have a question.
What are the pros/cons, if install 3 repositories(Dev/QA/PRD) into one server and Dev/PRD separately on different servers?
Thanks,
Sree
Sridhar,
More elaborating my question. i.e. What are the advantage and disadvantages on below cases.
1) Installing BOE on one server and create 2 repositories for Dev/Prod ?
2) Installing BOE on two different servers and create Dev/Prod repositories on separate servers?
Please let me know, what BO recommends for?
Thanks,
Sudheer, Sorry worngly mentioned your name as Sridhar. Please reply me, as early as possible on landscape installaion, i.e. one BOE installation for Dev & Prod and separate BOE installation for Dev & Prod on different servers. Want to know the advantages and disadvantages.
Never mind sree…
If you’re looking for BOE(OLAP tool) environment setup, then I would recommend you to go thru these docs once….
Lifecycle Manager
http://help.sap.com/businessobject/product_guides/boexir31SP2/en/xi31_sp2_lcm_user_en.pdf
Technical Requirements
http://help.sap.com/businessobject/product_guides/planningxir2_112/en/xir2_BusinessObjects_Planning_Technical_Requirements_11.20_en.pdf
The suggested way for any tools is better to have one dedicated server for Prod and one server for both(Dev & QA/UAT). The main advantage I could think of is it improves the tool response performance.
If you’re looking for an ETL tool setup then its a different story again. Please mention which tool environment setup you’re looking for precisely
Cheers,
Sudheer
Sudheer,
Thanks a lot. One more question, i.e. How to maintain Business Objects security and user authorizations (BODI/BODS).
Please send me, if you have any documentation?
Thanks,
Sree
Sree,
I would call it as Data Integrator/Data Services rather Business Objects. I do not have any related docs to share(since those are non-disclosure docs), however, will definitely post some information about security & admin soon.
Cheers,
Sudheer
Sudheer,
Thanks a bunch.