SAP Data Services

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

-- Download SAP Data Services as PDF --


SAP Business Objects Data Services  Overview

Data Services is a GUI based ETL tool from SAP Business Objects. It is a client server technology and an integrated tool-set for designing, running, monitoring, scheduling and administrating the ETL jobs.

Data Integrator/Data Services allows to develop data warehouse applications in both Batch and Real-Time job mode. It manages the data from heterogeneous and homogeneous sources and transforms the data according to business logic, integrates into single data source for analysis. It extracts the data from operational source systems and external source systems typically like RDBMS, Flat files, XML files, XML Messages, SAP IDoc’s, PeopleSoft, Siebel etc. We can develop complex dataflow mappings using simple GUI, point & click techniques

Here I’m going to share some basic stuff in DI/DS, which is useful for beginners (novice).

Data Services Topics

Data Services Installation Steps
Data Services Transformations

Tips & Tweaks

What’s New in SAP BODI/DS

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

196 comments on “SAP Data Services

  1. Dear Sudheer,

    We got very good stuff on database,Datawarehousing,BODI in single blog, This stuff is more
    than enough for BODI learners. Please update with new features in BODI.

    Cheers
    Balamurali(BM)

  2. Dear Sudheer,
    The info on BODI is magnanimous.This is the real good stuff for the BODI learners.After checking the blog we came to know some hidden featues in BODI.Please update with remaining features. We all will contribute u with some new info to make the blog more successful.
    Cheers
    Sant:)sh

  3. Hi Sudheer,

    can you plz also add some info about other remaining transforms and functions.This is the only Blog which has helped me gaining some confidence on BODI.The explanation is awesome.

    Thank you
    SK

  4. Hi,

    We really appreciate your effort in putting this together. Can you please add more information about scheduling DI jobs as a web service?

    Thank you.

  5. Hey Sandeep,

    No, this stuff is not sufficed to get the certification in DI. My technical dairy would give you an idea that how to implement things in DI, that’s it.

    You need to grind technical manuals completely and I would say you needed to have a practical experience first.

    Pay attention on these concepts

    DI Designer Transformations
    DI Management console/Web Admin
    Metadata Management

    If you have any outstanding questions/doubts you can e-mail me. Will try to reply as soon as possible

    Cheers
    Sudheer

  6. Hi,

    Im quite new to BO-DI! While surfing I got to see ur blog. Really amazing… as u have put all the info at a single POC. Actually Im interested in the installation of BO-DI on Windows(Im having Oracle 10g on my system). I have gone thru the topic “Data Integrator Installation Steps, but it just says how to create a repo. Can u suggest me the steps to install, coz I got the following error during installation: The installer cannot configure the CMS Integrator because BOE Java SDK is not installed.” From where can I get the S/W of BOE Java SDK ? Even is it licensed?

    Please help me out in resolving the installation issue.

    Thanks,
    Lalitha

  7. Hello Lalitha,

    Thanks for visiting my site and leaving your valuable comments on it.

    You haven’t mentioned which version of DI you’re using. I think you’re trying to configure Metadata repository thru Metadata integrator using the CMS server, you should have BOE Client/server Application installed on the same machine where Metadata Integrator is installed and BOE should be installed before Metadata Integrator.

    If you want to use only DI/DS designer and related web components (DI Admin & Metada Reports). While installing you can uncheck Metadata Integrator component.

    Steps
    1. Run the setup
    2. Keying the Product License
    3. Select the path
    4. Select the componets (Here you need to uncheck the Metadata Integrator component)

    Just give a right click on the component you want to uncheck and say ”Entire feature will be unavailable”
    Now, you can see a cross mark image on the left side of the component and go ahead with the other installation steps.

    Hope this helps you! let me know if you have any concerns or questions

    Cheers,
    Sudheer

  8. Hi Sudheer,

    I found this forum very intresitng ..where one get a complete knowledge about DI. However i have one question…how to avoid access violation error in DI. Some of my daily jobs are failing due to access violation error. Hence please give me some idea about this error and how to resolve/avoid this one.

    Thanks
    Sarada

    • Thanks for visiting my notes and for your comments.

      In order to solve your issue, I need more specific information from you….

      1. Please provide error description and error #
      2. A brief description of your flow
      3. Version of DI and Database information

      If I know these details precisely, I could figure out and try to give my best solution to get rid of this issue.

      Cheers,
      Sudheer

  9. Sudheer,

    Very useful stuff. Thank you!
    The link to How to get job execution stats using repository tables seems to be broken. Can you please fix that?

    Thank you.

    • Thanks for your comments and letting me know about the broken links issue Sushmitha and Hari
      Broken links are been corrected.

      Thanks,
      Sudheer

  10. Hi

    I watched you are site once. It is awesome.

    But from past 2 days it is not opening please try to check this.

    Good site !!!!

    Thank

  11. Hello Sudheer,

    I was looking Data Integrator stuff on net and I came across your wonderful site. I really appreciate your efforts for preparing such a wonderful notes in DW related area to help people like me. It would be great if you could provide some information about other transformations and more useful features DI. Thanks for sharing this useful information.

    -Sri

  12. Hi Sudheer,

    Previously i have mentioned about an error that is Access violation error and below are the details–

    Source — SAP
    Target — Datawarehouse (Oracle)

    below is the error–

    (11.7) 01-22-10 23:06:42 (E) (7488:8624) SYS-170101: |Session JOB_W_DMN_Sales_Partner_SAP_GEB|Workflow WFL_Sls_Ptnr_Orphan|Dataflow DFL_Orphan_Sold_to_Cust
    System Exception occurred. Process dump is written to
    and

    Process dump is written to and

    Call stack:
    001B:00D0DBCB, vcharUserInterfaceRegular::allocVchar()+0059 byte(s), x:\src\eval\vchar.cpp, line 0334
    001B:00D1493A, XRow_data::set_attr()+0650 byte(s), x:\src\eval\row.cpp, line 2110+0105 byte(s)
    001B:00D17E3D, XRow_data::cloneVarchars()+0285 byte(s), x:\src\eval\row.cpp, line 0264+0014 byte(s)
    001B:00D17FBC, XRow_data::cloneUtilCopy()+0300 byte(s), x:\src\eval\row.cpp, line 0396
    001B:00D18405, XRow_data::clone()+0117 byte(s), x:\src\eval\row.cpp, line 0531
    001B:FFFFFFFF, WahNotifyAllProcesses()+-1908358535 byte(s)
    001B:00D3D384, XTran_desc::execute()+0244 byte(s), x:\src\core\xtran.cpp, line 0546
    001B:00C48D33, Rww_thread::main()+0083 byte(s), x:\src\rww\rww.cpp, line 0394
    001B:01FEC20B, RWThreadFunctionImp::run()+0139 byte(s)
    001B:01FDC9D2, RWRunnableImp::exec()+0226 byte(s)
    001B:01FEC4DC, RWThreadImp::exec()+0076 byte(s)
    001B:01FED819, RWThreadImp::_setTimeSliceQuantum()+0249 byte(s)
    001B:77E6482F, GetModuleHandleA()+0223 byte(s)
    Registers:
    EAX=00000015 EBX=00000015 ECX=059D8C68 EDX=01174140 ESI=059D8C28
    EDI=00000000 EBP=00000015 ESP=0815FC34 EIP=00D0DBCB FLG=00010202
    CS=001B DS=0023 SS=0023 ES=0023 FS=003B GS=0000
    Exception code: C0000005 ACCESS_VIOLATION
    Fault address: 00D0DBCB 01:005CDBCB E:\DATAIN~1\Bin\acta.dll

    • Hey Sarada…

      Process dump can be happen when your Job/WF/DF consumes more memory. Try to optimize the DF as much as you can… There is a feature/setting called pageable cache that may help you to get rid of this issue. However, you have to find out that what causes this memory consumption??

      Cheers,
      Sudheer

  13. Hi Sudheer,
    It’s very nice site to get complete information on Data Integretor. Thanks for putting all the info in one place. I am quite new to BODI. Using the BODI 11.5 version. One of the requirement is notifying supervisor with a mail when there is a bad reacord/ duplicate record with first 20 lines of error log. Could you suggest me how do we do this. Appriciate you suggestions.

    Thanks,
    Prahas.

    • My apologies for late reply and Thanks a lot for your comments Prahas. Well, for your scenario the best approach is you need to use 3rd party mailing utilities which can send attachments (for ex:- BLAT32). And for the first top n lines of error log, you can use “head” unix command, I’m assuming yours is a unix box. If it is Windows box, try with same command if it doesn’t work then you may need to download Windows 2003 Resource kit from Microsoft website.

      Please let me know if you have any questions.

      Cheers,
      -Sudheer

  14. Hi sudheer,

    I have put the settings to pageable cache but it is not working. The job runs daily but it is not failing everyday…somedays it is failing…my concern is like what is the reason for this…..and i guess memory is not an issue…because ther is sufficient memory available…..I am eager to find out the root cause of this issue.

    Thanks
    Sarada

  15. Hey Sudheer,

    I am trying to find a solution to this ABAP issue all over.
    Already an ABAP programme is created by SAP consultant. this truncate/reloads data in one of the SAP table.
    How can we call that ABAP programme in DI XIR2

    Thanks in advance

  16. Hi Sudheer,

    Hope you remember me Dude,

    This site is really very good. I learn t a lot of stuff on DI from http://www.dwhnotes.com.
    Thank you very much Sudheer for sharing the information.The way you presented TOPICS (Transformations and Best Practices) are awesome.

    I feel proud to work with you.

    Regards
    Sudheer.:)

  17. Hi Sudheer,

    Im working on BODI and got stuck up with few jobs. The number of rows in the jobs are huge. I want to test the job with limited number of rows. What can i do to limit that? Thnx

    • Okay…
      1. Fisrt of all place your soruce table and then
      2. Connect it to the query transform and drag the required coulmns to out schema
      3. Now place a defined flatfile(csv delimited) as target and connect it to query transform.
      4. Now check all the connections, names, fileds and their datatypes properly.
      5. Save the Dataflow/job definition and validate the job to make sure without any errors.
      That’s it your done with the design, now execute the job. Data will be populated into your file.

      Cheers,
      Sudheer

  18. Hey .. is there any method where we can send the target data directly to the e-mail? If there is whats the procedure for that ??

    • Unfortunately there is no straight way to send email attachments from BODI/DS. You have to setup thrid party tools(like BLAT32 etc.,) to send email attachments from BODI/DS other option is using scheduler tools you can achieve this, again perhaps your organization uses 3rd party scheduler tools and that should have the functionality to send email attachments.

      -Sudheer

  19. Okie .. thanks man .. we are working on server and doesnt have rights to save the data on the server. where we have rights to save data , there we cannot view data.

  20. Have to check with the manager regading this .. anyways thanks for your support.. Will disturb you as soon as i get sumthin 🙂

  21. Hi Sudheer,

    Is there any way in which we can make log the jobs running in DATA MANAGEMENT CONSOLE inot excel automatically?? As in the jobs run directly go into the excel,so that we can analyse the jobs there using filters there ..

  22. Hi Sudheer,

    Is there any way in which we can make log the jobs running in DATA MANAGEMENT CONSOLE inot excel automatically?? As in the jobs run directly go into the excel,so that we can analyse the jobs there using filters there ..
    Thnx..
    Nikhil

  23. Hi Sudheer,

    We have have a table generated in ORACLE database which has information about all other tables which we are saving in the database. I want to know more about this auto generated table. As in do we have this table for each repository or for all the repository there will just be one table.?
    Also what is the information that is there in that table about other tables?
    What im planning to do is import this table into the BO and then take its data to the excel sheet for weekly maintainance for our jobs ?? What do u say about it ..??

    Thnx
    Nikhil

    • Well, in any project we’ll used to store some metadata information about required DI objects and database objects in a control tables. And we maintain these control tables accross all the environments. Generally these tables stores info about job name,wf name, df name, src & tgt table names, jobid, job start time, end time, no. of rows, …. etc., In some organizations they’ll have their own mornitoring tools to get the data from the these control tables. In some organizations they wont mantain anything… Support ppl maintains kind of excel sheets to log these details.

      Coming to your thought… that’s not a bad idea… you can give a try… but tell me what you’re trying to putup in excel sheet? what you want to maintain weekly?

      -Sudheer

  24. Hey Sudhir,

    can you please let me know how to create a DS job with the below info

    A specific folder contains around 200 flat fiels..and I need only the name of the flat files to get loaded into the target table not the data of the flat files.

    how to do this in DS??

    Thanks in advance
    Sarada

    • Hey Sarada,

      Well I could think of these two methods,

      Method 1 and the easiest way:
      Define a fileformat with required parameters and there is an option called DI_Filename or something like that… on the Fileformat editor page… you can define a global variable and pass it over there to read the file names. Now inside the Dataflow you can read the file as a source and map only DI_Filename as a target coulmn and ignore other columns. This way you can achieve.

      Method 2:
      Write a batch scirpt/shell script to store all the filenames and put it on the server. Now read that filenames file as a source file in DI and populate the data into respective tables.

      Let me know if this helps.

      Cheers,
      Sudheer

  25. Hi Sudheer ,

    Im loading data from sap server to local rep. using BO ETL. but there are some tables which have huge data and while loading the tables the error im getting is “TIME LIMIT EXCEEDED”. and the job fails. Can you please suggest some ways to get this rectified.

    Nikhil

  26. Hi

    Can you please tell me in detail how to use the Matched Transform? What all steps do we need to do so that it starts working.
    A user wants to use this to compare data from 2 tables and is having a RED CROSS on the matched transform stage while doing it.What can be the error and how to remove it ?

    Thanks
    Nikhil

    • Hi Nikhil,

      I would need more details, probably a screen shot could help to figure out the root cause of the issue. And I would recommend you to go thru the technical manuals for all the transformations in detail before you work on that component. If I get time will definitely update content on blog with new Transformations & their functionalities(which are due from my side).

      Cheers,
      Sudheer

  27. Hi

    While using the Data Management Console in BODS, im getting the Error –
    ORA-00933: SQL command not properly ended .

    What can be the cause of such an error . How can we resolve this issue.

    Thanks

  28. Hi ,

    The 3rd tab in data managment console is for Repository Schedule .. while clicking on that tab this error is coming on the screen.

    thnx

  29. Hi,

    I tried running the jobs . They are running fine . Just that while scheduling this problem occurred. I dont know how, but while scheduling i got this error

  30. I’m suspecting some of your repository tables might be corrupted. How many users are there for the same repo, I mean how many developer are accessing the same repository? If more than one user is accessing the same repo then try to access the same page in another system.

  31. That page in another system is showing valid. As in someone other rep , the same page is showing valid. In our rep. only this prob is happening. I think i know the job which is showing error. Should i drop that table and again rerun the job. how will it work then?

    • No do not drop any repository tables just like that. Have a backup of everything(copy of repo in atl, database backup of your complete repo) and now, upgrade your repository using repository manager. Remember click on upgrade option. It will update the repository tables, once this is done retest the issue. Let me know if you find any difficulties.

      -Sudheer

  32. i think its an SQL error . While scheduling the job this error occured. What can be the possible solution for the error. I know the table which is causing this issue.

  33. Sudheer ,

    I think there is a error statement in the database in the back end . I think we need to roll back that statement to make that run. How can we do that ?

  34. Hey Sudheer,

    I deleted the job and reimported it , its working fine now. I think there is an issue in the table , and it has got corrupted. Lets see how this is going to work out . Anyways , thanks for your help . love u brother.

    Nikhil

  35. Hi sudheer ,

    The same problem has occured again :(. Now while scheduling any job in data Management Console, when i schedule a job, after filling in the schedule details, when i click the confirm button the error message comes there.

    ORA-00933: SQL command not properly ended

    This is the only thing that comes there . Can it be a database thing ??

    thats the error im getting .

    • Could be metadata tables problem. Follow the below steps carefully…

      1. Remove your repository from mgmt. console.
      2. Compact your local repository once.
      3. Now, turn on calculate column mappings.
      4. Save your repo with all data and close.
      5. Now attach your repo to mgmt console and retest the issue

      If it still persists
      1. Take then entire repo back(both database level backup and atl version)
      2. Now, upgrade your repo using repo manager
      3. re-configure your repo in management console.
      Now, retest the same issue.

      Let me know if you face any issues.

      Cheers,
      -Sudheer

  36. Hi Sudheer,

    Very interesting and useful blog. I have a few questions to you. Please reply, as and when you get a chance.
    1) User-id will be created at BO side or DB side?
    2) One repository can be accessed by different users with different credentials (ID/PWD)?
    3) What is the best practice to create BODI landscape? I mean DEV/QA/PRD? Shall we install separately on servers? OR install one time and create separate repositories for DEV/QA/PRD?

    Thanks,
    Sree

    • Hi Sree,

      Very interesting and useful blog.
      Sudheer> Thanks for visiting my site and appreciate your valuable comments

      1) User-id will be created at BO side or DB side?
      Sudheer> When you say user-id, I’m assuming it as a repository. User should be created at DB side only

      2) One repository can be accessed by different users with different credentials (ID/PWD)?
      Sudheer>Yes. One repository can be accessed by different users, but its not at all recommended.
      The best practice for this is…
      a) Create no. of users at DB as per your requirement (lets say 10 users)
      b) Then I would recommend you to create two logical job servers(with diff unique ports) on DI/DS installed server machine to configure these users
      c) Now assign, 5 users(repositories) to one logical job server (JS1) and attach the other 5 users(repositories) to another logical job server(JS2).

      3) What is the best practice to create BODI landscape? I mean DEV/QA/PRD? Shall we install separately on servers? OR install one time and create separate repositories for DEV/QA/PRD?
      It depends, if your organization is not afford to have multiple servers, then you can go with server/two servers (One for Dev & QA and one server dedicated for PROD) which is not recommendable again but depends. Most of the organization go with multiple dedicated server licenses. In that case, your organization may have to buy multiple server licenses(one server for each environment)

      Hope this helps!….do let me know if you need more clarification on this.

      Cheers,
      Sudheer

  37. Thanks for your prompt respone. Appreciate your inputs. I have a question.
    What are the pros/cons, if install 3 repositories(Dev/QA/PRD) into one server and Dev/PRD separately on different servers?

    Thanks,
    Sree

  38. Sridhar,

    More elaborating my question. i.e. What are the advantage and disadvantages on below cases.
    1) Installing BOE on one server and create 2 repositories for Dev/Prod ?
    2) Installing BOE on two different servers and create Dev/Prod repositories on separate servers?

    Please let me know, what BO recommends for?

    Thanks,

  39. Sudheer, Sorry worngly mentioned your name as Sridhar. Please reply me, as early as possible on landscape installaion, i.e. one BOE installation for Dev & Prod and separate BOE installation for Dev & Prod on different servers. Want to know the advantages and disadvantages.

  40. Sudheer,

    Thanks a lot. One more question, i.e. How to maintain Business Objects security and user authorizations (BODI/BODS).
    Please send me, if you have any documentation?

    Thanks,
    Sree

    • Sree,

      I would call it as Data Integrator/Data Services rather Business Objects. I do not have any related docs to share(since those are non-disclosure docs), however, will definitely post some information about security & admin soon.

      Cheers,
      Sudheer

Leave a Reply

Your email address will not be published. Required fields are marked *


*