Pages

Monday 1 September 2014

The following are the different phases involved in a ETL project development life cycle.

The following are the different phases involved in a ETL project development life cycle.

1) Requirement Gathering
a) Business Requirement Collection (BRD)
--> The business requirement gathering start by
Business analyst, onsite technical lead and client business users.
--> In this phase, a business analyst prepares business requirement document (BRD) or business requirement specification (BRS)
--> Br collection takes place at client location.
--> The o/p from BR analyses are
--> BRS business requirement specifications
--> SRS system requirement specifications
b) System Requirement Collection (SRD)
Senior technical people (or) ETL architect will prepare the SRS which contains s/w and h/w details
the SRS will includes
a) o/s to be used (windows or Unix)
b) rdbms required to build database (oracle, teradata etc)
c) etl tools required (Informatica, data stage)
d) olap tools required (cognos, bo)
the srs is also called as technical requirement specifications (trs)

2) Design Phase
a) High level design document (HLD)
an etl architect and dwh architect participate in designing a solution to build a dwh.
an HLD document is prepared based on business requirement.
b) Low level design document (LLD)
based on HLD, a senior etl developer prepare  low level design document
the lld contains more technical details of an etl system
an lld contains data flow diagram (dfd), details of source and targets of each mapping.
an lld also contains information about full and incremental load.
after lld then development phase will start.
3) Development Phase
a) Mapping design
--> based an lld, the etl team will create mapping(etl code)
--> after designing the mappings, the code (mappings) will be reviewed by developers.
b) code review
--> code review will be done by developer.
--> in code review, the developer will review the code and the logic but not the data.
--> the foll. activities takes place in code review
--> you have to check the naming standards of transformation, mappings of data etc.
--> source and target mapping (placed the correct logic or not in mapping)
c) peer review
--> the code will reviewed by your team member (third party developer)
4) Testing
a) unit testing
--> A unit test for the dwh is a white box testing, it should check the etl procedure and mappings.
--> the foll. are the test cases can be executed by an etl developer.
1) verify data loss
2) no. of records in the source and target
3) data load / insert
4)data load /update
5) Incremental load
6) Data accuracy
7) Verify naming standards
8) Verify column mapping
--> The unit test will be carried by etl developer in development phase.
-- > ETL developer has to do the data validations also in this phase.

b) Development integration testing
--> run all the mappings in the sequence order.4
--> first run the source to stage mappings
--> then run the mappings related to dimensions and facts.
c) System integration testing
--> after development phase, we have to move our code to QA environment.
--> in this environment, we are giving read only permissions to testing people.
--> they will test all the workflows.
--> and they will test our code according to their standards.
d) User acceptance testing (UAT)
--> This test is carried out in the presence of client side technical users to verify the data migration from source to destination.
5) Pre-production
6) Production (go-live)

Regards,
Venkat

Could anyone please tell me how to expalin an informatica project in interview..?

Could anyone please tell me how to expalin an informatica project in interview..
What      are your Daily routines?
            Cheking any imp mails are came.Need to disucss with Team lead if we have to do anywork. By EOD send the mail to team lead regarding work. need to attend weekly status meeting.

How      many mapping have you created all together in your project?

      As of now i did 3 projects first project : 48 workflows , second : 52 workflow , third project : 156 worlfows as of now.

In      which account does your Project Fall?
          account???
         hope its manufacturing and advanced services.

What      is your Reporting Hierarchy?

         me -->team lead--->Project Manager--->programming Manager
         me---> AR--->HR

How      many Complex Mapping’s have you created? Could you please me the situation      for which you have developed that Complex mapping?
          i think 8 , one for character by character  comparision (with informatica not possible) so written 950 lines if single SQL code. and one for Dynamical Hierarchy distribution .... and one for Multi byte characters ( japanes,chines,..etc) and so on...
      What is your Involvement in Performance tuning of your      Project?
   some times any way performance team will take care of my code.
What is the Schema of your Project? And why did you opt      for that particular schema?
     some_Prd --->that is the souce for us

What are your Roles in this project?

          as a developer , desing the workflows , unit testing, and so on...


Can I have one situation which you have adopted by      which performance has improved dramatically?
         yes ..... my firest project ..i faced perormance issue after go live.

Where you Involved in more than two projects      simultaneously?
         of couse ,, i involved 3 projects at a time.

Do you have any experience in the Production support?

         no

What kinds of Testing have you done on your Project      (Unit or Integration or System or UAT)? And Enhancement’s were done after      testing?
        Unit testing and sometimes integration. UAT will done by business not BI team.
How many Dimension Table are there in your Project and      how are they linked to the fact table?

         current project 18 and 2 facts . and relationship will always be dimention keys.


How do we do the Fact Load?
           loding the fact table ??
      after dimention load complete ..

How did you implement CDC in your project?

        change data capture ??
        It is always on souce modification date.


How does your Mapping in File to Load look like?

            souce --->ods--->flat file ( .dat) -- e cap delimeter

How does your Mapping in Load to Stage look like?

         is depends on project ...
          Souce -->ods--->stage

How does your Mapping in Stage to ODS look like?

          Stage to ods ???
         never ...

What is the size of your Data warehouse?
           10 TB
What is your Daily feed size and weekly feed size?
        feed size>???
Which Approach (Top down or Bottom Up) was used in      building your project?
            Bottom up ---i mean you are asking about dimention to fact and fact to dimention??
How do you access your source’s (are they Flat files or      Relational)?

             Relational and some times flat files also.

Have you developed any Stored Procedure or triggers in      this project? How did you use them and in which situation?

            no

Did your Project go live? What are the issues that you      have faced while moving your project from the Test Environment to the      Production Environment?

         yeah...i faced some issues.

What is the biggest Challenge that you encountered in      this project?

        Dynamic hierachy data distribution and moving the files from unix box to informatica directory through shell scriopt , need to clean the data in flatfile itselt.


What is the scheduler tool you have used in this      project? How did you schedule jobs using it?

          Dollar Universe.. by using $U Sessioin task.

Friday 8 August 2014

Slowly Changing Dimensions(SCD) full Description

Slowly Changing Dimensions: Slowly changing dimensions are the dimensions in which the data changes slowly,.
rather than changing regularly on a time basis.
Let’s say I have a customer dimension with these columns mainly
 (Customer Id, Customer First Name, Customer Last Name, Customer Country)
Customer
Id
Customer First Name
Customer Last Name
Customer Country
1
Sudheer
Sharma
India

Now, this guy moved to US. In source the country name has been changed to US, we need to update that in our target dimension to reflect this change.
SCD Type 1: The new incoming record (changed/modified data set) replaces the existing old record in target.
Customer
Id
Customer First Name
Customer Last Name
Customer Country
1
Sudheer
Sharma
US

Old value (India) is overwritten by the new value (US) and there is no way to find out the old version of data. It holds only the current version of data.




SCD Type 2: In this case, an additional record is added into the customer dimension. The beauty of this approach is it will maintain two versions, you will find two records the older version and the current version. In other words it maintains history. Again we can implement Type 2 in following methods
1.           Versioning
2.           Effective Dates
3.           By setting Current Flag values/Record Indicators.
Method 1: Versioning
Customer Id
Customer First Name
Customer
Last Name
Customer Country
Effective Year
Version
1
Sudheer
Sharma
India
2008
0
1
Sudheer
Sharma
US
2009
1

Method 2: Effective Dates
Customer Id
Customer
First Name
Customer Last Name
Customer Country
Effective
Start Date
Effective EndDate
1
Sudheer
Sharma
India
01/01/2008
12/31/2008
1
Sudheer
Sharma
US
01/01/2009
tilldate

Method 3: Effective Dates & Current Record Indicators
Customer Id
Customer
First Name
Customer
Last Name
Customer Country
Effective
Start Date
Effective
End Date
Current Record IND
1
Sudheer
Sharma
India
01/01/2008
12/31/2008
N
1
Sudheer
Sharma
US
01/01/2009
tilldate
Y

SCD Type 3: In this approach, only the information about a previous value of a dimension is written into the database. An ‘old ‘or ‘previous’ column is created which stores the immediate previous attribute.
Product
ID
Product
Name
Current
Year
Current
Price
Previous
Year
Previous
Price
1
Close-up
2008
50.00
2007
45.00


The problem with this approach is over years, if the product price continuously changes, then the complete history may not be stored, only the latest change will be stored. For example, in year 2009, if the product price changes to 60, then we would not be able to see the complete history of 2007 prices, since the old values would have been updated with 2008 information.

Top 30 Informatica Interview Questions and Answers ?

1.What is Data Warehousing?
Ans:
Data warehouses Basis:- Data warehouses are widely used within the largest and most complex businesses in the world.
Use with in moderately large organizations, even those with more than 1,000 employees remains surprisingly low at the moment.
We are confident that use of this technology will grow dramatically in the next few years. In challenging times good decision-making becomes critical.
The best decisions are made when all the relevant data available is taken into consideration.
The best possible source for that data is a welldesigned data warehouse.
To make any new decision or to introduce new Plan data warehousing is very important.
*ETL is one of the main processes in data warehousing.ETL means extract transform and Load data into data warehouse.
Informatica is ETL tool. It is very flexible and cheaper as compared to other ETL tool.

2.What is Business Intelligence (BI)?
Ans:
BI is an abbreviation of the two words Business Intelligence, bringing the right information at the right time to the right people in the right format.
It is a 5-step process to run your business smarter, starting with registering the right data correctly, collecting the data from multiple sources,
transforming, combining and storing it in a data warehouse.

3.What is a Dimension Table?
Ans:
A dimension table stores attributes, or dimensions, that describe the objects in a fact table.
A dimension table has a primary key column that uniquely identifies each dimension record (row).
Types of Dimensions:
 * Slowly Changing Dimensions
 *Rapidly Changing Dimensions
 *Junk Dimensions
 *Inferred Dimensions
 *Conformed Dimensions
  Etc..











4.What is a Fact Table?
Ans:

A fact table stores quantitative information for analysis and is often denormalized.

 A fact table works with dimension tables. A fact table holds the data to be analyzed,
and a dimension table stores data about the ways in which the data in the fact table can be analyzed. Thus,
the fact table consists of two types of columns. The foreign keys column allows joins with dimension tables,
and the measures columns contain the data that is being analyzed.

5.What are the Fundamental Stages of Data Warehousing?
Ans:
i.Offline Operational Databases – Data warehouses in this initial stage are developed by simply copying the database of an operational system to
an off-line server where the processing load of reporting does not impact on the operational system’s performance.
ii.Offline Data Warehouse – Data warehouses in this stage of evolution are updated on a regular time cycle (usually daily, weekly or monthly)
from the operational systems and the data is stored in an integrated reporting-oriented data structure.
iii.Real Time Data Warehouse – Data warehouses at this stage are updated on a transaction or event basis,
every time an operational system performs a transaction (e.g. an order or a delivery or a booking etc.)
iv.Integrated Data Warehouse – Data warehouses at this stage are used to generate activity or transactions that
are passed back into the operational systems for use in the daily activity of the organization.

8.What are the Different Methods of Loading Dimension tables?
Ans:
i. Conventional Load: Before loading the data, all the Table constraints will be checked against the data.
ii.Direct load:(Faster Loading): All the Constraints will be disabled. Data will be loaded directly.
Later the data will be checked against the table constraints and the bad data won't be indexed.

9.What is Data Mining?
Ans:
Data mining is the practice of automatically searching large stores of data to discover patterns and trends that go beyond simple analysis.
Data mining uses sophisticated mathematical algorithms to segment the data and evaluate the probability of future events.
Data mining is also known as Knowledge Discovery in Data (KDD).

The key properties of data mining are:

Automatic discovery of patterns

Prediction of likely outcomes

Creation of actionable information

Focus on large data sets and databases

Data mining can answer questions that cannot be addressed through simple query and reporting techniques.

10.What is the Difference between a View and a Materialized View?
Ans:
A view
is nothing but a SQL query takes the output of a query and makes it appear like
a virtual table which does not take up any storage space or contain any data



But Materialized views are schema objects it storing the results of a query in a
separate schema object (i.e. take up storage space and contain data). This
indicates the materialized view is returning a physically separate copy of the
table data.

11.What is a staging area?
Ans:
Staging(logical Memory) area is place where you hold temporary tables on data warehouse server.
Staging tables are connected to work area or fact tables. We basically need staging area to hold the data ,
and perform data cleansing and merging , before loading the data into warehouse.

12.What is OLAP?
Ans:
OLAP (online analytical processing) enables a user to easily and selectively extract and view data from different points-of-view.

13.What is the Difference between OLTP and OLAP?


Ans:                   OLTP                                                      OLAP
  1. Application :   * Operational: ERP, CRM, legacy apps, * Management Information System, Decision Support System.

  2.Typical users:   * Staff                                    * Managers, Executives .

  3.Horizon      :   *Weeks, Months                             *Years   .

  4.             :   * Normalization is promoted                * Denormalization is promoted .

  5.             :   *On Line Transaction Processing            *On Line Analytic Processing .

13.What is ODS?
Ans:
> ODS - Operational Data Store.
> ODS Comes between staging area & Data Warehouse.
> ODS is nothing but a staging area, in which you can keep your OLTP type  data like your day to day transactional data.

14.What is ER Diagram?
Ans:
An entity-relationship diagram is a data modeling technique that creates a graphical representation of the entities,
and the relationships between entities, within an information system.

15.What are Lookup Tables?
Ans:
Look up stage performs the same action as merge and join
stages do. But here we have a table as lookup table which
will be having a key column with the source table. if the
key colums in source are matched with the lookup table, that
rows will only be passed to output. Others can be taken in
reject.

16.What are Aggregate Tables?
Ans:
Aggregate table contains the summary of existing warhouse data , it contains data which is group accordingly (month , quartely etc)
depending upon the business needs.For example you may contain transaction in fact table for all the customers but
sometime you want the sum of amount for a particular customer over a month in that case you need to run the query in fact table which
contains millions of row and time consuming. so avoid this problem we do the aggregation of amount in aggregate over a month for each customers,
so this will improve the performance and you can retrieve the result very fastly.

17.What are Conformed Dimensions?
Ans:Conformed dimentions are dimensions which are common to the cubes.
(cubes are the schemas contains facts and dimension tables)  Consider Cube-1 contains F1,D1,D2,D3 and Cube-2 contains F2,D1,D2,D4 are
the Facts and Dimensions  here D1,D2 are the Conformed Dimensions  .

18.How do you Load the Time Dimension?
Ans:
     We can use SCD Type 1/2/3 to load any Dimensions based on the requirement.

OR

U can load time dimension manually by writing scripts in PL/SQL to load the time dimension table with values for a period.

19.What is a Level of Granularity of a Fact Table?
Ans:
 The granularity is the lowest level of information stored in the fact table. The depth of data level is known as granularity.
 In date dimension the level could be year, month, quarter, period, week, day of granularity.

20.What is a Factless Facts Table?
Ans:Is fact table that does not contain any measure. This table will only  contain keys from different dimension tables.
    This is often used to  resolve a many-to-many cardinality issue.

21.What are Slowly Changing Dimensions (SCD)?
Ans:
 Slowly Changing Dimensions: Slowly changing dimensions are the dimensions in which the data changes slowly,
rather than changing regularly on a time basis.

22.What is a Surrogate Key?
Ans:sarrogate key is ssystem generated sequence number,an
artificial key. which can be used in maintaing the history
of the data .

23.What is Junk Dimension?
Ans:
A "junk" dimension is a collection of random transactional
codes, flags and/or text attributes that are unrelated to
any particular dimension.

The junk dimension is simply a structure that provides a
convenient place to store the junk attributes. A good
example would be a trade fact in a company that brokers
equity trades.

24.What is a Data Mart?
Ans: Data mart is a part of the data warehouse.

25.What is Schema? & Types of Schema?
Ans:
  There are 3 types of Schemas,
1.Star Schema
2.Snowflake Schema
3.Galaxy/Integrated/Hybride/Consalation Schemas
1)Star Schema-It is defined as the fact table is centrally located,surrounded by dimension tables is called star schema.
In that dimension tables are denormalized and fact table is normalized.(or)Star schema is defind as two or more fact and
two or more dimension tables that are related to foreigh keys.
2)Snowflake Schema-It is defind as the Denormalized Dimensions tables are can be splited into two or more normalized dimensions is called snowflake schema.
In that both dimensions and fact tables are normalized.
3)Hybride Schema-It is defind as a dimension table is shared by two or more fact tables.

26.Explain the Paradigm of Bill Inmon and Ralph Kimball.
Ans:
    Bill Inmon’s paradigm: Data warehouse is one part of the overall business intelligence system. An enterprise has one data warehouse, and data marts source their information from the data warehouse. In the data warehouse, information is stored in 3rd normal form.

    Ralph Kimball’s paradigm: Data warehouse is the conglomerate of all data marts within the enterprise. Information is always stored in the dimensional model.



Thursday 3 July 2014

Top 20 Informatica Interview Questions with Answers-Part-5

1]    HOW CAN YOU STOP A BATCH?
Answer: By using server manager or pmcmd.

2]    WHAT IS A COMMAND THAT USED TO RUN A BATCH?
Answer: pmcmd is used to start a batch.

3]    WHAT IS DIMENSION TABLE EXACTLY?
Answer: Dimension tables gives description about something. for eg. If we take Student as a dimention table, we have various attributes like college name, age, gender,etc which gives some description about it.
4]    WHEN THE INFORMATICA SERVER MARKS THAT A BATCH IS FAILED?
Answer: If one of session is configured to "run if previous completes" and that previous session fails.
5]  What is Batch in Informatica & types?
  Ans:  Batches provide a way to group sessions for either serial or parallel execution by the
     Informatica  Server. There are two types of batches:
   a) Sequential: Runs sessions one after the other.
    b) Concurrent: Runs sessions at the same time.
 You might create a sequential batch if you have sessions with source-target dependencies that you   want to run in a specific order. You might also create a concurrent batch if you have several independent sessions you need scheduled at the same time. You can place them all in one batch, then schedule the batch as needed instead of scheduling each individual session.
You can create, edit, start, schedule, and stop batches with the Server Manager. However, you cannot copy or abort batches. With pmcmd, you can start and stop batches.

6]    HOW DOES THE SERVER RECOGNISE THE SOURCE AND TARGET DATABASES?
Answer: By using ODBC connection.if it is relational.if is flat file FTP connection..see we can make sure with connection in the properties of session both sources && targets.
7]    WHAT IS THE LIMIT TO THE NUMBER OF SOURCES AND TARGETS YOU CAN HAVE IN A MAPPING
Answer: There is no such restriction to use this number of sources or targets inside a mapping.

8] Briefly explain the Aggregator transformation?
Ans: It allows one to do aggregate calculations such as sums, averages etc. It is unlike expression transformation in which  one can do calculations in groups.

9] Describe Expression transformation?
Ans: Values can be calculated in single row before writing on the target in this form of transformation. It can be used to perform non aggregate calculations. Conditional statements can also be tested before output results go to target tables.

10]  What do you mean by filter transformation?
Ans: It is a medium of filtering rows in a mapping. Data needs to be transformed through filter transformation and then filter condition is applied.  Filter transformation contains all ports of input/output, and the rows which meet  the condition can only pass through that filter.

11] What is Joiner transformation?
Ans: Joiner transformation combines two affiliated heterogeneous sources living in different locations while a source qualifier transformation can combine data emerging from a common source.
12] What is Lookup transformation?
 Ans: It is used for looking up data in a relational table through mapping. Lookup definition from any relational database is imported from a source which has tendency of connecting client and server. One can use multiple lookup transformation in a mapping.

13]  What are the types of metadata that stores in repository?
Ans: The types of metadata includes Source definition, Target definition, Mappings, Mapplet, Transformations.

14]  What the difference is between a database, a data warehouse and a data mart?
Ans: Database includes a set of sensibly affiliated data which is normally small in size as compared to data warehouse. While in data warehouse there are assortments of all sorts of data and data is taken out only according to the customer’s needs. On the other hand datamart is also a set of data which is designed to  cater the needs of different domains. For instance an organization having different chunk of data for its different departments i.e. sales, finance, marketing etc.

15]  What is meant by a domain?
Ans: When all related relationships and nodes are covered by a sole organizational point,  its called domain. Through this data management can be improved.

16]  What is the difference between a repository server and a powerhouse?
Ans: Repository  server controls the complete repository which includes tables, charts, and various procedures etc. Its main function is to assure the repository  integrity and consistency. While a powerhouse server governs the implementation of various processes among the factors of server’s database repository.

17]  How can one identify whether mapping is correct or not without connecting session?
Ans: One can find whether the session is correct or not without connecting the session is with the help of debugging option.


Wednesday 2 July 2014

Top 20 Informatica Interview Questions with Answers-Part-4

1]    CAN U TELL ME HOW TO GO FOR SCD'S AND ITS TYPES.WHERE DO WE USE THEM MOSTLY
Answer: The "Slowly Changing Dimension" problem is a common one particular to data warehousing. In a nutshell, this applies to cases where the attribute for a record varies over time.

2]    WHAT IS MEANT BY EDW?
Answer: EDW is Enterprise Datawarehouse which means that its a centralised DW for the whole organization. This apporach is the apporach on Imon which relies on the point of having a single warehouse/centra..

3]    IF YOU WANT TO CREATE INDEXES AFTER THE LOAD PROCESS WHICH TRANSFORMATION YOU CHOOSE?
Answer: Its usually not done in the mapping(transformation) level. Its done in session level. Create a command task which will execute a shell script (if Unix) or any other scripts which contains the create index.

4]    WHAT IS WORKLET AND WHAT USE OF WORKLET AND IN WHICH SITUATION WE CAN USE IT
Answer: A set of worlflow tasks is called worklet, Workflow tasks means 1)timer 2)decision 3)command 4)eventwait 5)eventrise 6)mail etc...... But we r use diffrent situations by using this only.

5]    CAN YOU COPY THE BATCHES?
Answer: NO.

6]    WHAT ARE COST BASED AND RULE BASED APPROACHES AND THE DIFFERENCE
Answer: Cost based and rule based approaches are the optimization techniques which are used in related to databases, where we need to optimize a sql query. Basically Oracle provides Two types of Optimizer .

7]    HOW MANY WAYS YOU CAN UPDATE A RELATIONAL SOURCE DEFINTION AND WHAT R THEY?
Answer: Two ways 1. Edit the definition 2. Reimport the definition .
8]    HOW MANY WAYS YOU CREATE PORTS?
Answer: Two ways:- 1.Drag the port from another transforamtion 2.Click the add buttion on the ports tab.

9]    HOW MANY WAYS YOU CAN UPDATE A RELATIONAL SOURCE DEFINITION AND WHAT ARE THEY?
Answer: joinercondition exciuted with in the infoematica staging layer source qualifier condition exciuted with in the database level.

10]    WHAT IS A VIEW? HOW IT IS RELATED TO DATA INDEPENDENCE? AND WHAT ARE THE DIFFERENT TYPES OF VIEWS, AND WHAT IS MATERIALIZE VIEW
Answer: views view is a combination of one or more table.view does not stores the data,it just store the query in file format.If we excutes the query the query will fetch the data from the tables.

11]    WHAT IS MEANT BY JUNK ATTRIBUTE IN INFORMATICA?
Answer: Junk Dimension A Dimension is called junk dimension if it contains attribute which are rarely changed ormodified. example In Banking Domain , we can fetch four attributes accounting to a junk dimension.

12]    WHAT IS THE BEST WAY TO SHOW METADATA (NUMBER OF ROWS AT SOURCE, TARGET AND EACH TRANSFORMATION LEVEL, ERROR RELATED DATA) IN A REPORT FORMAT?
Answer: You can select these details from the repository table. you can use the view REP_SESS_LOG to get these data.

13]    HOW TO JOIN TWO TABLES WITHOUT USING THE JOINER TRANSFORMATION.
Answer: It’s possible to join the two or more tables by using source qualifier.But provided the tables should have relationship. When u drag n drop the tables u will getting the source qualifier for each transformation.

14]    HOW CAN WE ELIMINATE DUPLICATE ROWS FROM FLAT FILE?
Answer: Use Sorter Transformation. When you configure the Sorter Transformation to treat output rows as distinct, it configures all ports as part of the sort key.

15]    HOW TO GET TWO TARGETS T1 CONTAINING DISTINCT VALUES AND T2 CONTAINING DUPLICATE VALUES FROM ONE SOURCE S1.
Answer: Use filter transformation for loading the target with no duplicates. and for the other transformation load it directly from source.

16]    WHAT IS THE DIFF B/W STOP & ABORT IN INFORMATICA SESS LEVEL ?
Answer: Stop:We can Restart the session Abort:WE cant restart the session.We should truncate all the pipeline after that start the session.

17]    CAN YOU GENERATE REPORTS IN INFORMATCIA?
Answer: Yes. By using Metadata reporter we can generate reports in informatica.

18]    HOW DO WE ANALYSE THE DATA AT DATABASE LEVEL?
Answer: Data can be viewed using Informatica's designer tool. If you want to view the data on source/target we can preview the data but with some limitations. We can use data profiling too.

19]    HOW TO RETRIEVE THE RECORDS FROM A REJECTED FILE. EXPLAIN WITH SYNTAX OR EXAMPLE
Answer: During the execution of workflow all the rejected rows will be stored in bad files(where your informatica server get installed;C:Program FilesInformatica PowerCenter 7.1Server) These bad files can be..

20]    WHAT IS MEANT BY COMPLEX MAPPING?
Answer: Complex maping means involved in more logic and more business rules. Actually in my project complex mapping is In my bank project, I involved in construct a 1 dataware house Meny .

Top 20 Informatica Interview Questions with Answers-Part-3

1]    HOW TO EXPORT MAPPINGS TO THE PRODUCTION ENVIRONMENT?
Answer: In the designer go to the main menu and one can see the export/import options. Import the exported mapping in to the production repository with replace options.

2]    HOW TO GENERATE THE METADATA REPORTS IN INFORMATICA?
Answer: You can generate PowerCenter Metadata Reporter from a browser on any workstation, even a workstation that does not have PowerCenter tools installed.

3]    IN MY SOURCE TABLE 1000 REC'S R THERE.I WANT TO LOAD 501 REC TO 1000 REC INTO MY TARGET TABLE ? HOW CAN U DO THIS ?
Answer: You can overide the sql Query in Wofkflow Manager. LIke select * from tab_name where rownum<=1000 minus select * from tab_name where rownum<=500; This will work fine. Try it and get back .

4]    CAN U GENERATE REPORTS IN INFORMATCIA?
Answer: It is a ETL tool, you could not make reports from here, but you can generate metadata report, that is not going to be used for business analysis.

5]    IF I DONE ANY MODIFICATIONS FOR MY TABLE IN BACK END DOES IT REFLECT IN INFORMATCA WAREHOUSE OR MAPING DESGINER OR SOURCE ANALYZER?
Answer: NO. Informatica is not at all concern with back end data base.It displays u all the information that is to be stored in repository.If want to reflect back end changes to informatica screens.

6]    HOW TO RECOVER THE STANDALONE SESSION?
Answer: A standalone session is a session that is not nested in a batch. If a standalone session fails, you can run recovery using a menu command or pmcmd. These options are not available for batched session.

7]    CAN U COPY THE SESSION TO A DIFFERENT FOLDER OR REPOSITORY?
Answer: In addition, you can copy the workflow from the Repository manager. This will automatically copy the mapping, associated source,targets and session to the target folder.

8]    WHAT IS DATA CLEANSING..?
Answer: Data cleansing is a two-step process including DETECTION and then CORRECTION of errors in a data set.

9]    WHAT ARE THE ACTIVE AND PASSIVE TRANSFORAMTIONS?
Answer: An active transforamtion can change the number of rows that pass through it.A passive transformation does not change the number of rows that pass through it.

10]    THE DESIGNER INCLUDES A "FIND" SEARCH TOOL AS PART OF THE STANDARD TOOL BAR. WHAT CAN IT BE USED TO FIND?
Answer: This is used for finding source columns in workspace.

11]    CAN U USE THE MAPING PARAMETERS OR VARIABLES CREATED IN ONE MAPING INTO ANOTHER MAPING?
Answer: NO. You might want to use a workflow parameter/variable if you want it to be visible with other mappings/sessions.

12]    WHERE SHOULD YOU PLACE THE FLAT FILE TO IMPORT THE FLAT FILE DEFINITION TO THE DESIGNER?
Answer: Place it in local folder.

13]    IF YOU ARE WORKFLOW IS RUNNING SLOW IN INFORMATICA. WHERE DO YOU START TROUBLE SHOOTING AND WHAT ARE THE STEPS YOU FOLLOW?
Answer: When the work flow is running slowly u have to find out the bottlenecks in this order target source mapping session system.

14]    WHAT DOES THE EXPRESSION N FILTER TRANSFORMATIONS DO IN INFORMATICA SLOWLY GROWING TARGET WIZARD?
Answer: EXPESSION transformation detects and flags the rows from source. Filter transformation filters the rows that are not flagged and passes the flagged rows to the Update strategy transformation.

15]    HOW CAN WE JOIN THE TABLES IF THE TABLES HAVE NO PRIMARY AND FORIEN KEY RELATION AND NO MATCHIG PORT TO JOIN?
Answer: without common column or common data type we can join two sources using dummy ports. 1.Add one dummy port in two sources. 2.In the expression trans assing '1' to each port. 2.Use Joiner transformation.

16]    WHAT IS POLLING?
Answer: It displays the updated information about the session in the monitor window. The monitor window displays the status of each session when you poll the informatica server.

17]    WHY YOU USE REPOSITORY CONNECTIVITY?
Answer: When you edit,schedule the sesion each time,informatica server directly communicates the repository to check whether or not the session and users are valid.All the metadata of sessions and mappings.

18]    WHAT IS CODE PAGE USED FOR?
Answer: Code Page is used to identify characters that might be in different languages. If you are importing Japanese data into mapping, you must select the Japanese code page of source data.

19]    WHAT ARE VARIOUS TYPES OF AGGREGATION?
Answer: Various types of aggregation are SUM, AVG, COUNT, MAX, MIN, FIRST, LAST, MEDIAN, PERCENTILE, STDDEV, and VARIANCE.

20]    WHAT IS CHANGE DATA CAPTURE?
Answer: Change data capture (CDC) is a set of software design patterns used to determine the data that has changed in a database so that action can be taken using the changed data.