Informatica Interview Questions and Answers for beginners and experts. List of frequently asked Informatica Interview Questions with answers by Besant Technologies.
We hope these Informatica interview questions and answers are useful and will help you to get the best job in the networking industry. This Informatica interview questions and answers are prepared by Informatica Professionals based on MNC Companies expectation. Stay tuned we will update New Informatica Interview questions with Answers Frequently. If you want to learn Practical Informatica Training then please go through this Informatica Training in Chennai
Besant Technologies supports the students by providing Informatica interview questions and answers for the job placements and job purposes. Informatica is the leading important course in the present situation because more job openings and the high salary pay for this Informatica and more related jobs. These are top Informatica interview questions and answers, prepared by our institute experienced trainers.
Best Informatica Interview Questions and Answers
Here is the list of most frequently asked InformaticaInterview Questions and Answersin technical interviews. These Informatica questions and answers are suitable for both freshers and experienced professionals at any level. TheInformatica questions are for intermediate to somewhat advanced Informatica professionals, but even if you are just a beginner or fresher you should be able to understand the Informatica answers and explanations here we give.
In this post, you will get the most important and top Informatica Interview Questions and Answers, which will be very helpful and useful to those who are preparing for jobs.
Q1). What is ETL and what are ETL Tools?
ETL is a process of Extract, Transform and Load the data into DataWarehousing in the required format for decision-making.
ETL TOOLS:
- IBM Datastage
- Informatica PowerCenter
- Abinitio
- Talend Studio etc.
Q2). What is Informatica?
Informatica is an ETL tool provided by ‘Informatica Corporation’ which is used to Extract, Transform and Load all type of databases.
Q3). What is Informatica PowerCenter Tools?
Informatica PowerCenter had 2 types of Tools:
Server Tools:Administrator Console, Integration Service
Client Tools:Repository Manager, Designer, Workflow Manager, Workflow Monitor
Q4). What is Repository Manager?
It is used to organize and secure metadata by creating folders.
Q5). What is Designer?
The designer is used to create source definitions, target definitions, and transformations to build the mappings. Also, it is used to create mapplets which can be reused in different mappings.
Q6). What is Workflow Manager?
It is used to store workflow metadata and connection object information in the repository. A wf contains sessions and different tasks according to the requirement. Tasks can include session, email notification, assignment task, decision task etc. Each task in a wf needs to be connected via links and also we can provide link task based on the requirements.
Q7). What is Workflow Monitor?
It is used to retrieve wf run status and session logs. A wf is a set of instructions that tells an integration service how to run the tasks. Basically, WF monitor is used to monitoring workflows and tasks.Q8). Define Mapping?
It is a set of Source and Target definitions linked by transformation objects that define the rules for transformation.
Q9). Define Session?
It is a set of instructions that describe how and when to move the data from source to target.
Q10). Define Transformation?
Its nothing but a repository object that generates, modifies or passes data.
Active and Passive Transformation?
An active transformation can change the number and position of rows that pass through it. For Eg. Aggregator, Rank, Filter etc.
A passive transformation doesn’t change the number of rows that pass through it. For Eg. Expression, Sequence Generator etc.
Connected and Unconnected Transformation?
A connected transformation is connected to source definition or target definition or other transformation in a mapping
An unconnected transformation is not connected to any of the Informatica objects in a mapping.
Source Qualifier Transformation
Q11). What is SQ Transformation?
An SQ is an active and connected transformation that reads the rows from a relational database or flat file source while running a session.
Q12). What are the tasks can be performed using SQ?
- SQ can be configured tojoindata originating from same source database.
- We can specify the ofsorted portsand the integration service adds an ‘Order By’ clause to the default SQ query.
- We can usesource filterto specify a certain condition.
- Select Distinctoption can be used to only select unique records.
- PreandPost SQLstatements can be executed before and after the SQ query.
- Also, we can writeUser Defined SQL queryto override the default query in the SQ.
Q13). What happens if in the Source Filter property of SQ transformation we include keyword WHERE For Eg. WHERE EMP_ID > 10 ?
Normally, we use a Source filter to reduce the number of source records. If we include the string WHERE in the source filter, the Integration Service fails the session. In the above case, the correct syntax will be EMP_ID > 10.
Expression Transformation
Q14). What is an Expression Transform?
The expression is a Passive and connected transformation which is used to calculate values in a single row. Also, it can be used to test conditional statements before writing the output results to target tables or other transformations.
For eg. we might need to adjust employee salaries, concatenate first and last names, or convert strings to numbers or vice versa etc.
Q15). How do we convert a Date field coming as data type string from a flat file?
By using Date Conversion Function
IIF( IS_DATE( Column1 ) = 1, TO_DATE( Column1 , ‘YYYY-MM-DD’ ), NULL )
Sorter Transformation
Q16). What is a Sorter Transformation?
The sorter is an Active and Connected transformation which is used to sort data in ascending or descending order according to specified sort keys.
Q17). Why is Sorter an Active Transformation?
In sorter T/R we can select the “distinct” option in the sorted property. When Sorter transformation is configured to treat output rows as distinct, it assigns all ports as part of the sort key. The Integration Service discards duplicate rows compared during the sort operation. Since the number of Input Rows will vary as compared to the Output rows, it is treated as an Active transformation.
Q18). How to select distinct rows for flat file sources?
Since in a Flat File the distinct option in the source qualifier will be disabled, hence we can use a Sorter Transformation and check the Distinct option to get the unique rows. When we select the distinct option all the columns will the selected as keys, in ascending order by default.
Aggregator Transformation
Q19). What is an Aggregator Transformation?
An aggregator is an Active and Connected transformation which is used to perform aggregate calculations like AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM, and VARIANCE.
Q20). What are all expressions supported by Aggregator Transformation?
- Conditional Clauses: The conditional clause can be any clause that evaluates TRUE or FALSE.
For eg. SUM (SALARY, DEPT_ID = 10)
- Non-Aggregate expressions.
IIF(DEPT_ID=10,DEPT_NM,DEPT_ID||’_’||DEPT_NM)
- Nested Aggregation Expression:One aggregate function within another aggregate function.
- MAX (COUNT (EMP))
Q21). How to improve the performance of Aggregator Transformation?
- UseSorted inputwhich reduces the amount of data cached and improves session performance.
- Filter the unnecessary databefore aggregating it.
- connecting only the necessary input/output portsto subsequent transformations in order to reduce the size of a data cache.
Q22). Because of which condition selecting Sorted Input in aggregator may fail the session?
- If the input data is not sorted correctly, the session will fail.
- In some cases, if the input data is properly sorted, the session may fail also if the sort order by ports and the group by ports of the aggregator are not in the same order.
Q23). How can we delete duplicate records using Aggregator?
By using the Group By checkbox on the ports having duplicate occurring data.
Filter Transformation
Q24). What is a Filter Transformation?
A Filter transformation is an Active and Connected transformation which is used to test the data based on a certain condition that passes through it. If the condition is satisfied then the data will be passed to the next transformation or target else it will be dropped.
Q25). What is the difference between Source Qualifier transformations Source filter option and filter transformation?
SQ Source Filter | Filter Transformation |
Source Qualifier transformation filters rows while reading from the source. | Filter transformation filters rows from within a mapping |
Limits the records extracted from a source | Limits the records sent to a target |
It can only filter rows from relational sources | Filters rows coming from any type of source system |
The filter condition in the Source Qualifier transformation only uses standard SQL | Filter Transformation can define a condition using any statement or transformation function that returns either a TRUE or FALSE |
Router Transformation
Q26). What is Router Transformation?
A router is an active and connected transformation which is used to test the source data based on multiple conditions.
Q27). Different groups available in Router Transformation?
- Input Group
- Output Group/User Defined Group
- Default Group
Q28). What is the difference between Router and Filter?
Router | Filter |
Router transformation divides the incoming records into multiple groups based on some condition. | Filter transformation restricts or blocks the incoming recordset based on a single condition |
Records which does not match any condition passes to the default group | Records which doesn’t match the condition is dropped |
Filter acts like WHERE condition in SQL | Router acts like CASE WHEN statement in SQL |
Joiner Transformation
Q29). What is a Joiner Transformation?
A Joiner is an Active and Connected transformation used to join two data sources coming from same or heterogeneous databases or files.
Q30). What are the different types of Joins available in Joiner Transformation?
- Normal:Discards all rows of data from the master and detail source that do not match the join condition.
- Master Outer:keeps all rows of data from the detail source and the matching rows from the master source.
- Detail Outer:keeps all rows of data from the master source and the matching rows from the detail source.
- Full Outer:keeps all rows of data from both the master and detail sources.
Q31). Which input pipeline will be set as the master pipeline?
During a session run, the Integration Service compares each row of the master source against the detail source.
When the Integration Service processes a Joiner transformation, it caches the rows from the master source and blocks the detail source. Once it finishes reading and caching all master rows, it unblocks the detail source and reads the detail rows. That’s why if we will select the source containing fewer input rows in master, the cache size will be smaller, hence performance will be improved.
Rank Transform
Q32). What is Rank Transform?
Rank is an Active and Connected transformation used to select a set of top or bottom values of data.
Q33). What is a RANK port and RANKINDEX?
Rank port is an input/output port used to specify the column for which we want to rank the source values. RANK INDEX is the default port created by informatica for each Rank transformation which stores the ranking position for each row in a group.
Q34). How can you get ranks based on different groups?
We can configure one of its input/output ports as a group by a port. For each unique value in the group port, the transformation creates a group of rows falling within the rank definition (top or bottom, and a particular number in each rank).
Sequence Generator Transformation
Q35). What is a Sequence Generator Transformation?
A Sequence Generator is a Passive and Connected transformation that generates numeric values. It is used to create unique primary key values or surrogate key which uniquely identifies each record.
Q36). Define the Properties available in Sequence Generator transformation
- Start Value:Start value of the generated sequence that we want the Integration Service to use if we use the Cycle option. If we select Cycle, the Integration Service cycles back to this value when it reaches the end value. The default is 0.
- Increment By:Difference between two consecutive values from the NEXTVAL port. The default is 1.
- End Value:Maximum value generated by Sequence Generator. After reaching this value the session will fail if the sequence generator is not configured to cycle.
- Current Value:Current value of the sequence. Enter the value we want the Integration Service to use as the first value in the sequence. The default is 1
- Cycle:If selected, when the Integration Service reaches the configured end value for the sequence, it wraps around and starts the cycle again, beginning with the configured Start Value.
- A number of Cached Values:Number of sequential values the Integration Service caches at a time. The default value for a standard Sequence Generator is 0. The default value for a reusable Sequence Generator is 1,000.
- Reset:Restarts the sequence at the current value each time a session runs. This option is disabled for reusable Sequence Generator transformations.
Q37). Suppose we have a source table populating two target tables. We connect the NEXTVAL port of the Sequence Generator to the surrogate keys of both the target tables. Will the Surrogate keys in both the target tables be same? If not how can we flow the same sequence values in both of them?
- When we connect the NEXTVAL output port of the Sequence Generator directly to the surrogate key columns of the target tables, a block of sequence numbers is sent to one target tables surrogate key column. The second target receives a block of sequence numbers from the Sequence Generator transformation only after the first target table receives the block of sequence numbers.
- Suppose we have 5 rows coming from the source, so TGT1 will have the sequence values as (1,2,3,4,5) and TGT2 (6,7,8,9,10).
- Now, suppose we need to have the same surrogate keys in both the targets.
- Then we need to put an Expression transformation in between the Sequence Generator and the Target tables. The Sequence Generator will pass unique values to the expression transformation, and then the rows are routed from the expression transformation to the targets.
Q38). Suppose we have 100 records coming from the source and the Current Value is 0 and End Value of Sequence generator is set to 80. What will happen?
End Value is the maximum value the Sequence Generator will generate. After it reaches the End value the session fails with the following error message:
- TT_11009 Sequence Generator Transformation:Overflow error.
Failing of session can be handled if the Sequence Generator is configured toCycle.
Stored Procedure Transformation
Q39). What is a Stored Procedure Transformation?
Stored Procedure is a Passive transformation used to execute stored procedures prebuilt on the database through Informatica. It can also be used to call functions to return calculated values. It can beConnectedorUnconnected.
Q40). What are the uses of Stored Procedure Transformation?
- Check the status of a target database before loading data into it.
- Determine if enough space exists in a database.
- Perform a specialized calculation.
- Drop and recreate indexes.
Q41). What is PROC_RESULT in stored procedure transformation?
PROC_RESULT is a system variable, where the output of an unconnected stored procedure transformation is assigned by default.
Q42). What is the status code?
The status code provides error handling for the Informatica. The stored procedure issues a status code that notifies whether or not stored procedure completed successfully. This value can not seen by the user. Its only used by informatica to determine whether to continue running the session or stop.
lookup transformation
Q43). What is lookup transformation?
A lookup is a connected and unconnected transformation which is used to perform below tasks:
- Get related value
- Perform a calculation
- Update slowly changing dimension tables.
Q44). What are the differences between Connected and Unconnected Lookup?
Connected Lookup | Unconnected Lookup |
Connected lookup participates in data flow and receives input directly from the pipeline | Unconnected lookup receives input values from the result of a LKP: expression in another transformation |
Connected lookup can use both dynamic and static cache | Unconnected Lookup cache can NOT be dynamic |
Connected lookup can return more than one column value ( output port ) | Unconnected Lookup can return only one column value i.e. output port |
Connected lookup caches all lookup columns | Unconnected lookup caches only the lookup output ports in the lookup conditions and the return port |
Supports user-defined default values (i.e. value to return when lookup conditions are not satisfied) | Does not support user-defined default values |
Q45). What is the different Lookup Cache?
- Persistent cache:We can save the lookup cache files and reuse them the next time the Informatica server processes a lookup transformation configured to use the cache.
- Recache from the database:If the persistent cache is not synchronized with the lookup table, we can configure the Lookup transformation to rebuild the lookup cache.
- Static cache:We can configure a static or read-only cache for the only lookup table. By default, Informatica server creates a static cache. It caches the lookup table and lookup values in the cache for each row that comes into the transformation. When the lookup condition is true, the Informatica server does not update the cache while it processes the lookup transformation.
- Dynamic cache:If we want to cache the target table and insert new rows into the cache and the target, we can create a lookup transformation to use a dynamic cache. The Informatica server dynamically inserts data to the target table.
- Shared cache:We can share the lookup cache between multiple transactions. We can share unnamed cache between transformations in the same mapping.
Q46). How can we ensure data is not duplicated in the target when the source has duplicate records, using lookup transformation?
That is through UsingDynamic Lookup Cacheof the target table and associating the input ports with the lookup port and checking theInsert Else Updateoption will help to eliminate the duplicate records in source and hence loading unique records in the target.
Q47) Normalizer Transformation
The normalizer transformation normalizes the records from COBOL and relational sources, allowing us to organize the data according to our own needs. Apart from that, it is used to convert multiple occurring columns in a row into multiple rows.
Union transformation
Q48). What is Union transformation?
A union is an Active and Connected transformation used to merge data from multiple pipelines or sources into one pipeline branch. Similar to the UNION ALL SQL statement, the Union transformation does not remove duplicate rows.
Q49). What are the restrictions of Union Transformation?
- All input groups and the output group must have matching ports. The precision, data type, and scale must be identical across all groups.
- We can create multiple input groups, but only one default output group.
- The Union transformation does not remove duplicate rows.
- We cannot use a Sequence Generator or Update Strategy transformation upstream from a Union transformation.
Q50). How come union transformation is active?
The union is an active transformation because it combines two or more data streams into one. Though the total number of rows passing into the Union is the same as the total number of rows passing out of it, and the sequence of rows from any given input stream is preserved in the output, the positions of the rows are not preserved, i.e. row number 1 from input stream 1 might not be row number 1 in the output stream. Union does not even guarantee that the output is repeatable.
Update Strategy transform
Q51). What is Update Strategy transform?
Update strategy T/R is used to maintain the history data or just most recent changes into target table.
Q52). How can we update a record in target table without using Update strategy?
A target table can also be updated without using “Update Strategy”. For this, we need to define the key in the target table in Informatica level and then we need to connect the key and the field we want to update in the mapping Target. In the session level, we should set the target property as “Update as Update” and enable the “Update” check-box.
Let’s assume we have a target table “Customer” with fields as “Customer ID”, “Customer Name” and “Customer Address”. Suppose we want to update “Customer Address” without an Update Strategy. Then we have to define “Customer ID” as primary key in Informatica level and we will have to connect Customer ID and Customer Address fields in the mapping. If the session properties are set correctly as described above, then the mapping will only update the customer address field for all matching customer IDs.
Q53). What is Data Driven?
The integration service follows the instructions coded in mapping to flag the rows for insert, update, delete or reject.
Treat input rows as Data Driven:This is the default session property option selected while using an Update Strategy transformation in a mapping.
Q54). What happens when DD_UPDATE is defined in update strategy and Treat source rows as INSERT is selected in Session?
If in Session anything other than DATA DRIVEN is mentioned then Update strategy in the mapping is ignored.
Q55). By default for any row in Informatica without being altered is INSERT. Then when do we needDD_INSERT?
Suppose we want to perform an insert or update target table in a single pipeline. Then we can write the below expression in update strategy transformation to insert or update based on the incoming row.
IIF (LKP_EMPLOYEE_ID IS NULL, DD_INSERT, DD_UPDATE)
Slowly Changing Dimension
Q56). What is Slowly Changing Dimension?
The dimension that changes over time is known as slowly changing dimensions.
Type 1: Slowly Changing Dimension
- In Type 1 Slowly Changing Dimension, the new information simply overwrites the original information. In other words, no history is kept.
- In our example, recall we originally have the following table:
Name | State | |
Charu | Hyderabad |
- After Charu moved from Hyderabad to Bengaluru, the new information replaces the new record, and we have the following table:
Customer Key | Name | State |
1001 | Charu | Bengaluru |
Advantages:
- This is the easiest way to handle the Slowly Changing Dimension problem since there is no need to keep track of the old information.
Disadvantages:
- All history is lost. By applying this methodology, it is not possible to trace back in history. For example, in this case, the company would not be able to know that Charu lived in Hyderabad before.
Type 2:Slowly Changing Dimension
- In Type 2 Slowly Changing Dimension, a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key.
- In our example, recall we originally have the following table:
Customer Key | Name | State |
1001 | Charu | Hyderabad |
- After Charu moved from Hyderabad to Bengaluru, we add the new information as a new row into the table:
Customer Key | Name | State |
1001 | Charu | Hyderabad |
1005 | Charu | Bengaluru |
Advantages:
- This allows us to accurately keep all historical information.
Disadvantages:
- This will cause the size of the table to grow fast. In cases where the number of rows for the table is very high to start with, storage and performance can become a concern.
- This necessarily complicates the ETL process.
Type 3: Slowly Changing Dimension
- In Type 3 Slowly Changing Dimension, there will be two columns to indicate the particular attribute of interest, one indicating the original value, and one indicating the current value. There will also be a column that indicates when the current value becomes active.
- In our example, recall we originally have the following table:
Customer Key | Name | State |
1001 | Charu | Hyderabad |
To accommodate Type 3 Slowly Changing Dimension, we will now have the following columns:
- Customer Key
- Name
- Original State
- Current State
- Effective Date
After Charu moved from Hyderabad to Bengaluru, the original information gets updated, and we have the following table (assuming the effective date of change is January 15, 2018):
Customer Key | Name | Original State | Current State | Effective Date |
1001 | Charu | Hyderabad | Bengaluru | 15-JAN-2018 |
Advantages:
- This does not increase the size of the table since new information is updated.
- This allows us to keep some part of history.
Disadvantages:
Type 3 will not be able to keep all history where an attribute is changed more than once. For example, if Charu later moves to Chennai on April 30, 2018, the Bengaluru information will be lost.
Q57). What are the contents of session log?
- Allocation of the system shared a memory
- Execution of Pre-session command/Post-session command
- Session Initialization
- Creation of SQL commands for reader/writer threads
- Cache creation information
- Start/End timings for target loading
- Error encountered during the session
- Load summary of Reader/Writer/DTM Statistics
Q58). What are two types of processes that Informatica runs the session?
- Load Manager Process
- DTM Process
Q59). What is Load Manager?
- Manages session and batch scheduling.
- Locks the session and read session properties.
- Reads the parameter file.
- Expand the server and session variables and parameters.
- Verify permissions and privileges.
- Validate source and target code pages.
- Create the session log file.
- Create the Data Transformation Manager which executes the session.
Q60). What is DTM Process?
After the load manager performs validations for the session, it creates the DTM process. The DTM process is the second process associated with the session run. The primary purpose of the DTM process is to create and manage threads that carry out the session tasks.
- MASTER THREAD– the Main thread of the DTM process. Creates and manages all other threads.
- MAPPING THREAD– One Thread to Each Session. Fetches Session and Mapping Information.
- Pre and Post Session Thread– One Thread Each To Perform Pre and Post Session Operations.
- READER THREAD– One Thread for Each Partition for Each Source Pipeline.
- WRITER THREAD– One Thread for Each Partition If Target Exist in the Source pipeline Write to the Target.
- TRANSFORMATION THREAD– One or More Transformation Thread For Each Partition.
Q61). How to setup Event Raise and Event Wait tasks?
- Create an Event: right-click the workflow and Edit->events tab->create an event
- Place event raise task in the workflow space and give the created event name in the “user defined event” (Properties tab)
- Place event wait task in the workflow space->there are two options in events tab,
Pre-defined:This is a file watch
User-defined:This is used for the event created on the workflow properties. When the Eventraise is executed, it creates the event, it triggers the event wait to continue.
Q62). Explain Timer task?
We can specify the period of time to wait before the integration service runs the next task in the workflow with the timer task.
Two options:
- Absolute Time: Give the exact time when to start the next task or refer a date-time variable.
- Relative Time: give the hours, minute, seconds – From the start time of the task
— From the start time of the parent workflow/worklet
— From the start time of the top-level workflow
Q63). What is decision task?
Decision task is used instead of multiple link conditions in a workflow. Instead of specifying multiple link conditions, use the predefined condition variable in a decision task to simplify link conditions.
Q64). What is assignment task?
We can assign a value to user-defined workflow variable with the assigned task.
Q65). What is control task?
We can use the control task to stop, abort or fail the top-level workflow or the parent workflow based on input link condition.
Suppose we have ten source flat files of the same structure. How can we load all the files in target database in a single batch run using a single mapping?
To load a set of source files we need to create a file say final.txt containing the source flat file names, ten files in our case and set the Source filetype option as Indirect.
Q66). What is the difference between mapplet and reusable transformation?
- Mapplet consists of a set of transformations that are reusable. A reusable transformation is a single transformation that can be reusable.
- If u create variables or parameters in mapplet that cannot be used in another mapping or mapplet. Unlike the variables that r created in a reusable transformation can be useful in any other mapping or mapplet.
- We can not include source definitions in reusable transformations. But we can add sources to a mapplet.
- Whole transformation logic will be hidden in case of mapplet. But it is transparent in case of reusable transformation.
We can’t use COBOL source qualifier, joiner, normalizer transformations in mapplet. Whereas we can make them as a reusable transformation.
Q67). What are the mapping parameters and mapping variables?
In Informatica, Mapping parameter represents a constant value that we can define in parameter file before running a session. Informatica mapping parameter retains the same value throughout the entire session.
Unlike a mapping parameter, Informatica mapping variable represents a value that can change throughout the session. The integration service saves the value of mapping variable to the repository at the end of session run and uses that value next time we run the session.
Q68). What is the target load order?
We can specify the target load order based on source qualifiers in a mapping. If we have multiple source qualifiers connected to the multiple targets, we can designate the order in which Informatica server loads data into the targets.
Q69). What is constraint-based load ordering?
if we have one Source Qualifier transformation that provides data for multiple target tables having primary and foreign key relationships, we will go for Constraint-based load ordering.
In the Workflow Manager, we can Configure Constraint-based load ordering for a session. The Integration Service orders the target load on a row-by-row basis. For every row generated by an active source, the Integration Service loads the corresponding transformed row first to the primary key table, then to the foreign key table.
Q70). Difference between Stop and Abort.
When we issue the STOP command on the executing session task, the Integration Service stops reading data from source. It continues processing, writing and committing the data to targets. If the Integration Service cannot finish processing and committing data, we can issue the abort command.
In contrast, ABORT command has a timeout period of 60 seconds. If the Integration Service cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session.
Q71). What is the type of partitioning in Informatica?
(1) Round Robin Partitioning: The integration service distributes data evenly among all partitions. Its used where we want each partitions to process approximately the same number of rows.
(2) Hash Partitioning: The integration service applies a hash function to a partition key to group data among the partitions. Use hash autokey partitioning at or before Rank, Sorter, Joiner, and Unsorted Aggregator transformations to ensure that rows are grouped Properly before they enter these transformations. In hash user-key, we define the number of ports to generate the partition key.
(3) PassThrough Partition: In passthrough partitioning, the Integration Service processes data without redistributing rows among partitions. Use passthrough partitioning when we want to increase data throughput, but we do not want to increase the number of partitions.
(4) Key range Partition: We specify one or more ports to form a compound partition key. The Integration Service passes data to each partition depending on the Ranges we specify for each port. Use key range partitioning where the sources or targets in the pipeline are Partitioned by key range. For Example: Customer 1-100 in one partition, 101-200 in another and so on. We Define the range for each partition.
Q72). What is Pushdown Optimization?
We can push transformation logic to the source or target database using pushdown optimization. When we run a session configured for pushdown optimization, the Integration Service translates the transformation logic into SQL queries and sends the SQL queries to the database. The source or target database executes the SQL queries to process the transformations.
Next
FAQs
How can I clear my Informatica interview? ›
...
Tips
- Understand the question completely and thoroughly.
- Try to link your computer science concept to real world problems.
- Prepare Unix interview questions well, Software Testing questions especially programming test cases, and Software Engineering subject.
How many number of sessions can one group in batches? One can group any number of sessions but it would be easier for migration if the number of sessions are lesser in a batch.
How many input parameters can exist in an unconnected lookup? ›How many input parameters can exist in an unconnected lookup? Any number of input parameters can exist. For instance, you can provide input parameters like column 1, column 2, column 3, and so on. But the return value would only be one.
Can we update the target without primary key in Informatica? ›Your only way to update without primary key is to write an Target override in Target.By doing this, you dont need primary key for updation.
What is difference between stop and abort in Informatica? ›When you say "Stop" - Informatica Server waits for completing the Server Processes that are going to finish before going to stop the workflow. When you say "Abort" - Informatica Server going to stop the workflow immediately irrespective of the Server Processes are going to finish or not.
How do I get last 3 records in Informatica? ›Informatica: Load Last 3 records - YouTube
What is a mapplet in Informatica? ›A mapplet is a reusable object containing a set of transformations that you can use in multiple mappings. Use a mapplet in a mapping. Or, validate the mapplet as a rule. Transformations in a mapplet can be reusable or non-reusable. If you add a Sequence Generator transformation to a mapplet, it must be reusable.
How do you load even and odd records in Informatica? ›The even and odd records are split into two targets. This scenario we uses Router Transformation to achieve. Creating the two target tables. To split the records by using the position in the source file as even and odd using Expression and Router Transformation.
How do I limit the number of rows in Informatica? ›To limit the number of rows in Informatica PowerCenter mappings, add an SQL Query in the Source Qualifier. Click the Properties tab and select the Sql Query transformation attribute, and add a new value/query that will limit the number of records retrieved.
How many types of transformations are available in Informatica? ›There are 3 Informatica transformations viz. External Procedure, Lookup, and Stored Procedure which can be unconnected in a valid mapping (A mapping which the Integration Service can execute).
What is SQL override in Informatica? ›
SQL Override in Informatica We have an option to customize the default query provided by the transformations and enter custom SQL query by using SQL override. The most common transformations which support the SQL overrides are Source Qualifier & Lookup transformations.
What is cache in Informatica? ›For Shared Cache Informatica server creates the cache memory for multiple lookup transformations in the mapping and once the lookup is done for the first lookup then memory is released and that memory is used by the other look up transformation. We can share the lookup cache between multiple transformations.
How do I return multiple columns in unconnected lookup? ›Call the Unconnected lookup from the expression transformation and use various output ports to retrieve the lookup values based on the concatenated return value. Use SUBSTR,INSTR to extract the column values from the concatenated return field.
Can we generate surrogate keys in Informatica? ›Surrogate keys are used to store history and they are generated using sequence generator . Also you can version the records using sequence no using look up and expression transformation.
Why sorter is an active transformation? ›Its called active transformation because when we want the distinct rows to be sorted and once the sort happens we have rows with the duplicate data eliminated. This happens by setting the distinct output rows property to true. Hence Sort Transformation is called active Transformation.
What is update override in Informatica? ›Target update override is also like source qualifier override. It is useful to update the target without using the update strategy transformation. You can find the Target Update Override option in the target definition properties tab. The syntax of update statement to be specified in Target Update Override is.
What is update strategy in Informatica? ›Update Strategy Transformation is an active and connected transformation, which is used for Inserting, Updating, Deleting and rejecting the records in the target table. The target table should contain Primary key.
What is the difference between bulk and normal load in Informatica? ›The main difference between normal and bulk load is, in normal load Informatica repository service create logs and in bulk load log is not being created. That is the reason bulk load loads the data fast and if anything goes wrong the data can not be recovered.
How do I run a parallel session in Informatica? ›- Firstly, open the workflow monitor and click on the Edit Workflow/Worklet present in the right-click command menu for the workflow.
- Now, inside the general tab, you must enable the to Configure Concurrent Execution property.
Workflow recovery allows you to continue processing the workflow and workflow tasks from the point of interruption. You can recover a workflow if the Integration Service can access the workflow state of operation. The workflow state of operation includes the status of tasks in the workflow and workflow variable values.
How many ways we can remove duplicates in Informatica? ›
Duplicates are silently removed and therefore no auditing. Use AGG Transformation and group by the keys that u want to remove dup. To improve speed, sorted ports can be used for presorted data. Make sure the first row listed in the order by clause in source qualifier is the same as the Group By port in the Aggregator.
How do I remove duplicates in a flat file in Informatica? ›distinct property(keyword) is used to eliminate duplicate rows in Informatica from a database and a flat-file. We can eliminate duplicate rows from the flat-file by using group by function in an aggregator or in source qualifier in database. You can select distinct all or by using sorter transformation in flat-file.
How do I separate duplicates and non duplicates in Informatica? ›- If the source is DBMS, you can use the property in Source Qualifier to select the distinct records. ...
- You can use, Aggregator and select all the ports as key to get the distinct values. ...
- You can use Sorter and use the Sort Distinct Property to get the distinct values.
Actually $ means internal Parameter/Variable (such as $DBConnection prefix or $PMSessionLogDir) whereas $$ are used for user-defined parameters or variables (which could be defined at mapping or workflow/worklet level).
What are the types of logins in Informatica? ›- Native user authentication.
- LDAP user authentication.
- Kerberos network authentication.
- Security Assertion Markup Language (SAML)-based single sign-on.
Yes... a mapplet can contain sources, target and transformations.
How can I prepare for Informatica? ›- Detail introduction about your profile followed with project details.
- General questions on recent projects and domain.
- Data warehouse concepts.
- ETL tool Specific Questions.
- Questions linking both tool and project.
- Basic PLSQL Questions.
- Basic unix commands.
Duplicates are silently removed and therefore no auditing. Use AGG Transformation and group by the keys that u want to remove dup. To improve speed, sorted ports can be used for presorted data. Make sure the first row listed in the order by clause in source qualifier is the same as the Group By port in the Aggregator.
How will you improve performance of lookup transform? ›- Use the optimal database driver.
- Cache lookup tables.
- Optimize the lookup condition.
- Filter lookup rows.
- Index the lookup table.
- Optimize multiple lookups.
- Native user authentication.
- LDAP user authentication.
- Kerberos network authentication.
- Security Assertion Markup Language (SAML)-based single sign-on.
What is cache in Informatica? ›
For Shared Cache Informatica server creates the cache memory for multiple lookup transformations in the mapping and once the lookup is done for the first lookup then memory is released and that memory is used by the other look up transformation. We can share the lookup cache between multiple transformations.
What is a mapplet in Informatica? ›A mapplet is a reusable object containing a set of transformations that you can use in multiple mappings. Use a mapplet in a mapping. Or, validate the mapplet as a rule. Transformations in a mapplet can be reusable or non-reusable. If you add a Sequence Generator transformation to a mapplet, it must be reusable.
What is Informatica ETL tool? ›Informatica is a data integration tool based on ETL architecture. It provides data integration software and services for various businesses, industries and government organizations including telecommunication, health care, financial and insurance services.
What is target load order in Informatica? ›A target load order constraint restricts how the Data Integration Service loads and commits rows to two target instances related to each other in the same mapping. In the Developer tool you can configure constraints to restrict the order that the Data Integration Service loads rows to target tables.
How do I select unique records in Informatica? ›- In the Mapping Developer, edit an Application Source Qualifier transformation, and select the Properties tab.
- Check Select Distinct.
- Click OK to close the dialog box and save your changes. The Designer adds SELECT DISTINCT to the default query.
The Filter transformation filters data out of the data flow based on a specified filter condition. To improve job performance, place the Filter transformation close to mapping sources to remove unnecessary data from the data flow. A filter condition is an expression that returns TRUE or FALSE.
Why unconnected lookup is faster? ›Unconnected Lookup is always better.. It takes less time as compared to that of Connected Lookup. Because Unconnected Transformation is Isolated within the mapping,we can call this lookup with the help of a Expression transformation. Where as connected loopup is connected in a single pipeline from source to target.
Which is better joiner or lookup in Informatica? ›Generaly, sorted joiner is more effective on flat files than lookup, because sorted joiner uses merge join and cashes less rows. Lookup cashes always whole file. If the file is not sorted, it can be comparable.
What is cache and Uncached lookup in Informatica? ›If the lookup is cache(By default static cache) Informatica server builds the cache when it processes the first lookup request .But when it is. uncached lookup it queries the lookup source(direct to the Database) instead of building and querying the cache.