All about BI & Data Integration: 2014

23 Aralık 2014 Salı

Oracle - Big Table Joins with Nested Loop

Hi

In this entry i am gonna show faster execution sample for big data queries.

Environment: Oracle 11g

First we have the following SQL sample ( try to join 50M vs 88M data which takes 7 minutes to return with parallel hint )

select /*+parallel( m1 , 12) parallel( s1 12) */ *
from
test_table1 m1 , --50 million rows
test_table2 s1 --88 million rows
where
m1.col1=s1.col2 --Return Time : 7minutes

By Default optimizer choose HASH_JOIN which causes long execution time

Plan
SELECT STATEMENT ALL_ROWS Cost: 11 M Bytes: 268092 G Cardinality: 90 G
	3 HASH JOIN Cost: 11 M Bytes: 268092 G Cardinality: 90 G
		1 TABLE ACCESS FULL TABLE TEST_TABLE2 Cost: 474 K Bytes: 109 G Cardinality: 97 M
		2 TABLE ACCESS FULL TABLE TEST_TABLE1 Cost: 446 K Bytes: 114 G Cardinality: 61 M

To perform better first we will create an index over join column (COL1) in our smaller table.( TABLE1)

Then over same statement , we add USE_NL hint and we got the result as follows:

select /*+ USE_NL( m1, s1)*/ *
from
test_table1 m1 ,
test_table2 s1
where
m1.col1=s1.col2 -- Return Time: 421 MSEC (Much faster !! )

Plan
SELECT STATEMENT ALL_ROWS Cost: 72 G Bytes: 268092 G Cardinality: 90 G
	4 TABLE ACCESS BY INDEX ROWID TABLE TEST_TABLE1 Cost: 1 K Bytes: 2 M Cardinality: 923
		3 NESTED LOOPS Cost: 72 G Bytes: 268092 G Cardinality: 90 G
			1 TABLE ACCESS FULL TABLE TEST_TABLE2 Cost: 474 K Bytes: 109 G Cardinality: 97 M
			2 INDEX RANGE SCAN INDEX X_INDEX Cost: 2 Cardinality: 1 K

3 Kasım 2014 Pazartesi

Informatica Numeric - AlphaNumeric check

Hi

In this entry i will explain how to deal with AlphaNumeric characters.

Assume that we have a target column called TRG1 which is a number field.

From source table lets say we have to a map column COL1 which contain Alphanumeric characters.

By only doing TO_DECIMAL operation only cause informatica set Alphanumeric data to 0 such as :

Lets say we have following source data

T1245
X58924
23783
1245

By just using TO_DECIMAL transformation we get following in our target table:

0
0
23783
1245

So in order to fix this issue following code can helps.

IIF
(
REG_MATCH(COL1 , '\d+')=1 ,
TO_DECIMAL(COL1 ),
TO_DECIMAL( SUBSTR(COL1 ,2) )
)

Result:
1245
58924
23783
1245

24 Eylül 2014 Çarşamba

Informatica how to run post or pre SQL statements in workflow sessions

Hi

Following script is a sample struncture for how to run scripts in Informatica Workflow session's
pre-post sql statement.

Instead of standart ; change with \;

BEGIN

PROCEDURE1\;

PROCEDURE2\;

PROCEDURE3\;

END\;

16 Temmuz 2014 Çarşamba

ORACLE NULL VALUE SUM CASE

Hi

In this entry , I will explain summation problem about NULL values.

Case:

Environment: Oracle 10g

SELECT NUM1+NUM2+NUM3 AS TOTAL
FROM
(
SELECT 1 AS NUM1 , 5 AS NUM2 , NULL AS NUM3
FROM
DUAL
)

Here query that i wrote above result is NULL ( NULL value of NUM3 causing result error )

True form of this query as follows:

SELECT NUM1+NUM2+NUM3 AS TOTAL

FROM

(

SELECT 1 AS NUM1 , 5 AS NUM2 , NVL(NULL, 0 ) AS NUM3

FROM

DUAL

)

In here result is 6...

Basically when doing arithmetic operation , it is quite important to put NVL check over columns.

( If we are not sure about data value within the columns )

21 Haziran 2014 Cumartesi

ORACLE NOT EQUAL ( != ) CASE

Hi

In this entry i will explain the effect of null records in not equal statements.Basically null records omitted by not equal statement and Oracle shows us missing resultset.

Environment: Oracle11g

Step1: Here below our department table which contains department_names and some rows have null department names.

Step2: Here following resultset shows us only 24 records which it should be 26
( null records are filtered by the statement plus with the row which department name 'Marketing')

Step3: To fix this sql statement can be typed as below. ( which we get 26 records )

16 Haziran 2014 Pazartesi

ORACLE - TIMESTAMP FORMAT

Hi

Assume that we have following data which comes from a text file.

2013-02-17-15.26.18.014055

To load this data into Oracle DB we need specific timestamp format. ( via SQLLOADER)

Which is

TIMESTAMP "YYYY-MM-DD-HH24.MI.SS.FF"

Don try to give exact microsecond digits such as

TIMESTAMP "YYYY-MM-DD-HH24.MI.SS.FFFFFF"

That causes following error :

ORA-01810: format code appears twice

15 Haziran 2014 Pazar

ODI 12c Knowledge Module for Hierarchical Queries

Hi

In this entry , I will explain how to implement hierarchical queries over ODI via Knowledge Module.

Case Study:
What we have a table which contains employee information and for each employee there is a manager defined.Basic Rule a manager can be an employee which means while A is the manager of B , C is also can be the manager of A etc. Basically in oracle we can use CONNECT BY to show information about hierarchical structures and in this example we will create a KM that can implement a standart CONNECT BY query.

Query as follows:

Our Source table: ( EMPLOYEE)

Target Table: (EMPLOYEE_HR)

Step1:
First of all we gonna create standart mapping as follows:

Filter part contains following sql:
level <= 3 AND department_id = 80

Step2:
For Execution we choose our new KM called IKM_SQL_Control_Hierarchical

Everthing is almost as same as standart IKM SQL Control Append.We just add specific options for CONNECT BY and START WITH scripts.

KM Options as follows:

OPTIONS:
IS_NOCYCLE: IF NULL KM use 'CONNECT BY' IF 'YES' KM USE 'CONNECT BY NOCYCLE '

START_WITH_OPTION: IF NULL KM does not use 'START WITH' script Otherwise KM use
the value which is typed into the textbox.

CONNECT_OPTION:
KM use the value which is typed into the textbox.

When Execution completed output look like this:

For more detail you can download the Knowledge Module link below:

http://sharesend.com/ynq43t7q

8 Mayıs 2014 Perşembe

ODI 12c Delete / Insert IKM Knowledge Module

Hi

In this post , i will share a IKM Knowledge Module that can make Delete/Insert operation which sometimes a better alternative for Incremental Update.

Case Study:
What i want to do is Inserting resultset of the following query into my target table FINAL_SET2 and while doing that i want to delete the existing rows.

Sample Query:
select CUST_ID,
CUST_NAME,
PRODUCT,
TOTAL_NUM_OF_ORDERS,
PER_QUANTITY

FROM (
select FINAL_SET.CUST_ID CUST_ID,
FINAL_SET.CUST_NAME CUST_NAME,
FINAL_SET.PRODUCT PRODUCT,
FINAL_SET.TOTAL_NUM_OF_ORDERS TOTAL_NUM_OF_ORDERS,
FINAL_SET.PER_QUANTITY PER_QUANTITY
from ODI_DEMO.FINAL_SET FINAL_SET , ODI_DEMO.FINAL_SET2 TRG
where (1=1) and (FINAL_SET.CUST_ID = TRG.CUST_ID)
)

Step1:
At First i am changing my target tables alias name to TRG ( need for Knowledge Module )

My Source table :

My Target Table: (FINAL_SET2)

What i expect is executing my mapping with my new IKM and replace the three rows showed above with the ones in the source table.

Step2:
Preparing mapping object according to our query.

Step3:
Setting my Knowledge Module

Step4:
The KM basically has 3 steps. First i am creating a temp table which is as same as my target table.By this way before deleting the rows in my target , i can save the resulset into another table for upcoming insert operation.

Step5:
We just inserted the resulset into temp table and now we can delete the existing records.

Step6:
And for final step, New resultset data which is inside of our temp table will be inserted into our target table.

Final view of our target table:

Note:
You can download the KM file below:
http://speedy.sh/dwcx4/KM-IKM-SQL-Control-DeleteInsert-2.xml

19 Nisan 2014 Cumartesi

ODI 12c Implementing Complex queries ( Analytic Functions & SubSelect Query )

Hi

In this entry i will implement more advance query over ODI 12c.Again I am gonna use ODIDEMO environment and implement following query.

select
sales.CUST_ID,CUST.CUST_NAME,
sales.PRODUCT,
sales.TOTAL_NUM_OF_ORDERS,
'%'||ROUND( (SUM_QUANTITY*100)/SUM_QUANTITY_ALL ) PER_QUANTITY
from
(
select
trg_customer.CUST_ID ,trg_customer.CUST_NAME
from
trg_customer where age>= 30 and age<=50
) cust ,
(
select
trg_sales.CUST_ID,
trg_product.PRODUCT ,
SUM(trg_sales.QTY) OVER (PARTITION BY trg_sales.CUST_ID) SUM_QUANTITY_ALL ,
SUM(trg_sales.QTY) OVER (PARTITION BY trg_sales.CUST_ID, trg_product.PRODUCT ) SUM_QUANTITY ,
COUNT(trg_sales.CUST_ID) OVER (PARTITION BY trg_sales.CUST_ID ) TOTAL_NUM_OF_ORDERS
from
trg_sales , trg_product
where
trg_sales.PRODUCT_ID = trg_product.PRODUCT_ID
) sales
where
sales.CUST_ID=cust.CUST_ID

At first we need prepare our datasource structure in mapping ( Sales and Customer )

Step1:
Prepare data source for sales subquery ( For this i will extract SALES and PRODUCT tables from MODEL--> SALES_ADMINISTRATION into SALES dataset and define the Join )

Step2:
Again in Sales part , we will gonna define our expressions
( TOTAL_NUM_OF_ORDERS ,SUM_QUANTITY , SUM_QUANTITY_ALL ).
To do that , first i will drag the columns which i need to use for calculation into expression object.
( CUST_ID , PRODUCT , QTY )

Step3:
Now i am gonna prepare CUSTOMER part of the query.For this i will extract TRG_CUSTOMER table from MODEL--> SALES_ADMINISTRATION into mapping and using filter object
( where age>= 30 and age<=50 )

Step4:
Okay , Currently we have two different datasource ( SALES , CUSTOMER) now according to query i will gonna merge these two set into one via join object.

Step5
As a final step , i will gonna map dataset into a target table.

Physical diagram as follows:

Note: For this example Source and Target tables are in same Oracle DB , so Default Oracle Knowledge Modules are used.

For Extraction: XKM Oracle Extract GLOBAL
For Integration: IKM Oracle Insert GLOBAL

13 Nisan 2014 Pazar

ODI 12c Subquery Filter

Hi

In this entry i am gonna show how to implement an oracle pl/sql statement with subquery condition in ODI 12c.

Environment : ODI 12c & Oracle 11g

Sample Query which i will implement:

select custid

from src_customer

where

custid in

(

select

CUST_ID

from src_orders

group by CUST_ID

having count(order_id) >3

)

Tables that we are using for this implementation:

Source table 1:

Source table 2:

Target Table:

Mapping:

In above you can see the mapping we suppose use for this operation.Trick is hidden inside of the subquery filter object.

ODI already automatically set SRC_ORDERS table as a subquery table.Now only thing we have to do is to set Subquery filter filter properly .( Check the image above).For KMs standart knowledge modules works fine , you dont have to do any change.( For this example LKM Exract and IKM Oracle Global Knowledge modules were used. )

24 Mart 2014 Pazartesi

ORACLE - Add List partition with for loop

Following script let you create list partition between dates you specify.
Dates , tablename and tablespace are all depend on variables that you can change manually.

DECLARE

p_BAS_TAR DATE ;

p_BIT_TAR DATE;

pTABLE_NAME VARCHAR2( 500 CHAR);

pTABLESPACE_NAME VARCHAR2( 500 CHAR);

DIFF NUMBER ;

INC_VAL NUMBER :=0;

V_DYNTASK LONG;

BEGIN

DIFF :=0;

p_BAS_TAR:= TO_DATE('20140316','YYYYMMDD'); -- INCULING THIS DAY

p_BIT_TAR:= TO_DATE('20200101','YYYYMMDD'); -- TILL THIS DAY NOT INCLUDED

pTABLE_NAME := 'TABLE_NAME; -- TABLE_NAME

pTABLESPACE_NAME:='TABLESPACE_NAME'; -- TABLESPACE_NAME

SELECT trunc(TO_DATE(''||p_BIT_TAR||'') ) - TO_date(TO_DATE(''||p_BAS_TAR||'', 'dd/mm/yyyy') ) INTO DIFF FROM DUAL;

DBMS_OUTPUT.PUT_LINE(DIFF);

FOR Lcntr IN 1..DIFF

LOOP

V_DYNTASK:='ALTER TABLE '||pTABLE_NAME||' ADD PARTITION P_'||to_char(p_BAS_TAR, 'YYYYMMDD')||' VALUES (TO_DATE('''||p_BAS_TAR||''', ''dd.mm.yyyy'')) '||(case when pTABLESPACE_NAME is null then null else 'TABLESPACE '||pTABLESPACE_NAME end);

DBMS_OUTPUT.PUT_LINE(V_DYNTASK);

EXECUTE IMMEDIATE V_DYNTASK;

COMMIT;

p_BAS_TAR := p_BAS_TAR + 1;

END LOOP;

END;

5 Mart 2014 Çarşamba

Oracle - How to find invalid number row

Following sql can help to find rows which contains invalid number

select rowid as row_nums,

column_name

from

TABLE_NAME

where not

REGEXP_LIKE (column_name, '[[:digit:]]')

All about BI & Data Integration