There are quite a few tools and ways to build and manage cubes in Oracle so in this post I’ll try to clarify some concepts.
Good top level introduction could be found in article by Arup Nanda – Data Warehousing and OLAP.
Detailed step by step tutorial – Building OLAP Cubes.
The easiest way to start is to explore existing sample schema.
Download GLOBAL schema using this link.
Following objects should be created after installation
SQL> select object_type, object_name 2 from user_objects 3 where object_type <> 'INDEX' 4 order by object_type, object_name; OBJECT_TYPE OBJECT_NAME --------------- ------------------------------ DIMENSION CHANNEL_DIMENSION DIMENSION CUSTOMER_DIMENSION DIMENSION PRODUCT_DIMENSION DIMENSION TIME_DIMENSION SEQUENCE CUBE_BUILD_ID TABLE ACCOUNT TABLE CHANNEL_DIM TABLE CUSTOMER_DIM TABLE PRICE_FACT TABLE PRODUCT_CHILD_PARENT TABLE PRODUCT_DIM TABLE TIME_DIM TABLE UNITS_FACT 13 rows selected.
To deploy cubes run Analytic Workspace Manager (can be downloaded from this page – Oracle OLAP Downloads: Software, Sample Schemas and Sample Code) and choose menu “Create Analytic Workspace from Template”.
There are two predefined workspaces in the archive – GLOBAL_MV.XML and GLOBAL.XML. First uses cube organized MVs while second does no as implied in the names.
Let’s start with second one.
After importing workspace in AWM the list of user objects looks as following (indexes and partitions excluded)
OBJECT_TYPE OBJECT_NAME CREATED_FROM_AWM --------------- ------------------------------ -------------------- CUBE PRICE_CUBE Y CUBE UNITS_CUBE Y CUBE DIMENSION CHANNEL Y CUBE DIMENSION CUSTOMER Y CUBE DIMENSION PRODUCT Y CUBE DIMENSION TIME Y DIMENSION CHANNEL_DIMENSION DIMENSION CUSTOMER_DIMENSION DIMENSION PRODUCT_DIMENSION DIMENSION TIME_DIMENSION LOB SYS_LOB0000243395C00009$$ Y LOB SYS_LOB0000243399C00004$$ Y SEQUENCE CUBE_BUILD_ID SEQUENCE GLOBAL_S$ Y TABLE ACCOUNT TABLE AW$GLOBAL Y TABLE CHANNEL_DIM TABLE CUBE_TEMPLATES Y TABLE CUSTOMER_DIM TABLE PRICE_FACT TABLE PRODUCT_CHILD_PARENT TABLE PRODUCT_DIM TABLE TIME_DIM TABLE UNITS_FACT VIEW CHANNEL_PRIMARY_VIEW Y VIEW CHANNEL_VIEW Y VIEW CUSTOMER_MARKET_VIEW Y VIEW CUSTOMER_SHIPMENTS_VIEW Y VIEW CUSTOMER_VIEW Y VIEW PRICE_CUBE_VIEW Y VIEW PRODUCT_PRIMARY_VIEW Y VIEW PRODUCT_VIEW Y VIEW TIME_CALENDAR_VIEW Y VIEW TIME_FISCAL_VIEW Y VIEW TIME_VIEW Y VIEW UNITS_CUBE_VIEW Y 36 rows selected.
Last columns is used to highlight whether object was created using initial script or from AWM.
We see that two CUBEs and few CUBE DIMENSIONs were created as well as several views.
Cubes and their dimensions may be queried using cube_table function.
SQL> select * from table(cube_table('GLOBAL.PRICE_CUBE')); no rows selected
Query above returned zero rows because in order to populate cube with data “Maintain Cube …” command should be run in AWM by clicking right mouse button on corresponding cube.
Once cube is maintained (I’d prefer to call it “processed” though) – all the data is populated in AW$GLOBAL.
SQL> delete from aw$global; 1773 rows deleted. SQL> select count(*) cnt from table(cube_table('GLOBAL.PRICE_CUBE')); select count(*) cnt from table(cube_table('GLOBAL.PRICE_CUBE')) * ERROR at line 1: ORA-33272: Analytic workspace GLOBAL.GLOBAL cannot be opened. ORA-01403: no data found SQL> rollback; Rollback complete. SQL> select count(*) cnt from table(cube_table('GLOBAL.PRICE_CUBE')); CNT ---------- 4779
One cube is stored in a BLOB column in one row but AW$GLOBAL contains a lot of metadata as well as dimensions’ data – this explains why almost two thousand rows were deleted.
Let’s have a closer look at function cube_table.
SQL> select count(*) cnt from table(cube_table('GLOBAL.PRICE_CUBE')); CNT ---------- 4779 SQL> select * from table(dbms_xplan.display_cursor(format => 'basic')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------- EXPLAINED SQL STATEMENT: ------------------------ select count(*) cnt from table(cube_table('GLOBAL.PRICE_CUBE')) Plan hash value: 953371085 --------------------------------- | Id | Operation | Name | --------------------------------- | 0 | SELECT STATEMENT | | | 1 | SORT AGGREGATE | | | 2 | CUBE SCAN | | --------------------------------- 14 rows selected.
cube_table function generates access path “CUBE SCAN” (may generate several other variations but always with the word “CUBE”).
Its definition is following
function cube_table(cube in varchar2) return sys.anydataset pipelined using olapimpl_t;
and declaration of olapimpl_t is following
type olapimpl_t authid current_user as object(shared_ctx raw(8), mutable_ctx raw(8), tdo_ctx sys.anytype, stmt_id number);
Function look like typical one with the ODCITable interface but type’s source code is quite unexpected.
It does not contain any methods and is not even mapped to external library. Let’s try to create the same function with another name
SQL> create or replace function cube_table0(cube in varchar2) return sys.anydataset pipelined using olapimpl_t; 2 / Function created. SQL> select count(*) cnt from table(cube_table0('GLOBAL.PRICE_CUBE')); select count(*) cnt from table(cube_table0('GLOBAL.PRICE_CUBE')) * ERROR at line 1: ORA-03113: end-of-file on communication channel
Function has been created successfully but cannot be executed. Looks like there are some internal code with the hardcoded function name.
So just to summarize: by default OLAP engine stores cubes in BLOB column in AW$GLOBAL table and cube_table function is used to access cube measures/dimensions from that BLOB column.
It’s worth to mention that it’s impossible to use parallel execution with such approach.
Also cube_table has negative side effect like oll other function which return anydataset – creation a number of SYSTP%== types.
After importing workspace (deploying cubes) AWM created several views to encapsulate cube_table function.
Now let’s build UNITS_CUBE based on cube organized materialized views.
In order to do that it’s required to change each dimension data constraints to “Star Consistent” and check the box “Enable Materialized View Refresh of the Cube” in cube properties.
Once all changes are applied AWM will create following objects
OBJECT_TYPE OBJECT_NAME -------------------- ------------------------------ MATERIALIZED VIEW CB$CHANNEL_PRIMARY MATERIALIZED VIEW CB$CUSTOMER_MARKET MATERIALIZED VIEW CB$CUSTOMER_SHIPMENTS MATERIALIZED VIEW CB$PRODUCT_PRIMARY MATERIALIZED VIEW CB$TIME_CALENDAR MATERIALIZED VIEW CB$TIME_FISCAL MATERIALIZED VIEW CB$UNITS_CUBE TABLE CB$CHANNEL_PRIMARY TABLE CB$CUSTOMER_MARKET TABLE CB$CUSTOMER_SHIPMENTS TABLE CB$PRODUCT_PRIMARY TABLE CB$TIME_CALENDAR TABLE CB$TIME_FISCAL TABLE CB$UNITS_CUBE TABLE CR$UNITS_CUBE 15 rows selected.
In order to correctly populate all the aggregations cube must be rebuilt.
If rebuilding executed using SQL statement (from let’s say SQL*PLus) then AW should be detached from AWM – otherwise it will hang on enq: AW – user access for AW.
All AW sessions with corresponding attach modes are listed in v$aw_olap.
Below you can see data in CHANNEL dimension before and after rebuild.
DIM_KEY LEVEL_N CHANNEL_TOTAL_ID CHANNEL_CHANNEL_ID --------------- ------- -------------------- -------------------- CHANNEL_CAT CHANNEL CHANNEL_DIR CHANNEL CHANNEL_INT CHANNEL TOTAL_TOTAL TOTAL SQL> exec dbms_cube.build('UNITS_CUBE'); PL/SQL procedure successfully completed. SQL> select * from cb$channel_primary; DIM_KEY LEVEL_N CHANNEL_TOTAL_ID CHANNEL_CHANNEL_ID LONG_DESCRIPTIO SHORT_DESCRIPTI --------------- ------- -------------------- -------------------- --------------- --------------- CHANNEL_CAT CHANNEL TOTAL CAT Catalog Catalog CHANNEL_DIR CHANNEL TOTAL DIR Direct Sales Direct Sales CHANNEL_INT CHANNEL TOTAL INT Internet Internet TOTAL_TOTAL TOTAL TOTAL Total Channel Total Channel
Following access path was used to query data from cube organized MV
-------------------------------------- | Id | Operation | Name | -------------------------------------- | 0 | SELECT STATEMENT | | | 1 | MAT_VIEW CUBE ACCESS | | --------------------------------------
If we have a look at definition of cb$channel_primary
select dbms_metadata.get_ddl('MATERIALIZED_VIEW','CB$CHANNEL_PRIMARY') from dual;
create materialized view "GLOBAL"."CB$CHANNEL_PRIMARY" ( "DIM_KEY" ,"LEVEL_NAME" ,"CHANNEL_TOTAL_ID" ,"CHANNEL_CHANNEL_ID" ,"LONG_DESCRIPTION" ,"SHORT_DESCRIPTION" ) ORGANIZATION CUBE ON "GLOBAL" ( DIMENSION "CHANNEL" IS "DIM_KEY" ATTRIBUTE "CHANNEL_LEVELREL" IS "LEVEL_NAME" ATTRIBUTE "CHANNEL_CHANNEL_TOTAL_ID_UNIQU" IS "CHANNEL_TOTAL_ID" ATTRIBUTE "CHANNEL_CHANNEL_CHANNEL_ID_UNI" IS "CHANNEL_CHANNEL_ID" ATTRIBUTE "CHANNEL_LONG_DESCRIPTION" IS "LONG_DESCRIPTION" ATTRIBUTE "CHANNEL_SHORT_DESCRIPTION" IS "SHORT_DESCRIPTION" ) build deferred refresh force on demand using enforced constraints disable query rewrite as (select (case grouping_id(t1."TOTAL_ID", t1."CHANNEL_ID") when 1 then to_char(('TOTAL_' || t1."TOTAL_ID")) else to_char(('CHANNEL_' || t1."CHANNEL_ID")) end) "DIM_KEY" ,(case grouping_id(t1."TOTAL_ID", t1."CHANNEL_ID") when 1 then to_char('TOTAL') else to_char('CHANNEL') end) level_name ,t1."TOTAL_ID" "CHANNEL_TOTAL_ID" ,t1."CHANNEL_ID" "CHANNEL_CHANNEL_ID" ,(case grouping_id(t1."TOTAL_ID", t1."CHANNEL_ID") when 1 then max(t1."TOTAL_DSC") else max(t1."CHANNEL_DSC") end) "LONG_DESCRIPTION" ,(case grouping_id(t1."TOTAL_ID", t1."CHANNEL_ID") when 1 then max(t1."TOTAL_DSC") else max(t1."CHANNEL_DSC") end) "SHORT_DESCRIPTION" from global."CHANNEL_DIM" t1 group by (t1."TOTAL_ID"), rollup((t1."CHANNEL_ID")))
It contains keywords “ORGANIZATION CUBE” but there is no information about this clause in documentation so presumably all such views should be generated using UI tools, in particular AWM or subprograms like DBMS_CUBE.CREATE_MVIEW.
On the other hand it’s possible to create cube organized view manually and query it but it does not make much sense since it should be used along with other dimensions and facts.
Crucial point here is that data for cube organized MV is not stored in segment(s) like ordinary MV. So query below return zero rows.
select * from dba_segments where segment_name like 'CB$%'
Now, it’s interesting to compare performance of cubes stored as BLOBs vs cubes based on cube organized MVs vs warehouse which does not use OLAP option at all.
Generic comparison can be found in this white paper – Comparing Materialized Views and Analytic Workspaces in Oracle Database 11g, more detailed performance analysis is a subject for another blog post.