Saturday, February 25, 2012
Data warehousing - Design
operational Database, I=B4m creating dimension tables and=20
facta tables, my question is following.
When creating dimension tables I=B4m sometimes taking=20
column from tables in the operational database where=20
there is no indexing, should I create index for the=20
column in the data warehouse ?.
When creating facta tables and dimension tables for=20
Analyser services should I have dimension tables with=20
primary keys, and facta with foreign keys to the=20
dimension tables.
Does somone knows about some good article for designing=20
Data warehouse for Analyser services
Looking forward to hear from you
Regards
J=F3n SveinssonTry these
http://www.microsoft.com/technet/tr...part5/c2061.asp
http://www.ralphkimball.com/html/articles.html
HTH
Ray Higdon MCSE, MCDBA, CCNA
--
"Jn Sveinsson" <jon@.nordural.is> wrote in message
news:035701c3fab5$f1f389c0$a401280a@.phx.gbl...
I'm designing data wareouse where Im taking Data from an
operational Database, Im creating dimension tables and
facta tables, my question is following.
When creating dimension tables Im sometimes taking
column from tables in the operational database where
there is no indexing, should I create index for the
column in the data warehouse ?.
When creating facta tables and dimension tables for
Analyser services should I have dimension tables with
primary keys, and facta with foreign keys to the
dimension tables.
Does somone knows about some good article for designing
Data warehouse for Analyser services
Looking forward to hear from you
Regards
Jn Sveinsson
Data Warehouse Structure and Indexing when using MOLAP
Hi there,
I am building a cube in AS2005 which is made up of one fact table (20 million rows) and 7 dimension tables. I am planning to use MOLAP storage and my question is this. Is it important to add indexes, primary keys, foreign keys to my data source to improve performance? I can appreciate that it might speed up processing time, but since all the data and aggregations would be stored in the cube, I can't see how it would have any imact on performance (querying the cube).
If there is a benefit to adding indexes, keys etc. then could somebody please tell the best strategy, i.e where to put them, clustered or nonclustered etc.
Many thanks,
David
I have the same opinion, with MOLAP indexes will only help when you process a cube. MOLAP is separated from the data source when the cube is finished/processed.
If you use the same datamart for Reporting Services you will benefit when running queries.
The strategy for index is clustered indexes for dimension primary keys and non-clustered on their foreign keys in the fact table. A non clustered composite index on all the foreigh keys in the fact table is also recommended. Clustered index in the fact table is not recommended.
You will pay for using indexes when you do inserts, deletes and updates so it is often recommended to drop indexes when you update your data mart or data warehouse, and recreate the when the update/load is finished.
Primary and foreign keys are good help if you like to receive errors when you are loading the data mart/warehouse. With them you know that all fact records have related dimension members Else these errors will appear when you process the cube.
Here is a web site I can recommend for performance issues:
http://www.sql-server-performance.com/default.asp
Regards
Thomas Ivarsson
Data Warehouse Structure and Indexing when using MOLAP
Hi there,
I am building a cube in AS2005 which is made up of one fact table (20 million rows) and 7 dimension tables. I am planning to use MOLAP storage and my question is this. Is it important to add indexes, primary keys, foreign keys to my data source to improve performance? I can appreciate that it might speed up processing time, but since all the data and aggregations would be stored in the cube, I can't see how it would have any imact on performance (querying the cube).
If there is a benefit to adding indexes, keys etc. then could somebody please tell the best strategy, i.e where to put them, clustered or nonclustered etc.
Many thanks,
David
I have the same opinion, with MOLAP indexes will only help when you process a cube. MOLAP is separated from the data source when the cube is finished/processed.
If you use the same datamart for Reporting Services you will benefit when running queries.
The strategy for index is clustered indexes for dimension primary keys and non-clustered on their foreign keys in the fact table. A non clustered composite index on all the foreigh keys in the fact table is also recommended. Clustered index in the fact table is not recommended.
You will pay for using indexes when you do inserts, deletes and updates so it is often recommended to drop indexes when you update your data mart or data warehouse, and recreate the when the update/load is finished.
Primary and foreign keys are good help if you like to receive errors when you are loading the data mart/warehouse. With them you know that all fact records have related dimension members Else these errors will appear when you process the cube.
Here is a web site I can recommend for performance issues:
http://www.sql-server-performance.com/default.asp
Regards
Thomas Ivarsson
Friday, February 24, 2012
Data types question - Varbinary and timestamps
Greetings once again SSIS friends,
I have some source tables which contain timestamp fields (that's timestamp data type not datetime). My dimension table holds the maximum timestamp value as a varbinary(8).
I want my package to have a variable that holds that value but I don't know which data type to use for this. The reason for this is because I want to use that variable to then retrieve all records from my source table that have a timestamp value greater than the value stored in the variable.
Please advise on what data type is suitable.
Thanks for your help in advance.
Greetings once again guys,
I found the following article which gives a workable solution to my problem, but I am still interested to find out the equivalent of timestamp data type in SSIS.
http://solidqualitylearning.com/blogs/erik/archive/2005/12/09/1499.aspx
|||
DT_BYTES
binary, varbinary, timestamp
binary, varbinary, timestamp
BigBinary, VarBinary
RAW
RAW
You will want to use DT_BYTES. See this article on MSDN.
http://msdn2.microsoft.com/en-us/library/ms141036.aspx
Does this answer your question?
Sunday, February 19, 2012
Data Type in DSV
I have Varchar column called "Age", which capture patient age. and I am trying to use grouping property of the Age dimension, since it is varchar type, the grouping is messed up, such as 20 - 4, 40 - 5. So I say, ok, I need to cast to int, so I change my age dimension to int, but the fact table need to be changed too, so I did cast(age as int) as Age, however I got error:
maxlength applies to string data type only, you cannot set column 'Age' property MaxLength to be non-negative number.
Any advice?
Dear Friend,
First when U Chages any data type in Fact Table. First remove all references in Cube. Suppose U want to change Age Datatype then first remove Age Measures from Cube or if any other reference is there like dimension using this field remove that from cube, then save, then process again. Now U can change that age datatype. Do fresh DataSourceView. Then add the things U removed. And Re-process your cube now this error will not come.