Saturday, February 25, 2012

DataAdapter Update Problems

Hi, I am populating a datagrid with data from one sql table, I add additional columns and then wish to write it back out to another table. I have written the following code to acheive this, however it keeps throwing the follwing exception - "Update requites a valid UpdateComman when passed a DataRow collection with modified rows"

I've looked and I can see what the problem is, can anybody help please?

Private Sub FillData()'SQL connectionMe.sqlconnection =New System.Data.SqlClient.SqlConnectionMe.sqlconnection.ConnectionString = ConfigurationSettings.AppSettings("CuroConnection")Dim sqlAs String If Request("action") ="gen"Then sql ="select item from tblPickList where GroupName ='xxx'AND Category = '" & Category2 & "' ORDER BY item ASC" ElseIf Request("action") = "fav" Then sql = "select identifiertext as item from tblfavourites where HRID =" & Session("hrid") & " and type = 3 ORDER BY identifiertext ASC" ElseIf Request("action") = "spec" Then sql = "select item from tblPickList where GroupName ='" & GroupID & "' AND Category = '" & Category2 & "' ORDER BY item ASC"End If ds =New DataSet da =New SqlClient.SqlDataAdapter(sql, sqlconnection)Try sqlconnection.Open() da.Fill(ds,"tblPickListTemp") sqlconnection.Close()Catch exAs Exception sqlconnection.Close()End Try ds.Tables(0).Columns.Add("HRID") ds.Tables(0).Columns.Add("Flag")'may possibly add a favourites column here at a later date to ensure no duplicationDim drAs DataRowFor Each drIn ds.Tables("tblPickListTemp").Rows dr("HRID") = Session("hrid") dr("Flag") = 0Next DgrdFavourites.DataSource = ds DgrdFavourites.DataBind()End Sub Public Sub CreateTempRecords()'SQL connectionMe.sqlconnection =New System.Data.SqlClient.SqlConnectionMe.sqlcommand =New System.Data.SqlClient.SqlCommandMe.da =New SqlClient.SqlDataAdapterMe.sqlconnection.ConnectionString = ConfigurationSettings.AppSettings("CuroConnection")Dim sqlAs String sql ="Insert into tblPickListTemp (Item, HRID, Flag) values (@.Item, @.HRID, @.Flag)" sqlcommand.CommandText = sql sqlcommand.Connection = sqlconnection sqlcommand.Parameters.Add("@.Item","Item") sqlcommand.Parameters.Add("@.HRID","HRID") sqlcommand.Parameters.Add("@.Flag","Flag") da.InsertCommand = sqlcommand da.Update(ds.Tables(0))End Sub

Hello Naimh,

Your program retrieves records from the database, you modify them and try to update the records in the database.

In that case you need to add an UpdateCommand to your SqlDataAdapter.

The Update statement will than handle new rows and modified rows.

Jeroen Molenaar.

|||I thought that was what I was doing with the da.InsertCommand on the 2nd to last line.I'm not sure I've really got to grips with the dataadapter syntax, could you show me please?|||

The InsertCommand you use is correct. But you are not adding new rows to the table, so you need also the UpdateCommand.

sql = "Update tblPickList Set HRID = @.HRID, Flag = @.Flag where Item = @.Item"

// Add the parameters here ...

sqlcommand.CommandText = sql

da.UpdateCommand = sqlcommand

// etc.

Hope you get it working now.

Jeroen Molenaar.

|||

jeroenm:

The InsertCommand you use is correct. But you are not adding new rows to the table, so you need also the UpdateCommand.

Okay really confused now, as that is exactly what I want to do, I populate the dataset from one table, manipulate it in the dataset and then Insert entirely new row's into a completely different table, erk! I shall give your suggestion a try but I have to fess up to being none the wiser.|||

jeroenm:

sql?= "Update tblPickListTemp Set HRID = @.HRID, Flag = @.Flag where Item = @.Item"

See now this is never going to work for me, tblPickListTemp ( the table I want to insert rows into) at the start of this process is empty, I cannot update and set fields to new values when no records currently exist.Ah well back to the drawing board, thanks anyway.|||

Do you have an AcceptChanges statement somewhere before the data is saved to the database?

When you modify the dataset after AcceptChanges, the tableadapter thinks it needs an Update statement in stead of an Insert statement.

Jeroen Molenaar.

|||No I am afraid I don't, all the code is in the snippet above. It's most confusing. I have also ensured that the primary key is set in the recipient table, which is another common problem which throws this error. I am at a real loss and almost at the point of scrapping the whole thing and taking a different approach, it's just that if I could have got this to work it would have been less trips to the Db and a lot less spaghetti code.|||

You can add some ASP.NET code. Perhaps this gives another clue.

|||

<form id="Form1" method="post" runat="server"><TABLE id="Table2" cellSpacing="0" cellPadding="0" width="280" align="center" border="0"><TR height="24"><TD class="fntmed" style="WIDTH: 280px" background="../images/header.jpg">         PickList</TD></TR><TR><TD class="fntnorm" style="WIDTH: 280px; HEIGHT: 36px"><IMG height="6" src="../images/6pixspacer.gif"><br><asp:image id="imgLine" runat="server" Width="260px" ImageUrl="../Images/line.jpg"></asp:image></TD></TR><TR><TD><div style="OVERFLOW: auto; WIDTH: 270px; POSITION: static; HEIGHT: 190px"><asp:DataGrid id="DgrdFavourites" autogeneratecolumns="False" BackColor="#ffffff" showheader="False"OnItemCommand="DgrdFavourites_ItemCommand" runat="server" Width="256px" Height="25px" BorderWidth="0"CssClass="fntsmall" AlternatingItemStyle-CssClass="fntsmall"><AlternatingItemStyle BackColor="#e2eaee" /><Columns><asp:BoundColumn DataField="Item" /><asp:templatecolumn ItemStyle-HorizontalAlign="Right"><ItemTemplate><asp:button CssClass="buttonlike" Text="Select" commandname="Select" runat="server" ID="btnSelect" /></ItemTemplate></asp:templatecolumn><asp:templatecolumn ItemStyle-HorizontalAlign="Right"><ItemTemplate><asp:button CssClass="buttonlike" Text="Favourite" runat="server" commandname="Add" ID="btnAdd" /></ItemTemplate></asp:templatecolumn></Columns></asp:DataGrid></div></TD></TR></TABLE><asp:Label ID="oops" Runat="server"></asp:Label></form></body>
That the entirety of the front end code, basically at page load inside a "If not Is Postback" the Filldata() method is invoked followed by the CreateTempData() method. Not a lot to it really.|||

Hello Naimh,

Sorry, but I've been moving you into the wrong direction. My first idea was correct, but I didn't read your code properly.Embarrassed

When you call FillData the dataset ds is filled from the table tblPickList. In FillData you add 2 columns and modify all rows with the session HRID. At this stage your rows have the status Modified in the dataset ds. (This generates the error message when calling da.Update())

Before "DgrdFavourites.DataSource = ds" you can enter the command ds.AcceptChanges(). All modifications are accepted and no update statement will be generated by the dataadapter.

The second thing is make some changes to CreateTempRecords. Create a new dataset dstemp as a copy of ds. After that insert the rows from ds into the new dataset dstemp, for example with a foreach construction. Now the new dataset is ready for the insert statement that you have written in CreateTempRecords(). Change da.Update(ds.Tables(0)) into da.Update(dstemp.Tables(0)).

Now your code should do the job!

Jeroen Molenaar.

DataAdapter does NOT load relations from DB

Why DataAdapter does NOT load relations from DB ?

the relations are made using SQL SERVER MANAGEMENT

but when a fill a datatable using dataadapter the relations are not loaded .

why ? what is the solution to this problem ?

Because it doesn't, nor would it be appropriate for it to. Your code may want to set up different relationships than those enforced by the database. Or setup up no relationships. If you want the relationships in your code, then you'll need to add them yourself.

DataAdapter always returns 0 or empty rows

Good day,

I just like to ask if anybody has experienced getting empty rows from SQL data adapter? I'm using SQL Server 2005. Problem is when the sql is run on Query Analyzer it retrieves a number of rows but when used in my application it returns 0 or empty rows.

I thought the connection is not the problem since I got my columns right. Below is my code snippet.

Thanks!

conststring COMMAND_TEXT =@."select distinct somefield as matchcode, count(somefield) "

+

"as recordcount from filteredaccount where StateCode = 0 group by somefield having count(somefield) > 1";SqlDataAdapter adapter =newSqlDataAdapter(COMMAND_TEXT, connection);DataTable dt =newDataTable(sometablename);

adapter.Fill(dt);

Hi,

I might be wrong, but don't you need a 'Return dt;' at the end of all this?

Hope this helps.

Paul

|||

You can not write aggregate function like count in where condition. You need to use subquery.

conststring COMMAND_TEXT =@."select distinct matchcode,recordcount from
(select distinct somefield as matchcode, count(somefield)as recordcount from filteredaccount where StateCode = 0 group by somefield ) As mytbl
whererecordcount > 1

|||Thanks for the replies. I will try your recommended SQL update, my question now is why my SQL is returning rows when run on Query Analyzer but not on the web application?|||

Hi,

The reason is that in your code you are returning nothing! Personally, I didn't find anything wrong with the original query, it worked fine in my tests. I was using an ObjectDataSource calling an object and my function was pretty much identical to yours - except it had a Return statement and mine is in VB rather than C#. However, the code is:

' Initialise the DataTableDim dadAs New SqlDataAdapter("SELECT DISTINCT LastName AS Matchcode, COUNT(Lastname) AS RecordCount " _&"FROM Contacts GROUP BY LastName HAVING COUNT(LastName)>1", _conString)' Create a DataTableDim dtAs New DataTable()' Populate the DataTabledad.Fill(dt)' Return resultsReturn dt
This worked without any issues on my system. By the way, the function returns a DataTable (i.e. Function nameOfFunction() As DataTable).
Hope this helps.
Paul
|||

Hi Paul,

I do return the datatable I just didn't put it there, sorry about that. I put logs after the adapter.fill and checked columns.count and rows.count.

columns were ok - 2 columns but rows.count is always zero.

I'm wondering if anybody has experienced the same, maybe a bug/ setting in SQL server 2005 or ADO.NET that I can check?

Thanks.

|||

Hi,

If I'm understanding correctly, the problem isn't the returning of your data table, but is in giving you a rows.count value? If so, I've just tried it on my code without any ptoblems. The adapted code looks like this:

' Initialise the DataTableDim dadAs New SqlDataAdapter("SELECT DISTINCT LastName AS Matchcode, COUNT(Lastname) AS RecordCount " _ &"FROM Contacts GROUP BY LastName HAVING COUNT(LastName)>1", _conString)' Create a DataTableDim dtAs New DataTable()' Populate the DataTable dad.Fill(dt)' Log result & Return log.Write("Columns & Rows","Test Grid","Columns= " & dt.Columns.Count.ToString &" Rows= " & dt.Rows.Count.ToString, 0,"Test","test")Return dt

If you can't get this to work (I'm sure your logging code won't be the same as mine, but hopefully you get the idea), you could try writing to your log file from the OnDataBound event. Strangely, if you have AutoGenerateColumns set to true, the column count returns zero here, but the row count is OK. This problem is fixed by ensuring you have set out the columns in your GridView.

I hope this (at last!) helps sort your problem. If not, I suggest you post all of the relevant code.

Cheers,

Paul

|||

Thanks Paul. The problem is my datatable when filled always has zero rows. While the SQL query when run on Query Analyzer returns a number of rows.

I'm sure that there's nothing wrong with my logging.

Anyway, I kind of implemented a workaround, my SQL selects from a view.. what I did was to select directly from the table instead. Hence, i believe my SQL view is somewhat corrupt or problematic.

Thanks again for the replies.

What bothers me though is ADO.NET does not fire an exception or anything... it just does not fills the datatable...

|||

Hi,

Do you mean that you can't get the return with a number of rows by selecting from the view in QA ? Try to select from the table directly and fill your datatable with the DataAdapter , then check if you can get a number of rows. If you can, the problem should be caused by the view you use. So what you should do is to rebuild the view and check the selecting in QA first. My personal view is that there should not be anything wrong with your DataAdapter or DataTable. Thanks.

|||

Yes that's what I did I select directly from the table rather than from the view. It is working fine now.

My SQL select from view run on Query Analyzer works fine returns a number of rows

BUT my SQL select from view run on my application always returns 0 rows.

What's strange is that I didn't get any errors when filling up my datatable, rows are just zeroed out.

|||

Hi,

i. You may open your SQL Profiler and start a trace. In the meantime, pls run your application and show us the textdata in the profiler.

ii. Pls show us the whole code and including your view in your database.

Thanks.

|||

Thanks Michael appreciate your suggestions... below is the whole method I'll try the SQL Profiler... what filter should I use to catch the sql statement?

public

DataTable RetrieveAllGroups(int entityTypeCode)

{

const

string COMMAND_TEXT =@."select distinct somefield as matchcode, count(somefield) "

+

"as recordcount from filteredaccount where StateCode = 0 group by somefield having count(somefield) > 1 order by recordcount desc, matchcode asc";

SqlConnection connection = GetDataConnection();

if

(connection !=null)

{

SqlDataAdapter adapter =new SqlDataAdapter(COMMAND_TEXT, connection);

DataTable dt =

new DataTable(entityTableName);

adapter.Fill(dt);

return dt;

}

}

DataAdapter - SELECT Statement - items in last 30 days

I'm using DataList to return vales stored in an SQL database, of which one of the fields contains the date the record was added.

I am trying to fill the dataset with items only from the last 30 days.
I've tried a few different ways, but all the database rows are returned.

What is the WHERE clause I sholud use to do this??

ThanksTry with the following SQL statement, i belive it should work.

select * from <tablename> where datediff(day, <columnname>, getdate()) < 30

Hope it solves your issue.|||Thanks very much, it worked a treat

Data/Transaction log files for two databases have same file name

We have two db's. One live and one test.

When I right click on the live one in SQL Enterprise Manager and
select properties -> Data Files ->
File Name is LIVE.MDF
Location is F:\Data\LIVE.MDF

When I right click on the test one in SQL Enterprise Manager and
select properties -> Data Files ->
File Name is LIVE.MDF
Location is F:\Data\TEST.MDF

Same thing applies to Transaction log files too.

My concern is File Name is same in both the above cases even though
the location is different. What are the consequences of this.

Thanks for your help

GVVGirish (kattukuyil@.hotmail.com) writes:
> When I right click on the live one in SQL Enterprise Manager and
> select properties -> Data Files ->
> File Name is LIVE.MDF
> Location is F:\Data\LIVE.MDF
> When I right click on the test one in SQL Enterprise Manager and
> select properties -> Data Files ->
> File Name is LIVE.MDF
> Location is F:\Data\TEST.MDF
> Same thing applies to Transaction log files too.
> My concern is File Name is same in both the above cases even though
> the location is different. What are the consequences of this.

Nothing. The filename in question is just a logical name within the
database.

--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

Data/Table Last Accessed/Modified

Is anyone aware of a method of determining when a table was last modified or accessed?

Some of our databases have tables that I am sure are not being used and I would like to generate a list of tables that have not been accessed or modified for some period of time.

I looked for a system procedure but didn't see anything that satisfied my need.

Currently I rename suspect tables and wait for someone or some process to gripe, but I don't care for that method for obvious reasons.

Thanks!again .. r u talking about the data or the object?
check what brett says here:
http://www.dbforums.com/t995468.html|||Both actually. I would like to know when someone queries against it or when the data is modified. The trigger method is good for data mods...I can use something similar. But it might be hard see if a table has been accessed or read.

Thanks!|||The only effective way I know to do that is using SQL Profiler (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/adminsql/ad_mon_perf_86ib.asp). Be forwarned: This can use enormous amounts of disk if you don't set it up very carefully!

This is when I wish that the design team had left the SELECT triggers in place, although that could have been a nightmare too!

-PatP|||That may do the trick. Thanks!

Data/records in other languages

Hi,

Can the English version of Sql Server (or other DBMSs) store data/records in
other languages (e.g. French, Chinese, etc.) or
do you have to use the French version of Sql Server to store French data?

Where can I go to read more on this?

Thanks.
Eric"Eric" <ericd_@.hotmail.com> wrote in message news:<Gpjjb.3561$Z_2.317631@.news20.bellglobal.com>...
> Hi,
> Can the English version of Sql Server (or other DBMSs) store data/records in
> other languages (e.g. French, Chinese, etc.) or
> do you have to use the French version of Sql Server to store French data?
> Where can I go to read more on this?
> Thanks.
> Eric

You can use Unicode data types (nchar, nvarchar, ntext) to store
Unicode data, and it doesn't matter what language your version of
MSSQL is. You can get more information in Books Online in the sections
on "Using Unicode Data" and "Collations".

Simon

Data/Preview windows not showing same data return

When I test my query in the "Data" window (tab), I get a healthy number of
rows returned. When I test under the "Preview" window (tab), I get a tiny
fraction of rows returned. Can someone tell why this is? Perhaps, something
in the "Layout" that I am missing? I'm not even sure how the data I am
getting in the "Preview" window is unique compared to the other data I did
not get.
Thanks in advance for any clues!Here is what might be happening. The preview tab caches data. If the
parameter is not changed it re-uses the data. Look where your .rdl file is
and you should see files with a .data extension. Either delete this file to
force it to hit the database again or just run your report with a different
parameter.
Bruce Loehle-Conger
MVP SQL Server Reporting Services
"Lu" <Lu@.discussions.microsoft.com> wrote in message
news:E1372BDD-F320-4687-9390-F703136400FF@.microsoft.com...
> When I test my query in the "Data" window (tab), I get a healthy number of
> rows returned. When I test under the "Preview" window (tab), I get a tiny
> fraction of rows returned. Can someone tell why this is? Perhaps,
> something
> in the "Layout" that I am missing? I'm not even sure how the data I am
> getting in the "Preview" window is unique compared to the other data I did
> not get.
> Thanks in advance for any clues!|||Thanks, Bruce. I flushed the old data, but this did not resolve the issue.
I forgot to mention that I deployed as well thinking it may have been cached
data. However, the problem also persists on production. I have the
parameters the same on both test preview and the raw query (data tab) and
even in the SQL Srvr Mgmt Studio...the preview and production windows only
retrieve 3 rows of data as opposed to the expected 3,000+ rows of data. It
should also be noted that something must've changed that I didn't catch
because the query was pulling more info than it is now...not all of it, but
was still pulling more than now.
"Bruce L-C [MVP]" wrote:
> Here is what might be happening. The preview tab caches data. If the
> parameter is not changed it re-uses the data. Look where your .rdl file is
> and you should see files with a .data extension. Either delete this file to
> force it to hit the database again or just run your report with a different
> parameter.
>
> --
> Bruce Loehle-Conger
> MVP SQL Server Reporting Services
> "Lu" <Lu@.discussions.microsoft.com> wrote in message
> news:E1372BDD-F320-4687-9390-F703136400FF@.microsoft.com...
> > When I test my query in the "Data" window (tab), I get a healthy number of
> > rows returned. When I test under the "Preview" window (tab), I get a tiny
> > fraction of rows returned. Can someone tell why this is? Perhaps,
> > something
> > in the "Layout" that I am missing? I'm not even sure how the data I am
> > getting in the "Preview" window is unique compared to the other data I did
> > not get.
> > Thanks in advance for any clues!
>
>

Data, data, who's got the data? (basic theory question)

I'm trying to understand aggregations, storage theory (MOLAP,ROLAP,HOLAP), and proactive caching and loading data into my cube.

We are currently building both an OLTP db and OLAP db 100% in SQL Server.

The OLTP users will add several hundred thousand rows a day.

I'm going to create an SSIS package to initially take data from the OLTP db and load it into the OLAP db and then build my cube.

Now my understanding becomes fuzzy. I've built partitions and created aggregations and enabled proactive caching, but I'm not sure why I am doing so. When talking about real time MOLAP and proactive caching and MOLAP to ROLAP to MOLAP, the various literature refers to "changes in the relational data" triggering various actions based on my settings. Are the authors referring to changes in the original OLTP db or changes in the OLAP db. If the latter, then changes will only result when I run another SSIS package to update my data.

Or does this discussion assume I am using a UDM db which seems to say I can build my star schema on top of the OLTP db. But then who has that data?

Where the heck is my data?

You want to update your cubes after data has been added to your relational OLAP database. Your relational OLAP database is the source of the data for your cubes so there really isn't any reason to process your cubes until they have been updated.

Though data will be added to your OLTP system throughout the day, how frequently will you be adding data to your OLAP database? If it's on a fixed schedule, you may want to do without the proactive caching and just start the processing yourself as part of your SSIS package.

Regarding the UDM, I would not recommend building your DSV using references to tables in the OLTP system. Though this can be done, it just seems like this is bound to cause you problems.

Good luck,

Bryan

|||

My problem is my users want "near real time" data so I'm looking for solutions. Proactive caching seems like a piced of the puzzle but I wasn't sure what the data source was for the proactive caching.

If all the hubub is about detecting data that changes in the OLAP, I don't get it. Data will only change in the OLAP when I tell it (run an SSIS program) to change so I would think I'd know about it and go ahead and process the cube.

|||

What's nice about the Proactive Caching is that you could have SSAS handle the processing on your behalf. You can configure it to detect a change but then wait for some period of time for things to settle down before processing. During that time, you might have 2 or 3 ETL cycles, some with changes and some without. Really just depends on your needs. If you can process quickly so that it doesn't interfere with your ETL cycles, then your suggested approach is probably the best for you.

Bryan

|||Does it "detect a change" in the OLAP db or the original OLTP db? I suspect OLAP, which is only going to change when I run an SSIS update so I don't see much value to detecting a change sincs I am causing the change and am fully aware of when it happens. As part of my SSIS package I suppose I could trigger the reprossing of my cube.|||

Just so anyone else reading this is clear on what we are talking about, let me define a few things. OLTP and OLAP refer to how data is used. OLTP systems typically support entity-relationship models while OLAP systems typiclly support dimensional models (though others are available).

Both entity-relationship and dimensional models are be supported through relational database technologies. The SQL Server Database Engine is an example of a relational database technology. Dimensional models are also supported by multidimensional database technologies, such as SQL Server Analysis Services, which store data in a different kind of structure and allow access through languages like MDX.

So, when you say "OLAP db", I interpret that to mean the relational database containing your dimensional model supporting OLAP activities, and when you say "OLTP db", I interpret that to mean the relational database containing your entity-relationship model supporting OLTP activities.

Whew!

With that out of the way, the answer to your question is normally you are watching your OLAP db for changes. Watching isn't the most accurate way to describe this. You can certainly configure SSAS to evaluate the underlying tables for changes. However, you can also configure some relational database technologies to notify SSAS when a change occurs. And there are a ton of other options that make ProActive caching really useful in some scenarios.

Still, my general philosophy is to keep your system as simplistic as is possible while achieving your critical goals (one of which is supportablity!). If you can avoid using proactive caching by taking a more straightforward approach while still meeting your goals, I would recommend doing so.

Good luck,

Bryan

|||

I'm sorry, I did leave out quite a bit, but yes, that is what I am talking about. In my case the only reason to have a "middle" OLAP db is to create some calculated fields and pull together some columns from different tables. I could probably do most everything during cube processing and build it right on my OLTP which is a single SQL Server db. But the main reason I'm using the OLAP "staging" db is so I am not placing an additional load on my OLTP db which is expected to receive up to a million new rows of data a day. I foresee huge latency and functionality issues and am trying to get ahead of the curve.

But, from your answer, it looks like I can use either db to trigger a cache update (if that's the proper SSAS term).

Thanks.

|||

Just wanted to mention one thing.... You refer to your OLAP db as a "staging" database. We often implement a full-blown dimensional model in a relational database. Lots of reasons for this. (The Kimball books go into this in detail.)

This database + the OLAP cube serve as the data warehouse layer of our BI infrastructure. You could build a cube off a DSV that assembles data right out of your OLTP database, but we don't often recommend this.

Good Luck,
Bryan

|||

Yes. The main reason I'm building an OLAP db is to offload data retrieval requests from the main db. So I tend to think of it as a "staging" db, which really refers to a real db one might require were one gathering data from diverse data sources.

We are currently building both the OLTP db and BI solution. In theory, I could build my cube right on top of the live db (OLTP) or a snapshot of it or some other replicated image. My "truth" lies in my OLTP db, changes in which I may be able to use to update caching. I was reading the Sam's book and it referred to changes in the relational database triggering updates to your cache and I just wondered WHICH relational db.

Given the constant updating that is going to be done on the OLTP source, I probably am better off to update caching and/or cube processing on a timed schedule. I realize there are a couple of configurable "windows" but my data updates probably will be constant throughout the day. Of course clients want terabyte size data cubes to generate instantaneous reports on data with nano-second latency.

I'm trying to get my arms around the basic theory of this new (to me) data model (multi-dimensional). I can recite Codd's 12 rules in my sleep (or is it 13...I think I need a nap), which is perhaps part of my problem: I am trying to force everything through a relational prism.

But you've been quite helpful.

Thanks.

John

Data XML file to SQL Server Table...

Hi,

I have the following XML document, which I want to put into SQL Server 2000 Table. How can I do that? The table name is "Orders". I am using C#.

<Order>

<Date>2007/09/02</Date>

<OrderID>200</OrderID>

<PartNumber id = “ABC10”>

<Delivery_Date>

</Delivery Date>

<Delivery_Quanity>

</Delivery_Quanity>

<Price>

</Price>

</PartNumber>

<PartNumber id = “ABC98”>

<Delivery_Date>

</Delivery Date>

<Delivery_Quanity>

</Delivery_Quanity>

<Price>

</Price>

</PartNumber>

<PartNumber id = “ABC11”>

<Delivery_Date>

</Delivery Date>

<Delivery_Quanity>

</Delivery_Quanity>

<Price>

</Price>

</PartNumber>

</Order>

Any help would be highly appreciated...

Regards,

Hi again,

Just to explain my question again,

I want the table to look like the following,

Orders:

Auto: Date: OrderID: Partnumber: Delivery_Date: Delivery_Quantity: Price:

0001 2007/09/02 200 ABC10 0 1 0

0002 2007/09/02 200 ABC10 0 2 0

0003 2007/09/02 200 ABC98 0 56 0

0004 2007/09/02 200 ABC11 0 34 0

Orders.XML

<Order>

<Date>2007/09/02</Date>

<OrderID>200</OrderID>

<PartNumber id = “ABC10”>

<Delivery_Date> <!--Can have multiple Delivery_Date, Delivery_Quantity and Price-->

</Delivery Date>

<Delivery_Quanity>

1

</Delivery_Quanity>

<Price>

</Price>

<Delivery_Date> <!--Can have multiple Delivery_Date, Delivery_Quantity and Price-->

</Delivery Date>

<Delivery_Quanity>

2

</Delivery_Quanity>

<Price>

</Price>

</PartNumber>

<PartNumber id = “ABC98”>

<Delivery_Date>

</Delivery Date>

<Delivery_Quanity>

56

</Delivery_Quanity>

<Price>

</Price>

</PartNumber>

<PartNumber id = “ABC11”>

<Delivery_Date>

</Delivery Date>

<Delivery_Quanity>

34

</Delivery_Quanity>

<Price>

</Price>

</PartNumber>

</Order>

Regards,

data with apostrophe in it

How do I add data that has an apostrophe in it?
For example,
update logon set question='Mother's maiden name'
won't work as there is an apostrophe in it. I need to be able to allow
apostrophes in some of my fields.
Thanks,
TomDelimit with a second apostrophe:
update logon set question='Mother''s maiden name'
If you're passing this in through a client, then use a parameter, which you
assign the string to. That will handle the apostrophes for you.|||Hi tshad
for this u need to use 2 single apostrophies
update logon set question='Mother''s maiden name'
best Regards,
Chandra
---
"tshad" wrote:

> How do I add data that has an apostrophe in it?
> For example,
> update logon set question='Mother's maiden name'
> won't work as there is an apostrophe in it. I need to be able to allow
> apostrophes in some of my fields.
> Thanks,
> Tom
>
>|||Hi tshad,
If you are passing eg. via a client VB, for those parameters, parse with a r
eplace string function 1st
Example:
Dim sSQL as string
Dim sParam as string
sParam = replace(Mother's maiden name, '"'", "''") -- note: the
bold are double quotes while unbold are single quote
sSQL = "update logon set question='" & sParam & "'" -- note: the bold ar
e double quotes while unbold are single quote
"Chandra" <Chandra@.discussions.microsoft.com> wrote in message news:23556560-7C78-481D-A84F
-C59DC9C0F8CB@.microsoft.com...
> Hi tshad
> for this u need to use 2 single apostrophies
>
> update logon set question='Mother''s maiden name'
>
>
> --
> best Regards,
> Chandra
> ---
>
>
>
> "tshad" wrote:
>|||... or use a command object with parameter objects. That will take care of
it for you.
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
"Kriste L" <whiteegg@.hotmail.com> wrote in message news:e24XPHFVFHA.628@.tk2m
sftngp13.phx.gbl...
Hi tshad,
If you are passing eg. via a client VB, for those parameters, parse with a r
eplace string function
1st
Example:
Dim sSQL as string
Dim sParam as string
sParam = replace(Mother's maiden name, '"'", "''") -- note: the
bold are double quotes
while unbold are single quote
sSQL = "update logon set question='" & sParam & "'" -- note: the bold ar
e double quotes while
unbold are single quote
"Chandra" <Chandra@.discussions.microsoft.com> wrote in message
news:23556560-7C78-481D-A84F-C59DC9C0F8CB@.microsoft.com...
> Hi tshad
> for this u need to use 2 single apostrophies
> update logon set question='Mother''s maiden name'
>
> --
> best Regards,
> Chandra
> ---
>
> "tshad" wrote:
>|||"Tibor Karaszi" <tibor_please.no.email_karaszi@.hotmail.nomail.com> wrote in
message news:e4GFjKHVFHA.3544@.TK2MSFTNGP12.phx.gbl...
> ... or use a command object with parameter objects. That will take care of
> it for you.
It did.
Thanks,
Tom

> --
> Tibor Karaszi, SQL Server MVP
> http://www.karaszi.com/sqlserver/default.asp
> http://www.solidqualitylearning.com/
>
> "Kriste L" <whiteegg@.hotmail.com> wrote in message
> news:e24XPHFVFHA.628@.tk2msftngp13.phx.gbl...
> Hi tshad,
> If you are passing eg. via a client VB, for those parameters, parse with a
> replace string function 1st
> Example:
> Dim sSQL as string
> Dim sParam as string
> sParam = replace(Mother's maiden name, '"'", "''") -- note:
> the bold are double quotes while unbold are single quote
> sSQL = "update logon set question='" & sParam & "'" -- note: the bold
> are double quotes while unbold are single quote
>
>
> "Chandra" <Chandra@.discussions.microsoft.com> wrote in message
> news:23556560-7C78-481D-A84F-C59DC9C0F8CB@.microsoft.com...
>|||"Rob Farley" <RobFarley@.discussions.microsoft.com> wrote in message
news:82BCBE73-3EF2-4A48-AF62-0D296EAB0C9D@.microsoft.com...
> Delimit with a second apostrophe:
> update logon set question='Mother''s maiden name'
>
I assume I would do a replace on all my strings that a user would put
apostrophes, first.
Tom

> If you're passing this in through a client, then use a parameter, which
> you
> assign the string to. That will handle the apostrophes for you.
I would use both.
Thanks,
Tom

Data will not show on report preview

I have a proc that combines the results from 2 queries. I can see all of the
data in QA and the Data tab of SRS there are no issues with that part. When
I preview the report only the second part shows up, nothing from the first
query. I have started over a few times, still the same problem. Any ideas'
ThanksMake sure your SP returns only one resultset. RS does not handle multiple
resultsets. I.e. you should only have one select statement. If you have
multiple it will only use the first one.
Bruce Loehle-Conger
MVP SQL Server Reporting Services
"darwin" <darwin@.discussions.microsoft.com> wrote in message
news:4AE2623F-3AC3-456C-BC85-72B228C0C1FD@.microsoft.com...
> I have a proc that combines the results from 2 queries. I can see all of
the
> data in QA and the Data tab of SRS there are no issues with that part.
When
> I preview the report only the second part shows up, nothing from the first
> query. I have started over a few times, still the same problem. Any
ideas'
> Thanks|||There is only one select statement, it joins 2 temp tables. The record set
looks fine in SRS Data tab and in QA, just not in the actual report.
"Bruce L-C [MVP]" wrote:
> Make sure your SP returns only one resultset. RS does not handle multiple
> resultsets. I.e. you should only have one select statement. If you have
> multiple it will only use the first one.
>
> --
> Bruce Loehle-Conger
> MVP SQL Server Reporting Services
>
> "darwin" <darwin@.discussions.microsoft.com> wrote in message
> news:4AE2623F-3AC3-456C-BC85-72B228C0C1FD@.microsoft.com...
> > I have a proc that combines the results from 2 queries. I can see all of
> the
> > data in QA and the Data tab of SRS there are no issues with that part.
> When
> > I preview the report only the second part shows up, nothing from the first
> > query. I have started over a few times, still the same problem. Any
> ideas'
> > Thanks
>
>|||Try it using the generic query designer in the data tab. The code used there
is exactly what is used during report preview time.
--
Brian Welcker
Group Program Manager
Microsoft SQL Server Reporting Services
This posting is provided "AS IS" with no warranties, and confers no rights.
"darwin" <darwin@.discussions.microsoft.com> wrote in message
news:913433BC-A552-4559-8CD8-7FC508719402@.microsoft.com...
> There is only one select statement, it joins 2 temp tables. The record set
> looks fine in SRS Data tab and in QA, just not in the actual report.
> "Bruce L-C [MVP]" wrote:
>> Make sure your SP returns only one resultset. RS does not handle multiple
>> resultsets. I.e. you should only have one select statement. If you have
>> multiple it will only use the first one.
>>
>> --
>> Bruce Loehle-Conger
>> MVP SQL Server Reporting Services
>>
>> "darwin" <darwin@.discussions.microsoft.com> wrote in message
>> news:4AE2623F-3AC3-456C-BC85-72B228C0C1FD@.microsoft.com...
>> > I have a proc that combines the results from 2 queries. I can see all
>> > of
>> the
>> > data in QA and the Data tab of SRS there are no issues with that part.
>> When
>> > I preview the report only the second part shows up, nothing from the
>> > first
>> > query. I have started over a few times, still the same problem. Any
>> ideas'
>> > Thanks
>>|||This should work. One thing to make sure of with your SP is that you do not
explicitly drop your temp tables. Just let them fall out of scope. Also,
have your select be the last statement.
Bruce Loehle-Conger
MVP SQL Server Reporting Services
"darwin" <darwin@.discussions.microsoft.com> wrote in message
news:913433BC-A552-4559-8CD8-7FC508719402@.microsoft.com...
> There is only one select statement, it joins 2 temp tables. The record set
> looks fine in SRS Data tab and in QA, just not in the actual report.
> "Bruce L-C [MVP]" wrote:
>> Make sure your SP returns only one resultset. RS does not handle multiple
>> resultsets. I.e. you should only have one select statement. If you have
>> multiple it will only use the first one.
>>
>> --
>> Bruce Loehle-Conger
>> MVP SQL Server Reporting Services
>>
>> "darwin" <darwin@.discussions.microsoft.com> wrote in message
>> news:4AE2623F-3AC3-456C-BC85-72B228C0C1FD@.microsoft.com...
>> > I have a proc that combines the results from 2 queries. I can see all
>> > of
>> the
>> > data in QA and the Data tab of SRS there are no issues with that part.
>> When
>> > I preview the report only the second part shows up, nothing from the
>> > first
>> > query. I have started over a few times, still the same problem. Any
>> ideas'
>> > Thanks
>>|||I started all over again, this time I created the parameter dataset first
then added the call to the stored proc. This worked, no clue as to why.. as
to the temp tables I have always droped them after the last select with no
issues.
thanks all ;)
"Bruce L-C [MVP]" wrote:
> This should work. One thing to make sure of with your SP is that you do not
> explicitly drop your temp tables. Just let them fall out of scope. Also,
> have your select be the last statement.
>
> --
> Bruce Loehle-Conger
> MVP SQL Server Reporting Services
> "darwin" <darwin@.discussions.microsoft.com> wrote in message
> news:913433BC-A552-4559-8CD8-7FC508719402@.microsoft.com...
> > There is only one select statement, it joins 2 temp tables. The record set
> > looks fine in SRS Data tab and in QA, just not in the actual report.
> >
> > "Bruce L-C [MVP]" wrote:
> >
> >> Make sure your SP returns only one resultset. RS does not handle multiple
> >> resultsets. I.e. you should only have one select statement. If you have
> >> multiple it will only use the first one.
> >>
> >>
> >> --
> >> Bruce Loehle-Conger
> >> MVP SQL Server Reporting Services
> >>
> >>
> >> "darwin" <darwin@.discussions.microsoft.com> wrote in message
> >> news:4AE2623F-3AC3-456C-BC85-72B228C0C1FD@.microsoft.com...
> >> > I have a proc that combines the results from 2 queries. I can see all
> >> > of
> >> the
> >> > data in QA and the Data tab of SRS there are no issues with that part.
> >> When
> >> > I preview the report only the second part shows up, nothing from the
> >> > first
> >> > query. I have started over a few times, still the same problem. Any
> >> ideas'
> >> > Thanks
> >>
> >>
> >>
>
>

Data Warehousing:Best way to write SP?

I'm writing a set of queries designed to test the data quality of our data warehouse at the fact level. The intent is to ensure that there are no keys at the fact level that do not exist at the dimension level.

I'm dealing with 8 fact tables, 8 dimension tables and 9 keys. I'm currently doing:

SELECT COUNT(DISTINCT key)
FROM t_fact
WHERE key NOT IN
(
SELECT DISTINCT key FROM t_dimension
)

If I do one of those for each key-fact table combo, there are about 50 queries in total. Not every key exists in every fact table.

I'm a Stored Procedure novice. What is the best way to check all of the fact tables, aside from running 50 counts with subqueries? If I run the queries one fact table at a time, it will take about 30 minutes. I've tried to run one query per fact table, by counting all keys, and doing a subselect to each dimension table, but got misleading results.

Any tips will be greatly appreciated. Abandoning data warehousing isn't a current option!

MikeIn Oracle I would find missing keys like this:

SELECT key FROM t_fact
MINUS
SELECT key FROM t_dimension

BTW, why not use a foreign key constraint to ensure all fact keys are based on dimension keys?

Data Warehousing question

I asked some of this question in another group so pardon my cross posting.
Im rather bored right now and really curios about this.
Do most of the BI folks do the technical stuff on they're own, or do they
have the technical people do it for them? Is it common to have someone doing
the technical but not alot of business involvement? If I were to go from
being a DBA to a DW guy tomorrow, would I be thrown much more in the mix
with the business folks? I never thought about this but I would have to
imagine it to be the case.
TIA, ChrisR
Sorry...

> Do most of the BI folks do the technical stuff on they're own, or do they
> have the technical people do it for them? Is it common to have someone
> doing the technical but not alot of business involvement?
This was all in regards to Data Warehousing.
"ChrisR" <noemail@.bla.com> wrote in message
news:e3e5izQpFHA.3568@.TK2MSFTNGP10.phx.gbl...
>I asked some of this question in another group so pardon my cross posting.
>Im rather bored right now and really curios about this.
>
> Do most of the BI folks do the technical stuff on they're own, or do they
> have the technical people do it for them? Is it common to have someone
> doing the technical but not alot of business involvement? If I were to go
> from being a DBA to a DW guy tomorrow, would I be thrown much more in the
> mix with the business folks? I never thought about this but I would have
> to imagine it to be the case.
> TIA, ChrisR
>
>

Data Warehousing question

I asked some of this question in another group so pardon my cross posting.
Im rather bored right now and really curios about this.
Do most of the BI folks do the technical stuff on they're own, or do they
have the technical people do it for them? Is it common to have someone doing
the technical but not alot of business involvement? If I were to go from
being a DBA to a DW guy tomorrow, would I be thrown much more in the mix
with the business folks? I never thought about this but I would have to
imagine it to be the case.
TIA, ChrisRSorry...
> Do most of the BI folks do the technical stuff on they're own, or do they
> have the technical people do it for them? Is it common to have someone
> doing the technical but not alot of business involvement?
This was all in regards to Data Warehousing.
"ChrisR" <noemail@.bla.com> wrote in message
news:e3e5izQpFHA.3568@.TK2MSFTNGP10.phx.gbl...
>I asked some of this question in another group so pardon my cross posting.
>Im rather bored right now and really curios about this.
>
> Do most of the BI folks do the technical stuff on they're own, or do they
> have the technical people do it for them? Is it common to have someone
> doing the technical but not alot of business involvement? If I were to go
> from being a DBA to a DW guy tomorrow, would I be thrown much more in the
> mix with the business folks? I never thought about this but I would have
> to imagine it to be the case.
> TIA, ChrisR
>
>

Data Warehousing question

I asked some of this question in another group so pardon my cross posting.
Im rather bored right now and really curios about this.
Do most of the BI folks do the technical stuff on they're own, or do they
have the technical people do it for them? Is it common to have someone doing
the technical but not alot of business involvement? If I were to go from
being a DBA to a DW guy tomorrow, would I be thrown much more in the mix
with the business folks? I never thought about this but I would have to
imagine it to be the case.
TIA, ChrisRSorry...

> Do most of the BI folks do the technical stuff on they're own, or do they
> have the technical people do it for them? Is it common to have someone
> doing the technical but not alot of business involvement?
This was all in regards to Data Warehousing.
"ChrisR" <noemail@.bla.com> wrote in message
news:e3e5izQpFHA.3568@.TK2MSFTNGP10.phx.gbl...
>I asked some of this question in another group so pardon my cross posting.
>Im rather bored right now and really curios about this.
>
> Do most of the BI folks do the technical stuff on they're own, or do they
> have the technical people do it for them? Is it common to have someone
> doing the technical but not alot of business involvement? If I were to go
> from being a DBA to a DW guy tomorrow, would I be thrown much more in the
> mix with the business folks? I never thought about this but I would have
> to imagine it to be the case.
> TIA, ChrisR
>
>

data warehousing olap

cananyone help me regarding data warehousing & Business Intelleigence . Dow any one have notes or links to download details & tutorials regarding Info Cubes
regards
raj chokshiInfoc Cubes is an SAP centric term, are you interested in SAP business warehousing or in data warehousing? If you are interested in DW, I reccomend you start by reading Kimball's articles: http://intelligententerprise.com/ports/search_dw_fund.shtml

If you are interested in BW, you might start at the SAP developer network here https://www.sdn.sap.com/

HTH|||Hi

As u referred me the SDN Site Link. I did get connected & registered my self but to my dis-satisfaction. I was not able to get some seroius content regarding BW . I do have the help .pdf file but it is huge & the links aregarding the learning procedure is also not clear. It would be more profitable if you have any such kind of content relating to BW to share with me .

regards
raj|||I'm fairly new to SAP BW, from what I hear they have forums there that are supposed to be pretty good. Another place to try is the SAP Service marketplace at https://websmp204.sap-ag.de/~SAPIDP/002006825000000234912001E

HTH

Data Warehousing Learning Sample

Greetings,
I'm new with data warehousing. Just wondering, do any of you know of a
free sample/example that would help me learn the ABC's of data
warehousing?
The SQL 2000 sample is only available with SQL Enterprise..
Thanks in advance,
Don
*** Sent via Developersdex http://www.codecomments.com ***
Don't just participate in USENET...get rewarded for it!
The Analysis Services installation comes with a sample database called Food
Mart 2000 which is an OLAP database that sits on top of an .mdb file. Is
this the one that you have already investigated?
Jay Nathan, MCP
http://www.jaynathan.com/blog
"don larry" <donlarry17@.hotmail.com> wrote in message
news:O4t01EM3EHA.3932@.TK2MSFTNGP12.phx.gbl...
> Greetings,
> I'm new with data warehousing. Just wondering, do any of you know of a
> free sample/example that would help me learn the ABC's of data
> warehousing?
> The SQL 2000 sample is only available with SQL Enterprise..
> Thanks in advance,
> Don
> *** Sent via Developersdex http://www.codecomments.com ***
> Don't just participate in USENET...get rewarded for it!
|||Greetings,
Hi Jay, it is a great sample. Just what I was looking for.
Thank you,
Don
*** Sent via Developersdex http://www.codecomments.com ***
Don't just participate in USENET...get rewarded for it!

Data warehousing design

Hi

We are starting with designing a datawarehouse for my company. I have done some reading on the concepts and steps involved, but what I am seriously lacking is some examples. I'd like to read through some real examples of data warehouses that worked including the full design diagrams.
Can anyone direct me to some good sites for this?

Thanks,
TeaaZAIf you are building a true data warehouse, then get the books written by Bill Inmon. If you are building a small data mart, then you might consider Kimball's books.

Data warehousing - Design

I'm designing data wareouse where I=B4m taking Data from an=20
operational Database, I=B4m creating dimension tables and=20
facta tables, my question is following.
When creating dimension tables I=B4m sometimes taking=20
column from tables in the operational database where=20
there is no indexing, should I create index for the=20
column in the data warehouse ?.
When creating facta tables and dimension tables for=20
Analyser services should I have dimension tables with=20
primary keys, and facta with foreign keys to the=20
dimension tables.
Does somone knows about some good article for designing=20
Data warehouse for Analyser services
Looking forward to hear from you
Regards
J=F3n SveinssonTry these
http://www.microsoft.com/technet/tr...part5/c2061.asp
http://www.ralphkimball.com/html/articles.html
HTH
Ray Higdon MCSE, MCDBA, CCNA
--
"Jn Sveinsson" <jon@.nordural.is> wrote in message
news:035701c3fab5$f1f389c0$a401280a@.phx.gbl...
I'm designing data wareouse where Im taking Data from an
operational Database, Im creating dimension tables and
facta tables, my question is following.
When creating dimension tables Im sometimes taking
column from tables in the operational database where
there is no indexing, should I create index for the
column in the data warehouse ?.
When creating facta tables and dimension tables for
Analyser services should I have dimension tables with
primary keys, and facta with foreign keys to the
dimension tables.
Does somone knows about some good article for designing
Data warehouse for Analyser services
Looking forward to hear from you
Regards
Jn Sveinsson

Data Warehousing

HAI
i have read about few pages in Microsoft dataware housing toolkit by ralph
kimball.i find it little difficult and vague to understand the concepts.so i
kindly request anybody who have worked with dataware housing to recommened
me a book to get a clear picture to start with and then to concentrate on
design part.
thank youmaybe try this book:
Microsoft SQL Server 2005 Analysis Services Step by Step
but I don't know what is its content.
"vidhya" <vidhya@.chellasoft.com> wrote in message
news:u4MZEqxrGHA.4512@.TK2MSFTNGP02.phx.gbl...
> HAI
> i have read about few pages in Microsoft dataware housing toolkit by ralph
> kimball.i find it little difficult and vague to understand the concepts.so
> i kindly request anybody who have worked with dataware housing to
> recommened me a book to get a clear picture to start with and then to
> concentrate on design part.
> thank you
>|||hi, i'm thangesh
i'want datawarehousing in informatica project and design pls give
datails and how to prefer resume and interview technical questions.
*** Sent via Developersdex http://www.codecomments.com ***

data warehousing

We have some old data that we are not using, it's taking so much space for
no reason, we want to keep them seperately from the new data so we 're
planning to insert them to a seperate table and still make them accessible
by the customers whenever they want to access the arcieved data. Now I
really need your advice on this. What is the best way to do this? There 's 8
million data on this table, how do I implement this arcieving system?
Thank you.I would put it in a separate database... that way it could be backed up and
restored separately... It would also make it easier to move to another
server if (later) you decided to doso.
--
Wayne Snyder, MCDBA, SQL Server MVP
Computer Education Services Corporation (CESC), Charlotte, NC
www.computeredservices.com
(Please respond only to the newsgroups.)
I support the Professional Association of SQL Server (PASS) and it's
community of SQL Server professionals.
www.sqlpass.org
"Fulya Erol" <ferol@.no.nospam.mezun.com> wrote in message
news:eU3qGrAwDHA.3744@.TK2MSFTNGP11.phx.gbl...
> We have some old data that we are not using, it's taking so much space for
> no reason, we want to keep them seperately from the new data so we 're
> planning to insert them to a seperate table and still make them accessible
> by the customers whenever they want to access the arcieved data. Now I
> really need your advice on this. What is the best way to do this? There 's
8
> million data on this table, how do I implement this arcieving system?
> Thank you.
>

data warehouse truncate question

have an SSIS project and I have a package that drops and creates the
database, creates tables, then I have a package that loads the data into the
data warehouse. Now, I got the request in to only load the data and not drop
and create the database every time the package runs due to the package is
going to run every 5-10 minutes. Now I never created an SSIS package until
now, so how can I accomplish in only loading the tables every time its ran
instead of dropping and creating the database every time the package is ran?
so in a nutshell.
I need a package to load the data only if the data does not exsist and not
drop and create the database every time.
any suggestions on how this can be done?
Two methods come to mind.
1) Simple insert where not exists statements for each table. Can be VERY
inefficient, but easy to construct.
2) Build system that tracks the last-inserted key value for each table.
Reference the appropriate key in the where clause of each insert statement.
If you need to also track modified data, then it gets MUCH more complicated
because you need triggers on every base table to insert the modified row's
PKs into a refresh holding table which would also be reference for the
warehouse update statements.
TheSQLGuru
President
Indicium Resources, Inc.
"Mike" <igotyourdotnet@.gmail.com> wrote in message
news:%23%23HSmywaHHA.1300@.TK2MSFTNGP02.phx.gbl...
> have an SSIS project and I have a package that drops and creates the
> database, creates tables, then I have a package that loads the data into
> the
> data warehouse. Now, I got the request in to only load the data and not
> drop
> and create the database every time the package runs due to the package is
> going to run every 5-10 minutes. Now I never created an SSIS package until
> now, so how can I accomplish in only loading the tables every time its ran
> instead of dropping and creating the database every time the package is
> ran?
> so in a nutshell.
> I need a package to load the data only if the data does not exsist and not
> drop and create the database every time.
> any suggestions on how this can be done?
>
>

data warehouse truncate question

have an SSIS project and I have a package that drops and creates the
database, creates tables, then I have a package that loads the data into the
data warehouse. Now, I got the request in to only load the data and not drop
and create the database every time the package runs due to the package is
going to run every 5-10 minutes. Now I never created an SSIS package until
now, so how can I accomplish in only loading the tables every time its ran
instead of dropping and creating the database every time the package is ran?
so in a nutshell.
I need a package to load the data only if the data does not exsist and not
drop and create the database every time.
any suggestions on how this can be done?Two methods come to mind.
1) Simple insert where not exists statements for each table. Can be VERY
inefficient, but easy to construct.
2) Build system that tracks the last-inserted key value for each table.
Reference the appropriate key in the where clause of each insert statement.
If you need to also track modified data, then it gets MUCH more complicated
because you need triggers on every base table to insert the modified row's
PKs into a refresh holding table which would also be reference for the
warehouse update statements.
TheSQLGuru
President
Indicium Resources, Inc.
"Mike" <igotyourdotnet@.gmail.com> wrote in message
news:%23%23HSmywaHHA.1300@.TK2MSFTNGP02.phx.gbl...
> have an SSIS project and I have a package that drops and creates the
> database, creates tables, then I have a package that loads the data into
> the
> data warehouse. Now, I got the request in to only load the data and not
> drop
> and create the database every time the package runs due to the package is
> going to run every 5-10 minutes. Now I never created an SSIS package until
> now, so how can I accomplish in only loading the tables every time its ran
> instead of dropping and creating the database every time the package is
> ran?
> so in a nutshell.
> I need a package to load the data only if the data does not exsist and not
> drop and create the database every time.
> any suggestions on how this can be done?
>
>

Data Warehouse Structure and Indexing when using MOLAP

Hi there,

I am building a cube in AS2005 which is made up of one fact table (20 million rows) and 7 dimension tables. I am planning to use MOLAP storage and my question is this. Is it important to add indexes, primary keys, foreign keys to my data source to improve performance? I can appreciate that it might speed up processing time, but since all the data and aggregations would be stored in the cube, I can't see how it would have any imact on performance (querying the cube).

If there is a benefit to adding indexes, keys etc. then could somebody please tell the best strategy, i.e where to put them, clustered or nonclustered etc.

Many thanks,

David

I have the same opinion, with MOLAP indexes will only help when you process a cube. MOLAP is separated from the data source when the cube is finished/processed.

If you use the same datamart for Reporting Services you will benefit when running queries.

The strategy for index is clustered indexes for dimension primary keys and non-clustered on their foreign keys in the fact table. A non clustered composite index on all the foreigh keys in the fact table is also recommended. Clustered index in the fact table is not recommended.

You will pay for using indexes when you do inserts, deletes and updates so it is often recommended to drop indexes when you update your data mart or data warehouse, and recreate the when the update/load is finished.

Primary and foreign keys are good help if you like to receive errors when you are loading the data mart/warehouse. With them you know that all fact records have related dimension members Else these errors will appear when you process the cube.

Here is a web site I can recommend for performance issues:

http://www.sql-server-performance.com/default.asp

Regards

Thomas Ivarsson

Data Warehouse Structure and Indexing when using MOLAP

Hi there,

I am building a cube in AS2005 which is made up of one fact table (20 million rows) and 7 dimension tables. I am planning to use MOLAP storage and my question is this. Is it important to add indexes, primary keys, foreign keys to my data source to improve performance? I can appreciate that it might speed up processing time, but since all the data and aggregations would be stored in the cube, I can't see how it would have any imact on performance (querying the cube).

If there is a benefit to adding indexes, keys etc. then could somebody please tell the best strategy, i.e where to put them, clustered or nonclustered etc.

Many thanks,

David

I have the same opinion, with MOLAP indexes will only help when you process a cube. MOLAP is separated from the data source when the cube is finished/processed.

If you use the same datamart for Reporting Services you will benefit when running queries.

The strategy for index is clustered indexes for dimension primary keys and non-clustered on their foreign keys in the fact table. A non clustered composite index on all the foreigh keys in the fact table is also recommended. Clustered index in the fact table is not recommended.

You will pay for using indexes when you do inserts, deletes and updates so it is often recommended to drop indexes when you update your data mart or data warehouse, and recreate the when the update/load is finished.

Primary and foreign keys are good help if you like to receive errors when you are loading the data mart/warehouse. With them you know that all fact records have related dimension members Else these errors will appear when you process the cube.

Here is a web site I can recommend for performance issues:

http://www.sql-server-performance.com/default.asp

Regards

Thomas Ivarsson

Data warehouse question

Hi,
I am building the data warehouse in SQL 2005 and have the following
questions want to clarify.
1. Any method and suggestion for incremental loading for data warehouse? If
I create a field in each record in my source table and select the one havent
been loaded in ETL, is it a method or any other better suggestion?
2. If there are schema changes in the source table and the data warehouse,
is it necessary (or best practise) to reload all data?
Thanks a lot.
Ryan
Ryan:
I am stumbling through my ETL systems initial design and have had to work
this one out as well. Take this following bit of free advice with caution:
I created a tracking table in a separate db with TableName and LastLoadDate
columns. After runnin a package that is my "Initial Load" package (ie:
truncating the whole Fact table), it runs a SQL task that populates the
LastLoadDate in the corresponding record.
For Incremental Loads, I run two data flows, one for 'changed' rows and one
for 'new' rows:
"Changed": SELECT * FROM x WHERE [created] < LastLoadDate and [Modified] >
LastLoadDate. That goes into a staging table. At the end of the package,
another SQL task runs a simple UPDATE query that updates the FACT table from
the statging table.
"NEW ROWS" SELECT * FROM x WHERE [created] > LastLoadDate. That data flow
gets appended directly to the FACT table.
Check out books by Ralph Kimball on Microsoft Data Warehouse Design.
HTH
Todd Chittenden
"Ryan" wrote:

> Hi,
> I am building the data warehouse in SQL 2005 and have the following
> questions want to clarify.
> 1. Any method and suggestion for incremental loading for data warehouse? If
> I create a field in each record in my source table and select the one havent
> been loaded in ETL, is it a method or any other better suggestion?
> 2. If there are schema changes in the source table and the data warehouse,
> is it necessary (or best practise) to reload all data?
> Thanks a lot.
> Ryan

Data warehouse question

Hi,
I am building the data warehouse in SQL 2005 and have the following
questions want to clarify.
1. Any method and suggestion for incremental loading for data warehouse? If
I create a field in each record in my source table and select the one havent
been loaded in ETL, is it a method or any other better suggestion?
2. If there are schema changes in the source table and the data warehouse,
is it necessary (or best practise) to reload all data?
Thanks a lot.
RyanRyan:
I am stumbling through my ETL systems initial design and have had to work
this one out as well. Take this following bit of free advice with caution:
I created a tracking table in a separate db with TableName and LastLoadDate
columns. After runnin a package that is my "Initial Load" package (ie:
truncating the whole Fact table), it runs a SQL task that populates the
LastLoadDate in the corresponding record.
For Incremental Loads, I run two data flows, one for 'changed' rows and one
for 'new' rows:
"Changed": SELECT * FROM x WHERE [created] < LastLoadDate and [Modif
ied] >
LastLoadDate. That goes into a staging table. At the end of the package,
another SQL task runs a simple UPDATE query that updates the FACT table from
the statging table.
"NEW ROWS" SELECT * FROM x WHERE [created] > LastLoadDate. That data flo
w
gets appended directly to the FACT table.
Check out books by Ralph Kimball on Microsoft Data Warehouse Design.
HTH
--
Todd Chittenden
"Ryan" wrote:

> Hi,
> I am building the data warehouse in SQL 2005 and have the following
> questions want to clarify.
> 1. Any method and suggestion for incremental loading for data warehouse? I
f
> I create a field in each record in my source table and select the one have
nt
> been loaded in ETL, is it a method or any other better suggestion?
> 2. If there are schema changes in the source table and the data warehouse,
> is it necessary (or best practise) to reload all data?
> Thanks a lot.
> Ryan

Data Warehouse Presentation

Hello

I have been tasked to give a presentation (introductory) on Data warehouse to Health professionals and Health information analysts. I would be grateful if you could share with me any slides and or presentation on the topic that you might have. Anything that graphically explains the concepts of cubes and its applications would be much appreciated.

Help. Thanks.

If you would like to give most generic presentation about data warehousing in heath industry, you should try and run a search about most generic information about data warehousing and see how some if it applies to the health industry.

This forum is focucing mostly on questions about Microsoft SQL Server Analysis Services product.


You might find interesting a case study of Data Warehouse implemenation using Analysis Serivces by one of the most prominent Israeli Heath providers: http://members.microsoft.com/CustomerEvidence/Search/EvidenceDetails.aspx?EvidenceID=13653&LanguageID=1

Hope that helps.

Edward.
--
This posting is provided "AS IS" with no warranties, and confers no rights.

Data warehouse or data mart

Hi,
I'm working on a project at the moment where have an existing warehouse, and
need to bring in data from new source systems, which are from a similar (but
not quite the same) part of the business. These two parts of the business
merged recently, and they want to have a consolidated view of the data, as
well as replace some existing (Excel and Access based) reporting.
We are coming to the end of the analysis and design phase, and I want to get
some opinions that the approach I'm taking is valid.
I have tried to bring everything into the one database. I've added some new
fact tables for new source systems, as the original and new source systems
don't quite work the same. I've also tried to conform the dimensions as far
as possible, so that the data will aggregate consistently for the management
reporting.
This seems to fit pretty well, and I haven't had to alter the current data
model very much.
My question is basically: Is this the right way to go, or am I better off
breaking the warehouse into separate data marts?
A lot of the reporting will only apply to each part of the business, but
there will be some management reporting that wants a consolidated view of the
new merged business.
Any comments will be appreciated.
Thanks.
Hi,
Based on my scope, there is no necessary to break them into seperate data
marts for this situation because they have similar model and need to be
merged in the same analysis report. You may consider to create different
cube and you could at least share some common dimensions such as
time/location etc.
Regards,
Peter Yang
MCSE2000/2003, MCSA, MCDBA
Microsoft Online Partner Support
When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.
================================================== ===
This posting is provided "AS IS" with no warranties, and confers no rights.
--
>Thread-Topic: Data warehouse or data mart
>thread-index: AcXSOg1GLZ3yTwgwTGqbRewN1uy9bQ==
>X-WBNR-Posting-Host: 203.54.233.209
>From: "=?Utf-8?B?V3JlY2s=?=" <Wreck@.community.nospam>
>Subject: Data warehouse or data mart
>Date: Sun, 16 Oct 2005 03:12:02 -0700
>Lines: 31
>Message-ID: <EA2C0FDE-2A52-44DE-88F9-5BA83B6278BF@.microsoft.com>
>MIME-Version: 1.0
>Content-Type: text/plain;
>charset="Utf-8"
>Content-Transfer-Encoding: 7bit
>X-Newsreader: Microsoft CDO for Windows 2000
>Content-Class: urn:content-classes:message
>Importance: normal
>Priority: normal
>X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
>Newsgroups: microsoft.public.sqlserver.datawarehouse
>NNTP-Posting-Host: TK2MSFTNGXA03.phx.gbl 10.40.2.250
>Path: TK2MSFTNGXA01.phx.gbl!TK2MSFTNGXA03.phx.gbl
>Xref: TK2MSFTNGXA01.phx.gbl microsoft.public.sqlserver.datawarehouse:2317
>X-Tomcat-NG: microsoft.public.sqlserver.datawarehouse
>Hi,
>I'm working on a project at the moment where have an existing warehouse,
and
>need to bring in data from new source systems, which are from a similar
(but
>not quite the same) part of the business. These two parts of the business
>merged recently, and they want to have a consolidated view of the data, as
>well as replace some existing (Excel and Access based) reporting.
>We are coming to the end of the analysis and design phase, and I want to
get
>some opinions that the approach I'm taking is valid.
>I have tried to bring everything into the one database. I've added some
new
>fact tables for new source systems, as the original and new source systems
>don't quite work the same. I've also tried to conform the dimensions as
far
>as possible, so that the data will aggregate consistently for the
management
>reporting.
>This seems to fit pretty well, and I haven't had to alter the current data
>model very much.
>My question is basically: Is this the right way to go, or am I better off
>breaking the warehouse into separate data marts?
>A lot of the reporting will only apply to each part of the business, but
>there will be some management reporting that wants a consolidated view of
the
>new merged business.
>Any comments will be appreciated.
>Thanks.
>
|||Hi Peter,
That's pretty much the approach I'm taking. The new source systems will have
their own fact tables, with conformed measures, joining to conformed
dimensions.
I'll build separate cubes for the fact tables, and put the old and new cubes
in a single virtual cube, joined on the common dimensions.
Thanks,
Wreck.
"Peter Yang [MSFT]" wrote:

> Hi,
> Based on my scope, there is no necessary to break them into seperate data
> marts for this situation because they have similar model and need to be
> merged in the same analysis report. You may consider to create different
> cube and you could at least share some common dimensions such as
> time/location etc.
> Regards,
> Peter Yang
> MCSE2000/2003, MCSA, MCDBA
> Microsoft Online Partner Support
> When responding to posts, please "Reply to Group" via your newsreader so
> that others may learn and benefit from your issue.
> ================================================== ===
>
> This posting is provided "AS IS" with no warranties, and confers no rights.
>
> --
> and
> (but
> get
> new
> far
> management
> the
>

Data warehouse or data mart

Hi,
I'm working on a project at the moment where have an existing warehouse, and
need to bring in data from new source systems, which are from a similar (but
not quite the same) part of the business. These two parts of the business
merged recently, and they want to have a consolidated view of the data, as
well as replace some existing (Excel and Access based) reporting.
We are coming to the end of the analysis and design phase, and I want to get
some opinions that the approach I'm taking is valid.
I have tried to bring everything into the one database. I've added some new
fact tables for new source systems, as the original and new source systems
don't quite work the same. I've also tried to conform the dimensions as far
as possible, so that the data will aggregate consistently for the management
reporting.
This seems to fit pretty well, and I haven't had to alter the current data
model very much.
My question is basically: Is this the right way to go, or am I better off
breaking the warehouse into separate data marts?
A lot of the reporting will only apply to each part of the business, but
there will be some management reporting that wants a consolidated view of th
e
new merged business.
Any comments will be appreciated.
Thanks.Hi,
Based on my scope, there is no necessary to break them into seperate data
marts for this situation because they have similar model and need to be
merged in the same analysis report. You may consider to create different
cube and you could at least share some common dimensions such as
time/location etc.
Regards,
Peter Yang
MCSE2000/2003, MCSA, MCDBA
Microsoft Online Partner Support
When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.
========================================
=============
This posting is provided "AS IS" with no warranties, and confers no rights.
--
>Thread-Topic: Data warehouse or data mart
>thread-index: AcXSOg1GLZ3yTwgwTGqbRewN1uy9bQ==
>X-WBNR-Posting-Host: 203.54.233.209
>From: "examnotes" <Wreck@.community.nospam>
>Subject: Data warehouse or data mart
>Date: Sun, 16 Oct 2005 03:12:02 -0700
>Lines: 31
>Message-ID: <EA2C0FDE-2A52-44DE-88F9-5BA83B6278BF@.microsoft.com>
>MIME-Version: 1.0
>Content-Type: text/plain;
> charset="Utf-8"
>Content-Transfer-Encoding: 7bit
>X-Newsreader: Microsoft CDO for Windows 2000
>Content-Class: urn:content-classes:message
>Importance: normal
>Priority: normal
>X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
>Newsgroups: microsoft.public.sqlserver.datawarehouse
>NNTP-Posting-Host: TK2MSFTNGXA03.phx.gbl 10.40.2.250
>Path: TK2MSFTNGXA01.phx.gbl!TK2MSFTNGXA03.phx.gbl
>Xref: TK2MSFTNGXA01.phx.gbl microsoft.public.sqlserver.datawarehouse:2317
>X-Tomcat-NG: microsoft.public.sqlserver.datawarehouse
>Hi,
>I'm working on a project at the moment where have an existing warehouse,
and
>need to bring in data from new source systems, which are from a similar
(but
>not quite the same) part of the business. These two parts of the business
>merged recently, and they want to have a consolidated view of the data, as
>well as replace some existing (Excel and Access based) reporting.
>We are coming to the end of the analysis and design phase, and I want to
get
>some opinions that the approach I'm taking is valid.
>I have tried to bring everything into the one database. I've added some
new
>fact tables for new source systems, as the original and new source systems
>don't quite work the same. I've also tried to conform the dimensions as
far
>as possible, so that the data will aggregate consistently for the
management
>reporting.
>This seems to fit pretty well, and I haven't had to alter the current data
>model very much.
>My question is basically: Is this the right way to go, or am I better off
>breaking the warehouse into separate data marts?
>A lot of the reporting will only apply to each part of the business, but
>there will be some management reporting that wants a consolidated view of
the
>new merged business.
>Any comments will be appreciated.
>Thanks.
>|||Hi Peter,
That's pretty much the approach I'm taking. The new source systems will have
their own fact tables, with conformed measures, joining to conformed
dimensions.
I'll build separate cubes for the fact tables, and put the old and new cubes
in a single virtual cube, joined on the common dimensions.
Thanks,
Wreck.
"Peter Yang [MSFT]" wrote:

> Hi,
> Based on my scope, there is no necessary to break them into seperate data
> marts for this situation because they have similar model and need to be
> merged in the same analysis report. You may consider to create different
> cube and you could at least share some common dimensions such as
> time/location etc.
> Regards,
> Peter Yang
> MCSE2000/2003, MCSA, MCDBA
> Microsoft Online Partner Support
> When responding to posts, please "Reply to Group" via your newsreader so
> that others may learn and benefit from your issue.
> ========================================
=============
>
> This posting is provided "AS IS" with no warranties, and confers no rights
.
>
> --
> and
> (but
> get
> new
> far
> management
> the
>

Data Warehouse Nulls

Hello..

I was wondering if anyone out there could tell me how they deal with
NULL values in a data warehouse? I am looking to implement a warehouse
in SQL 2005 and have some fields which will have NULL values and I
would like some further ideas on how to deal with them. At my last job
in dealing with Oracle we were just going to leave the fields NULL, but
in SQL how would you best recommend cleaning the data? I greatly
appreicate your help and look forward to your reponses.

Thank youVery interesting question. The answer is "it depends."

As a general rule, I'll leave money as null usually. My logic is that
if a dollar amount is unknown, that is different then the dollar amount
zero, and i probably need to deal with the unknowns wherever and
whenever the amounts are shown to teh end user. This leads to
interesting discussions with the users, as you get to explain to them
the issues, and ask them what they want the defaults to be, or whether
they want to skip the data, and how they want the reports to be
documented.

For text fields, I will usually convert to ' '. Olap likes that
better. Sometimes I will convert nulls to 'BLANK'. Just kind of
depends.
Dates need to stay null. A null date is the easiest thing to deal
with.

Does this answer your questions? Are you doing OLAP? You might try
creating a cube with your denormalized data. OLAP is pretty neat for
datawarehouses where users want to extract data.
Regards,
Doug|||Thanks so much Doug! Your answer is of great help!

Data Warehouse Load Hints?

(SQL 2000) So if we are loading large tables while no one else is logged in
and no other processes will occur, should we provide hints for the table
inserts and updates for a performance boost? If so, which hints? Can any
help avoid deadlocks? And will this be different for 2005?
Does setting the db to singleuser help speed-up updates?
Thanks
Greg CA deadlock is when two processes have locked resources (typically in an
uncommitted transaction) that the other needs to continue. How would this
happen if the ETL is the only process running?
http://support.microsoft.com/defaul...kb;en-us;169960
http://msdn.microsoft.com/library/d... />
a_3hdf.asp
As for maximizing the performance of your ETL, I would reccomend the "bulk
load" method, becuase it supports features such as minimal transaction
logging.
[url]http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/incbulkload.mspx[/u
rl]
Setting the database to DBO Use Only would also be a good idea; it may
improve performance, but mostly to insure that noone or nothing else logs
in.
"gc" <nospam@.hotmail.com> wrote in message
news:Xns977C861DB2E90nospam@.24.93.43.121...
> (SQL 2000) So if we are loading large tables while no one else is logged
> in
> and no other processes will occur, should we provide hints for the table
> inserts and updates for a performance boost? If so, which hints? Can any
> help avoid deadlocks? And will this be different for 2005?
> Does setting the db to singleuser help speed-up updates?
> Thanks
> Greg C
>
>|||Just a couple of tips that I've learned:
1. If you have a clustered index, make sure that it's located on a
monotonically increasing value. This will minimize page splitting.
2. The clustered index does not have to be your primary key; in fact,
if you use a natural key, then it should not be. Our data is very date
and time sensitive, so I index the date and time of load; others use a
sequential numbering system (like an identity column), but I needed the
date of load for other reasons, and I prefer not to add columns that
have no meaning.
3 Depending on the size of your data, it may be more appropriate to
drop all of your non-clustered indexes , load the data, and rebuild the
indexes.
HTH,
Stu|||"JT" <someone@.microsoft.com> wrote in
news:uTboj28PGHA.720@.TK2MSFTNGP14.phx.gbl:
Well...you are asking someone who really needs to ask the question. Did
that make sense? :-) I am not a dba, and have learned through trial and
error what seems to work, at least most of the time.
I thought 'bulk load' recovery was just beneficial when importing text
files or using 'select into', no?
In our ETL, we have a number of procedures (which can call other
procedures) to test and process the data after the raw data is imported
w/ DTS. The DTS part is always fast, some of the subsequent processing
will lock-up periodically...I assume this would be a deadlock or lock
escalation? (When I look at the locks I will see many pages on one
object). It just happened to occurr on one machine...I stopped and
restarted w/o mods to the code and the next time it processed w/o
halting.
I drop indexes for the DTS, for the data transformation I optimize sql
for performance, individual statements are broken into transactions,
most of the largest updates/inserts are broken into smaller transactions,
temp tables are used when needed, I aggressively manage the multiple data
and log files on multiple raids and hard drives. But I have not been
able to understand the hints but more importantly why a process will run
fine most times then decide to freeze once (btw, I also check for open
transactions to make certain that there was no minor error leaving an
open tran).
thanks for the references

> A deadlock is when two processes have locked resources (typically in
> an uncommitted transaction) that the other needs to continue. How
> would this happen if the ETL is the only process running?
> http://support.microsoft.com/defaul...kb;en-us;169960
> http://msdn.microsoft.com/library/default.asp
?url=/library/en-us/acdata/ac_8_con_7a_3hdf.asp As for maximizing the
performance of your ETL, I
> would reccomend the "bulk load" method, becuase it supports features
> such as minimal transaction logging.
http://www.microsoft.com/technet/pr...tain/incbulkloa
d.mspx Setting the database to DBO Use Only would also be a good
> idea; it may improve performance, but mostly to insure that noone or
> nothing else logs in.
> "gc" <nospam@.hotmail.com> wrote in message
> news:Xns977C861DB2E90nospam@.24.93.43.121...
>
>|||"Stu" <stuart.ainsworth@.gmail.com> wrote in news:1141535576.352832.290970
@.e56g2000cwe.googlegroups.com:

> 1. If you have a clustered index, make sure that it's located on a
> monotonically increasing value. This will minimize page splitting.
Not sure that I understand this one.
If I load some data into the field of the clustered key, are you saying
that it makes a difference what order that I load the records?
If so, is that different whether the clustered index is the primary key or
not?

> 2. The clustered index does not have to be your primary key; in fact,
> if you use a natural key, then it should not be. Our data is very date
> and time sensitive, so I index the date and time of load; others use a
> sequential numbering system (like an identity column), but I needed the
> date of load for other reasons, and I prefer not to add columns that
> have no meaning.
I use natural keys for something like a client id, but convert almost all
of the dimensions to surrogate (identity) keys.

> 3 Depending on the size of your data, it may be more appropriate to
> drop all of your non-clustered indexes , load the data, and rebuild the
> indexes.
>
I do this everytime. 1.5 out of 3... :-)|||You can know if a process if being blocked by another process by executing
sp_who2 and seeing if the [blkby] column for a spid (process id) contains
the spid of another blocking process.
When importing large amounts of data, use the bulk copy program (BCP.EXE) or
BULK INSERT command.
http://msdn.microsoft.com/library/d...>
bcp_9esz.asp
Try reducing the batch size of your bulk copy process:
http://support.microsoft.com/defaul...=kb;en-us;81339
Consider what impact transaction logging is having on your process. There
are a few conditions (such as dropping indexes and specifying the TABLOCK
hint) that are required for minimal logging to take effect.
http://support.microsoft.com/defaul...=kb;en-us;59462
http://support.microsoft.com/defaul...kb;en-us;110139
"gc" <nospam@.hotmail.com> wrote in message
news:Xns977CEEB88B493nospam@.24.93.43.121...
> "JT" <someone@.microsoft.com> wrote in
> news:uTboj28PGHA.720@.TK2MSFTNGP14.phx.gbl:
> Well...you are asking someone who really needs to ask the question. Did
> that make sense? :-) I am not a dba, and have learned through trial and
> error what seems to work, at least most of the time.
> I thought 'bulk load' recovery was just beneficial when importing text
> files or using 'select into', no?
> In our ETL, we have a number of procedures (which can call other
> procedures) to test and process the data after the raw data is imported
> w/ DTS. The DTS part is always fast, some of the subsequent processing
> will lock-up periodically...I assume this would be a deadlock or lock
> escalation? (When I look at the locks I will see many pages on one
> object). It just happened to occurr on one machine...I stopped and
> restarted w/o mods to the code and the next time it processed w/o
> halting.
> I drop indexes for the DTS, for the data transformation I optimize sql
> for performance, individual statements are broken into transactions,
> most of the largest updates/inserts are broken into smaller transactions,
> temp tables are used when needed, I aggressively manage the multiple data
> and log files on multiple raids and hard drives. But I have not been
> able to understand the hints but more importantly why a process will run
> fine most times then decide to freeze once (btw, I also check for open
> transactions to make certain that there was no minor error leaving an
> open tran).
> thanks for the references
>
> ?url=/library/en-us/acdata/ac_8_con_7a_3hdf.asp As for maximizing the
> performance of your ETL, I
> http://www.microsoft.com/technet/pr...tain/incbulkloa
> d.mspx Setting the database to DBO Use Only would also be a good
>|||Ugh. Page splitting is difficult to explain; think of your table as a
blank notebook. The clustered index is kind of like a page number for
your other indexes to go and retrieve information; it's a way for the
optimizer to go figure out where row X is located in your table.
As you write data to your notebook, it begins to fill up each page. As
you continue to write, it skips to the next page; if your clustered
index does not increase in a sequential fashion, when the optimizer
encounters a a value that is out of order, it has to split the page at
the insertion point, and move records below that point on a page to a
new page; that's why you should cluster on a sequential value that is
independent of the order of the data outside of the warehouse. That's
why I recommend a datetime representation so it increases without
splitting the pages.
If your primary key is a natural key (like a combination of clientID
and rowID), then the likelihood is that your rows will be out of order
as they load into the target database, and hence some reordering will
be necessary if you cluster on that.
Clear as mud?
Stu|||Also consider that if a good candidate for a clustered index cannot be
found, then best not to implement one.
"Stu" <stuart.ainsworth@.gmail.com> wrote in message
news:1141539867.013564.240070@.e56g2000cwe.googlegroups.com...
> Ugh. Page splitting is difficult to explain; think of your table as a
> blank notebook. The clustered index is kind of like a page number for
> your other indexes to go and retrieve information; it's a way for the
> optimizer to go figure out where row X is located in your table.
> As you write data to your notebook, it begins to fill up each page. As
> you continue to write, it skips to the next page; if your clustered
> index does not increase in a sequential fashion, when the optimizer
> encounters a a value that is out of order, it has to split the page at
> the insertion point, and move records below that point on a page to a
> new page; that's why you should cluster on a sequential value that is
> independent of the order of the data outside of the warehouse. That's
> why I recommend a datetime representation so it increases without
> splitting the pages.
> If your primary key is a natural key (like a combination of clientID
> and rowID), then the likelihood is that your rows will be out of order
> as they load into the target database, and hence some reordering will
> be necessary if you cluster on that.
> Clear as mud?
> Stu
>|||gc (nospam@.hotmail.com) writes:
> In our ETL, we have a number of procedures (which can call other
> procedures) to test and process the data after the raw data is imported
> w/ DTS. The DTS part is always fast, some of the subsequent processing
> will lock-up periodically...I assume this would be a deadlock or lock
> escalation? (When I look at the locks I will see many pages on one
> object). It just happened to occurr on one machine...I stopped and
> restarted w/o mods to the code and the next time it processed w/o
> halting.
A deadlock is when two (or more) processes are waiting for each other
to release resources. SQL Server detects a deadlock, and will select one
of the processes as a deadlock victim and cancel execution for that
process, so that at least one of the processes can continue working.
If you processing "locks up" this can be due to blocking, but also due
to long-running queries. With the information you have provided, it is
difficult to tell. As JT said, use sp_who2 to check for blocking.
Another issue to watch out for is auto-grow of the database. By default,
SQL Server auto-grows a database with 10% when you run out of space. If
your database is huge - and data warehouses often are - then 10% can
take quite some to time grow. This can be dealt with expanding the database
to a reasonable size in advance.
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx