Showing posts with label developed. Show all posts
Showing posts with label developed. Show all posts

Thursday, March 29, 2012

Database comparision

Language : VB.NET
Database : MSSQLserver2000
i have developed a tool to compare two databases, but it is taking long time to compare. First it will take the whole table into main memory(order by primary key). Same procedure for table2 of second database. Then it will start comparison by taking row by row. i tested with one third party tool, which is taking only 2 min to compare 800000 records of a table. My tool is taking 1 hour,40 mins to do that. Is there any other optimized method?

Regards,

Loka

Did you consider that instead of checking column by column, that you 'might' be able to compare a checksum() of the column, or even of the entire row?|||

Hi,

The checksum idea is quite good - it should help you speed up the process. Just a couple of more thoughts on the subject:

(a) Any implementation with a client application would require an access to the data in some form - even if you use a server-side cursor, you would have to fetch the data for comparison and it will travel (maybe even through network if you're not running the app local to SQL Server) to the application. Thus the best approach is to implement the comparison inside SQL itself - as stored procedures, queries, views, etc. Be careful though - the comparison may consume SQL resources and starve existing processes and clients. You should be able to find a ballance between the server impact and the comparison task performance.

(b) You could take advantage of some of the new T-SQL enhancements in SQL 2005 like "intersect" and "except".

(c) If in the end, a standalone application proves to be needed, use OLEDB or ODBC directly with SQL Native Client to achieve maximum performance. Plan carefully for the types of cursors you need (ideally a simple firehose should be the fastest), the potential for blocking or other isolation issues, the proximity and network utilization between client and server, etc.

HTH,

Jivko Dobrev - MSFT

--
This posting is provided "AS IS" with no warranties, and confers no rights.

|||

Some other random ideas:

1. bcp out the tables and do a file compare of the resulting output files, if nothing is different this should be really fast.

2. Comparing rows in bcp out might actually be faster than scanning records using a recordset for sparse differences, hard to tell. For example, say you bcp out the tables into c:\foo1.txt and c:\foo2.txt. Then you can scan a single line at a time from each file and just memcmp the entire line in one command. If there is a missing pkey, they you have to scan each file to resync the keys but in general it should work pretty fast I think.

3. Computing a checksum on the server side should be faster than trying to download the data and checksum downloaded data I think.

Database compare

Language : VB.NET
Database : MSSQLserver2000
i have developed a tool to compare two databases, but it is taking long time to compare. First it will take the whole table into main memory(order by primary key). Same procedure for table2 of second database. Then it will start comparison by taking row by row. i tested with one third party tool, which is taking only 2 min to compare 800000 records of a table. My tool is taking 1 hour,40 mins to do that. Is there any other optimized method?

Regards,
Lokait is likely that you wrote inefficient code. without providing us the information in Brett's sticky at the top of this page, you are unlikely to get any replies you find helpful.|||Are you taking advantage of the CHECKSUM and BINARY_CHECKSUM functions for comparing records?|||BINARY_CHECKSUM rocks! We use it for this very same situation, but use a stored proc that does a full outer join (to grab missing rows from both tables also, in addition to catching data differences with the BINARY_CHECKSUM) and this morning's run took 4.26 minutes to pass/compare two tables on two different servers each with 12,224,072 rows.

yep, we bad, we bad...|||...as long as you don't need ABSOLUTE EXACT ACCURACY, binary_checksum is very efficient.|||I seem to recall some trouble with it, but your comment irritates the hell out of...err...*blush* I mean, piques my interest :)

How can you be more absolute, exactly accurate than a binary comparison? Guess I need to go look at BOL again ;0

I was tripped up a time or two because it actually seems to be a byte-by-byte comparison. As I recall, if you are comparing two columns named "price", and one is defined as a FLOAT, and one as a MONEY, and both contain 2.45, the BINARY COMPARE will fail.

Also, even if both your columns DATE contain '12/21/2006', but one is defined as a datetime, and the other a smalldatetime, the BINARY COMPARE will fail.

In both cases, if you think about it, it makes sense. from the BINARY perspective the bits involved are going to be different because the variable size is different. But I still got tripped up with them early on.|||Thanks for your quick response... We implemented a tool to compare the databases in VB.NET and we should not put any stored procedure in the any of databases. Is there any other way to compare long datatypes in VB.NET.
if there is 10 lakh records in each database then main emory is not at all sufficient to take the whole table at a time. Please suggest me a better approach to solve this issue..|||How can you be more absolute, exactly accurate than a binary comparison? Because it is possible for two different strings to yield the same binary checksum value. In some scenarios, it is almost likely to happen.

We implemented a tool to compare the databases in VB.NET and we should not put any stored procedure in the any of databases.Your system could create a temporary stored procedure in the database and then delete it upon completion.|||Binary Checksum does exactly what the name implies, it computes a binary checksum. You can compute a sum from a row that is almost 8 Kb long, and even longer if you use TEXT or IMAGE columns. The checksum is a whole bunch smaller than the actual data being summed, so there are some combinations of data that will produce the same checksum value.

Binary Checksum is good, but it is a long way from being perfect (no checksum can be perfect in that sense).

-PatP|||Is it possible to create temporary stored procedure in the remote server?
"Binary check sum" is database function or .Net function? what is the syntax?
If you have sample stored procedure to compare two databases, please give it to me.

Regards,
Loka|||First it will take the whole table into main memory
Very bad idea. What do you do if the two tables do not fit into memory?|||Yes that's a problem, how to overcome this?|||Is it possible to create temporary stored procedure in the remote server?
"Binary check sum" is database function or .Net function? what is the syntax?
If you have sample stored procedure to compare two databases, please give it to me.

Regards,
Loka
BINARY_CHECKSUM is a SQLSVR database function.
You can create a stored procedure on the database at the start of your process using a simple CREATE PROCEDURE statement, and then drop it when you are finished with DROP PROCEDURE. The login used will need to have sufficient permissions to create procedures.|||Yes that's a problem, how to overcome this?I would recommend a cursor.

...just funnin' ya guys.

That's the purpose of the stored procedure, which can just select and (outer) join the two tables to find any differences.

And yup, you guys that dis' the checksum, binary or otherwise...I suppose you gotta know thy data, and thy checksum algorithm, and see if the risk is worth the benefit. As Mr. Pat pointed out, as soon as you start using a checksum, you introduce the possibility/probability that a checksum comparison will not catch a difference in data. There is some thought out there that the more data you scrunch into your checksum fields/components, the less likely to have this problem show up, but I am not sure I buy that at face value.

As with most things in life, it depends.

...and that is as close to philosophical as I am willing to get on Christmas Eve Eve Eve.

Thursday, March 22, 2012

Database Backup Security

Hi

We have developed and deployed a database which contanis very sensitive
information. Our client is now concerned about the security of the back
ups. In essense, if someone in the organization can get hold of the
backup of the database, he can simply restore it on any sql server in
the world with sa permission.

I know Microsoft provides flexibility of adding the "Password" in the
Backup t-sql statement but it wouldn't be of much use if the back up
task is saved as a script and password will be written inside the
script.

your suggestions are really appreciated!

Thanks<muzamil@.hotmail.com> wrote in message
news:1115055348.283887.64500@.z14g2000cwz.googlegro ups.com...
> Hi
> We have developed and deployed a database which contanis very sensitive
> information. Our client is now concerned about the security of the back
> ups. In essense, if someone in the organization can get hold of the
> backup of the database, he can simply restore it on any sql server in
> the world with sa permission.
> I know Microsoft provides flexibility of adding the "Password" in the
> Backup t-sql statement but it wouldn't be of much use if the back up
> task is saved as a script and password will be written inside the
> script.
> your suggestions are really appreciated!
> Thanks

If your client believes they cannot trust their own IT staff and/or cannot
secure their own backups, then I would suggest they have a number of serious
issues to address. On the technical side, they can implement a few standard
practices such as backing up to NTFS drives with appropriate ACLs, limiting
physical access to backup drives and tapes to a minimum number of trusted
staff, using OS-level auditing to see who accesses the files etc.
Ultimately, though, someone has to have access to backups, domain admin and
Exchange admin accounts etc., so they need to have people they can rely on,
and that probably isn't a problem you can or should try to solve for them -
it's really a human resources issue.

However, I appreciate that in reality, and especially in smaller companies,
these things may be not always be so easy. One possibility is to consider
encrypting the sensitive data using a key which is compiled into your
application. Your application can then encrypt/decrpyt the data when users
acess it, but if someone queries the tables directly then they only see the
encrypted data:

http://www.sqlsecurity.com/DesktopDefault.aspx?tabid=22

Simonsql

Thursday, March 8, 2012

Database access log

SQL 2K
We developed a VB.NET application (SQL 2k db) where users can log into the
system and extract reports. We are planning to create a report with user
list who accessed the application in last 30 days.
Can this be achieved by querying directly to the database '
Thanks
JohnHello John,
This will have to be done when your application authenticates. It will need
to create an audit of who logs in and when. This isn't a function of the
database but a function of the way your application is designed.
Aaron Weiker
http://aaronweiker.com/

> SQL 2K
> We developed a VB.NET application (SQL 2k db) where users can log into
> the system and extract reports. We are planning to create a report
> with user list who accessed the application in last 30 days.
> Can this be achieved by querying directly to the database '
> Thanks
> John|||Hi John,
You can get this information, if you would have done one of the following;
(a) Enabled logging in your VB app (the user and datetime)
(b) Logging in the database tables. Again this should have been done with
your application. (the user and datetime)
(c) Enabled Auditing in the database server.
Are you using one super user to connect to SQL Server for all the
application users or are you connecting to SQL Server for every user
accessing your VB application?
In the former case, you will always see one user at the database level.
Thanks
Yogish

Database access log

SQL 2K
We developed a VB.NET application (SQL 2k db) where users can log into the
system and extract reports. We are planning to create a report with user
list who accessed the application in last 30 days.
Can this be achieved by querying directly to the database '
Thanks
JohnHi John,
You can get this information, if you would have done one of the following;
(a) Enabled logging in your VB app (the user and datetime)
(b) Logging in the database tables. Again this should have been done with
your application. (the user and datetime)
(c) Enabled Auditing in the database server.
Are you using one super user to connect to SQL Server for all the
application users or are you connecting to SQL Server for every user
accessing your VB application?
In the former case, you will always see one user at the database level.
Thanks
Yogish

Database access log

SQL 2K
We developed a VB.NET application (SQL 2k db) where users can log into the
system and extract reports. We are planning to create a report with user
list who accessed the application in last 30 days.
Can this be achieved by querying directly to the database ?
Thanks
John
Hi John,
You can get this information, if you would have done one of the following;
(a) Enabled logging in your VB app (the user and datetime)
(b) Logging in the database tables. Again this should have been done with
your application. (the user and datetime)
(c) Enabled Auditing in the database server.
Are you using one super user to connect to SQL Server for all the
application users or are you connecting to SQL Server for every user
accessing your VB application?
In the former case, you will always see one user at the database level.
Thanks
Yogish

Database access log

SQL 2K
We developed a VB.NET application (SQL 2k db) where users can log into the
system and extract reports. We are planning to create a report with user
list who accessed the application in last 30 days.
Can this be achieved by querying directly to the database '
Thanks
JohnHi John,
You can get this information, if you would have done one of the following;
(a) Enabled logging in your VB app (the user and datetime)
(b) Logging in the database tables. Again this should have been done with
your application. (the user and datetime)
(c) Enabled Auditing in the database server.
Are you using one super user to connect to SQL Server for all the
application users or are you connecting to SQL Server for every user
accessing your VB application?
In the former case, you will always see one user at the database level.
Thanks
Yogish