Database corrupted: Database backup issue

I am trying to figure out how disk space is used when overwriting backup
files. Here is a description of our situation...
We have a 700+GB reporting database (SQL 2000) that we backup weekly to a 1
TB drive. This has worked fine for quite a while but we have seen the
server crash twice in the last month while performing the backup. The
backup is set to backup to 5 files which overwrite the previous weeks backup
(the old backup has already been backed up to tape at this point). The
backups are run via a scheduled job, but when the server has crashed we see
no evidence that it ever ran in the job history. The dba that set up the
job never set it up to go to a log file so we don't have that to look at
yet. The only thing I do have are a great deal of messages in the SQL
Server log that occured while the last backup was running. The following
messages began appearing a couple hours after the backup began and continued
until the server was rebooted.
SQL Server has encountered 3 occurrence(s) of IO requests taking longer than
15 seconds to complete on file [V:\MSSQL\Data\tempdev_2_Data.NDF] in
database [tempdb] (2). The OS file handle is 0x00000598. The offset of the
latest long IO is: 0x000004cab00000
Time out occurred while waiting for buffer latch type 4,bp 0x129ffc0, page
3:2305080), stat 0x40d, object ID 2:1682105033:0, EC 0x2015A3F0 : 0,
waittime 300. Not continuing to wait.
Waiting for type 0x4, current count 0x100022, current owning EC 0x2015A3F0.
Just some additional info on what I mean by "the server crashed". This
server is both a database server and a webserver which runs and admin
website used to monitor database extracts. At some point the server stopped
responding to all forms of remote control, and the actual screen in our
datacenter went black. No one could log onto the server or see anything.
Enterprise manager seemd to be connecting on, although it was extremely
slow, and the website was working as well. When the 4 hour backup had run a
full 12 hours over the normal time we had the datacenter do a hard power
down and restart the server. Everything came up fine.
Getting back to the backups, and my question...
I know when I run backups on large databases I will see the zero byte
file(s) created and the size doesn't appear to change until the backup is
complete. I'm wondering if the data is somehow being cached or stored in
temporary files until all the data is available then it written all at once.
I can't imagine SQL Server is caching 750GB, and if it uses temporary files
on the disk I would think my backups should have been failing for months.
It was suggested by our server admins that it may be overwriting the
previous backups but not updating the file system tables to show this, which
means we would have corrupt backup files on the server.
For the time being I have the job set up to delete the previous backups
prior to beginning the new backup, which I hope alleviates the problem.
Regardless of whether this works or not, I want to understand what is
happening and what my disk space requirements are going forward.
Can anyone explain how the disk space would be utilized in this situation?
Thanks in advance.
- Jim
What is this 1TB drive? Is it USB? The IO errors are because of a disk
bottleneck where the data resides but the lockup and black screen is another
issue. It sounds like a drive or hardware issue.
Jason Massie
www: http://statisticsio.com
rss: http://feeds.feedburner.com/statisticsio
"Jim Underwood" <james.underwood_nospam@.fallonclinic.org> wrote in message
news:etxeMjveIHA.2000@.TK2MSFTNGP05.phx.gbl...
>I am trying to figure out how disk space is used when overwriting backup
>files. Here is a description of our situation...
> We have a 700+GB reporting database (SQL 2000) that we backup weekly to a
> 1
> TB drive. This has worked fine for quite a while but we have seen the
> server crash twice in the last month while performing the backup. The
> backup is set to backup to 5 files which overwrite the previous weeks
> backup
> (the old backup has already been backed up to tape at this point). The
> backups are run via a scheduled job, but when the server has crashed we
> see
> no evidence that it ever ran in the job history. The dba that set up the
> job never set it up to go to a log file so we don't have that to look at
> yet. The only thing I do have are a great deal of messages in the SQL
> Server log that occured while the last backup was running. The following
> messages began appearing a couple hours after the backup began and
> continued
> until the server was rebooted.
> SQL Server has encountered 3 occurrence(s) of IO requests taking longer
> than
> 15 seconds to complete on file [V:\MSSQL\Data\tempdev_2_Data.NDF] in
> database [tempdb] (2). The OS file handle is 0x00000598. The offset of
> the
> latest long IO is: 0x000004cab00000
> Time out occurred while waiting for buffer latch type 4,bp 0x129ffc0, page
> 3:2305080), stat 0x40d, object ID 2:1682105033:0, EC 0x2015A3F0 : 0,
> waittime 300. Not continuing to wait.
> Waiting for type 0x4, current count 0x100022, current owning EC
> 0x2015A3F0.
> Just some additional info on what I mean by "the server crashed". This
> server is both a database server and a webserver which runs and admin
> website used to monitor database extracts. At some point the server
> stopped
> responding to all forms of remote control, and the actual screen in our
> datacenter went black. No one could log onto the server or see anything.
> Enterprise manager seemd to be connecting on, although it was extremely
> slow, and the website was working as well. When the 4 hour backup had run
> a
> full 12 hours over the normal time we had the datacenter do a hard power
> down and restart the server. Everything came up fine.
> Getting back to the backups, and my question...
> I know when I run backups on large databases I will see the zero byte
> file(s) created and the size doesn't appear to change until the backup is
> complete. I'm wondering if the data is somehow being cached or stored in
> temporary files until all the data is available then it written all at
> once.
> I can't imagine SQL Server is caching 750GB, and if it uses temporary
> files
> on the disk I would think my backups should have been failing for months.
> It was suggested by our server admins that it may be overwriting the
> previous backups but not updating the file system tables to show this,
> which
> means we would have corrupt backup files on the server.
> For the time being I have the job set up to delete the previous backups
> prior to beginning the new backup, which I hope alleviates the problem.
> Regardless of whether this works or not, I want to understand what is
> happening and what my disk space requirements are going forward.
> Can anyone explain how the disk space would be utilized in this situation?
> Thanks in advance.
> - Jim
>
>
|||Thanks for the response.
The backup drive is a local raid5 drive atttached internally. We did just
upgrade the firmware on wednesday to eliminate that as a potential cause of
the problem. However, I'm actually more interested in how SQL Server
handles the space when backing up a database over existing backup files.
The hardware issues are somethign of a lesser priority for me, since we have
folks far better equipped than I to handle those issues.
Should I have double the space available to handle temporary files and the
actual backup files? Does SQL Server overwrite the existing files but not
show the file changes in windows until it completes (I assume this would
make the files corrupt if the backup gets interrupted)?
"jason" <jason-r3move@.statisticsio.com> wrote in message
news:B41F9781-564C-4B80-AD12-651B85C5E39C@.microsoft.com...
> What is this 1TB drive? Is it USB? The IO errors are because of a disk
> bottleneck where the data resides but the lockup and black screen is
> another issue. It sounds like a drive or hardware issue.
> --
> Jason Massie
> www: http://statisticsio.com
> rss: http://feeds.feedburner.com/statisticsio
>
> "Jim Underwood" <james.underwood_nospam@.fallonclinic.org> wrote in message
> news:etxeMjveIHA.2000@.TK2MSFTNGP05.phx.gbl...
>
|||Jim, The backups do not overwrite existing files. - RLF
"Jim Underwood" <james.underwood_nospam@.fallonclinic.org> wrote in message
news:exS9yJxeIHA.1204@.TK2MSFTNGP03.phx.gbl...
> Thanks for the response.
> The backup drive is a local raid5 drive atttached internally. We did just
> upgrade the firmware on wednesday to eliminate that as a potential cause
> of the problem. However, I'm actually more interested in how SQL Server
> handles the space when backing up a database over existing backup files.
> The hardware issues are somethign of a lesser priority for me, since we
> have folks far better equipped than I to handle those issues.
> Should I have double the space available to handle temporary files and the
> actual backup files? Does SQL Server overwrite the existing files but not
> show the file changes in windows until it completes (I assume this would
> make the files corrupt if the backup gets interrupted)?
>
> "jason" <jason-r3move@.statisticsio.com> wrote in message
> news:B41F9781-564C-4B80-AD12-651B85C5E39C@.microsoft.com...
>
|||Russel, can you elaborate? We specify the name of the backup files, so the
files are replaced by the new backup. When the backup is complete the same
file names now contain the new backup and old backup is gone.
I should have noted the command we are using to perform the backup...
BACKUP DATABASE MyDB
TO DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_01.BAK',
DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_02.BAK',
DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_03.BAK',
DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_04.BAK',
DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_05.BAK'
WITH INIT
"Russell Fields" <russellfields@.nomail.com> wrote in message
news:eJdomOxeIHA.1188@.TK2MSFTNGP04.phx.gbl...
> Jim, The backups do not overwrite existing files. - RLF
> "Jim Underwood" <james.underwood_nospam@.fallonclinic.org> wrote in message
> news:exS9yJxeIHA.1204@.TK2MSFTNGP03.phx.gbl...
>
|||The WITH INIT option forces the overwrite.
I do not know the answer to your question about when the old file
space is released. Based on the events you reported I would assume
you should continue to delete the files before the next run, as you
are doing now.
Roy Harvey
Beacon Falls, CT
On Fri, 29 Feb 2008 16:45:34 -0500, "Jim Underwood"
<james.underwood_nospam@.fallonclinic.org> wrote:

>Russel, can you elaborate? We specify the name of the backup files, so the
>files are replaced by the new backup. When the backup is complete the same
>file names now contain the new backup and old backup is gone.
>I should have noted the command we are using to perform the backup...
>BACKUP DATABASE MyDB
>TO DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_01.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_02.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_03.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_04.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_05.BAK'
>WITH INIT
>
>"Russell Fields" <russellfields@.nomail.com> wrote in message
>news:eJdomOxeIHA.1188@.TK2MSFTNGP04.phx.gbl...
>
|||Jim,
This error is referring to tempdb so I suspect there is more to it than you
think. Is tempdb on the same physical array as the db you are trying to
backup? Are you using compression software? If not you probably should
consider it as it will make your backup files much smaller and probably
eliminate any issue with disk space. But in the mean time you might want to
try and delete the old backup before you attempt the new one since you
already have it on tape. Due to the I/O error I suspect your issue is
hardware related. Either the storage can't handle the load or there is a
problem somewhere. The errors of I/O taking longer than 15 minutes are sure
signs of one of the two. Using something such as LiteSpeed can dramatically
speed up an operation such as this.
Andrew J. Kelly SQL MVP
Solid Quality Mentors
"Jim Underwood" <james.underwood_nospam@.fallonclinic.org> wrote in message
news:etxeMjveIHA.2000@.TK2MSFTNGP05.phx.gbl...
>I am trying to figure out how disk space is used when overwriting backup
>files. Here is a description of our situation...
> We have a 700+GB reporting database (SQL 2000) that we backup weekly to a
> 1
> TB drive. This has worked fine for quite a while but we have seen the
> server crash twice in the last month while performing the backup. The
> backup is set to backup to 5 files which overwrite the previous weeks
> backup
> (the old backup has already been backed up to tape at this point). The
> backups are run via a scheduled job, but when the server has crashed we
> see
> no evidence that it ever ran in the job history. The dba that set up the
> job never set it up to go to a log file so we don't have that to look at
> yet. The only thing I do have are a great deal of messages in the SQL
> Server log that occured while the last backup was running. The following
> messages began appearing a couple hours after the backup began and
> continued
> until the server was rebooted.
> SQL Server has encountered 3 occurrence(s) of IO requests taking longer
> than
> 15 seconds to complete on file [V:\MSSQL\Data\tempdev_2_Data.NDF] in
> database [tempdb] (2). The OS file handle is 0x00000598. The offset of
> the
> latest long IO is: 0x000004cab00000
> Time out occurred while waiting for buffer latch type 4,bp 0x129ffc0, page
> 3:2305080), stat 0x40d, object ID 2:1682105033:0, EC 0x2015A3F0 : 0,
> waittime 300. Not continuing to wait.
> Waiting for type 0x4, current count 0x100022, current owning EC
> 0x2015A3F0.
> Just some additional info on what I mean by "the server crashed". This
> server is both a database server and a webserver which runs and admin
> website used to monitor database extracts. At some point the server
> stopped
> responding to all forms of remote control, and the actual screen in our
> datacenter went black. No one could log onto the server or see anything.
> Enterprise manager seemd to be connecting on, although it was extremely
> slow, and the website was working as well. When the 4 hour backup had run
> a
> full 12 hours over the normal time we had the datacenter do a hard power
> down and restart the server. Everything came up fine.
> Getting back to the backups, and my question...
> I know when I run backups on large databases I will see the zero byte
> file(s) created and the size doesn't appear to change until the backup is
> complete. I'm wondering if the data is somehow being cached or stored in
> temporary files until all the data is available then it written all at
> once.
> I can't imagine SQL Server is caching 750GB, and if it uses temporary
> files
> on the disk I would think my backups should have been failing for months.
> It was suggested by our server admins that it may be overwriting the
> previous backups but not updating the file system tables to show this,
> which
> means we would have corrupt backup files on the server.
> For the time being I have the job set up to delete the previous backups
> prior to beginning the new backup, which I hope alleviates the problem.
> Regardless of whether this works or not, I want to understand what is
> happening and what my disk space requirements are going forward.
> Can anyone explain how the disk space would be utilized in this situation?
> Thanks in advance.
> - Jim
>
>
|||Jim,
As best I can tell from experimentation, if you use NOINIT there is some
hiding and renaming of files going on. If the backup completes
successfully, it deletes the old backup and replaces it. But it does not
directly overwrite. Experiment with a 500 MB database:
BACKUP DATABASE [Test] TO
DISK = N'D:\Backup\Test.BAK'
WITH NOFORMAT, NOINIT, NAME = N'Test-Full Database Backup',
SKIP, NOREWIND, NOUNLOAD, STATS = 10
Run to completion, it creates George.BAK. Wait a minute, then run it again,
let it get about 70 percent done, then cancel the second backup. The file
date of George.BAK is updated with the time of the second backup, but the
contents are still from the original backup. You can verify by:
RESTORE DATABASE [Test] FROM
DISK = N'D:\Backup\Test.BAK'
WITH FILE = 1, NOUNLOAD, REPLACE, STATS = 10
It restores successfully from the first backup.
However, if you use INIT, then the second backup will overwrite the
original, but when you cancel the backup you will not be left with a useful
file.
If you do the second backup to a different backup file Test1.BAK, it will
appear as 0 bytes and eventually grow in size. If you cancel this backup,
Test1.BAK will disappear.
THEREFORE, my answer should be qualified by whether you use INIT or NOINIT.
RLF
"Jim Underwood" <james.underwood_nospam@.fallonclinic.org> wrote in message
news:ux5ErxxeIHA.5788@.TK2MSFTNGP02.phx.gbl...
> Russel, can you elaborate? We specify the name of the backup files, so
> the files are replaced by the new backup. When the backup is complete the
> same file names now contain the new backup and old backup is gone.
> I should have noted the command we are using to perform the backup...
> BACKUP DATABASE MyDB
> TO DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_01.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_02.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_03.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_04.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_05.BAK'
> WITH INIT
>
> "Russell Fields" <russellfields@.nomail.com> wrote in message
> news:eJdomOxeIHA.1188@.TK2MSFTNGP04.phx.gbl...
>
|||"Jim Underwood" <james.underwood_nospam@.fallonclinic.org> wrote in message
news:ux5ErxxeIHA.5788@.TK2MSFTNGP02.phx.gbl...
> Russel, can you elaborate? We specify the name of the backup files, so
> the files are replaced by the new backup. When the backup is complete the
> same file names now contain the new backup and old backup is gone.
>
If I understand what you're doing, I think you risk having something bad
happen.
Imagine this (I can because it's happened to me). 1/2 way through the
backup, it fails due to a data-corruption issue.
You know have a database with potential issues and no good backup since
you've overwritten 1/2 of your one good one.
Always keep at least one good backup when createing the next one.
I'm also not clear why you have 5 files written to the same drive.
Different physical drives can improve performance, but here I think it's
just complicating things.

> I should have noted the command we are using to perform the backup...
> BACKUP DATABASE MyDB
> TO DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_01.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_02.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_03.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_04.BAK',
> DISK='O:\MSSQL\Backup\MyDB\MyDB_Backup-Set_05.BAK'
> WITH INIT
>
> "Russell Fields" <russellfields@.nomail.com> wrote in message
> news:eJdomOxeIHA.1188@.TK2MSFTNGP04.phx.gbl...
>
Greg Moore
SQL Server DBA Consulting Remote and Onsite available!
Email: sql (at) greenms.com http://www.greenms.com/sqlserver.html
|||Perhaps you are using too many files simulaneously and that could be causing
the RAID5 volume to become overwhelmed with write requests. Try backing up
to just one or perhaps 2 files instead of 5.
Kevin G. Boles
Indicium Resources, Inc.
SQL Server MVP
kgboles a earthlink dt net
"Jim Underwood" <james.underwood_nospam@.fallonclinic.org> wrote in message
news:etxeMjveIHA.2000@.TK2MSFTNGP05.phx.gbl...
>I am trying to figure out how disk space is used when overwriting backup
>files. Here is a description of our situation...
> We have a 700+GB reporting database (SQL 2000) that we backup weekly to a
> 1
> TB drive. This has worked fine for quite a while but we have seen the
> server crash twice in the last month while performing the backup. The
> backup is set to backup to 5 files which overwrite the previous weeks
> backup
> (the old backup has already been backed up to tape at this point). The
> backups are run via a scheduled job, but when the server has crashed we
> see
> no evidence that it ever ran in the job history. The dba that set up the
> job never set it up to go to a log file so we don't have that to look at
> yet. The only thing I do have are a great deal of messages in the SQL
> Server log that occured while the last backup was running. The following
> messages began appearing a couple hours after the backup began and
> continued
> until the server was rebooted.
> SQL Server has encountered 3 occurrence(s) of IO requests taking longer
> than
> 15 seconds to complete on file [V:\MSSQL\Data\tempdev_2_Data.NDF] in
> database [tempdb] (2). The OS file handle is 0x00000598. The offset of
> the
> latest long IO is: 0x000004cab00000
> Time out occurred while waiting for buffer latch type 4,bp 0x129ffc0, page
> 3:2305080), stat 0x40d, object ID 2:1682105033:0, EC 0x2015A3F0 : 0,
> waittime 300. Not continuing to wait.
> Waiting for type 0x4, current count 0x100022, current owning EC
> 0x2015A3F0.
> Just some additional info on what I mean by "the server crashed". This
> server is both a database server and a webserver which runs and admin
> website used to monitor database extracts. At some point the server
> stopped
> responding to all forms of remote control, and the actual screen in our
> datacenter went black. No one could log onto the server or see anything.
> Enterprise manager seemd to be connecting on, although it was extremely
> slow, and the website was working as well. When the 4 hour backup had run
> a
> full 12 hours over the normal time we had the datacenter do a hard power
> down and restart the server. Everything came up fine.
> Getting back to the backups, and my question...
> I know when I run backups on large databases I will see the zero byte
> file(s) created and the size doesn't appear to change until the backup is
> complete. I'm wondering if the data is somehow being cached or stored in
> temporary files until all the data is available then it written all at
> once.
> I can't imagine SQL Server is caching 750GB, and if it uses temporary
> files
> on the disk I would think my backups should have been failing for months.
> It was suggested by our server admins that it may be overwriting the
> previous backups but not updating the file system tables to show this,
> which
> means we would have corrupt backup files on the server.
> For the time being I have the job set up to delete the previous backups
> prior to beginning the new backup, which I hope alleviates the problem.
> Regardless of whether this works or not, I want to understand what is
> happening and what my disk space requirements are going forward.
> Can anyone explain how the disk space would be utilized in this situation?
> Thanks in advance.
> - Jim
>
>

Thursday, March 22, 2012

Database backup issue - Space requirements

No comments:

Post a Comment

Database corrupted

Blog Archive

About Me