Friday, February 24, 2012

Data Validation Challenge

We have data migration process that transfers around 450 million records from
one DB to Another DB. There is a validation that takes place for the migrated
data that is time consuming (few days). The following validations takes place
on migrated data
NULL check
Length Check
Numeric Precision Check
So its looping through 450(rows) million * 30 (columns) times. so it takes
forever to complete the validation process and moreover the space
requirements also growing exponentially :-(. I would like to know is there a
better approach for validation of this kind. we are planning to try partition
approach. If there is any better way please help with your recommendations.
Regards,
Murali
Which tool are you using?
"Murali" wrote:

> We have data migration process that transfers around 450 million records from
> one DB to Another DB. There is a validation that takes place for the migrated
> data that is time consuming (few days). The following validations takes place
> on migrated data
> NULL check
> Length Check
> Numeric Precision Check
> So its looping through 450(rows) million * 30 (columns) times. so it takes
> forever to complete the validation process and moreover the space
> requirements also growing exponentially :-(. I would like to know is there a
> better approach for validation of this kind. we are planning to try partition
> approach. If there is any better way please help with your recommendations.
> Regards,
> Murali
>

No comments:

Post a Comment