Friday, February 24, 2012

Data Validation

Being relatively new to SSIS, I'm looking for advice, or a best practice, regarding data validation before extracting the data for a transformation.

One of my project's require that certain data be validated in staging tables before it is loaded. The validations include checking for null values, verifying that a field is populated with apropriate values etc... The entire batch of data (good records and bad records) may be rejected depending on the validations.

I have a couple of different thoughts on how this could be handled...

    Run a series of validation queries on the data before executing an SSIS package Run some kind of validation transformation (does one exist or should I write a custom transformation?) Place contraints on the target tables so that bad records error out on the load Something else... I could be missing the completely obvious

#3 doesn't seem to viable as the entire load may be rejected if some of the data is bad...

Any thoughts?

You could filter out the bad data very easily using a Conditional Split transform. Better still, you can pipe that bad data somewhere for later analysis.

Does that help?

-Jamie

|||I like both of thoses suggestions. With the conditional split transform, could I then update the original record with a status flag?|||

Once you pipe the data off elsewhere from the Conditional Split you can do what you want with it so yes, you should be able to do this! You would use an OLE DB Command transform.

You may have a problem (i.e. blocking) when trying to update a table that you are selecting from in the source - but this can be easily alleviated by dropping the update dataset into a raw file and then doing the update in a seperate data-flow.

-Jamie

|||

Thank you!

I was just testing an update using the OLE DB Command Transform. The raw file seems like good work around if the row is locked!

No comments:

Post a Comment