Home > SSIS Best Practices > SSIS – Raw Files or Not

SSIS – Raw Files or Not

I wrote a blog, SSIS – Staging in Relational Environment or in Raw Files?, to argue that raw file format has it’s place in ETL process, but not the only choice. The blog is “inspired” by a SSIS package design I saw that has no milestone points and no staging, not in relational database, not in flat files either.

I read a blog today, 31 Days of SSIS – Raw Files Are Awesome (1/31). I agree that the raw file format can be incredibly useful.  It can be used for either a data source or data destination.

Here are some of the scenarios that the author is suggesting where raw files can be awesome:

1) if you want to build an SSIS package that processes data in multiple steps, temporary versions of the data can be stored in raw files and passed between data flows.  This will allow packages to be built such that they can be restarted at different points in the data processing.

2) Raw files can be used to exchange data between SSIS packages.  You can develop one package to convert a clients file to a standardized structure in the raw file format.  And then use a couple other packages to read that format in the next packages that process the data in a standard format.  All of this can be done without hitting the database.

3) You can export data from a database to a raw file to use as a reference file for multiple SSIS packages.  If you need to cycle through a group of IDs or states multiple time, store the data in a raw file and use it as a data cache.  Sure you can use a Lookup Cache instead – but that doesn’t fit all scenarios and not everyone is using SQL Server 2008.

Categories: SSIS Best Practices
  1. No comments yet.
  1. No trackbacks yet.

Leave a comment