cdun2


Hello,

In a Data Flow Task, I have an insert that occurs into a SQL Server 2000 table from a fixed width flat file. The SQL Server table that the data goes into is accessed through an OLE DB connection manager that uses the Native OLE DB\Microsoft OLE DB Provider for SQL Server.

In the OLE DB Destination, I changed the access mode from Table or View - fast load to Table or View because I needed to implement OLE DB Destination Error Output. The Error output goes to a SQL Server 2000 table that uses the same connection manager.

The OLE DB Destination Editor Error Output 'Error' option is configured to 'Redirect' the row. 'Set this value to selected cells' is set to 'Fail component'.

Was changing the access mode the simple reason why the insert from the flat file takes so much longer, or could there be other problems

Thank you for your help!

cdun2




Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

Phil Brammer


Yes, that's the reason... You always want to use fast load if you can.  I don't understand why you got rid of using it...  OLE DB Destinations still have error outputs when using the fast load option.

http://msdn2.microsoft.com/en-us/library/ms141237.aspx






Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

Rafael Salas

Phil,

I am not quite sure you can have both at the same time; fast load option processes batches of rows, then when a row causes an error the OLE DB Destination fails; so there is not chance of sending rows to the error output *I think*.

To the original poster:

By using 'table or view' option the destination component will commit after each row is inserted; then you are able to use the Error output but as you saw it is very slow. You may want to catch any possible faulty row before it reaches the destination and send it to a different data pipeline; that way you can take advantage of the Fast load option







Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

Phil Brammer

 Rafael Salas wrote:

Phil,

I am not quite sure you can have both at the same time; fast load option processes batches of rows, then when a row causes an error the OLE DB Destination fails; so there is not chance of sending rows to the error output *I think*.



Works okay for me. Just turned on a primary key and reloaded a table using fast load... Ran it once with no error output and it fails as a result of primary key violations. Redirected the OLE DB Dest error output to a row counter and all rows processed were redirected.  Doesn't generate a valuable error message in those rows when inspecting with a data viewer, but then I wouldn't think it could.  The whole batch is thrown to the error output, not just the row that caused the error though.





Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

Phil Brammer

I just want to clarify... When using the fast load option, if one row is in error, the whole batch as indicated by "Rows per batch" is considered in error. So if you have 10,000 records, and the batch size is 1000, and the error record is in the 500th line, the first batch of 1000 records will be redirected to the error output (If the redirect option is set). Then the remaining 9,000 lines will continue processing.





Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

Phil Brammer

The other option for the original poster is to use a lookup in midstream to see and test records before inserting them, and if they are in error, you can redirect those to whereever, while pushing the valid records to an OLE DB Destination using fast load. If at all possible, use the processing logic of SSIS to test your records -- don't let the database do it for you.





Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

cdun2

In this case, the SQL Server table has a Unique Non-Clustered index on it with IGNORE_DUP_KEY. The error that occurs during the insert from the flat file is 'Duplicate Key was Ignored'. The error output sends the duplicate record to another SQL Server table. I could just ignore the error, but I need to know what the duplicate record is.



Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

Phil Brammer

 cdun2 wrote:
In this case, the SQL Server table has a Unique Non-Clustered index on it with IGNORE_DUP_KEY. The error that occurs during the insert from the flat file is 'Duplicate Key was Ignored'. The error output sends the duplicate record to another SQL Server table. I could just ignore the error, but I need to know what the duplicate record is.


So use a lookup and join on the key fields. If you have a match, this record would be in error (duplicate key). The "error" output of the lookup would be your records that are not duplicate and hence can be loaded via fast load. Use a non-caching lookup if you need current real-time results, but that will obviously be slower.





Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

cdun2

Thanks, I will try this approach.

cdun2





Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

cdun2

Wait...I'm confused...my source is a fixed witdth flat file. The destination is a SQL Server table. I have not used a Lookup Transformation before. Is the idea to somehow lookup records from the flat file In the Lookup Transformation Editor, I can only use an OLE DB connection manager. How can a lookup help uncover duplicate records in the flat file if there is nothing to look up in the destination SQL Server table



Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

Phil Brammer

 cdun2 wrote:
Wait...I'm confused...my source is a fixed witdth flat file. The destination is a SQL Server table. I have not used a Lookup Transformation before. Is the idea to somehow lookup records from the flat file In the Lookup Transformation Editor, I can only use an OLE DB connection manager. How can a lookup help uncover duplicate records in the flat file if there is nothing to look up in the destination SQL Server table


No, you use a lookup against your destination. Then you join the fields from your flat file (in the data flow) to the lookup table.  This will tell you if the flat file record already exists in the destination.

If you want to filter out duplicates in your flat file, use a sort transformation, checking the discard duplicate values box.





Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

cdun2

Thanks again. I might try it both ways, as a I may get a 'new opinion' on saving the duplicate records.



Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

cdun2

I'll work with the Lookup option first, so please bear with me as I try to understand how this should work.

I have the Lookup Transformation set up between my flat file source, and the Lookup Error Output and Lookup Output. The Lookup Transformation Reference Table is called ProblemRecords_DailySettlementData. This is where I intend for the matching (duplicate) records to go, so I have this table set as my Lookup Output.

I have my Lookup error output configured so that a table called Staging_DailySettlementData is populated when no match is made in the lookup (no duplicate records). This table has a unique nonclustered index on it with IGNORE_DUP_KEY. I have mapped only the unique columns between the source and the destination (columns 3,5,6,7,60).

I have Fast Load set up for both the output and the error output.

I'm not sure what the Lookup Operation should be for the mapped columns. I have it set up to <add as new column> for all columns.

According to how I understand this, when I run the DataFlow, The lookup should not find a matching record in ProblemRecords_DailySettlementData (the Lookup Output), and send the unmatched records to the Error Output. Otherwise the matching record (duplicate) should go to ProblemRecords_DailySettlementData.

When I run the dataflow, everything goes to the Error Output table, even the matching records. The Insert into the Error Output table fails when a duplicate is encountered.

What have I not set up correctly

Thanks again!





Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

Phil Brammer

Lookups are case sensitive, and you need to be sure that you are trimming any trailing whitespace that might be on CHAR columns.

The mapped columns should just be the key columns.  You don't need to add any output columns from the lookup table.  Just hook the tables up and use the outputs of the lookup transformation to direct the traffic.  The error output (red line) are new records; the "normal" output (green line) are duplicates.





Re: Slow Running Insert After Changing OLE DB Destination Data Access Mode

Rafael Salas

See if this can take you further:

http://blogs.conchango.com/jamiethomson/archive/2006/09/12/SSIS_3A00_-Checking-if-a-row-exists-and-if-it-does_2C00_-has-it-changed.aspx

If you are getting everything in the error output, means either all rows are non-duplicates or the mappings in the columns tab of the LUp transform is wrong. It may also be the result of trailing/leading blank spaces diffrent casing that prevent the match...