Buffer Size in SSIS
Behind the scenes in SSIS, the data flow engine uses a buffer-oriented architecture to efficiently load and manipulate datasets in memory. The benefit of this in-memory processing is that you do not need to physically copy and stage data at each step of the data integration. Rather, the data flow engine manipulates data as it is transferred from source to destination.
So how many buffers does it create? How many rows fit into a single buffer? How does it impact performance?
The number of buffers created is dependent on how many rows fit into a buffer, and how many rows fit into a buffer depends on a few other factors. The first consideration is the estimated row size, which is the sum of the maximum sizes of all the columns from the incoming records. The second consideration is the DefaultBufferSize property of the data flow task. This property specifies the default maximum size of a buffer. The default value is 10MB and its upper and lower boundaries are constrained by two internal non configurable properties of SSIS which are MaxBufferSize (100MB) and MinBufferSize (64KB). It means the size of a buffer can be as small as 64KB and as large as 100MB. The third factor is DefaultBufferMaxRows which is again a property of the data flow task which specifies the default number of rows in a buffer. Its default value is 10,000.
Although SSIS does a good job in tuning for these properties in order to create an optimum number of buffers, if the size exceeds the DefaultBufferSize then it reduces the rows in the buffer. For better buffer performance you can do two things. First you can remove unwanted columns from the source and set the data type in each column appropriately, especially if your source is flat file. This will enable you to accommodate as many rows as possible in the buffer. Second, if your system has sufficient memory available, you can tune these properties to have a small number of large buffers, which could improve performance. Beware if you change the values of these properties to a point where page spooling begins, it adversely impacts performance. So before you set a value for these properties, first thoroughly test in your environment and set the values appropriately.
You can enable logging of the BufferSizeTuning event to learn how many rows a buffer contains and you can monitor the “Buffers spooled” performance counter to see if SSIS has begun page spooling.
More info:
Improving the Performance of the Data Flow
Integration Services: Performance Tuning Techniques
SQL Server 2005 Integration Services – Performance – Part 30
My dream is to have the possibly to be able to adjust the buffer size dynamically from the package (so I would control it based on the system parameters).
Also curious why SSIS does not implement a so called “fluid” algorithm for adjusting it?
I know it does that only at design/initial creation time (it samples some rows).
You just refer to this link
http://www.sqlis.com/post/Log-Events-and-Pipeline-Events.aspx
Hi James,
I have a question Does this maxbuffersize and maxbufferrows really improve the perfomance of a data flow task? because i have adjusted those values and sometimes it works fine sometimes it is not and one more thing that along with adjusting these values i have changed the RowsPerbatch value to maxbufferrows value.