In part1 of Database partitioning we talked about why we need it and how to implement it into SQL Server database. Also, we have seen how to add new range & filegroups to the existing partitions. Now, in this article, we would be talking about the next bit of it where we feel the need of getting rid of the existing/empty partitions.
So, before jumping onto how to achieve this, let’s see when we get around with an empty partition or we can say why we need to remove some existing partition. In other words, there is a very useful application of the database partitioning in achieving the database archiving.
Database Archiving / Moving data between tables: Now a days, it has become very useful but no expensive for keeping the older records for long for many analysis purposes. But, from the database perspective, we generally maintain the older data in terms of a separate ‘archive/history’ table and the current data (based upon some date or so) in the main table. Now, moving data from the main table to the archive table on timely basis (mostly daily) can be an expensive affair if the data volume to be moved is large enough to give a significant performance hit & can be time consuming too if we copy data from main to archive table using simple select/insert & delete statements; even if being done using smaller batches.
And this is where database partitioning proves to be an amazing approach.
First, let’s see how data is distributed across all the existing partitions into our table (dbo.Fact_Orders).
As you can see that there are 6 partitions at the moment and the oldest partition #1 which contains data for the first quarter of 2014 and having 55 rows is the recordset we need to move to the archive table and then to remove the same from this main table.
So, basically there are two parts to it, first is to move the data from main to the archive/history table i.e. SWITCH PARTITION and second is to delete this data/partition from the main table i.e. MERGE RANGE. We will discuss these one by one.
SWITCH: There can be two types of data movement. Either moving data from the partitioned table (main table) to a non-partitioned table (archive/history) table or moving data from some other partitioned table into the main partitioned table (we can say that both main as well the archive/history tables are partitioned here).
Let’s first focus on our requirement of moving data of the oldest partition from the main (partitioned) table to the history table (a non-partitioned table) i.e. ‘Switching Data Out’.
Assume there is a history table ‘dbo.Fact_Orders_History’ with exactly the same schema as that of the main table – ‘dbo.Fact_Orders’.
Now, we need to attach one partition (partition number 1) of dbo.Fact_Orders to the history table – dbo.Fact_Orders_History i.e. all the data from partition#1 to be moved into the history table. This is achieved using ALTER TABLE command.
–as of now no record is there in the history table.
–All 55 records from partition#1 are now moved to history table now.
Also, look at the partition distribution of the main table, there is no record into the partition #1 i.e. an empty partition.
Likewise, we can do ‘Switch Data In’ i.e. moving data from a partitioned table to another partitioned table. Let’s take an example where our archive table is also a partitioned table. So, basically we want to move the latest partition from the main table to the archive table and then deleting the oldest one from the archive table.
Again, this can be achieved using ALTER TABLE command. The only difference here is that earlier (while Switching Data Out) we did move data to a non-partitioned table and hence no check was actually required, but since now we are trying to attach one partition of a table to another partitioned table by specifying the partition number, data being moved should adhere to the partition number of the table data is being moved into.
Therefore, there must be some check constraint on the partitioned column of the main table as per the definition of the partition it is moved into of the archive table.
This will also move all the data from the main table to the archive table’s 5th partition.
And, in both the cases, main table’s partition will be left empty which is of no use and hence should be removed. This is achieved using MERGE RANGE command.
MERGE RANGE: This is to remove some existing partitions i.e. it drops a partition and merges any values that exist in the partition into one of the remaining partitions. Therefore it is very useful to get rid of an empty or not in use partition. Merging partitions is self-explanatory-take two partitions and make them one.
If the two partitions to be merged are empty, the operation is instantaneous (no I/O involved). If the partitions aren’t empty, data from one of the partitions is physically moved to the other (remember that each partition resides on a certain filegroup). Hence it’s advisable to empty/switch the partition not in use and then merge it.
Now, look at the data distribution again…
Partition#1, the empty one, has been removed now.
- It’s very useful in case of larger data sets to implement the database partitioning
- Helps in moving data chunks from one partitioned table to another partitioned/non-partitioned table
- After moving the data, empty partition should be removed using MERGE RANGE and then a new partition can be added (into the main table to allow new records to be inserted) using SPLIT RANGE.