Friday, 29 May 2015

Pages & Extents



The fundamental unit of data storage in SQL Server is the page. The disk space allocated to a data file (.mdf or .ndf) in a database is logically divided into pages numbered contiguously from 0 to n. Disk I/O operations are performed at the page level. That is, SQL Server reads or writes whole data pages.
Extents are a collection of eight physically contiguous pages and are used to efficiently manage the pages. All pages are stored in extents.

PAGES

In SQL Server, the page size is 8 KB. This means SQL Server databases have 128 pages per megabyte. Each page begins with a 96-byte header that is used to store system information about the page. This information includes the page number, page type, the amount of free space on the page, and the allocation unit ID of the object that owns the page.
The following table shows the page types used in the data files of a SQL Server database.



Data rows are put on the page serially, starting immediately after the header. A row offset table starts at the end of the page, and each row offset table contains one entry for each row on the page. Each entry records how far the first byte of the row is from the start of the page. The entries in the row offset table are in reverse sequence from the sequence of the rows on the page.



Large Row Support
Rows cannot span pages, however portions of the row may be moved off the row's page so that the row can actually be very large. The maximum amount of data and overhead that is contained in a single row on a page is 8,060 bytes (8 KB). However, this does not include the data stored in the Text/Image page type. This restriction is relaxed for tables that contain varchar, nvarchar, varbinary, or sql_variant columns. When the total row size of all fixed and variable columns in a table exceeds the 8,060 byte limitation, SQL Server dynamically moves one or more variable length columns to pages in the ROW_OVERFLOW_DATA allocation unit, starting with the column with the largest width. This is done whenever an insert or update operation increases the total size of the row beyond the 8060 byte limit. When a column is moved to a page in the ROW_OVERFLOW_DATA allocation unit, a 24-byte pointer on the original page in the IN_ROW_DATA allocation unit is maintained. If a subsequent operation reduces the row size, SQL Server dynamically moves the columns back to the original data page. For more information, see Row-Overflow Data Exceeding 8 KB.


Extents

Extents are the basic unit in which space is managed. An extent is eight physically contiguous pages, or 64 KB. This means SQL Server databases have 16 extents per megabyte.
To make its space allocation efficient, SQL Server does not allocate whole extents to tables with small amounts of data. SQL Server has two types of extents:
Uniform extents are owned by a single object; all eight pages in the extent can only be used by the owning object.
Mixed extents are shared by up to eight objects. Each of the eight pages in the extent can be owned by a different object.
A new table or index is generally allocated pages from mixed extents. When the table or index grows to the point that it has eight pages, it then switches to use uniform extents for subsequent allocations. If you create an index on an existing table that has enough rows to generate eight pages in the index, all allocations to the index are in uniform extents.


Managing Extent allocations
SQL server uses two types of pages to optimize this allocation process.

GAM(Global Allocation Map): GAM pages records what extents have been allocated for any use. GAM has bit for every extent. If the bit is 1, the corresponding extent is free, if the bit is 0, the corresponding extent is in use as uniform or mixed extent.A GAM page can hold information of around 64000 extents. That is, a GAM page can hold information of (64000X8X8)/1024 = 4000 MB approximately. In short,  a data file of size 7 GB will have two GAM pages.

SGAM (Shares Global Allocation Map): SGAM pages record what extents are currently being used as mixed extent and also have at least one unused page. SGAM has bit for every extent. If the bit is 1, the corresponding extent is used as a mixed extent and has at least one page free to allocate. If the bit is 0, the extent is either not used as a mixed extent or it is mixed extent and with all its pages being used. A SGAM page can hold information of 64000 extents. That is, a SGAM page can hold information of (64000X8X8)/1024 = 4000 MB. In short, a data file of size 7 GB will have two SGAM page.


GAM and SGAM pages helps the database engine in extent management. To allocate an extent, the database engine searches the GAM page for a bit 1 and set the bit to 0. If that extent is allocating as mixed extent, it sets  the corresponding extent's bit in SGAM page to 1. If that extent is allocating as uniform extent, there is no need to change the corresponding SGAM bit. To find a mixed extent with free pages, the database engine searches the SGAM page for a bit 1. If there is no free extent, the data file is full. To deallocate an extent, the database engine sets the corresponding GAM bit set to 1 and SGAM bit to 0.

In any data file, the third page(page no 2) is GAM and fourth page (page no 3) is  SGAM page. The first page (page no 0) is file header and second page (page no 1) is PFS (Page Free Space) page.  We can see the GAM and SGAM pages using DBCC page command.

 

  QUESTION-

1-What is pfs(Page free space) in sql server?

Page Free Space (PFS) pages record the allocation status of each page, whether an individual page has been allocated, and the amount of free space on each page. The PFS has one byte for each page, recording whether the page is allocated, and if so, whether it is empty, 1 to 50 percent full, 51 to 80 percent full, 81 to 95 percent full, or 96 to 100 percent full.

After an extent has been allocated to an object, the Database Engine uses the PFS pages to record which pages in the extent are allocated or free. This information is used when the Database Engine has to allocate a new page. The amount of free space in a page is only maintained for heap and Text/Image pages. It is used when the Database Engine has to find a page with free space available to hold a newly inserted row. Indexes do not require that the page free space be tracked, because the point at which to insert a new row is set by the index key values.

A PFS page is the first page after the file header page in a data file (page number 1). This is followed by a GAM page (page number 2), and then an SGAM page (page 3). There is a PFS page approximately 8,000 pages in size after the first PFS page. There is another GAM page 64,000 extents after the first GAM page on page 2, and another SGAM page 64,000 extents after the first SGAM page on page 3. The following illustration shows the sequence of pages used by the Database Engine to allocate and manage extents.



 

2-What is Extent and Page?

Extent is a basic unit of storage to provide space for tables. Every extent has a number of data pages. As new records are inserted new data, pages are allocated. There are eight data pages in an extent. So as soon as the eight pages are consumed, it allocates a new extent with data pages.

While extent is basic unit storage from a database point of view, page is a unit of allocation within extent.

OR

A page is a unit of data storage in SQL. The size of a page is 8Kb. A page has a header and a body. Different types of pages are: Date, text, index , page free space etc. The data rows are put on the page serially after the header.

Extents are units in which space is allocated to tables and indexes. An extent is 8 continuous pages. SQL Server has two types of extents: uniform and mixed extent. For efficient allocation, the SQL server does not allocate whole extents to tables with small amounts of data .

 

3-What are the Different Sections in Page?

Page has three important sections:

Page header, Actual data i.e. Data row, Row pointers or Row offset 

Page header has information like timestamp, next page number, previous page number etc.

Data rows are where your actual row data is stored. For every data row, there is a row offset which points to that data row.

 

4-What are Page Splits?

Pages are contained in extent. Every extent will have around eight data pages. But all the eight data pages are not created at once; they are created depending on data demand. So when a page becomes full it creates a new page, this process is called as “Page Split”.

 

5-What is IAM(Index Allocation Map)?

Information pertaining to extents that are used by a table or index per allocation unit.

An Index Allocation Map (IAM) page maps the extents in a 4-gigabyte (GB) part of a database file used by an allocation unit. An allocation unit is one of three types:

·         IN_ROW_DATA: Holds a partition of a heap or index.

·         LOB_DATA: Holds large object (LOB) data types, such as xml, varbinary(max), and varchar(max).

·         ROW_OVERFLOW_DATA: Holds variable length data stored in varchar, nvarchar, varbinary, or sql_variant columns that exceed the 8,060 byte row size limit.

 

6- What is BCM (Bulk Changed Map)?

This tracks the extents that have been modified by bulk logged operations since the last BACKUP LOG statement. If the bit for an extent is 1, the extent has been modified by a bulk logged operation after the last BACKUP LOG statement. If the bit is 0, the extent has not been modified by bulk logged operations.

 

7- What is DCM (Differential Changed Map)?

This tracks the extents that have changed since the last BACKUP DATABASE statement. If the bit for an extent is 1, the extent has been modified since the last BACKUP DATABASE statement. If the bit is 0, the extent has not been modified.

Differential backups read just the DCM pages to determine which extents have been modified. This greatly reduces the number of pages that a differential backup must scan.

 





No comments:

Post a Comment