Codex

LVMTHIN

Section: (7)

Updated: LVM TOOLS (2) (2014-09-01)

Index?action=index Return to Main Contents


NAME

lvmthin --- LVM thin provisioning

DESCRIPTION

Blocks in a standard logical volume are allocated when the LV is created, but blocks in a thin provisioned logical volume are allocated as they are written. Because of this, a thin provisioned LV is given a virtual size, and can then be much larger than physically available storage. The amount of physical storage provided for thin provisioned LVs can be increased later as the need arises.

Blocks in a standard LV are allocated (during creation) from the VG, but blocks in a thin LV are allocated (during use) from a special "thin pool LV". The thin pool LV contains blocks of physical storage, and blocks in thin LVs just reference blocks in the thin pool LV.

A thin pool LV must be created before thin LVs can be created within it. A thin pool LV is created by combining two standard LVs: a large data LV that will hold blocks for thin LVs, and a metadata LV that will hold metadata. The metadata tracks which data blocks belong to each thin LV.

Snapshots of thin LVs are efficient because the data blocks common to a thin LV and any of its snapshots are shared. Snapshots may be taken of thin LVs or of other thin snapshots. Blocks common to recursive snapshots are also shared in the thin pool. There is no limit to or degradation from sequences of snapshots.

As thin LVs or snapshot LVs are written to, they consume data blocks in the thin pool. As free data blocks in the pool decrease, more free blocks may need to be supplied. This is done by extending the thin pool data LV with additional physical space from the VG. Removing thin LVs or snapshots from the thin pool can also free blocks in the thin pool. However, removing LVs is not always an effective way of freeing space in a thin pool because the amount is limited to the number of blocks not shared with other LVs in the pool.

Incremental block allocation from thin pools can cause thin LVs to become fragmented. Standard LVs generally avoid this problem by allocating all the blocks at once during creation.

Thin Terms

ThinDataLV

thin data LV

large LV created in a VG

used by thin pool to store ThinLV blocks:

ThinMetaLV

thin metadata LV

small LV created in a VG

used by thin pool to track data block usage:

ThinPoolLV

thin pool LV

combination of ThinDataLV and ThinMetaLV

contains ThinLVs and SnapLVs:

ThinLV

thin LV

created from ThinPoolLV

appears blank after creation:

SnapLV

snapshot LV

created from ThinPoolLV

appears as a snapshot of another LV after creation:

Thin Usage

The primary method for using lvm thin provisioning:

1. create ThinDataLV

Create an LV that will hold thin pool data.

lvcreate -n ThinDataLV -L LargeSize VG

Example

  1. lvcreate -n pool0 -L 10G vg

2. create ThinMetaLV

Create an LV that will hold thin pool metadata.

lvcreate -n ThinMetaLV -L SmallSize VG

Example

  1. lvcreate -n pool0meta -L 1G vg
  2. lvs

  LV        VG Attr       LSize

  pool0     vg -wi-a----- 10.00g

  pool0meta vg -wi-a----- 1.00g

3. create ThinPoolLV

lvconvert --thinpool VG/ThinDataLV --poolmetadata VG/ThinMetaLV

Example

  1. lvconvert --thinpool vg/pool0 --poolmetadata vg/pool0meta
  2. lvs vg/pool0

  LV    VG Attr       LSize  Pool Origin Data% Meta%

  pool0 vg twi-a-tz-- 10.00g      0.00   0.00

  1. lvs -a

  LV            VG Attr       LSize

  pool0         vg twi-a-tz-- 10.00g

  [pool0_tdata] vg Twi-ao---- 10.00g

  [pool0_tmeta] vg ewi-ao---- 1.00g

4. create ThinLV

lvcreate -n ThinLV -V VirtualSize --thinpool VG/ThinPoolLV

Example

Create a thin LV in a thin pool:

  1. lvcreate -n thin1 -V 1T --thinpool vg/pool0

Create another thin LV in the same thin pool:

  1. lvcreate -n thin2 -V 1T --thinpool vg/pool0
  2. lvs vg/thin1 vg/thin2

  LV    VG Attr       LSize Pool  Origin Data%

  thin1 vg Vwi-a-tz-- 1.00t pool0        0.00

  thin2 vg Vwi-a-tz-- 1.00t pool0        0.00

5. create SnapLV

Create snapshots of an existing ThinLV or SnapLV.

Do not specify -L, --size when creating a thin snapshot.

A size argument will cause an old COW snapshot to be created.

lvcreate -n SnapLV -s VG/ThinLV

lvcreate -n SnapLV -s VG/PrevSnapLV

Example

Create first snapshot of an existing ThinLV:

  1. lvcreate -n thin1s1 -s vg/thin1

Create second snapshot of the same ThinLV:

  1. lvcreate -n thin1s2 -s vg/thin1

Create a snapshot of the first snapshot:

  1. lvcreate -n thin1s1s1 -s vg/thin1s1
  2. lvs vg/thin1s1 vg/thin1s2 vg/thin1s1s1

  LV        VG Attr       LSize Pool  Origin

  thin1s1   vg Vwi---tz-k 1.00t pool0 thin1

  thin1s2   vg Vwi---tz-k 1.00t pool0 thin1

  thin1s1s1 vg Vwi---tz-k 1.00t pool0 thin1s1

6. activate SnapLV

Thin snapshots are created with the persistent "activation skip" flag, indicated by the "k" attribute. Use -K with lvchange or vgchange to activate thin snapshots with the "k" attribute.

lvchange -ay -K VG/SnapLV

Example

  1. lvchange -ay -K vg/thin1s1
  2. lvs vg/thin1s1

  LV      VG Attr       LSize Pool  Origin

  thin1s1 vg Vwi-a-tz-k 1.00t pool0 thin1

Thin Topics

Specify devices for data and metadata LVs

Tolerate device failures using raid

Spare metadata LV

Metadata check and repair

Automatic pool metadata LV

Activation of thin snapshots

Removing thin pool LVs, thin LVs and snapshots

Manually manage free data space of thin pool LV

Manually manage free metadata space of a thin pool LV

Using fstrim to increase free space in a thin pool LV

Automatically extend thin pool LV

Data space exhaustion

Metadata space exhaustion

Zeroing

Discard

Chunk size

Size of pool metadata LV

Create a thin snapshot of an external, read only LV

Convert a standard LV to a thin LV with an external origin

Single step thin pool LV creation

Single step thin pool LV and thin LV creation

Merge thin snapshots

XFS on snapshots

Specify devices for data and metadata LVs

The data and metadata LVs in a thin pool are best created on separate physical devices. To do that, specify the device name(s) at the end of the lvcreate line. It can be especially helpful to use fast devices for the metadata LV.

lvcreate -n ThinDataLV -L LargeSize VG LargePV

lvcreate -n ThinMetaLV -L SmallSize VG SmallPV

lvconvert --thinpool VG/ThinDataLV --poolmetadata VG/ThinMetaLV

Example

(5) thin_pool_metadata_require_separate_pvs

controls the default PV usage for thin pool creation.

Tolerate device failures using raid

To tolerate device failures, use raid for the pool data LV and pool metadata LV. This is especially recommended for pool metadata LVs.

lvcreate --type raid1 -m 1 -n ThinMetaLV -L SmallSize VG PVA PVB

lvcreate --type raid1 -m 1 -n ThinDataLV -L LargeSize VG PVC PVD

lvconvert --thinpool VG/ThinDataLV --poolmetadata VG/ThinMetaLV

Example

Spare metadata LV

The first time a thin pool LV is created, lvm will create a spare metadata LV in the VG. This behavior can be controlled with the option --poolmetadataspare y|n. (Future thin pool creations will also attempt to create the pmspare LV if none exists.)

To create the pmspare ("pool metadata spare") LV, lvm first creates an LV with a default name, e.g. lvol0, and then converts this LV to a hidden LV with the _pmspare suffix, e.g. lvol0_pmspare.

One pmspare LV is kept in a VG to be used for any thin pool.

The pmspare LV cannot be created explicitly, but may be removed explicitly.

Example

The "Metadata check and repair" section describes the use of the pmspare LV.

Metadata check and repair

If thin pool metadata is damaged, it may be repairable. Checking and repairing thin pool metadata is analagous to running fsck on a file system.

When a thin pool LV is activated, lvm runs the thin_check command to check the correctness of the metadata on the pool metadata LV.

(5) thin_check_executable

can be set to an empty string ("") to disable the thin_check step. This is not recommended.

(5) thin_check_options

controls the command options used for the thin_check command.

If the thin_check command finds a problem with the metadata, the thin pool LV is not activated, and the thin pool metadata should be repaired.

Command to repair a thin pool:

lvconvert --repair VG/ThinPoolLV

Repair performs the following steps:

1. Creates a new, repaired copy of the metadata.

lvconvert runs the thin_repair command to read damaged metadata from the existing pool metadata LV, and writes a new repaired copy to the VG's pmspare LV.

2. Replaces the thin pool metadata LV.

If step 1 is successful, the thin pool metadata LV is replaced with the pmspare LV containing the corrected metadata. The previous thin pool metadata LV, containing the damaged metadata, becomes visible with the new name ThinPoolLV_tmetaN (where N is 0,1,...).

If the repair works, the thin pool LV and its thin LVs can be activated, and the LV containing the damaged thin pool metadata can be removed. It may be useful to move the new metadata LV (previously pmspare) to a better PV.

If the repair does not work, the thin pool LV and its thin LVs are lost.

If metadata is manually restored with thin_repair directly, the pool metadata LV can be manually swapped with another LV containing new metadata:

lvconvert --thinpool VG/ThinPoolLV --poolmetadata VG/NewThinMetaLV

Automatic pool metadata LV

A thin data LV can be converted to a thin pool LV without specifying a thin pool metadata LV. LVM will automatically create a metadata LV from the same VG.

lvcreate -n ThinDataLV -L LargeSize VG

lvconvert --thinpool VG/ThinDataLV

Example

Activation of thin snapshots

When a thin snapshot LV is created, it is by default given the "activation skip" flag. This flag is indicated by the "k" attribute displayed by lvs:

This flag causes the snapshot LV to be skipped, i.e. not activated, by normal activation commands. The skipping behavior does not apply to deactivation commands.

A snapshot LV with the "k" attribute can be activated using the -K (or --ignoreactivationskip) option in addition to the standard -ay (or --activate y) option.

Command to activate a thin snapshot LV:

lvchange -ay -K VG/SnapLV

The persistent "activation skip" flag can be turned off during lvcreate, or later with lvchange using the -kn (or --setactivationskip n) option. It can be turned on again with -ky (or --setactivationskip y).

When the "activation skip" flag is removed, normal activation commands will activate the LV, and the -K activation option is not needed.

Command to create snapshot LV without the activation skip flag:

lvcreate -kn -n SnapLV -s VG/ThinLV

Command to remove the activation skip flag from a snapshot LV:

lvchange -kn VG/SnapLV

(5) auto_set_activation_skip

controls the default activation skip setting used by lvcreate.

Removing thin pool LVs, thin LVs and snapshots

Removing a thin LV and its related snapshots returns the blocks it used to the thin pool LV. These blocks will be reused for other thin LVs and snapshots.

Removing a thin pool LV removes both the data LV and metadata LV and returns the space to the VG.

lvremove of thin pool LVs, thin LVs and snapshots cannot be reversed with vgcfgrestore.

vgcfgbackup does not back up thin pool metadata.

Manually manage free data space of thin pool LV

The available free space in a thin pool LV can be displayed with the lvs command. Free space can be added by extending the thin pool LV.

Command to extend thin pool data space:

lvextend -L Size VG/ThinPoolLV

Example

Other methods of increasing free data space in a thin pool LV include removing a thin LV and its related snapsots, or running fstrim on the file system using a thin LV.

Manually manage free metadata space of a thin pool LV

The available metadata space in a thin pool LV can be displayed with the lvs -o+metadata_percent command.

Command to extend thin pool metadata space:

lvextend -L Size VG/ThinPoolLV_tmeta

Example

1. A thin pool LV is using 12.40% of its metadata blocks.

2. Display a thin pool LV with its component thin data LV and thin metadata LV.

3. Double the amount of physical space in the thin metadata LV.

4. The percentage of used metadata blocks is half the previous value.

Using fstrim to increase free space in a thin pool LV

Removing files in a file system on top of a thin LV does not generally add free space back to the thin pool. Manually running the fstrim command can return space back to the thin pool that had been used by removed files. fstrim uses discards and will not work if the thin pool LV has discards mode set to ignore.

Example

A thin pool has 10G of physical data space, and a thin LV has a virtual size of 100G. Writing a 1G file to the file system reduces the free space in the thin pool by 10% and increases the virtual usage of the file system by 1%. Removing the 1G file restores the virtual 1% to the file system, but does not restore the physical 10% to the thin pool. The fstrim command restores the physical space to the thin pool.

The "Discard" section covers an option for automatically freeing data space in a thin pool.

Automatically extend thin pool LV

An lvm daemon (dmeventd) will by default monitor the data usage of thin pool LVs and extend them when the usage reaches a certain level. The necessary free space must exist in the VG to extend the thin pool LVs.

Command to enable or disable the monitoring and automatic extension of an existing thin pool LV:

lvchange --monitor {y|n} VG/ThinPoolLV

(5) thin_pool_autoextend_threshold

(5) thin_pool_autoextend_percent

control the default autoextend behavior.

thin_pool_autoextend_threshold is a percentage value that defines when the thin pool LV should be extended. Setting this to 100 disables automatic extention. The minimum value is 50.

thin_pool_autoextend_percent defines how much extra data space should be added to the thin pool, in percent of its current size.

Warnings are emitted through syslog when the use of a pool reaches 80%, 85%, 90% and 95%.

Example

If thin_pool_autoextend_threshold is 70 and thin_pool_autoextend_percent is 20, whenever a pool exceeds 70% usage, it will be extended by another 20%. For a 1G pool, using 700M will trigger a resize to 1.2G. When the usage exceeds 840M, the pool will be extended to 1.44G, and so on.

Data space exhaustion

If thin pool data space is exhausted, writes to thin LVs will be queued until the the data space is extended. Reading is still possible.

When data space is exhausted, the lvs command displays 100 under Data% for the thin pool LV:

A thin pool can run out of data blocks for any of the following reasons:

1. Automatic extension of the thin pool is disabled, and the thin pool is not manually extended. (Disabling automatic extension is not recommended.)

2. The dmeventd daemon is not running and the thin pool is not manually extended. (Disabling dmeventd is not recommended.)

3. Automatic extension of the thin pool is too slow given the rate of writes to thin LVs in the pool. (This can be addressed by tuning the thin_pool_autoextend_threshold and thin_pool_autoextend_percent.)

4. The VG does not have enough free blocks to extend the thin pool.

The response to data space exhaustion is to extend the thin pool. This is described in the section "Manually manage free data space of thin pool LV".

Metadata space exhaustion

If thin pool metadata space is exhausted (or a thin pool metadata operation fails), errors will be returned for IO operations on thin LVs.

When metadata space is exhausted, the lvs command displays 100 under Meta% for the thin pool LV:

The same reasons for thin pool data space exhaustion apply to thin pool metadata space.

Metadata space exhaustion can lead to inconsistent thin pool metadata and inconsistent file systems, so the response requires offline checking and repair.

1. Deactivate the thin pool LV, or reboot the system if this is not possible.

2. Repair thin pool with lvconvert --repair.

   See "Metadata check and repair".

3. Extend pool metadata space with lvextend VG/ThinPoolLV_tmeta.

   See "Manually manage free metadata space of a thin pool LV".

4. Check and repair file system with fsck.

Zeroing

When a thin pool provisions a new data block for a thin LV, the new block is first overwritten with zeros. The zeroing mode is indicated by the "z" attribute displayed by lvs. The option -Z (or --zero) can be added to commands to specify the zeroing mode.

Command to set the zeroing mode when creating a thin pool LV:

lvconvert -Z{y|n} --thinpool VG/ThinDataLV

--poolmetadata VG/ThinMetaLV

Command to change the zeroing mode of an existing thin pool LV:

lvchange -Z{y|n} VG/ThinPoolLV

If zeroing mode is changed from "n" to "y", previously provisioned blocks are not zeroed.

Provisioning of large zeroed chunks impacts performance.

(5) thin_pool_zero

controls the default zeroing mode used when creating a thin pool.

Discard

The discard behavior of a thin pool LV determines how discard requests are handled. Enabling discard under a file system may adversely affect the file system performance (see the section on fstrim for an alternative.) Possible discard behaviors:

ignore: Ignore any discards that are received.

nopassdown: Process any discards in the thin pool itself and allow the no longer needed extends to be overwritten by new data.

passdown: Process discards in the thin pool (as with nopassdown), and pass the discards down the the underlying device. This is the default mode.

Command to display the current discard mode of a thin pool LV:

lvs -o+discards VG/ThinPoolLV

Command to set the discard mode when creating a thin pool LV:

lvconvert --discards {ignore|nopassdown|passdown}

<DD CLASS="c2|--thinpool VG/ThinDataLV --poolmetadata VG/ThinMetaLV:

Command to change the discard mode of an existing thin pool LV:

lvchange --discards {ignore|nopassdown|passdown} VG/ThinPoolLV

Example

(5) thin_pool_discards

controls the default discards mode used when creating a thin pool.

Chunk size

The size of data blocks managed by a thin pool can be specified with the --chunksize option when the thin pool LV is created. The default unit is kilobytes and the default value is 64KiB. The value must be a power of two between 4KiB and 1GiB.

When a thin pool is used primarily for the thin provisioning feature, a larger value is optimal. To optimize for a lot of snapshotting, a smaller value reduces copying time and consumes less space.

Command to display the thin pool LV chunk size:

lvs -o+chunksize VG/ThinPoolLV

Example

(5) thin_pool_chunk_size

controls the default chunk size used when creating a thin pool.

Size of pool metadata LV

The amount of thin metadata depends on how many blocks are shared between thin LVs (i.e. through snapshots). A thin pool with many snapshots may need a larger metadata LV.

The range of supported metadata LV sizes is 2MiB to 16GiB.

The default size is estimated with the formula:

ThinPoolLVSize / ThinPoolLVChunkSize * 64b.

When creating a thin metadata LV explicitly, the size is specified in the lvcreate command. When a command automatically creates a thin metadata LV, the --poolmetadatasize option can be used specify a non-default size. The default unit is megabytes.

Create a thin snapshot of an external, read only LV

Thin snapshots are typically taken of other thin LVs or other thin snapshot LVs within the same thin pool. It is also possible to take thin snapshots of external, read only LVs. Writes to the snapshot are stored in the thin pool, and the external LV is used to read unwritten parts of the thin snapshot.

lvcreate -n SnapLV -s VG/ExternalOriginLV --thinpool VG/ThinPoolLV

Example

Convert a standard LV to a thin LV with an external origin

A new thin LV can be created and given the name of an existing standard LV. At the same time, the existing LV is converted to a read only external LV with a new name. Unwritten portions of the thin LV are read from the external LV. The new name given to the existing LV can be specified with --originname, otherwise the existing LV will be given a default name, e.g. lvol#.

Convert ExampleLV into a read only external LV with the new name NewExternalOriginLV, and create a new thin LV that is given the previous name of ExampleLV.

lvconvert --type thin --thinpool VG/ThinPoolLV

<DD CLASS="c2|--originname NewExternalOriginLV --thin VG/ExampleLV:

Example

Single step thin pool LV creation

A thin pool LV can be created with a single lvcreate command, rather than using lvconvert on existing LVs. This one command creates a thin data LV, a thin metadata LV, and combines the two into a thin pool LV.

lvcreate -L LargeSize --thinpool VG/ThinPoolLV

Example

Single step thin pool LV and thin LV creation

A thin pool LV and a thin LV can be created with a single lvcreate command. This one command creates a thin data LV, a thin metadata LV, combines the two into a thin pool LV, and creates a thin LV in the new pool.

-L LargeSize specifies the physical size of the thin pool LV.

-V VirtualSize specifies the virtual size of the thin LV.

lvcreate -L LargeSize -V VirtualSize -n ThinLV

--thinpool VG/ThinPoolLV

Equivalent to:

lvcreate -L LargeSize --thinpool VG/ThinPoolLV

lvcreate -n ThinLV -V VirtualSize --thinpool VG/ThinPoolLV

Example

Merge thin snapshots

A thin snapshot can be merged into its origin thin LV using the lvconvert --merge command. The result of a snapshot merge is that the origin thin LV takes the content of the snapshot LV, and the snapshot LV is removed. Any content that was unique to the origin thin LV is lost after the merge.

Because a merge changes the content of an LV, it cannot be done while the LVs are open, e.g. mounted. If a merge is initiated while the LVs are open, the effect of the merge is delayed until the origin thin LV is next activated.

lvconvert --merge VG/SnapLV

Example

Example

XFS on snapshots

Mounting an XFS file system on a new snapshot LV requires attention to the file system's log state and uuid. On the snapshot LV, the xfs log will contain a dummy transaction, and the xfs uuid will match the uuid from the file system on the origin LV.

If the snapshot LV is writable, mounting will recover the log to clear the dummy transaction, but will require skipping the uuid check:

mount /dev/VG/SnapLV /mnt -o nouuid

Or, the uuid can be changed on disk before mounting:

xfs_admin -U generate /dev/VG/SnapLV

mount /dev/VG/SnapLV /mnt

If the snapshot LV is readonly, the log recovery and uuid check need to be skipped while mounting readonly:

mount /dev/VG/SnapLV /mnt -o ro,nouuid,norecovery

SEE ALSO

lvm?(8), (5), lvcreate?(8), lvconvert?(8), lvchange?(8), lvextend?(8), lvremove?(8), lvs?(8), thin_dump?(8), thin_repair?(8) thin_restore?(8)


Index

NAME

DESCRIPTION

Thin Terms

Thin Usage

1. create ThinDataLV

2. create ThinMetaLV

3. create ThinPoolLV

4. create ThinLV

5. create SnapLV

6. activate SnapLV

Thin Topics

Specify devices for data and metadata LVs

Tolerate device failures using raid

Spare metadata LV

Metadata check and repair

Automatic pool metadata LV

Activation of thin snapshots

Removing thin pool LVs, thin LVs and snapshots

Manually manage free data space of thin pool LV

Manually manage free metadata space of a thin pool LV

Using fstrim to increase free space in a thin pool LV

Automatically extend thin pool LV

Data space exhaustion

Metadata space exhaustion

Zeroing

Discard

Chunk size

Size of pool metadata LV

Create a thin snapshot of an external, read only LV

Convert a standard LV to a thin LV with an external origin

Single step thin pool LV creation

Single step thin pool LV and thin LV creation

Merge thin snapshots

XFS on snapshots

SEE ALSO