- bookworm 3.0.117-1
- bookworm-backports 4.56.6-1~bpo12+1
- testing 4.56.6-1
- unstable 4.57.5-1
mfsscadmin(1) | This is part of MooseFS | mfsscadmin(1) |
NAME¶
mfsscadmin - MooseFS storage class administration tool
SYNOPSIS¶
mfscreatesclass [-?] [-M MOUNTPOINT] -K KEEP_LABELS [-c description] [-p priority] [-g export_group] [-a admin_only] [-m labels_mode] [-o arch_mode] [-C CREATE_LABELS] [-A ARCH_LABELS [-d arch_delay] [-s min_file_length]] [-T TRASH_LABELS [-t min_trashretention]] SCLASS_NAME...
mfsmodifysclass [-?] [-M MOUNTPOINT] [-c description] [-p priority] [-g export_group] [-a admin_only] [-m labels_mode] [-o arch_mode] [-C CREATE_LABELS] [-K KEEP_LABELS] [-A ARCH_LABELS] [-d arch_delay] [-s min_file_length] [-T TRASH_LABELS] [-t min_trashretention] SCLASS_NAME...
mfsdeletesclass [-?] [-M MOUNTPOINT] SCLASS_NAME...
mfsclonesclass [-?] [-M MOUNTPOINT] SRC_SCLASS_NAME DST_SCLASS_NAME...
mfsrenamesclass [-?] [-M MOUNTPOINT] SRC_SCLASS_NAME DST_SCLASS_NAME
mfslistsclass [-?] [-M MOUNTPOINT] [-l] [-i] [SCLASS_NAME_GLOB_PATTERN]
mfsimportsclass [-?] [-M MOUNTPOINT] [-r] [-n filename]
DESCRIPTION¶
This is a set of tools for creating and modifying storage classes, which can be later applied to MooseFS objects with mfssclass tools (see mfssclass(1)). Storage class is a set of labels expressions and options that indicates on which chunkservers the files in this class should be written and later kept.
mfscreatesclass creates a new storage class with given options, described below, and names it SCLASS_NAME; there can be more than one name provided, multiple storage classes with the same definition will be created then
mfsmodifysclass changes the given options in a class or classes indicated by SCLASS_NAME parameter(s)
mfsdeletesclass removes the class or classes indicated by SCLASS_NAME parameter(s); if any of the classes is not empty (i.e. it is still used by some MooseFS objects), it will not be removed and the tool will return an error and an error message will be printed; empty classes will be removed in any case
mfsclonesclass copies class indicated by SRC_SCLASS_NAME under a new name provided with DST_SCLASS_NAME
mfsrenamesclass changes the name of a class from SRC_SCLASS_NAME to DST_SCLASS_NAME
mfslistsclass lists all the classes
mfsimportsclass imports storage classes definitions from stdin or a file and creates them; input format should be identical to mfslistsclass -l output.
OPTIONS¶
-C optional parameter, that tells the system to which chunkservers, defined by the CREATE_LABELS expression, the chunk should be first written just after creation; if this parameter is not provided for a class, the KEEP_LABELS chunkservers will be used
-K mandatory parameter, that tells the system on which chunkservers, defined by the KEEP_LABELS expression, the chunk(s) should be kept always, except for special conditions like creating, archiving and deleting (moving to Trash), if defined
-A optional parameter, that tells the system on which chunkservers, defined by the ARCH_LABELS expression, the chunk(s) should be kept for archiving purposes; the system starts to treat a chunk as archive, when atime/mtime/ctime (as set by -o) of the file it belongs to is older than the number of hours specified with -d option; see also ARCHIVE BEHAVIOUR section below
-d optional parameter that defines after how much time from atime/mtime/ctime (as set by -o) a file (and its chunks) are treated as archive; minimum unit is hours, default is 24, for value formating see TIME
-o optional parameter that defines archive flags. C - ctime, M - mtime, A - atime, R - reversible, F - fastmode, P - per chunk ; default is C; see ARCHIVE BEHAVIOUR section below for details
-s optional parameter that defines minimum file length in bytes that can be archived; default is 0
-T optional parameter, that tells the system on which chunkservers, defined by the TRASH_LABELS expression, the chunk(s) of files in Trash should be kept; see also -t
-t optional parameter, that defines, how much time in Trash must be left for the system to actually use the schema defined in -T for a chunk; minimum unit is hours, default is 0, for value formating see TIME
-c optional parameter, that defines a class description, for user's convenience (a string, maximum length is 255 bytes)
-p optional paremeter, that defines a class priority; default is 0, see STORAGE CLASSES PRIORITY section
-g optional parameter, that defines a class export group; possible values are 0 to 15, default is 0; see mfsexport.cfg(5) for explanation
-a can be either 1 or 0 and indicates if the storage class is available to everyone (0) or admin only (1)
-m label mode used; possible values are l (or L, loose, Loose, LOOSE) for LOOSE mode, d (or D, std, Std, STD) for DEFAULT mode and s (or S, strict, Strict, STRICT) for STRICT mode; if no mode is defined, DEFAULT mode is assumed; behaviour of label modes is described below in LABEL MODES section
-l list also definitions, not only the names of existing storage classes
-i case insensitive storage class name matching
-r replace (overwrite) existing classes when importing storage classes
-n use provided filename as the source of storage classes definitions for importing, instead of stdin
-M MooseFS mount point, doesn't need to be specified if a tool is run inside MooseFS mounted directory or MooseFS is mounted in /mnt/mfs/
-? displays short usage message
TIME¶
For time variables their value can be defined as a number of seconds or hours (integer), depending on minimum unit of the variable, or as a time period in one of two possible formats:
first format: #.#T where T is one of: s-seconds, m-minutes, h-hours, d-days or w-weeks; fractions of minimum unit will be rounded to integer value
second format: #w#d#h#m#s, any number of definitions can be ommited, but the remaining definitions must be in order (so #d#m is still a valid definition, but #m#d is not); ranges: s,m: 0 to 59, h: 0 to 23, d: 0 t o 6, w is unlimited and the first definition is also always unlimited (i.e. for #d#h#m d will be unlimited)
If a minimum unit of a variable is larger than seconds, units below the minimum one will not be accepted. For example, a variable that has hours as a minimum unit will not accept s and m units.
Examples:
1.5d is the same as 1d12h, is the same as 36h
2.5w is the same as 2w3d12h, is the same as 420h; 2w84h is not a valid time period (h is not the first definition, so it is bound by range 0 to 23)
LABELS EXPRESSIONS¶
Labels are letters (A-Z - 26 letters) that can be assigned to chunkservers. Each chunkserver can have multiple (up to 26) labels. Labels are defined in mfschunkserver.cfg file, for more information refer to the appropriate manpage.
Labels expression is a set of subexpressions separated by commas. For full copies each subexpression specifies the storage schema of one copy of a file. Subexpression can be: an asterisk or a label schema. Label schema can be one label or an expression with sums, multiplications, negations and brackets. Sum means a file can be stored on any chunkserver matching any element of the sum (logical or). Multiplication means a file can be stored only on a chunkserver matching all elements (logical and). Asterisk means any chunkserver. Negation means any chunkserver but the one matching negated subexpression. Identical subexpressions can be shortened by adding a number in front of one instead of repeating it a number of times.
For EC labels expression starts with @ sign, followed by a number of data parts then + sign and a number that says how many parity parts the chunk should have. Possible numbers of data parts are 4 or 8. Possible numbers of parity parts are 1 (CE version) or 1 to 9 (PRO version). So, for example, @4+1 means EC with 4 data parts and 1 parity part, @8+3 means EC with 8 data parts and 3 parity parts. If number of data parts is omitted then the master uses the default value defined by DEFAULT_EC_DATA_PARTS - see mfsmaster.cfg (5). In this case @2 means @8+2 or @4+2. Then, maximum of two subexpressions can follow, separated by commas. If only one is present, it defines where all the parts should be kept. If both are present, the first subexpression defines where data parts should be kept, the second subexpression defines where parity parts should be kept.
Labels expression can be either a regular labels expression or EC labels expression (i.e. EC labels expression cannot be a subexpression). EC labels expression can only be used in place of ARCHIVE_LABELS or TRASH_LABELS in the storage class definition, regular labels expression can be use in any place.
At the end of each label expression one or two extending informations, divided with a special separator, can be added. The first possible extension, is the distinguish extension and the separator is the slash (/) sign. Second is labels mode override and this extenstion is separated by colon (:) sign.
Distinguish extension can be a list of labels or one of the following special strings:
[IP] or [I] - distinguish by IP number
[RACK] or [R] - distinguish by RACK, as defined in topology, see mfstopology.cfg (5)
If present, the distinguish part lets the system know that it should try to distribute full copies so that each copy is either on a different label from the list or on a chunkserver with different IP address or from a different rack. For EC the distinguish part is currently ignored.
NOTICE! If CHUNKS_UNIQUE_MODE is defined in mfsmaster.cfg to a value other than 0, it will override any distinguish setting in storage classes. For more informations about this parameter refer to mfsmaster.cfg (5) manual.
Labels mode override extension can be one of three characters: d (alternatively D or in string form std or Std or STD), s (alternatively S or in string form strict or Strict or STRICT) or l (alternatively L or in string form loose or Loose or LOOSE) and they mean that the DEFAULT, STRICT or LOOSE label mode, respectively, should be applied only to this one labels expression. For explanation about label modes see the LABEL MODES section.
One or both extensions can be present for each labels expression, each has to start with their separator and if both are present, the order has to be kept, i.e. the distinguish extension has to be first and the label mode extension needs to be second.
Examples of labels expressions:
A,B - files will have two copies, one copy will be stored on chunkserver(s) with label A, the other on chunkserver(s) with label B
A,* - files will have two copies, one copy will be stored on chunkserver(s) with label A, the other on any chunkserver(s)
A,!A - files will have two copies, one copy will be stored on chunkserver(s) with label A, the other on any chunkserver(s) that doesn't have the label A
*,* - files will have two copies, stored on any chunkservers (different for each copy)
AB,C+D+E - files will have two copies, one copy will be stored on any chunkserver(s) that has both labels A and B (multiplication of labels), the other on any chunkserver(s) that has either the C label or the D label or the E label (sum of labels)
A,B[X+Y],C[X+Y] - files will have three copies, one copy will be stored on any chunkserver(s) with A label, the second on any chunserver(s) that has the B label and either X or Y label, the third on any chunkserver(s), that has the C label and either X or Y label
2A expression is equivalent to A,A expression
A,3BC expression is equivalent to A,BC,BC,BC expression
2 expression is equivalent to 2* expression is equivalent to *,* expression
3*/[IP] - files will have 3 copies, each copy will be kept on a chunkserver with different IP address
A,B/[RACK] - files will have two copies, one copy will be stored on chunkserver(s) with label A, the other on chunkserver(s) with label B in a different rack than the other copy
S,H,H/ABX-Z - files will have 3 copies, one on server with label S, two on servers with label H, but each copy will be on a server with different label from the set of A, B, X, Y, Z
@4+1 - files will be kept in EC format, 4 data parts and 1 parity part
@8+3 - files will be kept in EC format, 8 data parts and 3 parity parts
@2 - files will be kept in EC format, default number of data parts, 2 parity parts
@4+3,Z - files will be kept in EC format, 4 data parts and 3 parity parts - all on chunkservers with label Z.
@2,A(X+Y) - files will be kept in EC format, default number of data parts, 2 parity parts, all parts will be kept on chunsevers with label A and either X or Y
@3,S,H - files will be kept in EC format, default number of data parts will be kept on chunkservers with label S, 3 parity parts will be kept on chunkservers with label H
AB,AC:l - files will be kept in copies format, one copy on a server with labels A and B, the second on a server with labels A and C and the behaviour of this should be LOOSE
@4+2,X,Y:s - files will be kept in EC format, 4 data parts will be kept on servers with label X, 2 parity (checksum) parts should be kept on servers with label Y and the behaviour of this should be STRICT
2A/[IP]:s - files should be kept in 2 copies, both copies on servers with label A, but each server should have different IP, behaviour of this when accounting for labels should be STRICT
LABEL MODES¶
It is important to specify what to do when it is not possible to meet the labels requirement of a storage class, i.e.: there is no space available on all servers with needed labels, there is not enough servers with needed labels or servers with needed labels are all busy. The question is if the system should create chunks on other servers (with non-matching labels) or not. This decision must be made by the user.
There are 3 modes of operation: DEFAULT, LOOSE and STRICT. The modes work a bit different depending on if a chunk is stored in copies or EC format, due to the different nature and algorithms that each of those format uses.
For copies format the 3 modes behave as follows:
In DEFAULT mode in case of overloaded servers the system will wait for them, but in case of no space available it will use other servers and will replicate data to correct servers when it becomes possible. This means if some servers are in busy state for a long time, it might not be possible to create new chunks with certain storage classes and endangered (undergoal) chunks from those classes are at higher risk of being completely lost due to delayed replications.
In STRICT mode, during writing a new file, the system will return error (ENOSPC) in case of no space available on servers marked with labels specified for chunk creation. It will still wait for overloaded servers. Undergoal repliactions will not be performed if there is no space on servers with labels matching the storage class. This means high risk of losing data if servers with some labels are permamently filled up with data!
In LOOSE mode the system will immediately use other servers in case of overloaded servers or no space on servers and will replicate data to correct servers when it becomes possible. There is no delay or error on file creation and undergoal replications are always done as soon as possible.
This table sums up the modes behaviour for chunks stored in copy format:
DEFAULT | STRICT | LOOSE | |
CREATE - BUSY | WAIT | WAIT | WRITE ANY |
CREATE - NO SPACE | WRITE ANY | ENOSPC | WRITE ANY |
REPLICATE - BUSY | WAIT | WAIT | WRITE ANY |
REPLICATE - NO SPACE | WRITE ANY | NO COPY | WRITE ANY |
For chunks stored in EC format the 3 modes behave as follows:
In general, chunks will only be converted from copy format to EC
format if there are enough servers in the system to safely store all the
parts of the EC format. For EC @N+X format, where N is number of data parts
and can be either 4 or 8 and X is number of parity/checksum parts and can be
equal to 1 (CE version) or any number from 1 to 9 (PRO version), the general
requirements are:
- at least N+2X chunk servers to convert new chunks from copy format to EC
format
- at least N+X chunk servers to keep chunks that are already in EC format
still in this format
- if there are less than N+X servers, all chunks will revert to copy (KEEP
definition) format.
In LOOSE mode the system will try to use first the servers matching the label expression defined in the used storage class, but if not enough servers with "correct" labels are available (because they are busy or have no space or are just not defined), it will use any available chunk servers regardless of label; so the N+2X and N+X are calculated from all available chunk servers when the system decides what format to use to keep a chunk. Also, when one part of a chunk in EC format becomes unavailable or corrupted, restoration of such part will also be done to any available server, if a server with "correct" labels cannot currently be used.
It's important to remember that if not enough servers with "correct" labels are available for a chunk in LOOSE mode, the system may use however many it wants of the "other" chunk servers, not just the minimal amount that is missing from the "correct" number of servers.
In STRICT mode the system will only use the servers matching the label expression defined in the used storage class, so only available or short-term busy servers matching defined label expression will be used for calculation of N+2X and N+X when the system decides what format to use to keep a chunk. When one part of a chunk in EC format becomes unavailable or corrupted, restoration of such part can only be done to a server with "correct" label; if such a server is unavailable long term (i.e. is not available outright or only temporarily busy), this will automatically mean that the chunk needs to be reverted to keep format anyway (if the missing part is a parity/checksum part, the chunk will just revert to copy format using all available data parts, if a data part is missing, it will be restored to a chunk server hosting another part of the same chunk - which is not allowed under normal circumstances - and then the conversion to copy format will follow immediately).
In DEFAULT mode the system will behave like in STRICT mode when it needs to make a decision whether it will convert a new chunk from copy format to EC format, that is the N+2X in this step is calculated only from "correctly" labeled servers. But to make a decision whether existing chunks need to be converted back from EC format to copy format it will look at all available servers, regardless of labels, so the N+X in this step is calculated from all available servers, like in LOOSE mode. X. In case of missing parts, if it's not possible to restore them to chunk servers with "correct" labels, the system will also adapt the LOOSE mode behaviour and try to use any available servers.
Notice! When a chunk is converted from copy format to EC format, the system first performs a "local split" operation, that is it picks one copy of the chunk and calculates all EC parts necessary on the server occupied by this selected copy. Then these parts are moved to separate chunkservers, matching the labels in the storage class definition for used EC mode. But temporarily, between the split and the "moving out" of the parts, they can be recorded on a "wrong" chunk server even in STRICT mode. This is because of the mechanics of the "local split" operation.
ARCHIVE BEHAVIOUR¶
Chunks have archive flag set during file maintenance loop, which means that the time to archiving defined by -d option is the minimum time that has to pass before the flag is set, not the exact time.
Default behaviour of the system is that once a chunk has the archive bit set on, it IS NOT switched off even if atime/ctime/mtime changes, unless R flag is set by option -o. Writing to a chunk will always switch its archive flag off.
Archive flags:
C - use file's ctime to determine if archive flag should be set on - this is the default flag
M - use file's mtime to determine if archive flag should be set on
A - use file's atime to determine if archive flag should be set on
R - reversible, if atime/mtime/ctime changes for a file, system verifies if archive flag should be turned off for its chunks
F - fastmode, chunk has archive flag set to on as soon as possible, whatever is defined with -d option is disregarded
P - "per chunk" mode, use chunk's mtime to determine if archive flag should be set on
Archive flag can be modified manually. See mfsarchive (1)
STORAGE CLASSES PRIORITY¶
Storage classes are assigned to files, but one chunk (one fragment of a file) can belong to many files, courtesy of the snapshot mechanism (see mfssnapshots (1)). If one chunk belongs to many files with different storage classes, one storage class must be picked to specify, how this chunk's copies should be kept in the system. Up to MooseFS version 4.56.0 a predefined class was artificially asigned to such chunk. Currently one of the files' classes will be used, according to priorities assigned by the user, to be exact: the system will pick the class with highest priority out of all the files' classes.
Example 1: there are 3 classes defined:
ClassA, with priority 100,
ClassB, with priority 206,
ClassC, with priority 1001.
A chunk, that belongs to 2 files, one in ClassA, the other in
ClassB, will be stored according to the definition provided by classB
(higher priority than ClassA).
A chunk, that belongs to 3 files, one in ClassA, one in ClassB, one in ClassC,
will be stored according to the definition provided by classC (higher
priority than both ClassA and ClassB).
If two or more classes have the same priority, then the following
factors will be considered, in order of importance, to determine, which
class will be picked:
- a class with higher redundancy level (RL) will be picked (maximum from each
class's KEEP and ARCHIVE redundancy levels will be considered as this
class's redundancy level),
- a class that has EC format in ARCHIVE state will be picked over a class
without EC,
- a class that uses labels for KEEP or ARCHIVE state will be picked over a
class without labels,
- a class that has EC format in TRASH state will be picked over a class
without EC,
- a class that uses labels for TRASH state will be picked over a class without
labels,
- if none of the above conditions are used, a class with higher class id will
be used.
Example 2: there are 5 classes defined, all with the same priority
(e.g. the default priority 0):
ClassA (id=1) has 3 copies in KEEP state (RL=2),
ClassB (id=2) has 2 copies in KEEP state and EC4+1 in ARCHIVE state (RL=1, has
EC in ARCHIVE),
ClassC (id=3) has 2 copies in KEEP state, stored on labels X (RL=1, no EC in
ARCHIVE, has labels in KEEP),
ClassD (id=4) has 2 copies in KEEP state, stored on labels Y (RL=1, no EC in
ARCHIVE, has labels in KEEP),
ClassE (id=5) has 2 copies in KEEP state (RL=1).
There is also a class ClassF defined, which has a priority of 77 and 2 copies
in KEEP state (RL=1).
A chunk, that belongs to files in classes: ClassA, ClassC and
ClassE will be stored according to definition of ClassA (highest RL).
A chunk, that belongs to files in classes: ClassB, ClassC will be stored
according to definition of ClassB (same RL, but ClassB has EC).
A chunk, that belongs to files in classes: ClassC and ClassE will be stored
according to definition of ClassC (same RL, but ClassC has labels).
A chunk, that belongs to files in classes: ClassC and ClassD will be stored
according to definition of ClassD (same RL, no EC, both have labels, so
higher class ID is picked).
A chunk, that belongs to files in 6 classes, from ClassA to ClassF, will be
stored according to definition of ClassF, because this one has higher
priority than all the other classes (77>0).
In a system with these 6 storage classes classE will never be used for a chunk
belonging to multiple files, it has the lowest possible priority (0) and no
extra conditions to justify its choice (lowest existing RL, no EC and no
labels).
PREDEFINED STORAGE CLASSES¶
A new MooseFS instance will have the following classes predefined:
2CP - only KEEP state defined, keep 2 copies on any labels (default class for / directory)
3CP - only KEEP state defined, keep 3 copies on any labels
EC4+1 - in KEEP state, keep 2 copies on any labels, in ARCHIVE state, keep chunks in EC4+1 format on any labels, archive delay is 1 day and is calculated using file's ctime, files smaller than 512kiB will not be converted to EC format
EC4+2 - (pro only) in KEEP state, keep 3 copies on any labels, in ARCHIVE state, keep chunks in EC4+2 format on any labels, archive delay is 1 day and is calculated using file's ctime, files smaller than 512kiB will not be converted to EC format
EC8+1 - in KEEP state, keep 2 copies on any labels, in ARCHIVE state, keep chunks in EC8+1 format on any labels, archive delay is 1 day and is calculated using file's ctime, files smaller than 512kiB will not be converted to EC format
EC8+2 - (pro only) in KEEP state, keep 3 copies on any labels, in ARCHIVE state, keep chunks in EC8+2 format on any labels, archive delay is 1 day and is calculated using file's ctime, files smaller than 512kiB will not be converted to EC format
These classes are fully modifiable and deletable and can be replaced with user's choice of classes.
Up to version 4.56.0 of MooseFS the predefined classes were different. The following information pertains to old MooseFS behaviour. In newer versions of MooseFS the classes mentioned below might exist as a result of an upgrade, but will behave exactly as any user-defined classes. This information is left here purely for informative reasons and will be removed from this manual page at some point:
(Behaviour up to version 4.56.0) "For compatibility reasons, every fresh or freshly upgraded instance of MooseFS has 9 predefined storage classes. Their names are single digits, from 1 to 9, and their definitions are * to 9*. They are equivalents of simple numeric goals from previous versions of the system. In case of an upgrade, all files that had goal N before upgrade, will now have N storage class. These classes can be modified only when option -f is specified. It is advised to create new storage classes in an upgraded system and migrate files with mfsxchgsclass tool, rather than modify the predefined classes. The predefined classes CANNOT be deleted."
REPORTING BUGS¶
Report bugs to <bugs@moosefs.com>.
COPYRIGHT¶
Copyright (C) 2025 Jakub Kruszona-Zawadzki, Saglabs SA
This file is part of MooseFS.
MooseFS is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2 (only).
MooseFS is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with MooseFS; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02111-1301, USA or visit http://www.gnu.org/licenses/gpl-2.0.html
SEE ALSO¶
mfsmount(8), mfstools(1), mfssclass(1), mfsarchive(1), mfsmaster.cfg(5), mfschunkserver.cfg(5), mfstopology.cfg(5)
February 2025 | MooseFS 4.57.5-1 |