dwww Home | Manual pages | Find package

mount_namespaces(7)     Miscellaneous Information Manual    mount_namespaces(7)

NAME
       mount_namespaces - overview of Linux mount namespaces

DESCRIPTION
       For an overview of namespaces, see namespaces(7).

       Mount  namespaces  provide  isolation  of the list of mounts seen by the
       processes in each namespace instance.  Thus, the processes  in  each  of
       the mount namespace instances will see distinct single-directory hierar-
       chies.

       The  views  provided  by  the /proc/pid/mounts, /proc/pid/mountinfo, and
       /proc/pid/mountstats files (all described in proc(5)) correspond to  the
       mount  namespace in which the process with the PID pid resides.  (All of
       the processes that reside in the same mount namespace will see the  same
       view in these files.)

       A  new  mount  namespace  is created using either clone(2) or unshare(2)
       with the CLONE_NEWNS flag.  When a new mount namespace is  created,  its
       mount list is initialized as follows:

       •  If  the  namespace  is  created using clone(2), the mount list of the
          child's namespace is a copy of the mount list in the parent process's
          mount namespace.

       •  If the namespace is created using unshare(2), the mount list  of  the
          new  namespace  is  a copy of the mount list in the caller's previous
          mount namespace.

       Subsequent modifications to the mount list (mount(2) and  umount(2))  in
       either  mount namespace will not (by default) affect the mount list seen
       in the other namespace (but see the following discussion of shared  sub-
       trees).

SHARED SUBTREES
       After  the  implementation of mount namespaces was completed, experience
       showed that the isolation that they provided was,  in  some  cases,  too
       great.  For example, in order to make a newly loaded optical disk avail-
       able  in  all  mount  namespaces, a mount operation was required in each
       namespace.  For this use case, and others, the  shared  subtree  feature
       was introduced in Linux 2.6.15.  This feature allows for automatic, con-
       trolled  propagation of mount(2) and umount(2) events between namespaces
       (or, more precisely, between the mounts that are members of a peer group
       that are propagating events to one another).

       Each mount is marked (via mount(2)) as having one of the following prop-
       agation types:

       MS_SHARED
              This mount shares events with members of a peer group.   mount(2)
              and  umount(2) events immediately under this mount will propagate
              to the other mounts that are members of the peer group.  Propaga-
              tion here means that the same mount(2) or umount(2) will automat-
              ically occur under all of the other mounts  in  the  peer  group.
              Conversely,  mount(2)  and umount(2) events that take place under
              peer mounts will propagate to this mount.

       MS_PRIVATE
              This mount is private; it does not have a peer  group.   mount(2)
              and umount(2) events do not propagate into or out of this mount.

       MS_SLAVE
              mount(2)  and  umount(2)  events propagate into this mount from a
              (master) shared peer group.  mount(2) and umount(2) events  under
              this mount do not propagate to any peer.

              Note that a mount can be the slave of another peer group while at
              the  same  time sharing mount(2) and umount(2) events with a peer
              group of which it is a member.  (More precisely, one  peer  group
              can be the slave of another peer group.)

       MS_UNBINDABLE
              This is like a private mount, and in addition this mount can't be
              bind  mounted.   Attempts to bind mount this mount (mount(2) with
              the MS_BIND flag) will fail.

              When a recursive bind mount (mount(2) with the MS_BIND and MS_REC
              flags) is performed on  a  directory  subtree,  any  bind  mounts
              within  the  subtree  are  automatically pruned (i.e., not repli-
              cated) when replicating that subtree to produce the  target  sub-
              tree.

       For  a  discussion  of the propagation type assigned to a new mount, see
       NOTES.

       The propagation type is a per-mount-point setting; some  mounts  may  be
       marked  as  shared  (with each shared mount being a member of a distinct
       peer group), while others are private (or slaved or unbindable).

       Note that a mount's propagation type  determines  whether  mount(2)  and
       umount(2)  of  mounts immediately under the mount are propagated.  Thus,
       the propagation type does not affect propagation of  events  for  grand-
       children  and  further  removed  descendant mounts.  What happens if the
       mount itself is unmounted is determined by the propagation type that  is
       in effect for the parent of the mount.

       Members  are  added to a peer group when a mount is marked as shared and
       either:

       (a)  the mount is replicated during the creation of a  new  mount  name-
            space; or

       (b)  a new bind mount is created from the mount.

       In  both of these cases, the new mount joins the peer group of which the
       existing mount is a member.

       A new peer group is also created when a child mount is created under  an
       existing  mount  that  is marked as shared.  In this case, the new child
       mount is also marked as shared and the resulting peer group consists  of
       all the mounts that are replicated under the peers of parent mounts.

       A  mount  ceases to be a member of a peer group when either the mount is
       explicitly unmounted, or when the mount is implicitly unmounted  because
       a mount namespace is removed (because it has no more member processes).

       The  propagation  type of the mounts in a mount namespace can be discov-
       ered via the "optional fields"  exposed  in  /proc/pid/mountinfo.   (See
       proc(5) for details of this file.)  The following tags can appear in the
       optional fields for a record in that file:

       shared:X
              This  mount  is  shared  in  peer group X.  Each peer group has a
              unique ID that is automatically generated by the kernel, and  all
              mounts  in the same peer group will show the same ID.  (These IDs
              are assigned starting from the value 1, and may be recycled  when
              a peer group ceases to have any members.)

       master:X
              This mount is a slave to shared peer group X.

       propagate_from:X (since Linux 2.6.26)
              This  mount  is a slave and receives propagation from shared peer
              group X.  This tag will always appear in conjunction with a  mas-
              ter:X  tag.  Here, X is the closest dominant peer group under the
              process's root directory.  If X is the immediate  master  of  the
              mount, or if there is no dominant peer group under the same root,
              then  only  the  master:X  field  is  present  and not the propa-
              gate_from:X field.  For further details, see below.

       unbindable
              This is an unbindable mount.

       If none of the above tags is present, then this is a private mount.

   MS_SHARED and MS_PRIVATE example
       Suppose that on a terminal in the initial mount namespace, we  mark  one
       mount  as  shared  and  another  as private, and then view the mounts in
       /proc/self/mountinfo:

           sh1# mount --make-shared /mntS
           sh1# mount --make-private /mntP
           sh1# cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           77 61 8:17 / /mntS rw,relatime shared:1
           83 61 8:15 / /mntP rw,relatime

       From the /proc/self/mountinfo output, we see  that  /mntS  is  a  shared
       mount  in  peer group 1, and that /mntP has no optional tags, indicating
       that it is a private mount.  The first two fields in each record in this
       file are the unique ID for this mount, and the mount ID  of  the  parent
       mount.  We can further inspect this file to see that the parent mount of
       /mntS and /mntP is the root directory, /, which is mounted as private:

           sh1# cat /proc/self/mountinfo | awk '$1 == 61' | sed 's/ - .*//'
           61 0 8:2 / / rw,relatime

       On  a  second  terminal,  we create a new mount namespace where we run a
       second shell and inspect the mounts:

           $ PS1='sh2# ' sudo unshare -m --propagation unchanged sh
           sh2# cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           222 145 8:17 / /mntS rw,relatime shared:1
           225 145 8:15 / /mntP rw,relatime

       The new mount namespace received a copy of the initial mount namespace's
       mounts.  These new mounts maintain the same propagation types, but  have
       unique  mount  IDs.   (The  --propagation  unchanged option prevents un-
       share(1) from marking all mounts as private when creating  a  new  mount
       namespace, which it does by default.)

       In the second terminal, we then create submounts under each of /mntS and
       /mntP and inspect the set-up:

           sh2# mkdir /mntS/a
           sh2# mount /dev/sdb6 /mntS/a
           sh2# mkdir /mntP/b
           sh2# mount /dev/sdb7 /mntP/b
           sh2# cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           222 145 8:17 / /mntS rw,relatime shared:1
           225 145 8:15 / /mntP rw,relatime
           178 222 8:22 / /mntS/a rw,relatime shared:2
           230 225 8:23 / /mntP/b rw,relatime

       From  the  above, it can be seen that /mntS/a was created as shared (in-
       heriting this setting from its parent mount) and /mntP/b was created  as
       a private mount.

       Returning  to  the first terminal and inspecting the set-up, we see that
       the new mount created under the shared mount  /mntS  propagated  to  its
       peer  mount  (in the initial mount namespace), but the new mount created
       under the private mount /mntP did not propagate:

           sh1# cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           77 61 8:17 / /mntS rw,relatime shared:1
           83 61 8:15 / /mntP rw,relatime
           179 77 8:22 / /mntS/a rw,relatime shared:2

   MS_SLAVE example
       Making a mount a slave allows it  to  receive  propagated  mount(2)  and
       umount(2)  events  from  a master shared peer group, while preventing it
       from propagating events to that master.  This is useful if  we  want  to
       (say)  receive a mount event when an optical disk is mounted in the mas-
       ter shared peer group (in another mount namespace), but want to  prevent
       mount(2) and umount(2) events under the slave mount from having side ef-
       fects in other namespaces.

       We  can demonstrate the effect of slaving by first marking two mounts as
       shared in the initial mount namespace:

           sh1# mount --make-shared /mntX
           sh1# mount --make-shared /mntY
           sh1# cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           132 83 8:23 / /mntX rw,relatime shared:1
           133 83 8:22 / /mntY rw,relatime shared:2

       On a second terminal, we create a new mount namespace  and  inspect  the
       mounts:

           sh2# unshare -m --propagation unchanged sh
           sh2# cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           168 167 8:23 / /mntX rw,relatime shared:1
           169 167 8:22 / /mntY rw,relatime shared:2

       In the new mount namespace, we then mark one of the mounts as a slave:

           sh2# mount --make-slave /mntY
           sh2# cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           168 167 8:23 / /mntX rw,relatime shared:1
           169 167 8:22 / /mntY rw,relatime master:2

       From  the  above  output, we see that /mntY is now a slave mount that is
       receiving propagation events from the shared peer group with the ID 2.

       Continuing in the new namespace, we create submounts under each of /mntX
       and /mntY:

           sh2# mkdir /mntX/a
           sh2# mount /dev/sda3 /mntX/a
           sh2# mkdir /mntY/b
           sh2# mount /dev/sda5 /mntY/b

       When we inspect the state of the mounts in the new mount  namespace,  we
       see  that  /mntX/a  was  created  as  a new shared mount (inheriting the
       "shared" setting from its parent mount) and /mntY/b  was  created  as  a
       private mount:

           sh2# cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           168 167 8:23 / /mntX rw,relatime shared:1
           169 167 8:22 / /mntY rw,relatime master:2
           173 168 8:3 / /mntX/a rw,relatime shared:3
           175 169 8:5 / /mntY/b rw,relatime

       Returning to the first terminal (in the initial mount namespace), we see
       that  the  mount  /mntX/a propagated to the peer (the shared /mntX), but
       the mount /mntY/b was not propagated:

           sh1# cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           132 83 8:23 / /mntX rw,relatime shared:1
           133 83 8:22 / /mntY rw,relatime shared:2
           174 132 8:3 / /mntX/a rw,relatime shared:3

       Now we create a new mount under /mntY in the first shell:

           sh1# mkdir /mntY/c
           sh1# mount /dev/sda1 /mntY/c
           sh1# cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           132 83 8:23 / /mntX rw,relatime shared:1
           133 83 8:22 / /mntY rw,relatime shared:2
           174 132 8:3 / /mntX/a rw,relatime shared:3
           178 133 8:1 / /mntY/c rw,relatime shared:4

       When we examine the mounts in the second mount namespace, we see that in
       this case the new mount has been propagated to the slave mount, and that
       the new mount is itself a slave mount (to peer group 4):

           sh2# cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           168 167 8:23 / /mntX rw,relatime shared:1
           169 167 8:22 / /mntY rw,relatime master:2
           173 168 8:3 / /mntX/a rw,relatime shared:3
           175 169 8:5 / /mntY/b rw,relatime
           179 169 8:1 / /mntY/c rw,relatime master:4

   MS_UNBINDABLE example
       One of the primary purposes of unbindable mounts is to avoid the  "mount
       explosion"  problem  when repeatedly performing bind mounts of a higher-
       level subtree at a lower-level mount.  The problem is illustrated by the
       following shell session.

       Suppose we have a system with the following mounts:

           # mount | awk '{print $1, $2, $3}'
           /dev/sda1 on /
           /dev/sdb6 on /mntX
           /dev/sdb7 on /mntY

       Suppose furthermore that we wish to recursively bind mount the root  di-
       rectory under several users' home directories.  We do this for the first
       user, and inspect the mounts:

           # mount --rbind / /home/cecilia/
           # mount | awk '{print $1, $2, $3}'
           /dev/sda1 on /
           /dev/sdb6 on /mntX
           /dev/sdb7 on /mntY
           /dev/sda1 on /home/cecilia
           /dev/sdb6 on /home/cecilia/mntX
           /dev/sdb7 on /home/cecilia/mntY

       When  we  repeat this operation for the second user, we start to see the
       explosion problem:

           # mount --rbind / /home/henry
           # mount | awk '{print $1, $2, $3}'
           /dev/sda1 on /
           /dev/sdb6 on /mntX
           /dev/sdb7 on /mntY
           /dev/sda1 on /home/cecilia
           /dev/sdb6 on /home/cecilia/mntX
           /dev/sdb7 on /home/cecilia/mntY
           /dev/sda1 on /home/henry
           /dev/sdb6 on /home/henry/mntX
           /dev/sdb7 on /home/henry/mntY
           /dev/sda1 on /home/henry/home/cecilia
           /dev/sdb6 on /home/henry/home/cecilia/mntX
           /dev/sdb7 on /home/henry/home/cecilia/mntY

       Under /home/henry, we have not only  recursively  added  the  /mntX  and
       /mntY  mounts,  but also the recursive mounts of those directories under
       /home/cecilia that were created in the previous  step.   Upon  repeating
       the  step for a third user, it becomes obvious that the explosion is ex-
       ponential in nature:

           # mount --rbind / /home/otto
           # mount | awk '{print $1, $2, $3}'
           /dev/sda1 on /
           /dev/sdb6 on /mntX
           /dev/sdb7 on /mntY
           /dev/sda1 on /home/cecilia
           /dev/sdb6 on /home/cecilia/mntX
           /dev/sdb7 on /home/cecilia/mntY
           /dev/sda1 on /home/henry
           /dev/sdb6 on /home/henry/mntX
           /dev/sdb7 on /home/henry/mntY
           /dev/sda1 on /home/henry/home/cecilia
           /dev/sdb6 on /home/henry/home/cecilia/mntX
           /dev/sdb7 on /home/henry/home/cecilia/mntY
           /dev/sda1 on /home/otto
           /dev/sdb6 on /home/otto/mntX
           /dev/sdb7 on /home/otto/mntY
           /dev/sda1 on /home/otto/home/cecilia
           /dev/sdb6 on /home/otto/home/cecilia/mntX
           /dev/sdb7 on /home/otto/home/cecilia/mntY
           /dev/sda1 on /home/otto/home/henry
           /dev/sdb6 on /home/otto/home/henry/mntX
           /dev/sdb7 on /home/otto/home/henry/mntY
           /dev/sda1 on /home/otto/home/henry/home/cecilia
           /dev/sdb6 on /home/otto/home/henry/home/cecilia/mntX
           /dev/sdb7 on /home/otto/home/henry/home/cecilia/mntY

       The mount explosion problem in the above scenario can be avoided by mak-
       ing each of the new mounts unbindable.  The effect of doing this is that
       recursive mounts of the root directory will not replicate the unbindable
       mounts.  We make such a mount for the first user:

           # mount --rbind --make-unbindable / /home/cecilia

       Before going further, we show that unbindable mounts are indeed  unbind-
       able:

           # mkdir /mntZ
           # mount --bind /home/cecilia /mntZ
           mount: wrong fs type, bad option, bad superblock on /home/cecilia,
                  missing codepage or helper program, or other error

                  In some cases useful info is found in syslog - try
                  dmesg | tail or so.

       Now we create unbindable recursive bind mounts for the other two users:

           # mount --rbind --make-unbindable / /home/henry
           # mount --rbind --make-unbindable / /home/otto

       Upon examining the list of mounts, we see there has been no explosion of
       mounts,  because  the  unbindable  mounts were not replicated under each
       user's directory:

           # mount | awk '{print $1, $2, $3}'
           /dev/sda1 on /
           /dev/sdb6 on /mntX
           /dev/sdb7 on /mntY
           /dev/sda1 on /home/cecilia
           /dev/sdb6 on /home/cecilia/mntX
           /dev/sdb7 on /home/cecilia/mntY
           /dev/sda1 on /home/henry
           /dev/sdb6 on /home/henry/mntX
           /dev/sdb7 on /home/henry/mntY
           /dev/sda1 on /home/otto
           /dev/sdb6 on /home/otto/mntX
           /dev/sdb7 on /home/otto/mntY

   Propagation type transitions
       The following table shows the effect that  applying  a  new  propagation
       type (i.e., mount --make-xxxx) has on the existing propagation type of a
       mount.   The  rows  correspond  to  existing  propagation types, and the
       columns are the new propagation settings.  For reasons of  space,  "pri-
       vate" is abbreviated as "priv" and "unbindable" as "unbind".
                     make-shared   make-slave      make-priv  make-unbind
       ─────────────┬───────────────────────────────────────────────────────
       shared       │shared        slave/priv [1]  priv       unbind
       slave        │slave+shared  slave [2]       priv       unbind
       slave+shared │slave+shared  slave           priv       unbind
       private      │shared        priv [2]        priv       unbind
       unbindable   │shared        unbind [2]      priv       unbind

       Note the following details to the table:

       [1]  If  a shared mount is the only mount in its peer group, making it a
            slave automatically makes it private.

       [2]  Slaving a nonshared mount has no effect on the mount.

   Bind (MS_BIND) semantics
       Suppose that the following command is performed:

           mount --bind A/a B/b

       Here, A is the source mount, B is the destination mount, a is  a  subdi-
       rectory path under the mount point A, and b is a subdirectory path under
       the  mount  point  B.  The propagation type of the resulting mount, B/b,
       depends on the propagation types of the mounts A and B,  and  is  summa-
       rized in the following table.

                                  source(A)
                          shared  private    slave         unbind
       ──────────────────┬──────────────────────────────────────────
       dest(B)  shared   │shared  shared     slave+shared  invalid
                nonshared│shared  private    slave         invalid

       Note  that  a  recursive bind of a subtree follows the same semantics as
       for a bind operation on each mount in the subtree.   (Unbindable  mounts
       are automatically pruned at the target mount point.)

       For  further details, see Documentation/filesystems/sharedsubtree.rst in
       the kernel source tree.

   Move (MS_MOVE) semantics
       Suppose that the following command is performed:

           mount --move A B/b

       Here, A is the source mount, B is the destination mount, and b is a sub-
       directory path under the mount point B.  The propagation type of the re-
       sulting mount, B/b, depends on the propagation types of the mounts A and
       B, and is summarized in the following table.

                                  source(A)
                          shared  private    slave         unbind
       ──────────────────┬─────────────────────────────────────────────
       dest(B)  shared   │shared  shared     slave+shared  invalid
                nonshared│shared  private    slave         unbindable

       Note: moving a mount that resides under a shared mount is invalid.

       For further details, see Documentation/filesystems/sharedsubtree.rst  in
       the kernel source tree.

   Mount semantics
       Suppose that we use the following command to create a mount:

           mount device B/b

       Here, B is the destination mount, and b is a subdirectory path under the
       mount  point  B.  The propagation type of the resulting mount, B/b, fol-
       lows the same rules as for a bind mount, where the propagation  type  of
       the source mount is considered always to be private.

   Unmount semantics
       Suppose that we use the following command to tear down a mount:

           umount A

       Here, A is a mount on B/b, where B is the parent mount and b is a subdi-
       rectory path under the mount point B.  If B is shared, then all most-re-
       cently-mounted mounts at b on mounts that receive propagation from mount
       B and do not have submounts under them are unmounted.

   The /proc/ pid /mountinfo propagate_from tag
       The   propagate_from:X  tag  is  shown  in  the  optional  fields  of  a
       /proc/pid/mountinfo record in cases where a process can't see a  slave's
       immediate master (i.e., the pathname of the master is not reachable from
       the  filesystem  root  directory)  and  so cannot determine the chain of
       propagation between the mounts it can see.

       In the following example, we first create a two-link master-slave  chain
       between the mounts /mnt, /tmp/etc, and /mnt/tmp/etc.  Then the chroot(1)
       command  is  used  to make the /tmp/etc mount point unreachable from the
       root directory, creating a situation where the master of /mnt/tmp/etc is
       not reachable from the (new) root directory of the process.

       First, we bind mount the root directory onto /mnt and  then  bind  mount
       /proc  at  /mnt/proc  so  that  after  the  later  chroot(1) the proc(5)
       filesystem remains visible at the correct location in the chroot-ed  en-
       vironment.

           # mkdir -p /mnt/proc
           # mount --bind / /mnt
           # mount --bind /proc /mnt/proc

       Next,  we  ensure  that  the  /mnt mount is a shared mount in a new peer
       group (with no peers):

           # mount --make-private /mnt  # Isolate from any previous peer group
           # mount --make-shared /mnt
           # cat /proc/self/mountinfo | grep '/mnt' | sed 's/ - .*//'
           239 61 8:2 / /mnt ... shared:102
           248 239 0:4 / /mnt/proc ... shared:5

       Next, we bind mount /mnt/etc onto /tmp/etc:

           # mkdir -p /tmp/etc
           # mount --bind /mnt/etc /tmp/etc
           # cat /proc/self/mountinfo | egrep '/mnt|/tmp/' | sed 's/ - .*//'
           239 61 8:2 / /mnt ... shared:102
           248 239 0:4 / /mnt/proc ... shared:5
           267 40 8:2 /etc /tmp/etc ... shared:102

       Initially, these two mounts are in the same peer group, but we then make
       the /tmp/etc a slave of /mnt/etc, and then make /tmp/etc shared as well,
       so that it can propagate events to the next slave in the chain:

           # mount --make-slave /tmp/etc
           # mount --make-shared /tmp/etc
           # cat /proc/self/mountinfo | egrep '/mnt|/tmp/' | sed 's/ - .*//'
           239 61 8:2 / /mnt ... shared:102
           248 239 0:4 / /mnt/proc ... shared:5
           267 40 8:2 /etc /tmp/etc ... shared:105 master:102

       Then we bind mount /tmp/etc onto /mnt/tmp/etc.  Again,  the  two  mounts
       are  initially  in  the same peer group, but we then make /mnt/tmp/etc a
       slave of /tmp/etc:

           # mkdir -p /mnt/tmp/etc
           # mount --bind /tmp/etc /mnt/tmp/etc
           # mount --make-slave /mnt/tmp/etc
           # cat /proc/self/mountinfo | egrep '/mnt|/tmp/' | sed 's/ - .*//'
           239 61 8:2 / /mnt ... shared:102
           248 239 0:4 / /mnt/proc ... shared:5
           267 40 8:2 /etc /tmp/etc ... shared:105 master:102
           273 239 8:2 /etc /mnt/tmp/etc ... master:105

       From the above, we see that /mnt is the master of  the  slave  /tmp/etc,
       which in turn is the master of the slave /mnt/tmp/etc.

       We then chroot(1) to the /mnt directory, which renders the mount with ID
       267 unreachable from the (new) root directory:

           # chroot /mnt

       When  we  examine  the state of the mounts inside the chroot-ed environ-
       ment, we see the following:

           # cat /proc/self/mountinfo | sed 's/ - .*//'
           239 61 8:2 / / ... shared:102
           248 239 0:4 / /proc ... shared:5
           273 239 8:2 /etc /tmp/etc ... master:105 propagate_from:102

       Above, we see that the mount with ID 273 is a slave whose master is  the
       peer  group 105.  The mount point for that master is unreachable, and so
       a propagate_from tag is displayed, indicating that the closest  dominant
       peer group (i.e., the nearest reachable mount in the slave chain) is the
       peer group with the ID 102 (corresponding to the /mnt mount point before
       the chroot(1) was performed).

STANDARDS
       Linux.

HISTORY
       Linux 2.4.19.

NOTES
       The  propagation type assigned to a new mount depends on the propagation
       type of the parent mount.  If the mount has a parent (i.e., it is a non-
       root mount point) and the propagation type of the parent  is  MS_SHARED,
       then  the  propagation  type of the new mount is also MS_SHARED.  Other-
       wise, the propagation type of the new mount is MS_PRIVATE.

       Notwithstanding the fact that the default propagation type for new mount
       is in many cases MS_PRIVATE, MS_SHARED is typically  more  useful.   For
       this  reason,  systemd(1) automatically remounts all mounts as MS_SHARED
       on system startup.  Thus, on most modern systems, the  default  propaga-
       tion type is in practice MS_SHARED.

       Since, when one uses unshare(1) to create a mount namespace, the goal is
       commonly  to  provide full isolation of the mounts in the new namespace,
       unshare(1) (since util-linux 2.27) in turn reverses the  step  performed
       by  systemd(1), by making all mounts private in the new namespace.  That
       is, unshare(1) performs the equivalent of the following in the new mount
       namespace:

           mount --make-rprivate /

       To prevent this, one can use the --propagation unchanged option  to  un-
       share(1).

       An  application  that  creates  a  new  mount  namespace  directly using
       clone(2) or unshare(2) may desire to prevent propagation of mount events
       to other mount namespaces (as is done by unshare(1)).  This can be  done
       by  changing  the propagation type of mounts in the new namespace to ei-
       ther MS_SLAVE or MS_PRIVATE, using a call such as the following:

           mount(NULL, "/", MS_SLAVE | MS_REC, NULL);

       For a discussion of propagation types when moving mounts  (MS_MOVE)  and
       creating bind mounts (MS_BIND), see Documentation/filesystems/sharedsub-
       tree.rst.

   Restrictions on mount namespaces
       Note the following points with respect to mount namespaces:

       [1]  Each  mount  namespace  has  an owner user namespace.  As explained
            above, when a new mount namespace is created,  its  mount  list  is
            initialized as a copy of the mount list of another mount namespace.
            If  the  new  namespace and the namespace from which the mount list
            was copied are owned by different user  namespaces,  then  the  new
            mount namespace is considered less privileged.

       [2]  When  creating a less privileged mount namespace, shared mounts are
            reduced to slave mounts.  This ensures that mappings  performed  in
            less  privileged mount namespaces will not propagate to more privi-
            leged mount namespaces.

       [3]  Mounts that come as a single unit  from  a  more  privileged  mount
            namespace  are  locked  together and may not be separated in a less
            privileged mount namespace.  (The unshare(2) CLONE_NEWNS  operation
            brings  across  all of the mounts from the original mount namespace
            as a single unit, and recursive mounts that propagate between mount
            namespaces propagate as a single unit.)

            In this context, "may not be separated" means that the  mounts  are
            locked  so  that  they may not be individually unmounted.  Consider
            the following example:

                $ sudo sh
                # mount --bind /dev/null /etc/shadow
                # cat /etc/shadow       # Produces no output

            The above steps, performed in a more  privileged  mount  namespace,
            have  created a bind mount that obscures the contents of the shadow
            password file, /etc/shadow.  For security reasons, it should not be
            possible to umount(2) that mount in a less privileged  mount  name-
            space, since that would reveal the contents of /etc/shadow.

            Suppose  we  now  create  a new mount namespace owned by a new user
            namespace.  The new mount namespace will inherit copies of  all  of
            the  mounts  from  the  previous  mount  namespace.  However, those
            mounts will be locked because the new mount namespace is less priv-
            ileged.  Consequently, an attempt to umount(2) the mount  fails  as
            show in the following step:

                # unshare --user --map-root-user --mount \
                               strace -o /tmp/log \
                               umount /mnt/dir
                umount: /etc/shadow: not mounted.
                # grep '^umount' /tmp/log
                umount2("/etc/shadow", 0)     = -1 EINVAL (Invalid argument)

            The  error  message  from  mount(8)  is a little confusing, but the
            strace(1) output reveals that the underlying umount2(2) system call
            failed with the error EINVAL, which is the error  that  the  kernel
            returns to indicate that the mount is locked.

            Note,  however,  that it is possible to stack (and unstack) a mount
            on top of one of the inherited locked mounts in a  less  privileged
            mount namespace:

                # echo 'aaaaa' > /tmp/a    # File to mount onto /etc/shadow
                # unshare --user --map-root-user --mount \
                    sh -c 'mount --bind /tmp/a /etc/shadow; cat /etc/shadow'
                aaaaa
                # umount /etc/shadow

            The  final  umount(8) command above, which is performed in the ini-
            tial mount namespace, makes the original /etc/shadow file once more
            visible in that namespace.

       [4]  Following on from point [3], note that it is possible to  umount(2)
            an  entire  subtree of mounts that propagated as a unit into a less
            privileged mount namespace, as illustrated in the  following  exam-
            ple.

            First,  we  create  new user and mount namespaces using unshare(1).
            In the new mount namespace, the propagation type of all  mounts  is
            set  to private.  We then create a shared bind mount at /mnt, and a
            small hierarchy of mounts underneath that mount.

                $ PS1='ns1# ' sudo unshare --user --map-root-user \
                                       --mount --propagation private bash
                ns1# echo $$        # We need the PID of this shell later
                778501
                ns1# mount --make-shared --bind /mnt /mnt
                ns1# mkdir /mnt/x
                ns1# mount --make-private -t tmpfs none /mnt/x
                ns1# mkdir /mnt/x/y
                ns1# mount --make-private -t tmpfs none /mnt/x/y
                ns1# grep /mnt /proc/self/mountinfo | sed 's/ - .*//'
                986 83 8:5 /mnt /mnt rw,relatime shared:344
                989 986 0:56 / /mnt/x rw,relatime
                990 989 0:57 / /mnt/x/y rw,relatime

            Continuing in the same shell session, we then create a second shell
            in a new user namespace and a new (less privileged) mount namespace
            and check the state of the propagated mounts rooted at /mnt.

                ns1# PS1='ns2# ' unshare --user --map-root-user \
                                       --mount --propagation unchanged bash
                ns2# grep /mnt /proc/self/mountinfo | sed 's/ - .*//'
                1239 1204 8:5 /mnt /mnt rw,relatime master:344
                1240 1239 0:56 / /mnt/x rw,relatime
                1241 1240 0:57 / /mnt/x/y rw,relatime

            Of note in the above output is that the  propagation  type  of  the
            mount  /mnt  has  been reduced to slave, as explained in point [2].
            This means that submount events will propagate from the master /mnt
            in "ns1", but propagation will not occur in the opposite direction.

            From a separate terminal window, we then use  nsenter(1)  to  enter
            the mount and user namespaces corresponding to "ns1".  In that ter-
            minal window, we then recursively bind mount /mnt/x at the location
            /mnt/ppp.

                $ PS1='ns3# ' sudo nsenter -t 778501 --user --mount
                ns3# mount --rbind --make-private /mnt/x /mnt/ppp
                ns3# grep /mnt /proc/self/mountinfo | sed 's/ - .*//'
                986 83 8:5 /mnt /mnt rw,relatime shared:344
                989 986 0:56 / /mnt/x rw,relatime
                990 989 0:57 / /mnt/x/y rw,relatime
                1242 986 0:56 / /mnt/ppp rw,relatime
                1243 1242 0:57 / /mnt/ppp/y rw,relatime shared:518

            Because the propagation type of the parent mount, /mnt, was shared,
            the recursive bind mount propagated a small subtree of mounts under
            the  slave  mount  /mnt into "ns2", as can be verified by executing
            the following command in that shell session:

                ns2# grep /mnt /proc/self/mountinfo | sed 's/ - .*//'
                1239 1204 8:5 /mnt /mnt rw,relatime master:344
                1240 1239 0:56 / /mnt/x rw,relatime
                1241 1240 0:57 / /mnt/x/y rw,relatime
                1244 1239 0:56 / /mnt/ppp rw,relatime
                1245 1244 0:57 / /mnt/ppp/y rw,relatime master:518

            While it is not possible to umount(2) a part of the propagated sub-
            tree (/mnt/ppp/y) in "ns2", it is possible to umount(2) the  entire
            subtree, as shown by the following commands:

                ns2# umount /mnt/ppp/y
                umount: /mnt/ppp/y: not mounted.
                ns2# umount -l /mnt/ppp | sed 's/ - .*//'      # Succeeds...
                ns2# grep /mnt /proc/self/mountinfo
                1239 1204 8:5 /mnt /mnt rw,relatime master:344
                1240 1239 0:56 / /mnt/x rw,relatime
                1241 1240 0:57 / /mnt/x/y rw,relatime

       [5]  The mount(2) flags MS_RDONLY, MS_NOSUID, MS_NOEXEC, and the "atime"
            flags  (MS_NOATIME,  MS_NODIRATIME,  MS_RELATIME)  settings  become
            locked when propagated from a more privileged to a less  privileged
            mount  namespace,  and  may  not  be changed in the less privileged
            mount namespace.

            This point is illustrated in the following example where, in a more
            privileged mount namespace, we create a bind mount that  is  marked
            as  read-only.   For security reasons, it should not be possible to
            make the mount writable in a less privileged mount  namespace,  and
            indeed the kernel prevents this:

                $ sudo mkdir /mnt/dir
                $ sudo mount --bind -o ro /some/path /mnt/dir
                $ sudo unshare --user --map-root-user --mount \
                               mount -o remount,rw /mnt/dir
                mount: /mnt/dir: permission denied.

       [6]  A  file or directory that is a mount point in one namespace that is
            not a mount point in another namespace, may be  renamed,  unlinked,
            or  removed  (rmdir(2)) in the mount namespace in which it is not a
            mount point (subject  to  the  usual  permission  checks).   Conse-
            quently, the mount point is removed in the mount namespace where it
            was a mount point.

            Previously  (before  Linux  3.18), attempting to unlink, rename, or
            remove a file or directory that was a mount point in another  mount
            namespace would result in the error EBUSY.  That behavior had tech-
            nical problems of enforcement (e.g., for NFS) and permitted denial-
            of-service  attacks against more privileged users (i.e., preventing
            individual files from being updated by  bind  mounting  on  top  of
            them).

EXAMPLES
       See pivot_root(2).

SEE ALSO
       unshare(1),   clone(2),   mount(2),   mount_setattr(2),   pivot_root(2),
       setns(2),  umount(2),  unshare(2),  proc(5),  namespaces(7),  user_name-
       spaces(7),   findmnt(8),   mount(8),   pam_namespace(8),  pivot_root(8),
       umount(8)

       Documentation/filesystems/sharedsubtree.rst in the kernel source tree.

Linux man-pages 6.9.1              2024-06-15               mount_namespaces(7)

Generated by dwww version 1.16 on Tue Dec 16 04:47:58 CET 2025.