I want to isolate a potentially dangerous process, without needing root privileges.
However, the syscall pivot_root fails with EINVAL.
struct clone_args ca = {
.flags = CLONE_NEWUSER | CLONE_NEWNS,
};
pid = syscall(SYS_clone3, &ca, sizeof ca);
if (pid == 0) {
mount(NULL, "/", NULL, MS_REC | MS_PRIVATE, NULL);
mount(".", ".", NULL, MS_BIND | MS_REC, NULL);
mkdir("./oldroot", 0777);
syscall(SYS_pivot_root, ".", "./oldroot"); // Fails with EINVAL
}
I added printk calls in my kernel and recompiled it, and I know it fails at this check: if (new_mnt->mnt.mnt_flags & MNT_LOCKED).
What does MNT_LOCKED mean? What can I do to make new_mnt not have this flag? Is it even possible to do that in a user namespace, as a regular user?