Skip to content

Commit 5dffd6e

Browse files
Add KEP-5502 for EmptyDir volume sticky bit support
This KEP proposes adding an optional `stickyBit` field to EmptyDirVolumeSource that sets directory permissions to 01777 instead of 0777, preventing users from deleting files they don't own. References: Enhancement issue: #5502 Implementation PR: kubernetes/kubernetes#130277
1 parent b14e88a commit 5dffd6e

File tree

3 files changed

+425
-0
lines changed

3 files changed

+425
-0
lines changed
Lines changed: 395 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,395 @@
1+
# KEP-5502: EmptyDir Volume Sticky Bit Support
2+
3+
<!-- toc -->
4+
- [Release Signoff Checklist](#release-signoff-checklist)
5+
- [Summary](#summary)
6+
- [Motivation](#motivation)
7+
- [Goals](#goals)
8+
- [Non-Goals](#non-goals)
9+
- [Proposal](#proposal)
10+
- [User Stories](#user-stories)
11+
- [Story 1: Shared Temporary Storage for Multi-User Workloads](#story-1-shared-temporary-storage-for-multi-user-workloads)
12+
- [Risks and Mitigations](#risks-and-mitigations)
13+
- [Design Details](#design-details)
14+
- [API Changes](#api-changes)
15+
- [Implementation](#implementation)
16+
- [Test Plan](#test-plan)
17+
- [Prerequisite testing updates](#prerequisite-testing-updates)
18+
- [Unit tests](#unit-tests)
19+
- [Integration tests](#integration-tests)
20+
- [e2e tests](#e2e-tests)
21+
- [Graduation Criteria](#graduation-criteria)
22+
- [Alpha](#alpha)
23+
- [Beta](#beta)
24+
- [GA](#ga)
25+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
26+
- [Version Skew Strategy](#version-skew-strategy)
27+
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
28+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
29+
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
30+
- [Monitoring Requirements](#monitoring-requirements)
31+
- [Dependencies](#dependencies)
32+
- [Scalability](#scalability)
33+
- [Troubleshooting](#troubleshooting)
34+
- [Implementation History](#implementation-history)
35+
- [Drawbacks](#drawbacks)
36+
- [Alternatives](#alternatives)
37+
<!-- /toc -->
38+
39+
## Release Signoff Checklist
40+
41+
Items marked with (R) are required *prior to targeting to a milestone / release*.
42+
43+
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
44+
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
45+
- [ ] (R) Design details are appropriately documented
46+
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
47+
- [ ] e2e Tests for all Beta API Operations (endpoints)
48+
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
49+
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
50+
- [ ] (R) Graduation criteria is in place
51+
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) within one minor version of promotion to GA
52+
- [ ] (R) Production readiness review completed
53+
- [ ] (R) Production readiness review approved
54+
- [ ] "Implementation History" section is up-to-date for milestone
55+
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
56+
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
57+
58+
[kubernetes.io]: https://kubernetes.io/
59+
[kubernetes/enhancements]: https://git.k8s.io/enhancements
60+
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
61+
[kubernetes/website]: https://git.k8s.io/website
62+
63+
## Summary
64+
65+
This KEP proposes adding support for the sticky bit permission (mode 01777) to emptyDir volumes in Kubernetes. The sticky bit is a Unix file permission that restricts file deletion within a directory. Only the file owner, directory owner, or root can delete files, even if all users have write permission. Lack of a sticky bit on directories may result in being unable to use these as temporary directories for security reasons, making it impossible to use emptyDir and having to resort to ephemeral volumes.
66+
67+
## Motivation
68+
69+
The emptyDir volume currently creates directories with mode 0777, allowing any process with write access to delete or rename any file in the volume, regardless of who created it. This behavior can cause problems in multi-user or multi-process workloads where:
70+
71+
1. Multiple containers or processes running as different users share the same emptyDir volume
72+
2. One process accidentally or maliciously deletes files created by another process
73+
3. Init containers and main containers need to share files, but the main container should not be able to delete the init container's files
74+
75+
The sticky bit (mode 01777) is a standard Unix permission that solves this problem by ensuring that only the owner of a file (or the directory owner, or root) can delete or rename it, even when the directory is world-writable.
76+
77+
### Goals
78+
79+
- Add an optional `stickyBit` field to the emptyDir volume specification
80+
- When enabled, create emptyDir volumes with mode 01777 instead of 0777
81+
- Maintain backward compatibility by keeping the default behavior (mode 0777) unchanged
82+
- Support the feature on all platforms that support Unix file permissions
83+
84+
### Non-Goals
85+
86+
- Changing the default behavior of existing emptyDir volumes (mode 0777 remains the default)
87+
- Adding support for other advanced file permission features
88+
- Implementing this feature for volume types other than emptyDir
89+
- Supporting this feature on platforms that don't support Unix-style file permissions (e.g., Windows)
90+
91+
## Proposal
92+
93+
Add a new optional boolean field `stickyBit` to the `EmptyDirVolumeSource` API type. When set to `true`, the kubelet will create the emptyDir volume with mode 01777 (0777 | sticky bit) instead of the default 0777.
94+
95+
### User Stories
96+
97+
#### Story 1: Shared Temporary Storage for Multi-User Workloads
98+
99+
For containerized ruby apps, `/tmp` folders will be rejected if they do not have a sticky bit. This means `emptyDir` cannot be reliably used for tmp folders, and ephemeral volumes (more complex to manage) or RWX volumes have to be used (which are not well supported in many providers).
100+
101+
Allowing emptyDir to be mounted with sticky bit set would tremendously reduce complexity for these applications.
102+
103+
### Risks and Mitigations
104+
105+
**Risk**: Users might not understand the sticky bit behavior and be confused when they cannot delete files created by other users.
106+
107+
**Mitigation**: Document the feature clearly with examples. The feature is opt-in, so users must explicitly enable it.
108+
109+
**Risk**: The feature might not work correctly on all container runtimes or storage backends.
110+
111+
**Mitigation**: The sticky bit is a standard Unix permission supported by all major filesystems. The feature is opt-in (users must explicitly set `stickyBit: true`), allowing for gradual adoption and testing.
112+
113+
**Risk**: Existing workloads might be affected if the default changes.
114+
115+
**Mitigation**: The feature is opt-in via a new API field. Existing workloads will continue to use mode 0777 unless explicitly configured otherwise.
116+
117+
## Design Details
118+
119+
### API Changes
120+
121+
Add a new optional field to the `EmptyDirVolumeSource` struct:
122+
123+
```go
124+
type EmptyDirVolumeSource struct {
125+
// ... existing fields ...
126+
127+
// StickyBit sets the emptyDir permission to 01777 instead of 0777.
128+
// When enabled, only the owner of a file can delete or rename it,
129+
// even if the directory is world-writable.
130+
// This is similar to the /tmp directory behavior on Unix systems.
131+
// +optional
132+
StickyBit *bool `json:"stickyBit,omitempty" protobuf:"varint,3,opt,name=stickyBit"`
133+
}
134+
```
135+
136+
### Implementation
137+
138+
The implementation is in the emptyDir volume plugin in `pkg/volume/emptydir/empty_dir.go`:
139+
140+
1. Define constants for the sticky bit mode:
141+
```go
142+
const (
143+
stickyBitMode os.FileMode = 01000
144+
defaultPerm os.FileMode = 0777
145+
)
146+
```
147+
148+
2. When creating the emptyDir directory, check if the `StickyBit` field is set:
149+
```go
150+
perm := defaultPerm
151+
if ed.stickyBit != nil && *ed.stickyBit {
152+
perm = defaultPerm | stickyBitMode
153+
}
154+
```
155+
156+
3. Apply the appropriate permissions when creating the directory
157+
158+
### Test Plan
159+
160+
[x] I/we understand the owners of the involved components may require updates to
161+
existing tests to make this code solid enough prior to committing the changes necessary
162+
to implement this enhancement.
163+
164+
#### Prerequisite testing updates
165+
166+
No prerequisite testing updates are required. The emptyDir volume plugin already has good test coverage.
167+
168+
#### Unit tests
169+
170+
Unit tests have been added to verify:
171+
- Directory creation with sticky bit enabled results in mode 01777
172+
- Directory creation with sticky bit disabled or unset results in mode 0777
173+
174+
Coverage:
175+
- `pkg/volume/emptydir`: Unit tests cover the sticky bit implementation and default behavior
176+
177+
#### Integration tests
178+
179+
If needed, integration tests could additionally verify:
180+
- A pod with emptyDir volume and stickyBit enabled mounts correctly
181+
- Older kubelets ignore the field gracefully
182+
183+
#### e2e tests
184+
185+
TBD - e2e tests will be added as part of the implementation.
186+
187+
### Graduation Criteria
188+
189+
#### Alpha
190+
191+
- API field implemented and functional
192+
- Unit tests passing
193+
- Documentation available
194+
195+
#### Beta
196+
197+
- No major bugs reported during alpha
198+
- Gather feedback from users
199+
200+
#### GA
201+
202+
- Stable for at least two releases
203+
- No major issues reported
204+
205+
### Upgrade / Downgrade Strategy
206+
207+
No special upgrade/downgrade handling is needed. The `stickyBit` field is optional and ignored by older kubelets that don't recognize it.
208+
209+
### Version Skew Strategy
210+
211+
The feature is kubelet-only. Older kubelets will ignore the `stickyBit` field and create emptyDir volumes with the default mode 0777. This is safe as it matches the previous behavior.
212+
213+
## Production Readiness Review Questionnaire
214+
215+
### Feature Enablement and Rollback
216+
217+
###### How can this feature be enabled / disabled in a live cluster?
218+
219+
- [ ] Feature gate (also fill in values in `kep.yaml`)
220+
- Feature gate name:
221+
- Components depending on the feature gate:
222+
- [x] Other
223+
- Describe the mechanism: The feature is enabled per-volume by setting `stickyBit: true` on an emptyDir volume in the pod spec. No feature gate is required as this is a simple opt-in API field.
224+
- Will enabling / disabling the feature require downtime of the control plane? No
225+
- Will enabling / disabling the feature require downtime or reprovisioning of a node? No
226+
227+
###### Does enabling the feature change any default behavior?
228+
229+
No. The feature only takes effect when users explicitly set `stickyBit: true` on an emptyDir volume. Existing emptyDir volumes and new emptyDir volumes without the field continue to use mode 0777.
230+
231+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
232+
233+
Yes. Since there is no feature gate, the feature is controlled per-pod by setting or omitting the `stickyBit` field. To "disable" the feature, simply remove `stickyBit: true` from pod specs.
234+
235+
If rolling back to an older kubelet version that doesn't support the field, the field will be ignored and emptyDir volumes will be created with mode 0777.
236+
237+
**Impact on existing workloads**: Pods that were running with sticky bit enabled will continue to run unchanged (the directory permissions don't change after creation). However, new pods or pods that are rescheduled will have emptyDir volumes created with mode 0777 instead of 01777, which could affect application behavior if the application relies on the sticky bit behavior.
238+
239+
###### What happens if we reenable the feature if it was previously rolled back?
240+
241+
The feature will work as expected for new pods. Existing pods that were created while the feature was disabled will continue to use mode 0777 until they are deleted and recreated.
242+
243+
###### Are there any tests for feature enablement/disablement?
244+
245+
Yes, unit tests verify that:
246+
- When `stickyBit: true`, the directory is created with mode 01777
247+
- When `stickyBit` is false or unset, the directory is created with mode 0777
248+
- The default behavior (mode 0777) is preserved when the field is not specified
249+
250+
### Rollout, Upgrade and Rollback Planning
251+
252+
###### How can a rollout or rollback fail? Can it impact already running workloads?
253+
254+
**Rollout failure scenarios**:
255+
- If the feature has bugs that cause emptyDir volume creation to fail, pods using `stickyBit: true` will fail to start
256+
- If the host OS or filesystem doesn't support sticky bit (unlikely on standard Linux), volume creation could fail
257+
258+
**Impact on running workloads**:
259+
- Already running workloads are not affected by enabling or disabling the feature
260+
- Only new pods or rescheduled pods are affected
261+
- The feature is opt-in, so workloads that don't use it are unaffected
262+
263+
**Rollback scenarios**:
264+
- Rolling back to an older kubelet is safe and will not affect running pods
265+
- On older kubelets, new pods with `stickyBit: true` will get mode 0777 instead of 01777 (the field is ignored), which is a functional change but not a failure
266+
267+
###### What specific metrics should inform a rollback?
268+
269+
Increased pod startup failures or volume mount errors correlated with pods using `stickyBit: true`.
270+
271+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
272+
273+
Not yet. Will be tested manually before release.
274+
275+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
276+
277+
No.
278+
279+
### Monitoring Requirements
280+
281+
###### How can an operator determine if the feature is in use by workloads?
282+
283+
Operators can:
284+
1. Query the API server for pods with emptyDir volumes that have `stickyBit: true`:
285+
```bash
286+
kubectl get pods -A -o json | jq '.items[] | select(.spec.volumes[]?.emptyDir?.stickyBit == true)'
287+
```
288+
2. Check kubelet logs for messages related to sticky bit creation
289+
3. Inspect pod specifications directly
290+
291+
###### How can someone using this feature know that it is working for their instance?
292+
293+
- [x] Other (treat as last resort)
294+
- Details: Users can verify the feature is working by:
295+
1. Creating a pod with an emptyDir volume with `stickyBit: true`
296+
2. Exec into the pod and check the directory permissions: `ls -ld /path/to/emptydir`
297+
3. Verify the permissions show `drwxrwxrwt` (mode 01777, the 't' at the end indicates sticky bit)
298+
4. Test the behavior by creating a file as one user and attempting to delete it as another user
299+
300+
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
301+
302+
This feature should not affect existing SLOs. The performance impact should be negligible
303+
304+
- emptyDir volume creation time should not be measurably affected
305+
- Pod startup time should not be measurably affected
306+
307+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
308+
309+
- [ ] Metrics
310+
- Metric name: storage_operation_duration_seconds (existing metric)
311+
- Components exposing the metric: kubelet
312+
- This metric can be filtered by operation_name="setup" to track emptyDir volume creation time
313+
314+
Operators should monitor:
315+
- Pod startup failures
316+
- Volume mount failures
317+
- kubelet errors
318+
319+
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
320+
321+
No additional metrics are needed. The feature is a simple file permission change and can be observed using existing pod and volume metrics.
322+
323+
### Dependencies
324+
325+
###### Does this feature depend on any specific services running in the cluster?
326+
327+
No. The feature only depends on:
328+
- The host OS supporting the sticky bit permission (standard on all Linux systems)
329+
- The filesystem supporting sticky bit (standard on all major filesystems)
330+
331+
### Scalability
332+
333+
###### Will enabling / using this feature result in any new API calls?
334+
335+
No.
336+
337+
###### Will enabling / using this feature result in introducing new API types?
338+
339+
No. It adds a new field to an existing API type (EmptyDirVolumeSource).
340+
341+
###### Will enabling / using this feature result in any new calls to the cloud provider?
342+
343+
No.
344+
345+
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
346+
347+
- API type(s): Pod (EmptyDirVolumeSource)
348+
- Estimated increase in size: One additional boolean field per emptyDir volume that uses the feature, when set
349+
350+
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
351+
352+
No. The performance impact should be negligible
353+
354+
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
355+
356+
No. The feature only changes one argument to a mkdir system call.
357+
358+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
359+
360+
No.
361+
362+
### Troubleshooting
363+
364+
###### How does this feature react if the API server and/or etcd is unavailable?
365+
366+
The feature is implemented in the kubelet and does not depend on the API server or etcd after the pod spec has been retrieved.
367+
###### What are other known failure modes?
368+
369+
None beyond the standard emptyDir failure modes.
370+
371+
###### What steps should be taken if SLOs are not being met to determine the problem?
372+
373+
This feature should not affect SLOs. If pod startup or volume mounting SLOs are not being met, check if the affected pods are using `stickyBit: true` and verify kubelet logs for errors.
374+
375+
## Implementation History
376+
377+
- 2025-02-19 Initial implementation started (kubernetes/kubernetes#130277)
378+
- 2025-08-25 KEP issue created (kubernetes/enhancements#5502)
379+
- 2026-01-30: KEP created for alpha in v1.36
380+
381+
## Drawbacks
382+
383+
- Adds a new API field, slightly increasing API surface
384+
- Users unfamiliar with Unix permissions may be confused by sticky bit behavior
385+
- Not supported on Windows (but emptyDir permissions work differently there anyway)
386+
387+
## Alternatives
388+
389+
### Alternative 1: Provide more flexible mount options on emptyDir
390+
391+
There appears to be interested to provide more configuration options for mounting, that could entail setting permissions.
392+
393+
References: https://github.com/kubernetes/enhancements/pull/5856
394+
395+
## Infrastructure Needed (Optional)
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# PRR approval file for alpha
2+
# To be filled by prod-readiness team
3+
approver: TBD

0 commit comments

Comments
 (0)