[17.06] Graceful upgrade of containerd and runc state files upon live-restore#117
Conversation
| return fmt.Errorf("container %s is already active", containerID) | ||
| } | ||
|
|
||
| if errUpgrade := v17_06_1.Upgrade( |
There was a problem hiding this comment.
by this time, it's too late if I'm not mistaken.
containerd on restart tries to load all of those files and would have failed to unmarshal them
There was a problem hiding this comment.
thanks, this is not supposed to be here... only the other Upgrade() call. Will remove.
| ids[i] = fi.Name() | ||
| } | ||
| f.Close() | ||
| for _, id := range ids { |
There was a problem hiding this comment.
I don't think we should upgrade all the entries, we should only upgrade the one which ids are found in /run/docker/libcontainerd/containerd
Nothing prevent the user to having different version of containerd/runc on the machine and we don't want to break the containers started with those.
There was a problem hiding this comment.
Thanks for catching this. Will update.
|
In latest build result: |
9796c06 to
4f96326
Compare
|
Ping @mlaventure |
4f96326 to
d6f885e
Compare
| MountLabel string `json:"mount_label"` | ||
| Hostname string `json:"hostname"` | ||
| Namespaces configs.Namespaces `json:"namespaces"` | ||
| Capabilities linuxCapabilities `json:"capabilities"` |
There was a problem hiding this comment.
@mlaventure just realized that linuxCapabilities uses specs.LinuxCapabilities (https://github.com/docker/docker-ce/pull/117/files#diff-40363fc1cc8aab85c96e0f6d4e7d4315R97) while this is supposed to be configs.Capabilities, which thankfully happens to be the same.
Maybe it's fine for this version and we can think of something more accurate after?
| if err != nil { | ||
| return err | ||
| } | ||
| // TODO: copy caps or not copy caps? |
There was a problem hiding this comment.
@mlaventure I mean:
copy(boundingCaps, caps)
copy(effectiveCaps, caps)
...
Otherwise, l.V.Bounding[i] == l.V.Effective[i], and if someone writes code to change an element of Bounding it will change all the other ones too, unwanted side-effect. Although it's only in very special cases, where the same underlying array is used.
There was a problem hiding this comment.
It's the case currently in the daemon code too: https://github.com/moby/moby/blob/master/daemon/oci_linux.go#L258-L266
There was a problem hiding this comment.
@mlaventure okay thanks, so let's keep it orthogonal.
|
retriggered a new build https://jenkins.dockerproject.org/job/docker-ce-17.06-pr/188/console |
d6f885e to
7148d7b
Compare
|
@mlaventure I pushed a better version, that does all the decoding first, before overwriting the files. This is so that if one of the later files fails to be decoded, the previous ones keep their initial state and we don't end up in a mixed state. |
7148d7b to
ec6a007
Compare
|
And now it also encodes into a bytes.Buffer before writing to file, just in case the reencoding phase fails (not just the decoding). I hope the extra buffering doesn't become an issue when there's a lot of containers. |
|
Ping @mlaventure |
|
LGTM after moving to atomic file write. |
|
I agree with @stevvooe it's better to write to a tmp file, then rename it. |
…e-restore Vendors new dependency github.com/crosbymichael/upgrade Signed-off-by: Tibor Vass <tibor@docker.com>
ec6a007 to
358c36e
Compare
|
LGTM I've tested this PR by building a deb package and upgrading from a docker-ce 17.03 daemon that had a container started with live-restore. After the upgrade, container still seen as active from |
|
LGTM |
| defer f.w.Close() | ||
| } | ||
| var errs []string | ||
| for _, f := range files { |
There was a problem hiding this comment.
Ok, so this sucks everything up into memory, creates an atomic file for each, then writes the buffer.
|
LGTM |
|
@vieux Line for changelog: |
|
@tiborvass @vieux I've also verified this PR works on upgrade from docker-ce 17.05.0. |
|
Note to selves: after conversation with @tiborvass may need to look into having all 3 json files updated atomically. As this PR stands, looks good to have containers still running with live restore. |
[18.06] update package descriptions and info Upstream-commit: 2baa88e2bcf7c451ea8405fa54948c07f722cf9e Component: packaging
fixes #117 Print healthcheck information in pretty mode. Signed-off-by: Stephane Jeandeaux <stephane.jeandeaux@gmail.com> Upstream-commit: 05674a50961d99eaf5320f32beb91646fd1355b4 Component: cli
Signed-off-by: Stephane Jeandeaux <stephane.jeandeaux@gmail.com> Upstream-commit: d4ad7a94d217a8925923dd8e232fd949b3ec1bd4 Component: cli
[engine] Graceful upgrade of containerd and runc state files upon live-restore
To address issue: