2024/11/25 Updated by

ssh接続可能なイメージを自作する

Ubuntu Docker Image on Windows (GPU)

前提条件

Docker on Windows (GPU) の手順にしたがって、 nVidia のGPUを備えた Windows 11 マシンに docker Host がインストールされている。
Windows 上に Ubuntu (WSL2) がインストールされていて、そのうえで docker コマンドを実行できる。
Ubuntu (WSL2) 上で nvidia-smi コマンドが実行できる。

$ nvidia-smi
nitta@galleria:~$ nvidia-smi
Mon Nov 25 14:42:14 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.72                 Driver Version: 566.14         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 980M        On  |   00000000:01:00.0  On |                  N/A |
| N/A   57C    P0             26W / 1000W |     866MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A        26      G   /Xwayland                                   N/A      |
|    0   N/A  N/A        35      G   /Xwayland                                   N/A      |
|    0   N/A  N/A        41      G   /Xwayland                                   N/A      |
+-----------------------------------------------------------------------------------------+

Ubuntu (WSL2) 上から、cuDNN のバージョンを調べるために nvcc をインストールしようとすると失敗する。なぜ?

$ nvcc --version ←これを起動したいが、コマンドがないと出力されて、インストール方法が提示される。
Command 'nvcc' not found, but can be installed with:
sudo apt install nvidia-cuda-toolkit
$ sudo apt install  nvidai-cuda-toolkit ← 指示にしたがってインストールコマンドを実行するが、エラーとなり失敗する。
エラー

フォルダ work を作成して移動する。

mkdir work
cd work

Dockerfile を作る。

Dockerfile

FROM nvcr.io/nvidia/tensorflow:23.09-tf2-py3

ARG USERNAME=guest
ARG PASSWORD=password
ARG UID=1000
ARG GID=1000

# Setup timezone.
ENV TZ=Asia/Tokyo
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
RUN apt-get update
RUN apt-get upgrade -y

# Softwares.
RUN apt-get install -y wget git emacs tmux less

## Install anaconda.
#RUN apt-get install -y libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 \
#    libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
#RUN wget -P /opt https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh && \
#    bash /opt/Anaconda3-2020.02-Linux-x86_64.sh -b -p /opt/anaconda3 && \
#    rm /opt/Anaconda3-2020.02-Linux-x86_64.sh

# SSH
RUN apt-get update && apt-get install openssh-server sudo -y
RUN groupadd -g ${GID} developer
RUN useradd -rm -d /home/${USERNAME} -s /bin/bash -g developer -u ${UID} ${USERNAME} 
RUN gpasswd -a ${USERNAME} sudo
RUN echo "${USERNAME}:${PASSWORD}" | chpasswd
RUN service ssh start

EXPOSE 22 80 8888
CMD ["/usr/sbin/sshd","-D"]

UID を調べるには id -uコマンドを, GID を調べるには id -g コマンドを使う。

$ id -u
1000
$ id -g
1000

Docker Image を作成する

$ docker build --build-arg PASSWORD=river数字 -t nitta/tensorflow:23-09-ssh .

docker image の一覧を見ると、作成されていることがわかる。

nitta@gtunes3:~/work$ docker image ls -a
REPOSITORY                       TAG             IMAGE ID       CREATED          SIZE
nitta/tensorflow                 23.09-ssh       e195c1da6512   11 minutes ago   14.8GB
nvcr.io/nvidia/tensorflow        23.08-tf2-py3   fcda27c68c2c   16 months ago    14.2GB
docker101tutorial                latest          3419775773fa   2 years ago      28.9MB
vermeer777/docker101tutorial     latest          3419775773fa   2 years ago      28.9MB
alpine/git                       latest          b80d2cac43e4   2 years ago      43.6MB
nvcr.io/nvidia/k8s/cuda-sample   nbody           06d607b1fa6f   2 years ago      321MB
hello-world                      latest          feb5d9fea6a5   3 years ago      13.3kB
nvcr.io/nvidia/tensorflow        21.08-tf2-py3   2abe022b55d1   3 years ago      11.5GB

コンテナを作成する。

docker run --shm-size=1g \
   --ulimit memlock=-1 \
   --ulimit stack=67108864 \
   --gpus all \
   -p 8888:8888 \
   -it nitta/tensorflow:23.09-ssh

 docker run --name 4semi7 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
     --gpus all -p 7077:22 -p 8087:8888  -v /home/docker/4semi7:/root/doc
     -it nitta/tensorflow:23.09-ssh

Docker Host の WIndows にインストールされている nVidia ドライバが古いと次のようなエラーがでるので、 nVidia のドライバを更新する。

This container was built for NVIDIA Driver Release 535.86 or later, but
version 512.89 was detected and compatibility mode is UNABAILABLE.        ← エラー

エラー無く動くと、プロンプトは返ってこないので注意。

nitta@galleria:~/work$ docker run --name 4semi7 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 --gpus all -p 7077:22 -p 808
7:8888 -v /home/docker/4semi7:/root/doc -it nitta/tensorflow:23-09-ssh

================
== TensorFlow ==
================

NVIDIA Release 23.09-tf2 (build 68583340)
TensorFlow Version 2.13.0

Container image Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copyright 2017-2023 The TensorFlow Authors.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

他の端末から Docker Guest にアクセスする。いきなり ssh できる。

$ ssh -p 7077 guest@localhost

パスワードを変更するには、コンテナ内のシェルで以下のコマンドを実行する。すなわち、root権限でchpasswd を実行する必要がある。
```
$ sudo /bin/bash
# echo 'guest:新しいパスワード' | chpasswd
# exit
$
```

自分で作成した Image と Container について

[email protected] の Docker Host 上に、自分のイメージ nitta/tensorflow:20-08-ssh を作成した。 tensorflow-2.2.0 (GPU) を使いたいので nvcr.io/nvidia/tensorflow:23.09-tf2-py3 を元にして、「ユーザ名は guest で、sshサーバを自動起動する」ようにDockerfile を記述した。

docker build --build-arg PASSWORD=あれ -t nitta/tensorflow:20-08-ssh .

以下のコマンドで docker container を作成した。

docker run --name 4semi9 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 --gpus all -p 7079:22 -p 8089:8888 -v /home/docker/4semi9:/root/doc -it nitta/tensorflow:20-08-ssh

Container を起動後アクセスして、パスワードを「あれ」に変更してある。

ssh接続可能なイメージを自作する

Ubuntu Docker Image on Windows (GPU)

前提条件

自分で作成した Image と Container について

参考文献