(Frequently Asked Questions / Foire Aux Questions)
File copying from/to Lab-IA#
If you can log-in into Lab-IA via ssh, then you can use
rsync over ssh or
scp to transfer files.
Access to Internet from Lab-IA#
The cluster in in a protected network, accessing Internet from within Lab-IA hosts requires to go through a specific proxy host, which supports http and https transfers.
The proxy host is
webproxy.lab-ia.fr. You can define environment variables (from within your
~/.bashrc file by example), in lowercase and in uppercase:
export http_proxy=http://webproxy.lab-ia.fr:8080 export https_proxy=http://webproxy.lab-ia.fr:8080 export HTTP_PROXY=http://webproxy.lab-ia.fr:8080 export HTTPS_PROXY=http://webproxy.lab-ia.fr:8080
Note: on slurm.lab-ia.fr host these environment variables are automatically setup for bash.
For specific applications#
Some applications use their own way to provide proxy settings, see their respective documentation. Examples:
git config --global http.proxy http://webproxy.lab-ia.fr:8080
curl --proxy http://webproxy.lab-ia.fr:8080 my.page.com
Use environment variables http_proxy/https_proxy.
Package installer for Python.
Use environment variables http_proxy/https_proxy or command-line option:
python3 -m pip install --proxy=http://webproxy.lab-ia.fr:8080 [...]
Package manager for Anaconda and Miniconda.
proxy_servers: http: http://webproxy.lab-ia.fr:8080 https: http://webproxy.lab-ia.fr:8080
We recommend to use miniconda.
You will be able to install libraries that you need using its
conda package manager, and use dedicated environments.
Installing it into your Lab-IA home directory make the installation available to all computing nodes.
This tool is not specific to Lab-IA, you can also install it on your common computers for developments or tests.
Note: Python2 is no longer developed/supported since 1/1/2020. You should strongly go to Python3.
Need singularity / docker#
Some computing solutions are provided as prepared installations for Docker or Singularity.
These tools need to provide a root access to hosts. They are not installed (and will not be installed) on Lab-IA. You must find an alternative installation way.
Slurm job script.sh failed to execute#
You may have an error like:
slurmstepd: error: execve(): script.sh: Permission denied srun: error: n51: task 0: Exited with exit code 13
This is a known slurm error. Rename your job script with a name which is not a unix command.
Emails reception at end of jobs#
In your job script you provide your email address, and specify for which events you want an email:
#SBATCH --firstname.lastname@example.org #SBATCH --mail-type=ALL
Emails will come from sender:
Mail type events can be
ALL (=all previous).
They also can be
Multiple events can be specified with coma separator.
Quotas are enforced on Lab-IA so that every user can have enough space to work. There are two kind of quotas :
- User quotas,
- Project quotas.
Disk quotas are set on home and projects directories. Each user has an initial 50GB quota on his home directory and a 500GB quota on each projects.
Quotas can be adapted on justified request sent to Lab-IA staff.
User vs Project quota#
Quotas depends on the group ID (gid) of the files.
On Lab-IA, files created in the user home directory belong to the user. Files created in any project directory belong to the project.
If a user copies a file or a folder from his home directory to a project directory, the copied file will belong to the project.
However, if a user moves a file from his home directory to a project directory, the moved file will still belong to the user and count for his quota. To transfer files quota count to the project, the user must change the group ownership of the files :
chgrp -R prokect_name project_files_or_directories
Cuda, libcudnn, libnccl#
On a node, you can list all available cuda versions with this command
ls -ld /usr/local/cuda*
You will find:
- /usr/local/cuda → cuda-11.2
- /usr/local/cuda-11 → cuda-11.2
If you don't want to use the default version, set your environments variables accordingly.
Nota: Check sub-directories for libcudnn and libnccl libs.
You must have Jupyter Notebook installed within your conda Python installation setup, in your account do once (in the example we create a
notebook conda environment):
conda create -n notebook conda install -c conda-forge notebook
Open an interactive session onto a node. You can use sgpu command to check available GPUs:
$ sgpu LIST OF AVAILABLE GPU PER NODE NAME | AVAIL. GPU | TOTAL GPU COUNT (...) n53 | 0 | 3 n54 | 0 | 3 n55 | 1 | 3 n101 | 0 | 4 n102 | 0 | 4
In this example, there's a free GPU on
n55. To run a 2 hours
interactive bash session on it (eventually from a tmux terminal).
srun --time=10:00:00 --gres=gpu:1 --nodelist=n55 --pty bash
Once slurm selected your job and the bash session has started on the node, activate the conda environment, then start the jupyter notebook (select an unused port in range 1024-65535):
conda activate notebook jupyter notebook --no-browser --port=8889 --ip=0.0.0.0
When your notebook job is running, setup a tunnel between a port on your computer (here
8888) and the remote notebook process (here
ssh -N -L localhost:8888:n55:8889 LOGIN@slurm.lab-ia.fr
Note: To run this command, your ssh configuration file to must be properly set. Check Getting started - Lab-IA user documentation
Finally, open a browser to the remote process via the tunnel: http://localhost:8888