Table of Contents
About
This tutorial tells you how to create your customized jupyter environment with GCP. This process is automated by terraform. With this, you don't have to do tedious things at cloud provider's console to setup cloud environments. Terraform is one of IaC(Infrastructure as code) tools. It reduces risk of human error and time for managing it as little as possible.
Maybe you think why you don't use colab or kaggle kernel? Yes, they could be options but there are limitations for available time and resources even though professional service. Thus by setting up your own cloud environment, you can use whatever and as much as you want. But remember you have to pay as you use your resources.
Environment
・Ubuntu 20.04.4 LTS (Focal Fossa)
Step1 Install terraform
Run this shell if you don't install terraform yet.
#!/bin/sh
sudo apt-get update && sudo apt-get install -y gnupg software-properties-common curl
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main
sudo apt-get update && sudo apt-get install terraform
Step2 Create VM instance by terraform
Create your service account in GCP for this work, download credential and create terraform file.
Here is an example terraform file.
By enabling preemptible as true you can use preemptible VM.
You can take up to 91% discount!
Also, you can set a start-up script. In this example it is set as init.sh.
If you don't use it you can delete the line.
terraform {
required_providers {
google = {
source = hashicorp/google
version = 3.5.0
}
}
}
provider google {
credentials = file(../credential/your_credential.json)
project = your_project_id
region = asia-northeast1
zone = asia-northeast1-a
}
resource google_compute_instance default {
name = test
machine_type = n1-standard-1
boot_disk {
initialize_params {
image = debian-cloud/debian-10
size = 20
}
}
scheduling {
preemptible = true
automatic_restart = false
}
metadata_startup_script = file(init.sh)
metadata = {
enable-oslogin = TRUE
}
network_interface {
network = default
access_config {}
}
}
Then run this commands to create Google compute Engine instance.
You will find a created vm instances in your GCP console.
$ cd path_to_terraform_file
$ terraform init
$ terraform fmt
$ terraform validate
$ terraform apply
Step3 Set up your instance
Access to a VM instance by ssh then run this shell script to set up Docker container.
#!/bin/sh
curl https://get.docker.com | sh
sudo usermod -aG docker $USER
sudo systemctl start docker
sudo systemctl enable docker
sudo curl -L https://github.com/docker/compose/releases/download/1.16.1/docker-compose-\`uname -s\`-\`uname -m\` -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
sudo service docker start
# Start jupyter lab
sudo docker-compose up
Here is Dockerfile.
FROM python:3.8.12-buster
USER root
RUN pip install --no-cache-dir \\
numpy==1.22.3\
scipy==1.8.0\
jupyterlab==3.3.2
Here is a yaml file for docker compose.
version: 3
services:
notebook:
build: ./docker_images/jupyter
volumes:
- .:/home/work
- .jupyter:/root/.jupyter
ports:
- 7777:7777
tty: true
environment:
- JUPYTER_ENABLE_LAB=yes
command: jupyter lab --ip=0.0.0.0 --port=7777 --allow-root --no-browser
In this example Dockerfile is set in docker_images/jupyter directory.
Step4 Access to jupyterlab hosted on vm instance
You can do port forwarding to get access to your vm instance like this.
$ gcloud compute ssh --project=[project id] --zone=[zone] [instance name] -- -L [local port#]:[instance ip adress]:7777
Then, you can use jupyter lab by accessing http://localhost:7777/
Step5 Clean up
To delete gcp resources, run this script.
$ terraform destroy