Supercharge your AI Research with Google TPUs in VS Code



Prologue: Kaggle Sucks! (Kinda) 🤔
Have you ever thought of doing AI Research or training your own LLM, but you don't have the beefy machine required? You've probably heard of or used Google Colab or Kaggle, leading platforms that provide you free powerful compute to train on their machine. That isn't enough. In this technical blog, you'll be guided step-by-step to access your own TPU in VS Code.
If you'd like the commands only, head over to:

If you're a developer like me, we're probably alike in that we code in our favorite IDEs/Text Editor (JetBrains, Visual Studio, VS Code) and can run all sorts of programming languages and frameworks (C/C++, Java, JavaScript, React, NextJS, etc).
But, Google and Kaggle aren't like these editors as they provide you with a notebook-only UI and only support Python and R. We also lose out on our accustomed custom macros, memorized key bindings, and installed extensions we had throughout the years, leading to slower development.

Moreover, Kaggle can only run two notebooks per session, so we can't run much in parallel. Nonetheless, one of the merits of using these platforms is their free and powerful accelerators:
- Kaggle: GPU T4 x2, GPU, P100, & TPU VM x3-8
- Google Colab: TPU v2-8 & TPU v4
"Great, that will greatly speed up my AI training!" you thought. Hold on; they're limited:
- Kaggle GPUs: 30 hours/week
- Kaggle TPU's: 20 hours/week
- Google Colab: Not published
Additionally, you can only run notebook sessions at most for half a day:
- Kaggle: 9 hours of max runtime
- Google Colab: 12 hours of max runtime
They're also volatile in that if your internet gets cut off, there goes your progress if you don't save (can also happen in your own machine). You might think, Why don't I just SSH into the machines from my editor? That way, I also have full access to my development environment. That doesn't work. Not only does Kaggle and Colab not support SSH, but forcing it will lead your account to be banned (Trust me, I tried).
Tensor Processing Units (TPUs)
You might notice they both have TPUs. TPUs are custom-developed application-specific integrated circuits (ASICs) designed to speed up machine learning workloads. It makes sense to have them in Colab, but why does Kaggle have it? Google acquired Kaggle! Of course, you can also create and access TPUs on Google Cloud Platform on your account. However...

...they are extremely expensive! Above is the price of the minimum TPU v4-8 in us-central2-b, the lowest spec available for T4. Being machines of high compute and demand, it makes sense to be pricey, but I didn't think this much. Nevertheless, our AI training journey will not end here.
TPU Research Cloud
Enter TRC, a program providing researchers access to a pool of thousands of Cloud TPU clusters at no charge, should you be accepted. You are, however, required to share your TRC-supported research with the world through publications, open source code, blog posts, or other means. Should you apply even though you're not a researcher? I don't see why not! But do note your contributions and findings, as you will share them. After this tutorial, you'll be accessing beasts with the following specifications.

Act 1: Getting Access to TRC 🚀
Before we proceed, do note that you need a Google Cloud Platform account with activated billing. I'm also doing this on Linux, but open Powershell if you're on Windows.
- Apply to Google's TPU Research Cloud, and you can avail 30 days of free select TPUs accessible via SSH.
- You will instantly receive an email form with the subject [TRC] Welcome to the TPU Research Cloud!
- Follow the steps in the received email, create your project, and send the form.
- You will then receive an email with the subject [TRC] You have access to free Cloud TPUs within 3-4 days.

Act 2: Creating TPUs on GCP
EXTREMELY IMPORTANT:
- Keep in mind in what regions your quota from the email is located and adjust accordingly. You might get charged.
- Pre-emptible TPUs will get deleted if not used after hours or so. Prefer on-demand TPUs with the latest TPU version (on-demand TPU v4-8 on us-central2-b in my case)

That out of the way, here are the steps:
- Sign in and go to the same project you sent in the Google forms in Google Cloud Platform
- Click `activate cloud shell` on the top right on the desktop view or type G then S for shortcut
- On the shell, type the following command to switch projects:
gcloud projects list
gcloud config set project <GCP_PROJECT_ID> - Create a network
gcloud compute networks create my-network --subnet-mode=auto - Create a subnet
gcloud compute networks subnets create my-subnet \
--network=my-network \
--region=us-central2 \
--range=172.16.1.0/24 - Enable SSH access
gcloud compute firewall-rules create allow-ssh \
--network=my-network \
--allow=tcp:22 - Either create your TPU using the GUI or using the following command. Replace the <CLOUD_TPU_PROJECT_NAME> with something to call your instance (can be anything).
gcloud compute tpus tpu-vm create <CLOUD_TPU_PROJECT_NAME> \
--zone=us-central2-b \
--accelerator-type=v4-8 \
--version=tpu-ubuntu2204-base \
--network my-network \
--subnetwork my-subnet - From here on, use your local machine. Install Google Cloud CLI here or execute:
mkdir google_cloud
cd google_cloud
curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-linux-x86_64.tar.gz
tar -xf google-cloud-cli-linux-x86_64.tar.gz
./google-cloud-sdk/install.sh
./google-cloud-sdk/bin/gcloud init - Type ./google-cloud-sdk/bin/gcloud auth login. This will open a browser and choose your Google account that has the project.
- SSH into your TPU instance. Replace "<CLOUD_TPU_PROJECT_NAME>" with what you called it in step 7.
./google-cloud-sdk/bin/gcloud compute tpus tpu-vm ssh \
--zone "us-central2-b" \
"<CLOUD_TPU_PROJECT_NAME>" \
--project "<GCP_PROJECT_ID>" - If this is your first time, the terminal will prompt you to generate an SSH key. Enter away, but take note of the path and username
GCP Terminal Command Visualization





Local Machine Command Visualization




Act 3: Connecting using VS Code 🔧
Awesome. You now have access to a monstrous computer with TPUs for 30 days. However, here's why I believe it's not enough:
- I cannot code a whole deep learning project using nano or vim.
- I need a development environment with all the extensions and macros that I'm used to for faster development.
- I cannot open VS Code on the TPU VM even if I install it because it's just a terminal instance w/o GUI
You can also connect with other editors, but I'll use VS Code since that's what I use.
- Go to your TPU project in Google Cloud Platform
- Copy the external IP of your project
- On your computer
- On your computer, edit ~/.ssh/config on Mac/Linux and /c/Users/PC_USER_NAME/.ssh/config on Windows, then add the following using your details. The identity file for Linux is ~/.ssh/google_compute_engine and /c/Users/PC_USER_NAME/.ssh/google_compute_engine for Windows.
Host <EXTERNAL IP>
IdentityFile <path to ssh file>
Port 22 - Open your VS Code and install the Remote - SSH extension
- In VS Code, press ctrl + shift + p, then type Remote-SSH: Connect to Host
- Now, inside the SSH in VS Code, create a directory and switch to it using code <your created directory> -r in the terminal


Act 4: Profit (and must-knows) 💡
Congratulations! You've now SSH'd into your TPU using VS Code. You can now perform the most intensive ML/Deep Learning Task to your heart's content. Here are the key things to keep in mind:
- Always remember that you have to opt-in to use TPU in your code depending on your framework, even inside the TPU VM in VS Code or wherever. You will be using CPUs only if not.
- Setting up the Network on Act 2, Step 4 will charge you but will be offset by the free $300 credits if you still have them. Make sure to delete it afterward.
- You must follow and abide by the TPU Program requirements
- You have only up to the quota as is listed in the activation email.
- Google has safeguards around the TRC program GCP projects - you will not be allowed to create TPU sizes/types for which you do not have quota for.
- Do note that access will initially last 30 days from Act 1, step 4.
- But you can request TPU access extensions! (Another 30 days)
- Contact trc-support@google.com if you're having trouble

Acknowledgements 🤝
TRC played a big role in our undergraduate multimodal deep learning thesis. Without access to their powerful machines and TPUs, our hardware simply wouldn’t have handled the workload. By working with TRC, we introduced TPU-driven methods to our university, opening doors for future researchers. We hope this post inspires reproducibility and new ideas.
Hope you enjoyed the technical content. Follow me on LinkedIn if you're interested in AI, Deep Learning Research, and Software Development.