This guide walks you through running llama_mmlu_eval.py on Google Colab to evaluate Llama 3.2-1B on the MMLU benchmark using free GPU resources.
Google Account - For accessing Google Colab
Hugging Face Account - For model access
Hugging Face Token - With Llama license accepted
Sign in with your Google account
Open Google Drive
Open or create a folder named Colab_Projects.
Open the folder and create a folder named RunningLMM (or whatever is the name of your project).
Open the folder and click New (+ sign) -> More -> Google Colaboratory
This creates a Python notebook. Name it "llama_mmlu_eval.ipynb".
Open the notebook - this will return you to Colab.
Copy the following code into the first cell and run it. This code should appear in the first cell of every notebook, replacing "RunningLLM" with the name of your project if it is different. This will make sure all files your program creates are safely saved on Google Drive instead of disappearing when your session ends.
x# 1. Mount Google Drivefrom google.colab import drivedrive.mount('/content/drive')# 2. Create and move into a specific project folderimport osproject_folder = '/content/drive/MyDrive/Colab_Projects/RunningLMM'if not os.path.exists(project_folder):os.makedirs(project_folder)print(f"Created folder: {project_folder}")# 3. Change the working directory to this folderos.chdir(project_folder)print(f"Current Directory: {os.getcwd()}")
IMPORTANT: You must enable GPU for faster execution!
Click Runtime in the menu bar
Select Change runtime type
In the dialog:
Hardware accelerator: Select GPU (T4 is free tier)
GPU type: T4, V100, or A100 (if available)
Click Save
Verify GPU is enabled:
xxxxxxxxxx# Run this in a cell!nvidia-smiYou should see GPU information (Tesla T4, V100, etc.)
Click on Settings (gear icon)
Click on AI Assistance and check the box "Consented to use generative AI features"
This is optional because you could also run Claude in a separate window and copy / paste back and forth or just use Gemini built into Colab. Run this in a Colab code cell:
xxxxxxxxxx!curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash - && \sudo apt-get install -y nodejs && \sudo npm install -g @anthropic-ai/claude-code && \export PATH=/usr/bin:$PATHThen you can run Claude Code by running this in a cell:
xxxxxxxxxx!claudeCopy this into a of the notebook and run it:
xxxxxxxxxx# Install required packages!pip install -q transformers torch datasets accelerate tqdm huggingface_hub bitsandbytesThe -q flag makes output quieter. Remove it if you want to see installation progress. Modify this to add whatever other libraries you may need.
Option A: Interactive Login
xxxxxxxxxx# Login to Hugging Face!hf auth loginWhen prompted:
Paste your Hugging Face token
Press Enter
Type y when asked to save as git credential
Option B: Set Token Directly (Faster for repeated runs)
Copy this into a cell of your notebook and run it:
xxxxxxxxxximport osos.environ['HF_TOKEN'] = 'hf_YourTokenHere' # Replace with your token⚠️ Security Warning: If using Option B, do NOT share your notebook publicly with the token visible!
Option A: Copy-Paste Code Directly into notebook (recommended)
Create a new cell and paste in your code. This is recommended because it will make it easy for Gemini to debug and modify your code.
Option B: Upload from Your Computer
Click the 📁 Files icon in the left sidebar
Click 📤 Upload button
Select llama_mmlu_eval.py
File will appear in /content/
If you pasted the code: Just run the cell with the code.
If you uploaded the file:
Run this in a cell of the notebook:
xxxxxxxxxx!python llama_mmlu_eval.pyYou'll see:
Model loading progress
Subject-by-subject evaluation with progress bars
Real-time accuracy per subject
Final summary with top/bottom subjects
Expected runtime:
With T4 GPU: ~30-60 minutes for all 57 subjects
With V100/A100: ~15-30 minutes
CPU: Several hours (not recommended)
Your results files should appear in the Google Drive project folder you created.
You can also view and download your files directly in Colab by clicking on Files on the left sidebar, mounting Google Drive by clicking on the little Drive icon (if it is not already mounted), and then browsing to your project folder.