Skip to main content

Download and test a Hugging Face model

Goal

By the end of this tutorial you will have:

  • Set up a Python environment on an ECI VM
  • Understood the Hugging Face Hub download and cache layout
  • Run text-classification inference with the Transformers library
A CPU VM is fine for this exercise

You don't need a GPU; a CPU instance works too. Large models will simply take longer.


Step 1: Create the VM

FieldValue
Instance typeGPU or CPU instance
ImageUbuntu 22.04
Public IPCreate new (needed for the internet to download models)

Step 2: Install the environment

pip install transformers torch accelerate

Step 3: Download a model and run inference

# test_model.py
from transformers import pipeline

# Model is downloaded automatically into ~/.cache/huggingface/hub/
classifier = pipeline(
"text-classification",
model="snunlp/KR-FinBert-SC"
)

# Inference
texts = [
"The KOSPI rose 2% today.",
"Corporate earnings fell well short of expectations.",
"A new product launch date was announced."
]

for text in texts:
result = classifier(text)[0]
print(f"{text[:30]}... → {result['label']} ({result['score']:.2f})")
python3 test_model.py

Step 4: Cache management

Models are cached under ~/.cache/huggingface/hub/. Running the same model again loads from cache without downloading.

# Cached models
ls ~/.cache/huggingface/hub/

# Cache size
du -sh ~/.cache/huggingface/

If your block storage is filling up, move the cache:

export HF_HOME=/data/huggingface_cache

Step 5: GPU acceleration

On a GPU instance, set device=0 or device="cuda":

classifier = pipeline(
"text-classification",
model="snunlp/KR-FinBert-SC",
device=0 # use GPU
)

Next steps