Troubleshooting Guide¶

This comprehensive guide addresses common issues encountered when using OmniGenBench, organized by category. For each problem, we provide symptoms, root cause analysis, and systematic solutions with working code examples.

Quick Navigation:

Installation Issues - ViennaRNA, PyTorch, dependency conflicts
Command-Line Interface (CLI) Issues - Command-line interface errors and usage patterns
Runtime Errors - Out of memory, sequence length, ViennaRNA fold errors
Model Loading Issues - Tokenizer, architecture mismatch, trust_remote_code
Training & Evaluation Issues - NaN loss, poor performance, evaluation metrics
platform-specific - Windows encoding, path issues, long path support

Command-Line Interface (CLI) Issues¶

Problem: Command Not Found - ‘ogb’ or ‘autoinfer’

Symptoms:

bash: ogb: command not found
bash: autoinfer: command not found

Solutions:

Verify Installation:

pip show omnigenbench
# Check that 'Location' points to active Python environment

Check PATH Configuration:

# Verify pip scripts directory is in PATH
which ogb  # Linux/Mac
where ogb  # Windows

# If not found, add to PATH
export PATH="$HOME/.local/bin:$PATH"  # Linux
export PATH="$(python -m site --user-base)/bin:$PATH"  # macOS

Reinstall in Correct Environment:

conda activate omnigen_env
pip install --force-reinstall omnigenbench

Use Python Module Execution (Fallback):

python -m omnigenbench.cli.ogb_cli autobench --help

Problem: Invalid Argument Errors

Symptoms:

error: unrecognized arguments: --seeds 0 1 2
error: argument --model: expected one argument

Solutions:

Check Argument Syntax (Note: --seeds takes multiple values):

# CORRECT: Multiple seeds without commas
ogb autobench --model model --benchmark RGB --seeds 0 1 2

# WRONG: Comma-separated or quoted
# ogb autobench --model model --benchmark RGB --seeds "0,1,2"

Verify Required Arguments:

# View all required and optional arguments
ogb autobench --help
ogb autotrain --help
ogb autoinfer --help
ogb rna_design --help

Common Argument Patterns:

# AutoBench: model and benchmark are required
ogb autobench --model yangheng/OmniGenome-186M --benchmark RGB

# AutoTrain: dataset and model are required
ogb autotrain --dataset ./data --model yangheng/OmniGenome-186M

# AutoInfer: model and (sequence OR input-file) required
ogb autoinfer --model yangheng/ogb_tfb_finetuned --sequence "ATCG"
ogb autoinfer --model yangheng/ogb_tfb_finetuned --input-file data.json

# RNA Design: structure is required
ogb rna_design --structure "(((...)))"

Problem: Benchmark Name Not Recognized

Symptoms:

ValueError: Benchmark 'rgb' not found. Available: RGB, BEACON, PGB, GUE, GB

Solutions:

Use Correct Case-Sensitive Names:

# CORRECT: Uppercase benchmark names
ogb autobench --model model --benchmark RGB
ogb autobench --model model --benchmark BEACON
ogb autobench --model model --benchmark PGB
ogb autobench --model model --benchmark GUE
ogb autobench --model model --benchmark GB

# WRONG: Lowercase will fail
# ogb autobench --model model --benchmark rgb

List Available Benchmarks:

from omnigenbench import BenchHub
print(BenchHub.list_benchmarks())

Problem: Trainer Backend Not Available

Symptoms:

ValueError: Trainer 'Accelerate' not recognized. Use 'native', 'accelerate', or 'hf_trainer'

Solutions:

Use Lowercase Trainer Names:

# CORRECT: Lowercase trainer names
ogb autobench --model model --benchmark RGB --trainer native
ogb autobench --model model --benchmark RGB --trainer accelerate
ogb autobench --model model --benchmark RGB --trainer hf_trainer

# WRONG: Capitalized will fail
# ogb autobench --model model --benchmark RGB --trainer Accelerate

Install Missing Dependencies:

# For accelerate trainer
pip install accelerate

# For hf_trainer
pip install transformers[torch]

Verify Trainer Availability:

try:
    from accelerate import Accelerator
    print("Accelerate trainer available")
except ImportError:
    print("Install: pip install accelerate")

Installation Issues¶

Problem: ViennaRNA Installation Fails

Symptoms:

ERROR: Could not build wheels for viennarna
error: command 'gcc' failed with exit status 1

Solutions:

Use Conda (Recommended):

conda install -c bioconda viennarna
pip install omnigenbench

Install Build Dependencies (Linux):

sudo apt-get update
sudo apt-get install build-essential python3-dev
pip install viennarna

Use Pre-built Binary (macOS):

brew install viennarna
pip install viennarna

Windows Users: Use WSL2 or Docker:

# Inside WSL2
sudo apt-get install python3-viennarna

Problem: PyTorch CUDA Mismatch

Symptoms:

RuntimeError: CUDA error: no kernel image is available for execution on the device
torch.cuda.is_available() returns False

Solutions:

Check CUDA Version:

nvidia-smi  # Look at CUDA Version in top right
python -c "import torch; print(torch.version.cuda)"

Reinstall Matching PyTorch:

# For CUDA 11.8
pip uninstall torch
pip install torch --index-url https://download.pytorch.org/whl/cu118

# For CUDA 12.1
pip install torch --index-url https://download.pytorch.org/whl/cu121

Verify Installation:

import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
print(f"Device count: {torch.cuda.device_count()}")

Problem: Import Errors After Installation

Symptoms:

ModuleNotFoundError: No module named 'omnigenbench'
ImportError: cannot import name 'OmniModelForSequenceClassification'

Solutions:

Verify Installation:

pip show omnigenbench
# Check that package is installed and version is correct

Check Python Environment:

# Ensure you're in the correct environment
which python  # Linux/Mac
where python  # Windows

# Reinstall in correct environment
pip install --upgrade omnigenbench

Development Installation (if working from source):

cd /path/to/OmniGenBench
pip install -e .  # Editable install

Clear Python Cache:

# Remove cached bytecode
find . -type d -name "__pycache__" -exec rm -rf {} +  # Linux/Mac
# Windows: Delete __pycache__ folders manually

Problem: Windows Emoji/Unicode Display Issues

Symptoms:

UnicodeEncodeError: 'charmap' codec can't encode character
Display shows garbled characters instead of progress bars

Solutions:

Set UTF-8 Encoding (Recommended):

# PowerShell
$env:PYTHONIOENCODING="utf-8"
[Console]::OutputEncoding = [System.Text.Encoding]::UTF8

# Git Bash (recommended for Windows)
export PYTHONIOENCODING=utf-8

Use Git Bash Instead of CMD:

Git Bash provides better Unicode support and is the recommended terminal for Windows users.
Disable Emoji Output (if issues persist):

OmniGenBench avoids emoji in output by design for Windows compatibility. If you see encoding errors, they likely come from custom print statements or third-party libraries.

Problem: Windows Path Issues

Symptoms:

FileNotFoundError: [Errno 2] No such file or directory: 'results\\model'
OSError: [WinError 123] Invalid filename, directory name, or volume label

Solutions:

Use Forward Slashes:

# Good - works on all platforms
model.save_model("results/my_model")
dataset = OmniDataset("data/sequences.json", tokenizer)

# Avoid - Windows-specific backslashes
# model.save_model("results\\my_model")

Use pathlib for Cross-Platform Compatibility:

from pathlib import Path

output_dir = Path("results") / "models" / "experiment_1"
output_dir.mkdir(parents=True, exist_ok=True)
model.save_model(str(output_dir))

Avoid Long Paths (Windows 260-character limit):

# Enable long path support (Windows 10+, requires admin)
# Run in PowerShell as Administrator:
New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" `
                 -Name "LongPathsEnabled" -Value 1 -PropertyType DWORD -Force

Solutions:

Verify Installation:

pip show omnigenbench
which python  # Ensure correct environment

Check Virtual Environment:

conda activate omnigen_env
pip list | grep omnigenbench

Reinstall in Clean Environment:

conda create -n omnigen_fresh python=3.12
conda activate omnigen_fresh
pip install omnigenbench

Runtime Errors¶

Problem: Out of Memory (OOM) Errors

Symptoms:

RuntimeError: CUDA out of memory. Tried to allocate 2.00 GiB
torch.cuda.OutOfMemoryError

Solutions:

Reduce Batch Size:

# Python API
bench.run(batch_size=8)  # Instead of 32

# CLI
ogb autobench --model model --benchmark RGB --batch-size 8

Enable Gradient Checkpointing (for training):

from omnigenbench import AutoTrain

trainer = AutoTrain(
    dataset="data",
    config_or_model="model",
    gradient_checkpointing=True  # Trades compute for memory
)

Use Mixed Precision:

ogb autotrain --dataset data --model model --autocast

Monitor GPU Memory:

watch -n 1 nvidia-smi
# Or
nvidia-smi dmon -s u -d 1

Clear GPU Cache:
```
import torch
torch.cuda.empty_cache()
```

Problem: Sequence Length Exceeds Max Length

Symptoms:

ValueError: Input length 8192 exceeds maximum length 512
RuntimeError: The size of tensor a (8192) must match the size of tensor b (512)

Solutions:

Increase max_length Parameter:

dataset = OmniDatasetForSequenceClassification(
    dataset_name_or_path="data.json",
    tokenizer=tokenizer,
    max_length=8192  # Increase from default 512
)

# For models, ensure they support longer sequences
model = OmniModelForSequenceClassification(
    "yangheng/OmniGenome-186M",
    tokenizer=tokenizer,
    num_labels=2
)
# Note: Base model must support the target sequence length

Use drop_long_seq to Filter:

dataset = OmniDatasetForSequenceClassification(
    dataset_name_or_path="data.json",
    tokenizer=tokenizer,
    max_length=512,
    drop_long_seq=True  # Drop sequences > max_length instead of truncating
)

Chunking Long Sequences (for very long genomic regions):

def chunk_sequence(seq, chunk_size=512, overlap=50):
    """Split long sequence into overlapping chunks."""
    chunks = []
    for i in range(0, len(seq), chunk_size - overlap):
        chunk = seq[i:i + chunk_size]
        if len(chunk) >= chunk_size // 2:  # Keep meaningful chunks
            chunks.append(chunk)
    return chunks

# Process each chunk separately
long_seq = "ATCG" * 3000  # 12000 bp sequence
chunks = chunk_sequence(long_seq, chunk_size=512)
results = [model.inference(chunk) for chunk in chunks]

Problem: ViennaRNA Fold Function Errors

Symptoms:

AttributeError: module 'RNA' has no attribute 'fold'
ImportError: No module named 'RNA'

Solutions:

Install ViennaRNA (Required for RNA structure prediction/design):

# Conda (recommended)
conda install -c bioconda viennarna

# Linux
sudo apt-get install python3-viennarna

# macOS
brew install viennarna

Verify Installation:

try:
    import RNA
    structure, mfe = RNA.fold("GCGAAACGC")
    print(f"Structure: {structure}, MFE: {mfe}")
except ImportError:
    print("ViennaRNA not installed")

Windows Users: ViennaRNA has limited Windows support. Options:
- Use WSL2 (Windows Subsystem for Linux)
- Use Docker container with ViennaRNA
- Use online RNA folding services as fallback
```
dataset = OmniDatasetForSequenceClassification(
    data_path="data.json",
    tokenizer=tokenizer,
    max_length=8192  # Increase from default 512
)
```

Use Model with Longer Context:

# Models with long-context support
model = ModelHub.load("LongSafari/hyenadna-medium-160k-seqlen-hf")  # 160k tokens

Truncate Sequences:

dataset = OmniDatasetForSequenceClassification(
    data_path="data.json",
    tokenizer=tokenizer,
    max_length=512,
    truncation=True  # Enable truncation
)

Problem: HuggingFace Hub Authentication Errors

Symptoms:

HTTPError: 401 Client Error: Unauthorized
Repository not found

Solutions:

Login to HuggingFace:

huggingface-cli login
# Enter your token from https://huggingface.co/settings/tokens

Set Environment Variable:

export HUGGINGFACE_TOKEN=hf_your_token_here

Use Access Token in Code:

from huggingface_hub import login
login(token="hf_your_token_here")

Problem: Windows Encoding Errors

Symptoms:

UnicodeEncodeError: 'charmap' codec can't encode character '\u2713'

Solutions:

Set Terminal Encoding (PowerShell):

[Console]::OutputEncoding = [System.Text.Encoding]::UTF8

Use Git Bash (Recommended for Windows):
```
export PYTHONIOENCODING=utf-8
```

Disable Unicode in Output:

ogb autobench --model model --benchmark RGB --no-unicode

Model Loading Issues¶

Problem: Tokenizer Not Found

Symptoms:

OSError: Can't load tokenizer for 'yangheng/OmniGenome-186M'

Solutions:

Verify Model Exists:

# Check on HuggingFace Hub
# https://huggingface.co/yangheng/OmniGenome-186M

Specify Tokenizer Explicitly:

from omnigenbench import ModelHub, OmniSingleNucleotideTokenizer

tokenizer = OmniSingleNucleotideTokenizer.from_pretrained("model")
model = ModelHub.load("model", tokenizer=tokenizer)

Use Local Tokenizer:

tokenizer = OmniSingleNucleotideTokenizer(
    vocab_file="./tokenizer_vocab.json"
)

Problem: Model Architecture Mismatch

Symptoms:

RuntimeError: Error(s) in loading state_dict for BertModel:
size mismatch for embeddings.word_embeddings.weight

Solutions:

Use Correct Task-Specific Model Class:

# WRONG: Generic model class
from omnigenbench import OmniModel

# RIGHT: Task-specific model class
from omnigenbench import OmniModelForSequenceClassification

model = OmniModelForSequenceClassification(
    config_or_model="yangheng/ogb_tfb_finetuned",
    num_labels=919  # Match training configuration
)

Check Model Configuration:

from transformers import AutoConfig

config = AutoConfig.from_pretrained("model")
print(config)  # Verify num_labels, hidden_size, etc.

Problem: trust_remote_code Error

Symptoms:

ValueError: Loading this model requires you to execute code in the model
repository. You can enable this by setting `trust_remote_code=True`

Solutions:

Enable trust_remote_code:

model = ModelHub.load(
    "yangheng/OmniGenome-186M",
    trust_remote_code=True
)

Understand Security Implications: Only use for trusted models from reputable sources.

Training & Evaluation Issues¶

Problem: Training Loss is NaN

Symptoms:

Epoch 1: loss = nan
RuntimeError: loss is nan

Solutions:

Reduce Learning Rate:

trainer = AutoTrain(
    dataset="data",
    config_or_model="model",
    learning_rate=1e-5  # Instead of default 2e-5
)

Enable Gradient Clipping:

trainer = AutoTrain(
    dataset="data",
    config_or_model="model",
    max_grad_norm=1.0  # Clip gradients
)

Check Data Quality:

# Verify no NaN or Inf in labels
import json
data = json.load(open("train.json"))
labels = [d['label'] for d in data]
print(f"NaN count: {sum([l != l for l in labels])}")  # l != l checks for NaN

Use Mixed Precision Carefully:

# Try without autocast first
ogb autotrain --dataset data --model model

Problem: Poor Model Performance

Symptoms:

Test accuracy: 0.52 (close to random)
MCC: 0.05

Solutions:

Increase Training Epochs:

ogb autotrain --dataset data --model model --num-epochs 100

Adjust Learning Rate:

# Try learning rate sweep
for lr in [1e-6, 5e-6, 1e-5, 5e-5, 1e-4]:
    trainer = AutoTrain(
        dataset="data",
        config_or_model="model",
        learning_rate=lr
    )
    trainer.run()

Use Multi-Seed Evaluation:

bench = AutoBench(
    benchmark="RGB",
    config_or_model="model"
)
bench.run(seeds=[0, 1, 2, 3, 4])  # Average over 5 runs

Verify Data Quality:

# Check class balance
import json
from collections import Counter

data = json.load(open("train.json"))
labels = [d['label'] for d in data]
print(Counter(labels))  # Should not be extremely imbalanced

Try Different Model:

# Models with different architectures
ogb autotrain --dataset data --model zhihan1996/DNABERT-2-117M
ogb autotrain --dataset data --model yangheng/OmniGenome-186M

Problem: Slow Training Speed

Symptoms:

Training speed: 0.5 it/s (expected 5-10 it/s)

Solutions:

Use Multi-GPU Training:

ogb autotrain --dataset data --model model --trainer accelerate

Increase Batch Size:

ogb autotrain --dataset data --model model --batch-size 64

Enable Mixed Precision:

ogb autotrain --dataset data --model model --autocast

Use DataLoader Workers:

trainer = AutoTrain(
    dataset="data",
    config_or_model="model",
    num_workers=4  # Parallel data loading
)

Profile Bottlenecks:

import time

start = time.time()
# Training step
print(f"Time per batch: {(time.time() - start):.3f}s")

RNA Design Issues¶

Problem: RNA Design Not Converging

Symptoms:

Generation 100: Best score = 5 (Hamming distance from target)
No perfect matches found

Solutions:

Increase Population Size:

ogb rna_design --structure "(((...)))" --num-population 500

Increase Generations:

ogb rna_design --structure "(((...)))" --num-generation 300

Adjust Mutation Rate:

# Try lower mutation rate for fine-tuning
ogb rna_design --structure "(((...)))" --mutation-ratio 0.2

Use Different Model:

# Try RNA-specialized model
ogb rna_design --structure "(((...)))" --model yangheng/OmniGenome-186M

Verify Structure Validity:

def validate_structure(structure):
    """Check if dot-bracket notation is balanced."""
    stack = []
    for char in structure:
        if char == '(':
            stack.append(char)
        elif char == ')':
            if not stack:
                return False
            stack.pop()
    return len(stack) == 0

print(validate_structure("(((...)))"))  # True

Problem: ViennaRNA Not Found

Symptoms:

ModuleNotFoundError: No module named 'RNA'
ImportError: cannot import name 'RNA'

Solutions:

Install ViennaRNA:
```
conda install -c bioconda viennarna
```

Verify Installation:

import RNA
print(RNA.fold("GCGAAACGC"))

Common Questions (FAQ)¶

Q: How do I know which trainer backend to use?

A: Follow these guidelines:

Development/Debugging: Use trainer="native" for explicit control
Production Training: Use trainer="accelerate" for multi-GPU scaling
Advanced Features: Use trainer="hf_trainer" for DeepSpeed, callbacks

Q: What’s the difference between ``predict()`` and ``inference()``?

predict(): Returns raw model outputs (logits, hidden states)
inference(): Returns formatted predictions with probabilities and class labels

Q: How many seeds should I use for benchmarking?

Quick experiments: 1 seed
Paper results: 3-5 seeds (recommended)
Critical comparisons: 10+ seeds with significance testing

Q: Why is my model loading slow?

A: First-time loading downloads from HuggingFace Hub. Subsequent runs use cached models.

Q: Can I use OmniGenBench for protein sequences?

A: Currently optimized for DNA/RNA. Protein support is experimental.

Q: How do I contribute to OmniGenBench?

A: See CONTRIBUTING.md in the repository root.

Getting Help¶

If your issue persists after trying these solutions:

Check GitHub Issues: https://github.com/yangheng95/OmniGenBench/issues
Open New Issue: Include error messages, system info, minimal reproduction code
Documentation: OmniGenBench
API Reference: API Reference

System Information Template (include when reporting issues):

python -c "import omnigenbench; print(omnigenbench.__version__)"
python -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.version.cuda}')"
python -c "import transformers; print(f'Transformers: {transformers.__version__}')"
nvidia-smi  # If using GPU