paligemma2_vizwiz_gqa

This model is a fine-tuned version of ebrukilic/paligemma2_vizwiz_ft2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4691

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
0.8199 0.0288 200 0.6891
0.6335 0.0575 400 0.6061
0.5961 0.0863 600 0.5755
0.5773 0.1150 800 0.5565
0.5723 0.1438 1000 0.5471
0.5853 0.1725 1200 0.5325
0.5973 0.2013 1400 0.5301
0.5319 0.2300 1600 0.5167
0.525 0.2588 1800 0.5178
0.542 0.2875 2000 0.5051
0.5971 0.3163 2200 0.5006
0.5757 0.3450 2400 0.5005
0.5026 0.3738 2600 0.4914
0.5441 0.4025 2800 0.4802
0.5473 0.4313 3000 0.4861
0.4936 0.4600 3200 0.4880
0.4442 0.4888 3400 0.4821
0.5176 0.5175 3600 0.4668
0.4546 0.5463 3800 0.4720
0.481 0.5750 4000 0.4726
0.4701 0.6038 4200 0.4693
0.5143 0.6325 4400 0.4691

Framework versions

  • PEFT 0.18.0
  • Transformers 4.57.3
  • Pytorch 2.9.0+cu126
  • Datasets 4.4.2
  • Tokenizers 0.22.1
Downloads last month
24
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ebrukilic/paligemma2_vizwiz_gqa

Adapter
(2)
this model

Collection including ebrukilic/paligemma2_vizwiz_gqa

Evaluation results