SentenceTransformer based on microsoft/mdeberta-v3-base

This is a sentence-transformers model finetuned from microsoft/mdeberta-v3-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: microsoft/mdeberta-v3-base
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 tokens
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("BlackBeenie/mdeberta-v3-base-sbert")
# Run inference
sentences = [
    'hot dogs',
    'Habitual physical exercise has beneficial effects on telomere length in postmenopausal women. OBJECTIVE: It has been reported that women benefit from the maintenance of telomere length by estrogen. Exercise may favorably influence telomere length, although results are inconsistent regarding the duration and type of exercise and the cell type used to measure telomere length. The purpose of this study was to investigate the relationship between habitual physical exercise and telomere length in peripheral blood mononuclear cells (PBMCs) in postmenopausal women. Postmenopausal women were chosen as study participants because they are typically estrogen deficient. METHODS: This experimental-control, cross-sectional study included 44 healthy, nondiabetic, nonsmoking, postmenopausal women. Habitual exercisers and sedentary participants were matched for age and body mass index. Body weight, height, blood pressure, and waist and hip circumference were measured. Mitochondrial DNA copy number and telomere length in PBMCs were determined, and biochemical tests were performed. Habitual physical exercise was defined as combined aerobic and resistance exercise performed for at least 60 minutes per session more than three times a week for more than 12 months. RESULTS: The mean age of all participants was 58.11 ± 6.84 years, and participants in the habitual exercise group had been exercising more than three times per week for an average of 19.23 ± 5.15 months. Serum triglyceride levels (P = 0.01), fasting insulin concentrations (P < 0.01), and homeostasis model assessment of insulin resistance (P < 0.01) were significantly lower and high-density lipoprotein cholesterol levels (P < 0.01), circulating adiponectin (P < 0.01), mitochondrial DNA copy number (P < 0.01), and telomere length (P < 0.01) were significantly higher in the habitual exercise group than in the sedentary group. In a stepwise multiple regression analysis, habitual exercise (β = 0.522, P < 0.01) and adiponectin levels (β = 0.139, P = 0.03) were the independent factors associated with the telomere length of PBMCs in postmenopausal women. CONCLUSIONS: Habitual physical exercise is associated with greater telomere length in postmenopausal women. This finding suggests that habitual physical exercise in postmenopausal women may reduce telomere attrition.',
    'Dietary modification of human macular pigment density. PURPOSE: The retinal carotenoids lutein (L) and zeaxanthin (Z) that form the macular pigment (MP) may help to prevent neovascular age-related macular degeneration. The purpose of this study was to determine whether MP density in the retina could be raised by increasing dietary intake of L and Z from foods. METHODS: Macular pigment was measured psychophysically for 13 subjects. Serum concentrations of L, Z, and beta-carotene were measured by high-performance liquid chromatography. Eleven subjects modified their usual daily diets by adding 60 g of spinach (10.8 mg L, 0.3 mg Z, 5 mg beta-carotene) and ten also added 150 g of corn (0.3 mg Z, 0.4 mg L); two other subjects were given only corn. Dietary modification lasted up to 15 weeks. RESULTS: For the subjects fed spinach or spinach and corn, three types of responses to dietary modification were identified: Eight "retinal responders" had increases in serum L (mean, 33%; SD, 22%) and in MP density (mean, 19%; SD, 11%); two "retinal nonresponders" showed substantial increases in serum L (mean, 31%) but not in MP density (mean, -11%); one "serum and retinal nonresponder" showed no changes in serum L, Z, or beta-carotene and no change in MP density. For the two subjects given only corn, serum L changed little (+11%, -6%), but in one subject serum Z increased (70%) and MP density increased (25%). CONCLUSIONS: Increases in MP density were obtained within 4 weeks of dietary modification for most, but not all, subjects. When MP density increased with dietary modification, it remained elevated for at least several months after resuming an unmodified diet. Augmentation of MP for both experimental and clinical investigation appears to be feasible for many persons.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Dataset: eval
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.1975
cosine_accuracy@3	0.3117
cosine_accuracy@5	0.3673
cosine_accuracy@10	0.4074
cosine_precision@1	0.1975
cosine_precision@3	0.179
cosine_precision@5	0.1827
cosine_precision@10	0.1599
cosine_recall@1	0.0125
cosine_recall@3	0.0282
cosine_recall@5	0.0462
cosine_recall@10	0.0746
cosine_ndcg@10	0.1771
cosine_mrr@10	0.2604
cosine_map@100	0.1089
dot_accuracy@1	0.142
dot_accuracy@3	0.2346
dot_accuracy@5	0.2685
dot_accuracy@10	0.3302
dot_precision@1	0.142
dot_precision@3	0.1379
dot_precision@5	0.1395
dot_precision@10	0.1309
dot_recall@1	0.0053
dot_recall@3	0.0198
dot_recall@5	0.0291
dot_recall@10	0.0492
dot_ndcg@10	0.1368
dot_mrr@10	0.1959
dot_map@100	0.0882

Training Details

Training Dataset

Unnamed Dataset

Size: 110,575 training samples
Columns: sentence_0, sentence_1, and label
Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label
type string string int
details
min: 3 tokens
mean: 6.46 tokens
max: 19 tokens

min: 27 tokens
mean: 394.71 tokens
max: 512 tokens

1: 100.00%

	sentence_0	sentence_1	label
type	string	string	int
details	min: 3 tokens mean: 6.46 tokens max: 19 tokens	min: 27 tokens mean: 394.71 tokens max: 512 tokens	1: 100.00%

Samples:

sentence_0	sentence_1	label
`chronic diseases`	Role of antioxidants in cancer therapy. Oxidative stress is a key component in linking environmental toxicity to the multistage carcinogenic process. Reactive oxygen species (ROS) are generated in response to both endogenous and exogenous stimuli. To counterbalance ROS-mediated injury, an endogenous antioxidants defense system exists; however, when oxidation exceeds the control mechanisms, oxidative stress arises. Chronic and cumulative oxidative stress induces deleterious modifications to a variety of macromolecular components, such as DNA, lipids, and proteins. A primary mechanism of many chemotherapy drugs against cancer cells is the formation of ROS, or free radicals. Radiotherapy is based on the fact that ionizing radiation destroys tumor cells. Radiotherapy induces direct lesions in the DNA or biological molecules, which eventually affect DNA. Free radicals produced by oncology therapy are often a source of serious side effects as well. The objective of this review is to provide information about the effects of antioxidants during oncology treatments and to discuss the possible events and efficacy. Much debate has arisen about whether antioxidant supplementation alters the efficacy of cancer chemotherapy. There is still limited evidence in both quality and sample size, suggesting that certain antioxidant supplements may reduce adverse reactions and toxicities. Significant reductions in toxicity may alleviate dose-limiting toxicities so that more patients are able to complete prescribed chemotherapy regimens and thus, in turn, improve the potential for success in terms of tumor response and survival. Copyright © 2013 Elsevier Inc. All rights reserved.	`1`
`plant-based diets`	Diet, infection and wheezy illness: lessons from adults. An increase in asthma and atopic disease has been recorded in many countries where society has become more prosperous. We have investigated two possible explanations: a reduction in childhood infections and a change in diet. In a cohort of people followed up since 1964, originally selected as a random sample of primary school children, we have investigated the relevance of family size and the common childhood infectious diseases to development of eczema, hay fever and asthma. Although membership of a large family reduced risks of hay fever and eczema (but not asthma), this was not explained by the infections the child had suffered. Indeed, the more infections the child had had, the greater the likelihood of asthma, although measles gave a modest measure of protection. We have investigated dietary factors in two separate studies. In the first, we have shown the risks of bronchial hyper-reactivity are increased seven-fold among those with the lowest intake of vitamin C, while the lowest intake of saturated fats gave a 10-fold protection. In the second, we have shown that the risk of adult-onset wheezy illness is increased five-fold by the lowest intake of vitamin E and doubled by the lowest intake of vitamin C. These results were supported by direct measurements of the vitamins and triglycerides in plasma. We have proposed that changes in the diet of pregnant women may have reflected those observed in the population as a whole and that these may have resulted in the birth of cohorts of children predisposed to atopy and asthma. The direct test of this is to study the diet and nutritional status of a large cohort of pregnant women and to follow their offspring forward. This is our current research.	`1`
`liver health`	Effect of a very-high-fiber vegetable, fruit, and nut diet on serum lipids and colonic function. We tested the effects of feeding a diet very high in fiber from fruit and vegetables. The levels fed were those, which had originally inspired the dietary fiber hypothesis related to colon cancer and heart disease prevention and also may have been eaten early in human evolution. Ten healthy volunteers each took 3 metabolic diets of 2 weeks duration. The diets were: high-vegetable, fruit, and nut (very-high-fiber, 55 g/1,000 kcal); starch-based containing cereals and legumes (early agricultural diet); or low-fat (contemporary therapeutic diet). All diets were intended to be weight-maintaining (mean intake, 2,577 kcal/d). Compared with the starch-based and low-fat diets, the high-fiber vegetable diet resulted in the largest reduction in low-density lipoprotein (LDL) cholesterol (33% +/- 4%, P <.001) and the greatest fecal bile acid output (1.13 +/- 0.30 g/d, P =.002), fecal bulk (906 +/- 130 g/d, P <.001), and fecal short-chain fatty acid outputs (78 +/- 13 mmol/d, P <.001). Nevertheless, due to the increase in fecal bulk, the actual concentrations of fecal bile acids were lowest on the vegetable diet (1.2 mg/g wet weight, P =.002). Maximum lipid reductions occurred within 1 week. Urinary mevalonic acid excretion increased (P =.036) on the high-vegetable diet reflecting large fecal steroid losses. We conclude that very high-vegetable fiber intakes reduce risk factors for cardiovascular disease and possibly colon cancer. Vegetable and fruit fibers therefore warrant further detailed investigation. Copyright 2001 by W.B. Saunders Company	`1`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 32
per_device_eval_batch_size: 32
fp16: True
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 32
per_device_eval_batch_size: 32
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 3
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
eval_use_gather_object: False
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Training Logs

Epoch	Step	Training Loss	eval_cosine_map@100
0.1447	500	3.4744	-
0.2894	1000	3.3463	-
0.4340	1500	3.2119	-
0.5787	2000	3.0852	-
0.7234	2500	2.9736	-
0.8681	3000	2.8964	-
1.0	3456	-	0.0628
1.0127	3500	2.8117	-
1.1574	4000	2.7464	-
1.3021	4500	2.6987	-
1.4468	5000	2.6423	0.0795
1.5914	5500	2.584	-
1.7361	6000	2.5438	-
1.8808	6500	2.4891	-
2.0	6912	-	0.0948
2.0255	7000	2.4555	-
2.1701	7500	2.442	-
2.3148	8000	2.4161	-
2.4595	8500	2.3882	-
2.6042	9000	2.3545	-
2.7488	9500	2.3274	-
2.8935	10000	2.3134	0.1082
3.0	10368	-	0.1089

Framework Versions

Python: 3.10.12
Sentence Transformers: 3.1.1
Transformers: 4.44.2
PyTorch: 2.4.1+cu121
Accelerate: 0.34.2
Datasets: 3.0.0
Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

BlackBeenie
/

mdeberta-v3-base-sbert