On-demand answers available? [guardrailing]

#156
by Quetzalcoatl-homotopy - opened

Context:
from the last sentence in the Mistral documentation here:
"The answers of Mistral 7B-Instruct without prompt and with Mistral prompts are available on demand as they contain examples of text that may be considered unsafe, offensive, or upsetting."

General question: what does "on demand" mean here?

Question 1: how can we get a sample of their prompt, answer pairs for Mistral 7B-Instruct that were "considered unsafe, offensive"?

Question 2: can we get a sample of the dataset alluded to in content moderation: "We evaluated self-reflection on our manually curated and balanced dataset of adversarial and standard prompts and got a precision of 99.4% for a recall of 95.6% (considering acceptable prompts as positives)."?

Quetzalcoatl-homotopy changed discussion title from On-demand answers available? [gaurdailing] to On-demand answers available? [guardrailing]

Sign up or log in to comment