For reference, our API is detailed here.
Our API returns a document_classification
field which indicates the most likely classification of the document. The possible values are HUMAN_ONLY
, MIXED
, and AI_ONLY
. We also provide a probability for each classification, which is returned in the class_probabilities
field. The keys for this field are human
, ai
or mixed
. To get the probability for the most likely classification, the predicted_class
field can be used. The class probability corresponding to the predicted class can be interpreted as the chance that the detector is correct in its classification. I.e. 90% means that 90% of the time on similar documents our detector is correct in the prediction it makes.
Lastly, each prediction comes with a confidence_category
field, which can be high
, medium
, or low
. Confidence categories are tuned such that when the confidence_category
field is high
99.1% of human articles are classified as human, and 98.4% of AI articles are classified as AI.
If you would like to customize predictions further for your use case, for example to increase the rate at which AI documents are detected, you can set thresholds to transform the document_classification
and predicted_class
. E.g. you can make it such that if the predicted_class
is human
or mixed
, then the corresponding class probability needs to be greater than 65%, otherwise the prediction is overridden to be ai
.
Important: We discourage increasing detector sensitivity for academic use cases since false negatives (AI/mixed document classified as human) are preferred to false positives (human document classified as AI/mixed).
Similar logic can be applied to reduce false positives by overriding a prediction to human
if the ai
or mixed
class probabilities are less than 80% for example.