To see the full schema and try examples yourself, check out our API documentation.
Our classifier returns a document-level score
completely_generated_prob that specifies the probability the entire document was AI-generated. We would recommend using this score when deciding whether or not there is a significant use of AI in generating the text.n our validation dataset, here is how the results change when you set all documents with
completely_generated_prob under the threshold as human, and above as AI:
- At a threshold of 0.88, 85% of AI documents are classified as AI, and 99% of human documents are classified as human
- At a threshold of 0.5, 96% of AI documents are classified as AI, and 96% of human documents are classified as human
We recommend using a threshold of 0.88 or higher to minimize the number of false positives, as we think it is currently more harmful to falsely detect human writing as AI than vice versa.
Additionally, we highlight sentences that been detected to be written by AI. API users can access this highlighting through the
highlight_sentence_for_ai field. The sentence-level classification should not be soley used to indicate that an essay contains AI (such as ChatGPT plagiarism). Rather, when a document gets a
AI_ONLY classification, the highlighted sentence will indicate where in the document we believe this occurred.