New and improved content moderation tooling

373 Views
No Comments

New and improved content moderation tooling

New and improved content moderation tooling

To help developers protect their applications against possible misuse, we are introducing the faster and more accurate Moderation endpoint. This endpoint provides OpenAI API developers with free access to GPT-based classifiers that detect undesired content—an instance of using AI systems to assist with human supervision of these systems. We have also released both a technical paper describing our methodology and the dataset used for evaluation.

When given a text input, the Moderation endpoint assesses whether the content is sexual, hateful, violent, or promotes self-harm—content prohibited by our content policy. The endpoint has been trained to be quick, accurate, and to perform robustly across a range of applications. Importantly, this reduces the chances of products “saying” the wrong thing, even when deployed to users at-scale. As a consequence, AI can unlock benefits in sensitive settings, like education, where it could not otherwise be used with confidence.

input text

Violence
Self-harm
Hate
Sexual

Moderation endpoint

Flagged
Flagged

The Moderation endpoint helps developers to benefit from our infrastructure investments. Rather than build and maintain their own classifiers—an extensive process, as we document in our paper—they can instead access accurate classifiers through a single API call.

As part of OpenAI’s commitment to making the AI ecosystem safer, we are providing this endpoint to allow free moderation of all OpenAI API-generated content. For instance, Inworld, an OpenAI API customer, uses the Moderation endpoint to help their AI-based virtual characters remain appropriate for their audiences. By leveraging OpenAI’s technology, Inworld can focus on their core product: creating memorable characters. We currently do not support monitoring of third-party traffic.

Get started with the Moderation endpoint by checking out the documentation. More details of the training process and model performance are available in our paper. We have also released an evaluation dataset, featuring Common Crawl data labeled within these categories, which we hope will spur further research in this area.

Read More 

END
可以使用微信扫码关注公众号(ID:xzluomor)
post-qrcode
 0
Comment(No Comments)

文心AIGC

March 2023
M T W T F S S
 12345
6789101112
13141516171819
20212223242526
2728293031  
文心AIGC
文心AIGC
人工智能ChatGPT,AIGC指利用人工智能技术来生成内容,其中包括文字、语音、代码、图像、视频、机器人动作等等。被认为是继PGC、UGC之后的新型内容创作方式。AIGC作为元宇宙的新方向,近几年迭代速度呈现指数级爆发,谷歌、Meta、百度等平台型巨头持续布局
文章搜索
热门文章
最新评论
热评文章