This project aims to test and evaluate one AI agent in the development stage. By asking questions and requests, based on various categories (the training is provided), you will try to force the AI agent to say something harmful, offensive, dangerous or toxic and evaluate its responses.
The end goal is to prevent the AI agent from sharing any harmful, offensive, dangerous or toxic content to the end users and make it safer.
This role is fully remote, full-time with a duration of several months (depending on the project progress it might be prolonged).
Full online training before the project will be provided.
* The salary benchmark is based on the target salaries of market leaders in their relevant sectors. It is intended to serve as a guide to help Premium Members assess open positions and to help in salary negotiations. The salary benchmark is not provided directly by the company, which could be significantly higher or lower.