How To Do More With Less Effort: Amazon Mechanical Turk
Amazon Mechanical Turk or MTurk is a human service marketplace created by the US-based retail giant in 2005. The name derives from a chess-playing device that was built in the 18th century. Grand masters from all over the world competed against the Turk thinking they were competing against an automated machine. However, the Turk was a mere deception. Challengers were tricked to believe they were playing with an artificial intelligence while, in fact, their competitor was just a person hiding inside.
However, before we proceed on to the main part, let’s have a look at the terminology involved:
HIT or Human Intelligence Task — a task that needs to be performed by a human. HIT contains all the task-related information for Workers and allows them to send results to Requesters via a dedicated user interface.
Worker — a person who performs HITs.
Requester — a person who creates HITs.
Why do I need Amazon Mechanical Turk
The entire system is based on the premise that certain tasks are usually much easier for humans than for computers, e.g. cataloging, content moderation, surveys, audio or video transcription, and many more. Thus, you can effortlessly complete these tasks by delegating them to other people.
Custom flow
To create a HIT, a Requester needs to specify all the required fields including title, description, price, maximum number of Workers to apply, qualification, and other parameters. When it’s done, the HIT becomes available on the marketplace and Workers can apply to perform the task. After the Requester approves the most suitable application, the Worker can start performing the task. Upon finishing, all the data is sent to MTurk. The Requester can either accept or reject the result. If everything’s ok, the Worker receives their payment. However, you can always stick to the default flow. In this case, workers don’t need to apply for the task.
Two-level tasks
If the quality of task results directly affects your product, you can also leverage MTurk to doublecheck them. Here is an example from our experience. Recently, we needed to voice-record some large piece of text. So after we created the main task, we also created another one so that other Workers ensured the record was flawless. Ultimately, MTurk helped us to improve the quality without almost any intervention on our side.
How to create a HIT
The first thing you need to do is decide on the type of your task and the method for checking the result. Luckily, MTurk provides a very flexible API which allows you to create a HIT in five different ways. The first two are the most simple ones and can work not only through the API but also through web UI.
A one-time task
The screenshot above is an example of a QuestionForm. To submit one, you need to fill in all the required fields. There are several answer format options to choose from: a free text answer, a multiple choice, pictures, etc. In addition, you can create exactly the same or even more complex form through the API using XML.
MTurk Project
MTurk Project is a good option to perform tasks that fundamentally stay the same but involve different kinds of data every time. For example, you need ten pictures evaluated by five particular criteria. The task will always be about the same number of pictures as well as the number of criteria (10 and 5 respectively) while only the evaluation questions will change.
To create a HIT, you will need to initiate the project first. To do that, use the dedicated constructor available from the web UI. Then, you will need to specify the project ID and the type of data involved. It can be done either through the API or the web UI, i.e. either by explicitly filling the data in a special form or by uploading it from a CSV file.
Flexible forms. Creating a HIT for almost any task
The two approaches below are available through the API only. With their help, you can create any form which significantly expands the number of tasks that can be performed using MTurk.
ExternalQuestion
Instead of creating a description, you need to send a link to the HTML file containing your form. MTurk will grab the form and place it into iframe so that a Worker could access it. This approach is especially convenient when you have a static form that either cannot be described using QuestionForm or needs Javascript. In this case, the best option is to upload both the form and all its assets on AWS s3 and use this link to create a HIT.
HTMLQuestion
Essentially, the HTMLQuestion works almost the same as ExternalQuesion except that you don’t need to bother about where to store the files. What you do here is dynamically generate a form and send it to MTurk. The main advantage of this approach is the flexibility.Though, if the form is overly complex, we recommend using ExternalQuestion.
Also, don’t forget about the Javascript code that you need to add to the form. Otherwise, no Worker will be assigned to your task.
Tip: MTurk Project comes with its own form editor. It’s much easier to create a form prototype using the editor first and then dynamically generate it on your side.
HIT on the user side
All the previous approaches have interface-related limitations. The reason is because the forms are displayed inside the iframe. However, there is a solution. In HIT, you need to put a link to the web page that will contain the functionality. Having finished the task, the Worker will receive a code to enter into the HIT form. A slight disadvantage of this approach is that creating functionality on the user side needs particular effort.
General advice for users
- Don’t forget to specify qualification. This will help you find the most experienced Workers.
- If your task involves explicit content, make sure to specify that.
- Setting a deadline is also a good practice.
- Try lowering the payment until you find an optimal price.
- Don’t forget to monitor the balance on your account.
General advice for developers
- Use the sandbox (available here and here). But remember that this is not your personal sandbox! So check the second page of the list, if you don’t find your task on the first one.
- Though Amazon documentation is very informative, it’s quite poorly structured. You may need these three links: https://docs.aws.amazon.com/AWSMechTurk/latest/AWSMechanicalTurkGettingStartedGuide/Welcome.html http://docs.aws.amazon.com/AWSMechTurk/latest/AWSMechanicalTurkRequester/Welcome.html http://docs.aws.amazon.com/AWSMechTurk/latest/AWSMturkAPI/Welcome.html
- MTurk gem is not available on GitHub. However, you can download it from here. And make sure to check the “examples” folder. You will find some really useful examples there.
- Don’t forget to check the balance before creating a new HIT.
Wrap up
The practice of using human intelligence to perform various tasks becomes more popular every day. To a greater extent, this is because sometimes having your task performed is a lot cheaper and more effective than developing and maintaining a complex algorithm on your own. But the most important thing here is to find the right balance between automation and control: you can either assign tasks to the first Workers to apply and automatically send them payments however the results are or thoroughly control every step. In this case, though, you will need someone to do that.