A Matter of Trust - People are smarter than algorithms

December 13, 2005

I just read the Turk Lurkers post about building trust relationships between requesters and turkers on Amazon Mechanical Turk. That’s a good method but the problem is you shouldn’t trust any kind of automated quality control system. Some how - some way someone will be able to sneak by it. Sure you can run a HIT multiple times through but that ends up costing more and some bad work can still slip through the cracks. Having a piece of code as your quality control just doesn’t hack it.

The weakness of the approval methods for most of the HITs on Amazon Mechanical Turk are that they are just like using the “ask the audience” lifeline from who wants to be a millionaire(Which of course didn’t work all the time). To raise the ‘quality’ of this method, you need to raise the redundancy rate. This is bad for the workers (getting rejected for good work) and worse for the requestor (having to pay for once piece of work multiple times).

The only solution I have come up with that can provide QUALITY work results for the requestors is a Two-pass verification system. Paying another higher ranked Turker to double check a HIT will keep the quality up. By using qualification scores the higher ranked turker will then have to do less corrections because the poor workers have been weeded out. Eventually, skilled and trusted workers will get paid good money to click a ‘Approve’ button all day.

BTTS is using a two-pass qualification system to build quality trust relationships. The L1 qualifications are handed out to anyone who can pass the basic test, L2 qualifications are only give to people I can converse with and review credentials. The L2 person is responsible for ensuring the end HIT is perfect and approves the L1 work. When a L1 gets approved, his/her quality score goes up to eventually be promoted to L2. If the work is not approved the score goes down and eventual gets locked out from the work. I do realize this doesn’t scale well for work that needs a mass amount of workers ASAP. However, over time BTTS will have a small army of trusted translators that will provide high quality work.

So in closing, use AMT for your work AND your quality control. Writing a piece of code that can block people from making easy money is probably a lot harder than writing advanced image recognition code that can spot business store fronts:)

Tags: ,

2 Responses

  1. The method I proposed was not meant to be a complete system to ensure quality in submissions. It’s a tool to measure trust that is heavily geared towards stopping botters. It’s similar to what you mention in that you have to have a core group of trusted people to get started with it.

    I also agree you’re never going to make a perfect system to block cheaters and botters. The A9 hits were bad from the very beginning since they were so heavily skewed towards the “NotO” variety.

    I’d love to discuss any of this in further detail. Feel free to drop me an email.

  2. Ah, forgive my poor writing:) My solution is to ensure high quality work, the side effect of that is it will should out botters.

    Thanks for commenting on my blog, I’m glad there are people out there working on ideas on improving Turk.

Leave a Reply