Microsoft has revealed changes to its facial recognition technology, encompassed in its Face API, which significantly reduce the error rate in matching genders across skin tones. The rework follows concerns stretching across the industry about the apparent bias inherent in A.I.-driven technology which resulted in less accurate gender identification for individuals who didn’t have light skin tones. By incorporating just three major alterations into the machine learning system behind its Face API, Microsoft was able to improve the accuracy of the system dramatically. In fact, it is now as much as nine times more accurate at identifying women overall with as much as twenty times fewer errors in identifying women with darker skin tones. In this case, the problem was able to be pinpointed as resulting from issues in the dataset behind the API. Although not necessarily deliberate, Microsoft senior researcher Hanna Wallach says that it effectively came down to unrecognized biases stemming from the way in which the database was created.
More directly, the samples used to train the A.I. simply weren’t diverse enough. To rectify the situation, the training regimen for Face API was expanded and new data collection started with a focus on a wider array of skin tones across other classifications such as age and gender. The classifier, which is typically a mathematical algorithm used to classify input into categories, was also tweaked. With the continuing work to improve its API, Microsoft also says it will need to incorporate further machine learning across skin tones to take factors such as hairstyle, jewelry, and eyewear into account. By covering as wide a range of factors as possible across a range of skin tones, the company hopes to give the teams working on the project and the A.I. itself a more ‘nuanced’ understanding of the task at hand so as to eliminate bias.
Having said that, according to senior Microsoft researcher Ece Kamar, future work with A.I. will need to go beyond pure technical challenges. Machine learning is conducted within the confines of society, she suggests, and it reflects the biases of society. So researchers and developers working with the technology will need to learn to recognize when a system is ‘mimicking’ decisions of a naturally biased society. More importantly, those individuals will need to learn when and how to proactively mitigate problems and what the best practices should be for the development of the systems. That’s something which, according to Kamar, will need to happen across the entire process, from conceptualization to deployment and monitoring.