Read the Beforeitsnews.com story here. Advertise at Before It's News here.

By Jonathan Lee
Contributor profile | More stories

Story Views
Now:
Last hour:
Last 24 hours:
Total:

Way Too Advanced: A.I. Tech Turns Your Voice into a Face - Chilling Thought: What's Next (Video)

Monday, July 1, 2019 14:23

% of readers think this story is Fact. Add your two cents.

7-1-19

Scary technology and the inventors could care less about the future consequences. Heads up, folks! —

New A.I. Tech Turns Your Voice into a Face & Commonly Prescribed Drugs Linked to Dementia

Leak Project

In the AI era, your voice could give away your face

MIT researchers trained a machine learning model to reconstruct a very rough likeness of someone’s face based only on a short audio clip.

From algorithms that can automatically tag you in photos, to face recognition systems embedded in city surveillance systems to voice generators that can put words in people’s mouths, AI is dismantling privacy. A new tool is peeling back the curtain a little more, with a method to figure out what your face looks like from your voice.

In research published on Arxiv, a publishing site for non-peer-reviewed papers, MIT researchers created a way to reconstruct some people’s very rough likeness based on a short audio clip. The paper, “Speech2Face: Learning the Face Behind a Voice,” explains how they took a dataset made up of millions of clips from YouTube and created a neural network-based model that learns vocal attributes associated with facial features from the videos. Now, when the system hears a new sound bite, the AI can use what it’s learned to guess what the face might look like.

The researchers, led by MIT postdoctoral student Tae-Hyun Oh, do briefly acknowledge the privacy concerns in the paper, explaining in an “Ethical Consideration” section that Speech2Face was trained to capture visual features like gender and age that are common, and only when there was enough evidence from the voice to do so. In other words, the system is not trying or able to produce images of specific people.

Still, the researchers speculate, the AI “may support useful applications, such as attaching a representative face to phone/video calls based on the speaker’s voice.”

[Image: Tae-Hyun Oh et. al., “Speech2Face: Learning the Face Behind a Voice”]

The resulting images are certainly very rough. But while they are not quite the quality of the latest computer-generated images that police departments are putting out to find missing children or crime suspects, generally, many of the images get in the right ballpark for age, ethnicity, and gender. Previous research has explored methods for predicting age and gender from speech, but in this case, the researchers claim they have also detected correlations with some facial patterns too. “Beyond these dominant features, our reconstructions reveal non-negligible correlations between craniofacial features (e.g., nose structure) and voice,” they write.

The system struggled with people of certain identities, however. Under the ethics section, the researchers acknowledge cases where attributes like spoken language or voice pitch caused the model to create highly erroneous associations and approximations of what the speaker looks like. This reflects the limits of machine learning, and the limits of the premis that a voice can be used to predict a face beyond basic stereotypes. With enough data, AI can find insignificant patterns anywhere.

The errors are also a result of the limited nature of the training data, as the researchers acknowledge—a problem that has led to racial and gender bias in AI systems.

“The training data we use is a collection of educational videos from YouTube, and does not represent equally the entire world population,” the authors write. “Therefore, the model—as is the case with any machine learning model—is affected by this uneven distribution of data.” They recommend “that any further investigation or practical use of this technology will be carefully tested to ensure that the training data is representative of the intended user population.”

The MIT tool hasn’t been released, but clips can be played here, with the screenshot from the YouTube video it was pulled from, as well as the generated face.

This isn’t the first time of researchers doing face reconstruction with just a voice. Other research done from groups in Ireland and Spain can create faces from short audio clips, but with only realistic results with previously heard voices. A project out of Japan can also imagine faces from voices to varying degrees. Researchers at Carnegie Mellon are working on technology that is useful enough that law enforcement is turning to it for help narrowing down what a suspect might look like based off just a vocal sample.

While Speech2Face’s results look more believable than these, it might have problems that the other papers don’t. Bhiksha Raj, who worked on the CMU research, points out that the MIT paper does not include any code for outside developers to test out and thinks the paper overstates what it can accomplish. “In other words, there’s nothing in the paper that shows that they’re performing anything more extraordinary than predicting people’s gender, age, and ethnicity from their voice, with some error, and drawing a face which matches those,” says Raj.

As Slate reported, not everyone that was part of the dataset was thrilled to become an unwitting part of the project. Nick Sullivan, a technology researcher at Cloudflare, tweeted out about discovering his face and voice were in the paper, and his attempt to learn how he became part of it. Many public and non-public face recognition databases rely on faces scraped from the web. For now, that kind of data harvesting may be protected by law: YouTube content is considered publicly available data, and any claims to copyright could likely be countered with a fair use argument.

[Image: Tae-Hyun Oh et. al., “Speech2Face: Learning the Face Behind a Voice”]

Voice privacy has taken a backseat to the push to regulate face recognition, but there are plenty of places where our voices are already being used as a biometric data point, with or without our knowledge. Chase started using technology called “Voice ID” last year to recognize credit card customers when they call the bank, collecting and storing a sample of your voice unless you explicitly opt out. Correctional institutions across the country are building a database of “voiceprints” of thousands of incarcerated people.

Other research wants AI to be able to do sentiment analysis of your voice, the next frontier of machines knowing more about us than we might like. Amazon filed for a patent earlier this year that one day could allow Alexa to recognize your emotional state and target ads based on your mood. Amazon said in a statement to The New York Times it did not use voice recordings for targeted advertising. Of course, that doesn’t mean the company won’t do so in the future—or run all manner of other algorithms on people’s voices.

Fast Company

Before It’s News® is a community of individuals who report on what’s going on around them, from all around the world.

Anyone can join.
Anyone can contribute.
Anyone can become informed about their world.

"United We Stand" Click Here To Create Your Personal Citizen Journalist Account Today, Be Sure To Invite Your Friends.

Please Help Support BeforeitsNews by trying our Natural Health Products below!

Order by Phone at 888-809-8385 or online at https://mitocopper.com M - F 9am to 5pm EST

Order by Phone at 866-388-7003 or online at https://www.herbanomic.com M - F 9am to 5pm EST

Order by Phone at 866-388-7003 or online at https://www.herbanomics.com M - F 9am to 5pm EST

Humic & Fulvic Trace Minerals Complex - Nature's most important supplement! Vivid Dreams again!

HNEX HydroNano EXtracellular Water - Improve immune system health and reduce inflammation.

Ultimate Clinical Potency Curcumin - Natural pain relief, reduce inflammation and so much more.

MitoCopper - Bioavailable Copper destroys pathogens and gives you more energy. (See Blood Video)

Oxy Powder - Natural Colon Cleanser! Cleans out toxic buildup with oxygen!

Nascent Iodine - Promotes detoxification, mental focus and thyroid health.

Smart Meter Cover - Reduces Smart Meter radiation by 96%! (See Video).

Comments

Total 1 comment

Koroboom

All those modern technologies really move us forwards even though at first you might thing that some of those technologies are useless. When I got familiar with Apple’s Siri which became one of the first uses for voice recognition technology, I thought that it is a complete useless thing but now the technology improved significantly and it is not only used in our smartphones and other similar gadgets but also in business. For example top-notch call center qa teams use the technology to reduce costs and improve their performance over all.

Oct 23, 2019, 2:22 pm 0 Reply

Online:
Visits:	1,598,750,932
Stories:	8,136,995