How might we use object detection to help people living with visual disabilities?

Thiea Vision was a start-up that aimed to help people living with visual impairments navigate the world. It was a phone app that uses a neural network to process a video feed from a wi-fi enabled camera and convert it into audio feedback.

Timeline: 6 months

People Involved: Software developers (machine vision and app developers)

With an exciting natural language processing 2023 reimagination at the end!

The Problem

 
FAQ_outline I.jpg

Living with a visual disability can be tough. This is especially true for people living with severe visual impairment as all of our built environments do not cater to somebody with poor vision. Everything from wayfinding to navigating roads to something as simple as finding a coffee mug can be really challenges for someone with visual disabilities.

By the numbers

 
v.png

With the youngest of the baby boomers hitting 65 by 2029, the number of people with visual impairment or blindness in the United States is expected to double to more than 8 million by 2050

v copy.png

People living with visual impariments have to almost “hack” their way around life as society as a whole relies primarily on vision for communicating. Everything from way-finding to identifying the correct buttons on a microwave are complicated by vision loss.

The Technology

 
Gif_tech.gif

The idea was to use a neural network that runs real time object detection to identify objects and convert visual into audio feedback for people living with visual disabilities. The idea started out over a cup of coffee with a friend who was working on TensorFlow.



Research

 
teaching_outline III.jpg

What I did

  • Surveys

  • Desk Research

  • Literature Reviews

  • Market Research

  • Empathic Modelling

What I read

 
v copy 2.png

Visually Impaired

23.7 Million People are visually impaired in the United States alone.

v copy 3.png

Severely Impaired

Out of the 23.7 million, 10 million are considered severely visually impaired.

v copy 5.png

Cataracts/Glaucoma

Cataracts and Glaucoma are the 2nd and 3rd biggest causes of visual impairment at 33% and 2% respectively.

v copy 4.png

Legally blind

1.3 million people are legally blind with a best corrected vision of 20/20.

What I heard

 

“As far as navigation is concerned, the biggest obstacle is fear”

 

Overhanging objects that that cannot be detected from the ground with a white cane such as trees and low hanging objects are the the most scary”

“The primary predator of the blind person is an automobile”

 

“My brother in law put my niece’s play-pen in the middle of the room and forgot to tell me about it. I tripped on it and I hurt myself pretty bad

 

What I felt

 
IMG_6584.jpg
IMG_6588.jpg
IMG_6549.jpg
 

Insights

shampoo+conditioner.jpg

Shampoo and conditioner bottles are exactly the same. This is true for so many products and buttons.

braille.jpg

While Braille sign-boards exist, there is no standardization of where they are placed.

bills.jpg

All bills in the US are exactly the same size. This makes differentiating $100 from $1 difficult.

tactile pavements.jpg

Tactile pavements exist but not everywhere. It is difficult to know where to feel for them and where not.

What people said

3.jpg

We sent out surveys asking people living with visual disabilities about some of the challenges they felt and what their biggest problems were. We also asked them how they navigated these challenges and what their own personal “hacks” were. (n=21)

Synthesis

 

Stakeholder Map

I charted out the various stakeholders involved in the ecosystem that surrounds a person living with severe visual disabilities to understand the different intervention points that come into play.

 
STAKEHOLDER MAP.jpg
 
 

Empathy Map

Instead of creating user personas and journey maps, I decided to create an empathy map for my target audience. This helped me verbalize the biggest pain and gain points and effectively communicate my findings. I felt that a journey map or a persona would be too limiting as the target audience is very broad in terms of likes, interests and other personality traits. The only thing they have in common is their visual disability.

EMPATHY MAP.png

The idea

 
josh-calabrese-qmnpqDwla_E-unsplash.jpg

Why?

Total loss of vision in both eyes is considered to be 100% visual impairment and 85% impairment of the whole person. It is the 21st century and technology has finally reached the point where we can make the world more accessible for everyone.

 
1*NLnnf_M4Nlm4p1GAWrWUCQ.gif

How?

By creating a camera paired with a smartphone to run real time object detection on a live video feed. The smartphone processes the video and provides audio feedback to the user.

 

User flow

ServiceFlowTheiaVision.png

 Version 1

Screenshot 2019-05-26 at 12.49.41 AM.png

I consciously decided to create as simple an interface as possible. Most users interact with apps on their phones by using the talk-back feature on their phones. However, a small subset of users that have severe visual disabilities are still able to see objects up close. To meet their needs, the screens have big bold text with high-contrast. All of the different modes also have different colors to make it easier to differentiate between screens.

 

Checking accessibility for color blindness

 
Screenshot 2019-05-25 at 3.43.04 PM.png
 
Screenshot 2019-05-25 at 3.43.28 PM.png
 

 Version 2

Screenshot 2019-05-26 at 12.49.41 AM.png

Based on user-feedback, I modified the screens to include tabs to indicate the current mode. I also updated the Visual Design to simplify screens and get rid of unnecessary elements.

 

Checking accessibility for color blindness

Protanopia

Protanopia

Deuteranopia

Deuteranopia

Protanomaly

Protanomaly

Deuteranomaly

Deuteranomaly

The product

2.gif

Development

Form development

Renders

Renders

Physical 3D printed Mockups

Physical 3D printed Mockups

Final Renders

The moment we decided to design a product that was mounted on to glasses, we knew it would be impossible to make the product invisible. We decided to make the product a fashion accessory instead to make it something that users would be proud to wear.

render.png
Docking.png
Movement.gif

What happens if we add NLP to the mix?

So everything you saw was built back in 2019 using machine learning algorithms that were available then. This 2023 though so I asked myself how this project would change if LLMs (large language models- think ChatGPT or Bard) could describe images.

Here’s how I think something like that could work:

Previous
Previous

Protekt (HMW we protect children from questionable content online?)