How might we use object detection to help people living with visual disabilities?
Thiea Vision was a start-up that aimed to help people living with visual impairments navigate the world. It was a phone app that uses a neural network to process a video feed from a wi-fi enabled camera and convert it into audio feedback.
Timeline: 6 months
People Involved: Software developers (machine vision and app developers)
With an exciting natural language processing 2023 reimagination at the end!

The Problem
Living with a visual disability can be tough. This is especially true for people living with severe visual impairment as all of our built environments do not cater to somebody with poor vision. Everything from wayfinding to navigating roads to something as simple as finding a coffee mug can be really challenges for someone with visual disabilities.
By the numbers
With the youngest of the baby boomers hitting 65 by 2029, the number of people with visual impairment or blindness in the United States is expected to double to more than 8 million by 2050
People living with visual impariments have to almost “hack” their way around life as society as a whole relies primarily on vision for communicating. Everything from way-finding to identifying the correct buttons on a microwave are complicated by vision loss.
The Technology
The idea was to use a neural network that runs real time object detection to identify objects and convert visual into audio feedback for people living with visual disabilities. The idea started out over a cup of coffee with a friend who was working on TensorFlow.
Research
What I did
Surveys
Desk Research
Literature Reviews
Market Research
Empathic Modelling
What I read
Visually Impaired
23.7 Million People are visually impaired in the United States alone.
Severely Impaired
Out of the 23.7 million, 10 million are considered severely visually impaired.
Cataracts/Glaucoma
Cataracts and Glaucoma are the 2nd and 3rd biggest causes of visual impairment at 33% and 2% respectively.
Legally blind
1.3 million people are legally blind with a best corrected vision of 20/20.
What I heard
“As far as navigation is concerned, the biggest obstacle is fear”
“Overhanging objects that that cannot be detected from the ground with a white cane such as trees and low hanging objects are the the most scary”
“The primary predator of the blind person is an automobile”
“My brother in law put my niece’s play-pen in the middle of the room and forgot to tell me about it. I tripped on it and I hurt myself pretty bad”
What I felt
Insights
Shampoo and conditioner bottles are exactly the same. This is true for so many products and buttons.
While Braille sign-boards exist, there is no standardization of where they are placed.
All bills in the US are exactly the same size. This makes differentiating $100 from $1 difficult.
Tactile pavements exist but not everywhere. It is difficult to know where to feel for them and where not.
What people said
We sent out surveys asking people living with visual disabilities about some of the challenges they felt and what their biggest problems were. We also asked them how they navigated these challenges and what their own personal “hacks” were. (n=21)
Synthesis
Stakeholder Map
I charted out the various stakeholders involved in the ecosystem that surrounds a person living with severe visual disabilities to understand the different intervention points that come into play.
Empathy Map
Instead of creating user personas and journey maps, I decided to create an empathy map for my target audience. This helped me verbalize the biggest pain and gain points and effectively communicate my findings. I felt that a journey map or a persona would be too limiting as the target audience is very broad in terms of likes, interests and other personality traits. The only thing they have in common is their visual disability.
The idea
Why?
Total loss of vision in both eyes is considered to be 100% visual impairment and 85% impairment of the whole person. It is the 21st century and technology has finally reached the point where we can make the world more accessible for everyone.
How?
By creating a camera paired with a smartphone to run real time object detection on a live video feed. The smartphone processes the video and provides audio feedback to the user.
User flow
Version 1
I consciously decided to create as simple an interface as possible. Most users interact with apps on their phones by using the talk-back feature on their phones. However, a small subset of users that have severe visual disabilities are still able to see objects up close. To meet their needs, the screens have big bold text with high-contrast. All of the different modes also have different colors to make it easier to differentiate between screens.
Checking accessibility for color blindness
Version 2
Based on user-feedback, I modified the screens to include tabs to indicate the current mode. I also updated the Visual Design to simplify screens and get rid of unnecessary elements.
Checking accessibility for color blindness
Protanopia
Deuteranopia
Protanomaly
Deuteranomaly
The product
Development
Form development
Renders
Physical 3D printed Mockups
Final Renders
The moment we decided to design a product that was mounted on to glasses, we knew it would be impossible to make the product invisible. We decided to make the product a fashion accessory instead to make it something that users would be proud to wear.
What happens if we add NLP to the mix?
So everything you saw was built back in 2019 using machine learning algorithms that were available then. This 2023 though so I asked myself how this project would change if LLMs (large language models- think ChatGPT or Bard) could describe images.
Here’s how I think something like that could work: