Solving Problems Than Using Coolest Technique

Something I Learn From Work

Posted by Katherine Li on June 23, 2018

Understanding Problem Isn’t Just a Routine

Why do I still fail to achieve my goal even after trying the coolest deep learning algorithm?

Well, I admitted that I had this confusion several times at work and study. After several failures and trials, I learn the hard way how important it is to understand our problem correctly up front.

As a project initiator, whether you are a business manager or a technology programmer, we need to ask ourselves several questions before investing our money into the project:

1) As a human, is it feasible for us to do the task given current data or resources we have?

The reason why I brought it up is that some people believe that Artificial Intelligence is a black box to solve all the problems. But it isn’t. It’s like the Hasbro Clue board game, and we need to feed clues to the machine for learning the latent relationship. If you assign me a task to predict a man’s salary with the data of his favorite fruits, even Andrew Ng will sigh. That’s why even deep learnning can’t play a magic here.So just think of AI as another you, might be a bit smarter, waiting for relevant clues to get a conclusion.

2) Will it still add value to our business even if the model built is not 100% accurate?

In other words, is that costly to allow our model to make some mistakes? ‘Costly’ means customer loss or production rate decline. There is no way for AI models to achieve 100% accuracy rate. If a tiny error will lead to huge loss, then it’s better for us to list out all the possibilities and hardcode it with longer time. That’s the tradeoff between efficiency and precision.

3) If yes to question 2, what’s the lowest accuracy score can we tolerate? And do we have enough data to support that?

The higher accuracy score we expect, or more complicated the problem is (such as involving Natural Language), the more data we need to collect. Without sufficient historical data, the other option is to build a tool to automatically collect the training dataset from now, and incorporate machine learning a year later.

Data Science is the Combination of Consulting with Coding

The reason why I am passionate about DS is that I can have the opportunity to explore both business and technology

I understand that I might not be the one who focuses on developing the cutting-edge algorithms like someone in Google DeepMind dedicated to making a scientific breakthrough. I do really admire those talents, but it might not be my path at this point.

Nevertheless, I will still have a sense of achievement if I choose a right existing algorithm to solve a problem and bring money value to my company. That motivates me to read new papers, watch new tutorial articles, take online courses to keep pace with the technology updates. It is not an easy work to fully understand how each algorithm works, what are the use cases of each, and their drawbacks. In a broader sense, a data scientist should have a comprehensive skill set ranging from statistics, machine learning, deep learning, natural language processing, ab testing, front-end development, and data structure. The more I learn, the more I know I need to learn.