AIQ: How People and Machines are Smarter Together

AIQ Review

A short and educational read explaining how Artificial Intelligence is nothing more than statistics with a ton of data on steroids. The book does a great job at raising the curtains that make AI seem so magical and mysterious. It also grounds all the misconceptions and hype around a Skynet type general AI. Computers are much better than humans at remembering things, processing large amounts of data, and doing repetitive tasks very quickly. Similarly, a hammer is much better than humans at hitting things without being hurt or damaged. Both of these are tools that benefit humans, and the fear of being taken over by the tools we built is overhyped science fiction.

As a software engineer, who works closely with machine learning scientists, and has gone through Andrew Ng’s Coursera ML course, I still got a ton of value out of this book. I would recommend it to anyone who wants to freshen up on their stats knowledge, or expand their knowledge of history. In particular, the book does a great job at correlating modern advancements and uses of ML/AI to scientific developments of the past 100 years.

One of the things I hadn’t realized is that the big data, machines learning and cloud computing boom started around 2010. I was already in college and literally saw this revolution happen before my eyes!

Over the past few decades, engineers/scientists in this field really made a lot of mathematical and scientific breakthroughs. However, after reading this book, my previous assumption were validated. Almost nobody does “real machine learning” nowadays. Since various frameworks (pytorch, tensorflow, keras, caffe, etc…) have commoditized the ability to build and train neural networks, feature extraction has become the most difficult part of machine learning. The process of data collection, data cleansing, data analysis, data pipeline, and deciding how to represent that data is all machine learning has really come to. Everything else is just trial and error along with throwing a ton of data (i.e. steroids) at the problem. The book does a great job at exemplifying this by referring to two letters: Big N (big number of independent data points) and Big D (a lot of detail in every data point).

“It may seem like we rely on depend smart machines for everything these days, but in reality, they depend on us a lot more.”

==== Some cool facts from the book that stood out ====

Moore’s Law

One of the AI/ML advancement enablers in the past decade is Moore’s Law. However, it’s not just Moore’s Law in the context of processors, but also in the context of data quantity and the availability cloud compute. All of this, along with many other enables, all played a key role in AI’s hyper growth.

UBI

A common concern amongst many who are afraid of AI is the number of jobs (i.e. truck driving) it’ll make disappear. As a big Andrew Yang fan myself, and a big proponent of Universal Basic Income, I really connected with the authors’ similar viewpoin. I do not want technological advancement to be hindered by a fear of job loss, but am aware that it could lead to an increase in unemployment. For example, truck driving, which is the most popular job in the will most certainly be impact by the ubiquity of autonomous vehicles. I believe that as long as we put UBI policies in place, and provide the infrastructure for individuals to be retrained in a different career path, if they choose to, we can move forward without looking back.

Netflix

In 2007, Netflix offered $1MM to anyone who could beat their recommendation algorithm by more than 10%. A lot of teams came really close, but it took 2 years for a team to beat Netflix’s recommendation algorithm by 0.06%. Ironically, another team also managed to submit a winning solution exactly 19 minutes after the first.

House of Cards was the first Netflix original, and as we all know, a great success. The reason this is the case is because Netflix had gathered a lot of data on what kind of shows that appeal to the general public, and therefore knew it was going to be a success.

The authors discuss how traditional television networks waste hundreds of millions to pilot many different shows to see what sticks. Netflix’s data allows them to create successful shows with a very low probability of failure.

World War 2

Abraham Wald played a key role in mathematic and statistical developments during WWII.

The planes that were returning from battle had a lot of holes in the fuselage, so the generals thought of adding more armor to that section of the plane. However, Abraham correctly identified that we should make note of the planes that do not return, which were likely shot in the engine, and that is where more armor should be added.

The key point of comparison between the bombers returning and Netflix’s recommendation system is that both of these systems use conditional probability with large data sets and latent features. Abraham Wald didn’t know where the non-returning planes were getting hit, and Netflix doesn’t know viewer’s opinions on most movies because most users do not watch most movies.

After WWII, the US spent $17B in the early 60s to install microphones across the whole Atlantic. How cool is that!

Asia - Toilet Paper Theft

Individuals in Japan and China used to steal toilet paper from public spaces. Authorities tried to limit the toilet paper “per visit” to 6 sheets. While this slowed things down, the thieves were just walking around and returning to the same booths periodically. The final solution was to use facial recognition and only dispense toilet paper per person every X hours.

Physicists

Hubble used the pulsating star theory, and collected a lot of data, to prove that Andromeda was located in a different Galaxy, thereby disproving the “single Galaxy theory”.

Elina Berglund, who helped discover the Higgs boson, decided to use big data to kill the birth control pill. A smartphone app, along with input from the user concerning body temperature and menstrual cycle became highly accurate at determining when a woman is ovulating.

Moravec’s paradox

From wikipedia: “It is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility”

Bayes’ theorem

At the basis of all probability analyses is Baye’s theorem, which is actually quite straight forward: the probability that an event A occurs given that another event B has already occurred is equal to the probability that the event B occurs given that A has already occurred multiplied by the probability of occurrence of event A and divided by the probability of occurrence of event B.

SLAM

At the time of writing the book, GPS is only accurate up to 5m. This is good for most uses cases but not for autonomous vehicles, drones, etc… This is where SLAM (simultaneous localization and mapping) prevails. It does introspection by collecting data from various censors (color, grayscale, IMUs, depth, etc..) and exterminates (i.e. predicts) the device’s location/pose.

Grace Hopper

I had always heard of the Grace Hopper conference for women, but never realized how influential she was in the world of computer science.

Hopper grew up in a military family and managed to become one of the first female naval officers.

She wars the first American computer expert after she was assigned to work on the Mark I at Harvard. After that, she started working on the UNIVAC, which was the first large scale electronic computer adopted by various corporations such as ADP or Dupont for database operations.

She was a pioneer in NLP, computer compilers, and played a key role in making sure that the American government adapt to using modern computer systems.

Coin Clipping

The book went into great depth about coin clipping in the 16th century. Due to the variability in coin weight, it was easy to clip “excess” coin on heavy coins and make a little bit of extra money. It took a while for Isaac Newton to develop the method of least squares, which was followed by the great recoinage, to avoid this sort of theft in the future.

Cool Applications of AI

There is a smart knife that tells the surgeons whether they’re cutting/burning healthy or cancerous tissues. This is done by sending the burning smoke to a mass spectrometer, analyzing the data and feeding it into a prediction rule This takes 3 seconds and has a 100% accuracy rate.
Ali baba ships products to nearby warhorses before items are purchased to guarantee faster delivery, since they can predict which products are going to be bought when and where!

Florence Nightingale

Florence nightingale was a nurse and data scientist. In the 1800s, she was driven by both maths and helping people. She used her skills and interests to built an efficient data driven hospitals/camps to help as many injured soldiers as possible. She was the first woman elected to the UK data analytics organization, and was truly a remarkable and influential individual.

Compounding

If something has a 0.9 chance of happening, the compounding rule states that the chance of it happen 10 times is 0.9^10. However, the compunding rule does not apply to population averages where there are a lot of lurking variables that need to be taken into consideration, and often leads to conclusion bias.

For example, if 90% of the population has a tails only coin, then 90% of all flips will always result in tails. However, the compounding rule will apply ONLY IF all the coins are fair.