Hello World – Machine Learning Recipes #1

[MUSIC PLAYING] Six lines of code
is all it takes to write your first
Machine Learning program. My name’s Josh
Gordon, and today I’ll walk you through writing Hello
World for Machine learning. In the first few
episodes of the series, we’ll teach you how to
get started with Machine Learning from scratch. To do that, we’ll work with
two open source libraries, scikit-learn and TensorFlow. We’ll see scikit in
action in a minute. But first, let’s talk quickly
about what Machine Learning is and why it’s important. You can think of Machine
Learning as a subfield of artificial intelligence. Early AI programs typically
excelled at just one thing. For example, Deep
Blue could play chess at a championship level,
but that’s all it could do. Today we want to
write one program that can solve many problems without
needing to be rewritten. AlphaGo is a great
example of that. As we speak, it’s competing
in the World Go Championship. But similar software can also
learn to play Atari games. Machine Learning is what
makes that possible. It’s the study of
algorithms that learn from examples
and experience instead of relying
on hard-coded rules. So that’s the state-of-the-art. But here’s a much
simpler example we’ll start coding up today. I’ll give you a problem
that sounds easy but is impossible to solve
without Machine Learning. Can you write code to
tell the difference between an apple and an orange? Imagine I asked you to write
a program that takes an image file as input,
does some analysis, and outputs the types of fruit. How can you solve this? You’d have to start by
writing lots of manual rules. For example, you
could write code to count how many orange pixels
there are and compare that to the number of green ones. The ratio should give you a
hint about the type of fruit. That works fine for
simple images like these. But as you dive deeper
into the problem, you’ll find the real world
is messy, and the rules you write start to break. How would you write code to
handle black-and-white photos or images with no apples
or oranges in them at all? In fact, for just about
any rule you write, I can find an image
where it won’t work. You’d need to write
tons of rules, and that’s just to tell the
difference between apples and oranges. If I gave you a new problem, you
need to start all over again. Clearly, we need
something better. To solve this, we
need an algorithm that can figure out
the rules for us, so we don’t have to
write them by hand. And for that, we’re going
to train a classifier. For now you can think of a
classifier as a function. It takes some data as input
and assigns a label to it as output. For example, I
could have a picture and want to classify it
as an apple or an orange. Or I have an email, and
I want to classify it as spam or not spam. The technique to
write the classifier automatically is called
supervised learning. It begins with examples of
the problem you want to solve. To code this up, we’ll
work with scikit-learn. Here, I’ll download and
install the library. There are a couple
different ways to do that. But for me, the easiest
has been to use Anaconda. This makes it easy to get
all the dependencies set up and works well cross-platform. With the magic of
video, I’ll fast forward through downloading
and installing it. Once it’s installed,
you can test that everything is
working properly by starting a Python script
and importing SK learn. Assuming that worked, that’s
line one of our program down, five to go. To use supervised
learning, we’ll follow a recipe with
a few standard steps. Step one is to
collect training data. These are examples of the
problem we want to solve. For our problem, we’re
going to write a function to classify a piece of fruit. For starters, it will take
a description of the fruit as input and
predict whether it’s an apple or an orange as
output, based on features like its weight and texture. To collect our
training data, imagine we head out to an orchard. We’ll look at different
apples and oranges and write down measurements
that describe them in a table. In Machine Learning
these measurements are called features. To keep things simple,
here we’ve used just two– how much each fruit weighs in
grams and its texture, which can be bumpy or smooth. A good feature makes
it easy to discriminate between different
types of fruit. Each row in our training
data is an example. It describes one piece of fruit. The last column is
called the label. It identifies what type
of fruit is in each row, and there are just
two possibilities– apples and oranges. The whole table is
our training data. Think of these as
all the examples we want the classifier
to learn from. The more training data you
have, the better a classifier you can create. Now let’s write down our
training data in code. We’ll use two variables–
features and labels. Features contains the
first two columns, and labels contains the last. You can think of
features as the input to the classifier and labels
as the output we want. I’m going to change the
variable types of all features to ints instead of strings,
so I’ll use 0 for bumpy and 1 for smooth. I’ll do the same for our
labels, so I’ll use 0 for apple and 1 for orange. These are lines two and
three in our program. Step two in our recipes to
use these examples to train a classifier. The type of classifier
we’ll start with is called a decision tree. We’ll dive into
the details of how these work in a future episode. But for now, it’s OK to think of
a classifier as a box of rules. That’s because there are many
different types of classifier, but the input and output
type is always the same. I’m going to import the tree. Then on line four of our script,
we’ll create the classifier. At this point, it’s just
an empty box of rules. It doesn’t know anything
about apples and oranges yet. To train it, we’ll need
a learning algorithm. If a classifier
is a box of rules, then you can think of
the learning algorithm as the procedure
that creates them. It does that by finding
patterns in your training data. For example, it might notice
oranges tend to weigh more, so it’ll create a rule saying
that the heavier fruit is, the more likely it
is to be an orange. In scikit, the
training algorithm is included in the classifier
object, and it’s called Fit. You can think of Fit as being
a synonym for “find patterns in data.” We’ll get into
the details of how this happens under the
hood in a future episode. At this point, we have
a trained classifier. So let’s take it for a spin and
use it to classify a new fruit. The input to the classifier is
the features for a new example. Let’s say the fruit
we want to classify is 150 grams and bumpy. The output will be 0 if it’s an
apple or 1 if it’s an orange. Before we hit Enter and see
what the classifier predicts, let’s think for a sec. If you had to guess, what would
you say the output should be? To figure that out, compare
this fruit to our training data. It looks like it’s
similar to an orange because it’s heavy and bumpy. That’s what I’d guess
anyway, and if we hit Enter, it’s what our classifier
predicts as well. If everything
worked for you, then that’s it for your first
Machine Learning program. You can create a new
classifier for a new problem just by changing
the training data. That makes this approach
far more reusable than writing new rules
for each problem. Now, you might be wondering
why we described our fruit using a table of features
instead of using pictures of the fruit as training data. Well, you can use
pictures, and we’ll get to that in a future episode. But, as you’ll see later
on, the way we did it here is more general. The neat thing is that
programming with Machine Learning isn’t hard. But to get it right,
you need to understand a few important concepts. I’ll start walking you through
those in the next few episodes. Thanks very much for watching,
and I’ll see you then. [MUSIC PLAYING]

You may also like...

100 Responses

  1. Shyam Kumar says:

    I had installed anaconda and and other packages with it and when I tried the example got this as output sir

    C:Userspikachu>python H:py.py

    File "H:py.py", line 6

    print clf.predict([[150, 0]])


    SyntaxError: invalid syntax

    my code

    from sklearn import tree

    features =[[140,1],[130,1],[170,0],[150,0]]

    labels =[0,0,1,1]

    clf =tree.DecisionTreeClassifier()

    clf =clf.fit(features,labels)

    print clf.predict([[150, 0]])
    please help me

  2. rudrakshya1 says:

    I have completed Coursera, udemy machine learning A-Z. Udacity course.
    But this is a
    Best getting started I have ever seen for a developer.

  3. anim x says:

    this is how i did it and its working
    but make sure u are using python 3
    and also have installed scikit-learn

    import sklearn

    from sklearn import tree

    features = [[140,1],[130,1],[150,0],[170,0]]

    labels = [0,0,1,1]

    clf = tree.DecisionTreeClassifier()

    clf = clf.fit(features, labels)

    [x]=(clf.predict(X = [[150, 1]]))

    if [x]==1:


    else :


  4. Hand Of LEGION says:

    Well….. understood nothing.

  5. Timothée Oliveau says:

    Tried to run the code, got "Launching humanity destruction process".

  6. Aero Mateen says:


  7. Abdullah Hussain says:

    Google developer using Mac ?

  8. Kun Yu Tsai says:

    Awesome class for a beginner. Thank you!!

  9. ImSalman says:

    How do I make it say orange or apple instead of giving me a number?

  10. Jay Kadam says:

    Thats amazing, Thanks!

  11. Arwa Kalavadwala says:

    so with this code can i make a tiny robot project which can differentiate between fruits?

  12. Solve Everything says:

    I'm a mathematician and proper ML is too difficult . No way I'm going to learn the inner nature of the perceptor, neural layers and the stochastic models for the approximators and classifiers. Good luck learning the core of the inference behind GANs and convolutional models without years of intense study.

  13. Dwi Fajar Dandy Saputra says:

    Thank you Google

  14. MygenteTV says:

    how to use this with c++ instead of python?

  15. Seba Contreras says:

    1:34 Cristiano Ronaldo do not agree with that

  16. Maximilian Karelshteyn says:

    It would be fun to program an AI which can play LoL and learn from every game from scratch.I dont know anything about self learning AIs my knowledge is like how to program on java a tree, a house and how to get them on another position or change their colour and i know how to start a database that's , literally , it.So my question is : Is it even realistic or is it something i need to go to a university for?

  17. Aakash Kumar says:

    What editor is used to make this video : )

  18. David Lloyd-Jones says:

    Brilliant way to number your videos — with Roman numbers (of an irregular number of digits) tacked on at the right-hand end of the title.

  19. sanjiv 070 says:

    i m only getting y as the output

  20. Nguyen Thien Lam says:

    why do you need the double [] in the print command?

  21. Andrew Lemley says:

    Hey you might want to show what to do when you download anaconda because when I did it didn’t show up nor did anything else work

  22. Venkatesh R says:

    I was thinking of Machine learning and boom here it is! What kind of sorcery is this Google?

  23. Rabeeh T A says:

    is there any ML libraries for JavaScript, i know that a little bit than python.

  24. Tirth Patel says:

    Why do we pass 2D array in this line instead of 1D array? – clf.predict([[150, 0]])

  25. Lucas Lima says:

    #version: 3.7.2

    from sklearn import tree

    features = [[140, 1], [130, 1], [150, 0], [170, 0]]

    labels = ['Orange', 'Orange', 'Apple', 'Apple']

    clf = tree.DecisionTreeClassifier()

    clf = clf.fit(features, labels)

    print clf.predict([[145,1]])

  26. ScratchPatch says:

    6:40 – The text got blurred, the ai doesnt want people to learn to code them lol

  27. Allan Cheah says:

    Hey this is awesome! =)

    Oh by the way, is it ok if I can also seek your advice in this open source android app I have posted below? Just need some feedback about it…

    Kindly search ' pub:Path Ahead ' in Play Store (P & A are case sensitive).

    give_thanks you !

  28. Dimitar Tsvetkov says:

    print clf.predict([[150, 0 ]])
    SyntaxError: invalid syntax
    That's say 🙁

  29. Pushkaraj Sadegaonkar says:

    Really nice video about introduction to ML programming.

  30. lisichka ggg says:

    In the video he told that : "The more training data you have – the better", however what about overfitting?

  31. livingthedream says:

    -Don't think; let the machine do it for you!
    -Thanks "They Live-movie"-Cow.

  32. Richard Benoit says:

    Thanks for the post.

  33. Ryan K. says:

    Problem: every time i type import sklearn or from sklearn imort tree, it gives me a unresolved error. Please help.

  34. Kirill Bezzubkine says:

    Awesome sip of ml

  35. Sime Arsov says:

    Wow, google developers that use a mac book. How do I take your video seriously after this?

  36. Jaydan Doano says:

    i cant download conda it comes out with a pkg anything help please

  37. Navneet Kumar says:

    Nice Presentation for a Beginner like me. Good Lecture

  38. Jorge hernandez says:

    Is it possible to do it
    using Java instead of Python ?

  39. Krishnadas PC says:

    Great introduction.?

  40. Akin Pounds says:

    what ide is this?

  41. Defcon1Gaming says:

    Fire whoever used a red apple in the green and orange pixel example.

  42. Buddhika P. De Silva says:

    code in the video didnt work for me. it shows msg like this >
    AttributeError: module 'sklearn.tree' has no attribute 'DecisionTreeClassifier'

    after that i make some changes and its work!
    from sklearn.tree import DecisionTreeClassifier
    clf = DecisionTreeClassifier()

    #weight in grams
    #0 – bumby 1 – smooth
    labels = [0,0,1,1]
    # 1 – org 0 – apples

    clf = clf.fit(features,labels)
    print (clf.predict([[130,1]]))

  43. Krishiv Agarwal says:

    No, the easiest way to download any python library is to use Pycharm.

  44. SkvProgrammer says:

    very useful content

  45. Rajeshwar S says:

    But let me know what decision it makes if the data input is out of training data, <100,bumpy what decision it takes? thats why we study machine learning else i could have satisfied with c program itself atleast DOS

  46. Samir Maliqi says:

    WTF to do with that pkg FILE???


    Awesome! Just wrote a bot for 2048 which learns from me (And I suck lol) using the sklearn toolkit to predict the best move 🙂

  48. Lamar Medina says:


  49. Nijimura San says:

    Please Bring java too

  50. G V V Karthikeya says:

    Its showing a syntax error at line number 6 in your program

  51. Anirban Maitra says:

    Can somebody tell me the best source to learn machine learning….Please provide its link too i would be greatful

  52. G. Visal says:

    thanks for your videos ?

  53. Miguel Ramirez says:

    Awesome little tidbit! Had to run it in Anaconda3.

  54. alkerbix says:


  55. alkerbix says:


  56. Donald Faulknor says:

    Doesn't make sense. Nothing was stored in memory or a database. So how does the "machine" retain information in order to learn from the previous guesses?

  57. Antopia HK says:

    would you recommend making a graph like such when making machine learning?

  58. Guz Man says:

    Python 2.7 ….

  59. U live u learn And regret says:

    if it says python 3.7 (32-bit) on windows is that the same as hello-world.py on mac?

  60. Obi-Wan Kenobi says:

    Sweet, I’ve always wanted to get into machine learning

  61. Antonio Williams says:

    Overfitting, high variance

  62. Adrian Snipes says:

    Great video but just two things i wanna point out:
    1. While technically not wrong, when you labeled bumpy as 0 but orange as 1, then smooth as 1 but apple as 0. Idk just since you want them to correlate makes more sense to have them 0:0 1:1
    2. The example you wanted it yo predict was already one of the small sample size so it didn't really show its capabilities as well as it could've.
    Nice vid regardless though.
    edit few missed words :p

  63. Jacky Wong says:

    This is the tutorial for me

  64. Jerry Liang says:

    Excuse me! For the training statement, "clf = clf.fit(features, labels)", is the assignment required? I see in the later recipes, the training statement is simply "clf.fit(…, …)" without assignment. Could you please help? Thanks!

  65. Shanilka Ariyarathne says:

    can someone tell me , what is the IDE he is using there !!!

  66. pjossy joshi says:

    What if we have apple orange and banana.

  67. CUNEYT TASLI says:

    Great video. Looking forward to watching the rest.

  68. Fernando Lovera says:

    Hello little dude, ML is not a recipe. Stop confusing people, this is a disgrace to the field.

  69. Deniz Boz says:

    This guy's great.

  70. Ahmet GÜRBÜZ says:

    what is the difference between data mining and machine learning?

  71. ridhwaans says:

    my ML guru

  72. Akhil Y says:

    What is the point of using Anaconda? Can someone please help me out

  73. vikas says:

    Very very informative video. Big ?? to you bro

  74. Rizwan Rauf says:

    very well explained.

  75. Charles - says:

    how do I write brackets on a french qwerty keyboard?

  76. Kenton Banyai says:

    Well thats not fair, you can't compare apples and oranges

  77. Nikkolos The Kidd says:

    Does miniconda work too?

  78. rohith kattamuri says:

    Build your first music recommendation system model. Feel free to fork and star this project: github.com/rkat

  79. ᎯᏌᎿᏫᎦᏂᏫᎿᏃ says:

    Well done sir. Thanks for the help. I really appreciate it and considering I'm understanding this at a young age (12) tells me that other people should understand it.

  80. Tech Soft says:

    visit website c# and @t @

  81. Abel Arredondo says:

    Siraj Raval has this if your interested in learning more

  82. Kayumuzzaman Robin says:

    too good! <3 i'm loving it!

  83. Marco Scale says:

    don't recognise "import sklearn"
    I have installed Anaconda

  84. Michelle Barraclough says:

    cant import sklearn on windows i have everything installed why?

  85. LEARN! SHARE! and GROW! says:

    Thanks josh sir!

  86. Austin Ma says:

    I feel like you skipped the billion steps it takes to open your magical python file. I'm on a windows, but how did you go from Anaconda to a normal python file?

  87. thundertwinPlaysMC says:

    help, i keep getting this error

    Traceback (most recent call last):
    File "C:UserskaiserDesktopmlscript.py", line 1, in <module>
    from sklearn import tree
    File "C:UserskaiserAnaconda3libsite-packagessklearn__init__.py", line 76, in <module>
    from .base import clone
    File "C:UserskaiserAnaconda3libsite-packagessklearnbase.py", line 13, in <module>
    import numpy as np
    File "C:UserskaiserAnaconda3libsite-packagesnumpy__init__.py", line 140, in <module>
    from . import _distributor_init
    File "C:UserskaiserAnaconda3libsite-packagesnumpy_distributor_init.py", line 34, in <module>
    from . import _mklinit
    ImportError: DLL load failed: The specified module could not be found.


    this is my code

    from sklearn import tree

    info = [31, 21, 40, 71, 60, 80]
    labels = [1, 1, 1, 0, 0, 0]
    clf = tree.DecisionTreeClassifier()
    clf = clf.fit(info, labels)
    print (clf.predict([27]))

  88. Funny says:

    Best video ever ?

  89. Ollie White says:

    What application are you using to code in here?

  90. SNKRhead Games says:

    What is the System.out.println(“hello world”) of machine learning?

  91. JamBear says:

    This is so handy, thank you!

  92. Linjo 100 says:

    is he also a robot?

  93. Samuel Davidson says:

    I was trying to follow this project on my Raspberry Pi 2 Raspbian OS but couldn't even install scikit-learn. Any suggestions? I tried a few forums but couldn't find a solution that worked.

  94. pablo marcel says:

    good info!

  95. AcromaticGaming - Minecraft & More says:

    Of Code
    And 6
    Of Video

    Did you see my comment was 7-1 lines a second ago????

  96. Alpha Garrett says:

    Outstanding communicative clarity! How rare.

  97. تليفزيون اليوتيوب says:

    Thank you really this video show me alot

  98. Anusha K says:

    Does Transfer learning comes in Machine Learning or Deep learning? Neural networks come in Deep learning and transfer learning uses
    Neural networks..ugh I'm confused ??? please help..

  99. Kvarks says:

    What should I study to understand this?

Leave a Reply

Your email address will not be published. Required fields are marked *