Saturday, 22 October 2011

Machine Learning -Week 2 Multi Varient Linear Regression

Intoducing Octave

Fitted line graph -the blue lines is derived from the red crosses
If you got past the title you are doing well! More line fitting, but we're now into multi-dimensional space captain! Put simply this means that you have more parameters (different types) of information to match. So in the house price eample used in the course; the simple version matches area(sq ft) to the price of the house, the multivarient version might use area, number of rooms, and age to get you a more accurate fit. Fortunately, it works as you'd hope, in that you just add up the effects of the different terms, or in programming terms you just have two loops :

for i = 1 to number_of_houses
  for j = number_of things_i_know_about_a_house

Cost function plot from Octave
Which leads nicely to Octave a programming language designed specifically for doing maths, it combines with gnuplot to draw pretty pictures of your data. The sort of data we are talking about now comes in tables -whether printed, or as databases tables or spreadsheets doesn't really matter, as a programmer you would normally bring this stuff in as arrays of whatever dimension and do the above. Octave, however understands matrices and tables intuitively and can manipulate them directly, the above might become :

number_of_houses * work_it_out(number_of things_i_know_about_a_house)
A lot simpler and, because Octave is built for this sort of job, much quicker.

Onwards and Upwards. The top graph represents the point of all this, if you know the area of your house you can estimate it's cost. The advantage of getting a computer to do this is that you can have enormous training sets -all house prices in the country, and do lots of subset plots -all semis, all semis in Suffolk etc.

No comments:

Post a Comment