Posts

HOW TO PREDICT NEW DATA VALUES WITH R

Image
HOW TO GET THE DATA VALUES For example, a car manufacturer has three designs for a new car and wants to know what the predicted mileage is based on the weight of each new design. In order to do this, you first create a data frame with the new values — for example, like this: > new.cars <- data.frame(wt=c(1.7, 2.4, 3.6)) Always make sure the variable names you use are the same as used in the model. When you do that, you simply call the  predict()  function with the suited arguments, like this: > predict(Model, newdata=new.cars) 1 2 3 28.19952 24.45839 18.04503 So, the lightest car has a predicted mileage of 28.2 miles per gallon and the heaviest car has a predicted mileage of 18 miles per gallon, according to this model. Of course, if you use an inadequate model, your predictions can be pretty much off as well. CONFIDENCE IN YOUR PREDICTIONS In order to have an idea about the accuracy of the predictions, you can ask for intervals around yo...

Ten Machine Learning Algorithms You Should Know to Become a Data Scientist

Image
Ten Machine Learning Algorithms You Should Know to Become a Data Scientist Machine Learning Practitioners have different personalities. While some of them are “I am an expert in X and X can train on any type of data”, where X = some algorithm, some others are “Right tool for the right job people”. A lot of them also subscribe to “Jack of all trades. Master of one” strategy, where they have one area of deep expertise and know slightly about different fields of Machine Learning. That said, no one can deny the fact that as practicing Data Scientists, we will have to know basics of some common machine learning algorithms, which would help us engage with a new-domain problem we come across. This is a whirlwind tour of common machine learning algorithms and quick resources about them which can help you get started on them. 1. Principal Component Analysis(PCA)/SVD PCA is an unsupervised method to understand global properties of a dataset consisting of vectors. Covariance Matrix o...