Skip to main content

Posts

Showing posts from January, 2020

A/B testing significance calculator

A/B testing significance checker: Complete your A/B testing now! Impressions in normal group Clicks in normal group Impressions in variation Clicks in variation Show me A/B test result! result This is a very basic template A/B testing tool for clicks impression scenario. We have used normality assumption and sattertheid approximation for the calculation of significance. If you want to re-enter values, refresh the page and enter the new values. Thanks! Sponsored Ads Learn a/b testing in python from Udemy

If a student is selected in both IIT as well as in ISI which one should be preferred

Introduction: I think I am eligible to answer this question as I am a student of ISI Bangalore and have cracked IIT also successfully. So, I stood on that path. See, in this place, you have to choose your decision carefully. First, ask yourself the following questions: how much do you love mathematics? what is your long term goal? Being a researcher or being a top-notch industrialist? These two questions are sorting questions. If you are not dedicated enough to mathematics, and love physics, programming, and other staff also more than or equal to mathematics, then you should go for IIT. Maybe you want to become a big industrialist, then you should go to IIT for sure. In IIT, you will be doing engineering and therefore will be exposed to a lot of subjects and will learn diverse things. Now, let's say you are a math nerd, you solved a lot of RMO and INMO level problems, want to know a lot of mathematics in your life, boss, you are welcome to ISI. Now, you are like

How do I prepare for Msc in datascience at CMI in 2020 within one year?

Introduction: I have been seeing people ask this question many times in quora as well as in conversations too. So I thought I will take time and I will answer this question outright in a blog post. So, two of my friends from ISI went through this exam this year. It depends on your background. I will suggest you prepare your statistics, probability, and general bachelor's level mathematics properly. For basic level probability, I will suggest you go through Introduction to probability by Sheldon ross; read the concepts, solve the examples and proceed. This book is good enough to prepare you for the probability questions you will face. Now, let’s come to the statistics part. For statistics, I will suggest reading through Casella Berger and CR Rao. See, these books are pretty rigorous, and maybe you can find it really hard for you. But, You can skip some of the hard theoretical parts, and follow the flow basically. That will give you a good knowledge of what can come. Now,

What is Data-science?(details descriptions provided)

Introduction: For a little more than a year now, I have been learning and practicing data science, and data science have thoroughly amazed me throughout this journey. But a question strikes real hard when an online coaching company asked me last month; "what is data science?" and I suddenly lost my words. Because, grasping a thing you daily work on, think about and try to become creative about, becomes abstract to you slowly, such that you can't grasp that anymore all in a few words. That is why, in this post, I will try to express my view on the small but big question, "what is data science?" We will discuss the following topics in this post: What is data science? What are the pillars of data science? Why do we need data science? What are the different data science roles? What are the different common data science problems in different industries How does a general data science problem get solved?: CRISP-DM What are the pre-requisites to be

A Complete Guide for Understanding Random forest model

Introduction: Random forest is one of the highest used models in classical machine learning. Because of its robustness in high noisy data, and its much better ability to learn irregular patterns of data makes the random forest a worthy candidate for modeling in many fields like genomics and others. In this post, we will discuss: definition of a random forest What is the meaning of bagging What do you mean by ensemble model What is a decision tree The mathematics behind decision tree Training a Random forest Random forest packages Hyperparameter tuning Random forest How many trees should there be in Random forest Further study links definition of a random forest "Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest." --Leo Breiman Random forest is an ensemble model consisting of number of decision trees

accuracy and interpretation of multiple linear regression

Introduction: You have completed your linear regression fitting and prediction. But you want to know how to represent the accuracy of the linear regression. We discuss different accuracy metrics of linear regression in this post. (1) adjusted R-square and mean-squared-error:   In linear regression, the R-square is the measure of the accuracy of linear regression. The R-square can be from 0 to 1 i.e. it can also be interpreted as from 0 to 100%. Roughly speaking, R-square denotes the amount of variance in the data which is described by the linear regression. The more amount of variance gets described by the linear regression, the better the regression is. So, to describe how efficient a linear model-fitting has been, one can depend on how high the adjusted R-square percentage is.  While adjusted R-square is accuracy-measure from the statistical point of view, the more application point of view is the mean-squared-error (mse). Mean-squared-error is the mean of the squares