Machine learning and statistics with python

Posts

Showing posts from January, 2021

What is topic modeling?

What is topic modeling? Written by: Shyambhu Mukherjee Introduction: Topic modeling is one of the famous natural language processing tasks. Topic modeling refers to assigning one or multiple topic to a set of document. In this article, we will discuss the basic, medium and advanced understanding of topic modeling and discuss multiple python libraries which will be used to do topic modeling. Summary of the article: In this article, first we will define what topic modeling and topic classification modeling are; what are their difference and how to do both as a overview. Then we will discuss to...

Issues with current data science mentorship programs in India

Introduction: I have been working with mentorbruh, a close impact mentorship program for quite sometime now, and had the opportunity to meet a lot and mentor a few data aspirants. Lots of them, come with a prior experiences with internship programs, coaching schools and other type of certification courses and what not. But most of them one thing common. "Their knowledge from these programs didn't make them employable." Now when I say that, that sounds like I am exaggerating; and it can't be like that. But in this article, I am going to explain three types of programs, and also will in detail break down that why they don't work fully or partially. Summary of article: I am going to take a deep dive into the current famous programs running in india for data science coaching, and will thoroughly explain their flaws and reason of very small to no success rates. The SMB coaching centers: The lowest tier in data science mentorship is the small and medium businesses...

What is lazypredict automl library and how to use it?

Lazypredict: The automl library Introduction: Recently I have started my journey with automl, and explored the sberbank's light automl framework as well auto-eda with pandas profiling in this post . In this post, I will explore the lazypredict framework written by Shankar Pandala sir. In this post, we will first show how to use the library, what are the outputs we get from this, and then finally, we will go in-depth of the code; to see how lazypredict does what it does. Usage: For this part, we will just use the github repo's code example. There are two classes, LazyClassifier and LazyRegressor, respectively for classifier and regressor. We can import the classifier class if your problem is classification, and import regressor if you have a regression problem. X_train, X_test, y_train, y_test = train_te...

5 mistakes I made in my first year of machine learning and what I learned from them

5 mistakes of a ml beginner Introduction: This is a lightweight non-technical post. I have been practicing the craft named machine learning for the last 2.5 years now. I recently have started working with a few data science aspirants, and found them making a lot of similar mistakes like I would do. In this article, I will share 5 mistakes I did, and a lot of machine learning and data science beginners do. First mistake: not reading your data: Data science is exciting and machine learning starts with learning a lot of models and algorithms. Hence often when we start learning machine learning and data science, we don't learn the most important step. The most important step is to read your data. Now, I used to find it stupid to read the data, because I didn't know what does reading data mean. Reading data means u...

What is DALL.E and how does it create image out of text?

DALL-E A ground-breaking machine learning news Introduction: It was a tuesday morning. Woke up, sipping in my morning cup of coffee, found out that openai has dropped a bomb again. This time they didn't stop with language, but they created a neural network architecture which takes a text-prompt and creates an image for that text. It created another ripple within the data science and deep learning communities within few days; and within 5 days, there are 1000s of news to technical articles written ...

Introduction to Rasa: the NLU chatbot framework

Introduction to Rasa Written by: shyambhu mukherjee Motivation: With the onset of 2021, I planned to up-skill in chatbot creation. For chatbot creation, there are a number of frameworks available; such as Dialogflow, RASA and others. I already wrote about what are chatbots and created a small appointment scheduler chatbot using Dialogflow by following google's developer course on the same. If you don't know what a chatbot is, read the above linked article first; and then continue in this post. Summary of the article: In this article, we will first describe what Rasa is and what a normal chatbot anatomy looks like. Then we will quickly iterate over a few concepts related to chatbot framework. Finally we will ...

Spacy errors and their solutions

Introduction: There are a bunch of errors in spacy, which never makes sense until you get to the depth of it. In this post, we will analyze the attribute error E046 and why it occurs. (1) AttributeError: [E046] Can't retrieve unregistered extension attribute 'tag_name'. Did you forget to call the set_extension method? Let's first understand what the error means on superficial level. There is a tag_name extension in your code. i.e. from a doc object, probably you are calling doc._.tag_name. But spacy suggests to you that probably you forgot to call the set_extension method. So what to do from here? The problem in hand is that your extension is not created where it should have been created. Now in general this means that your pipeline is incorrect at some level. So how should you solve it? Look into the pipeline of your spacy language object. Chances are that the pipeline component which creates the extension is not included in the pipeline. To check the pipe eleme...

Posts

subscribe!