vurmotion.blogg.se

Label encoding in python
Label encoding in python











label encoding in python

The most commonly used ones are Label Encoding (where a list of strings is converted to a list of numbers but still remains one list) and One-Hot Encoding (where for n categories, n binary variables are created with a 1 representing existence of the category and 0 pertaining to its absence). Having said that, there is a wide list of go-to solutions in turning your categorical variables from strings to numbers that can be crunched by a machine. You have to know exactly what is being done in order to judge and select the solution that fits the particular problem’s needs. There are some frameworks and libraries out there that will perform the transformation (string–>num) for you but this is not the appropriate way of doing things in Data Science. After all, computers, in general, only recognize numbers. Since you are here, reading this article, I assume that you already know that it is imperative for any Machine Learning algorithm to perform math on actual numbers and not text. Why do I have to encode categorical variables?

label encoding in python

…in this post I am going to show you three ways of encoding your categorical variables and then saving your encoders in order to use them on new data, at the deployment phase.













Label encoding in python