2020.04.05(pm): Binary Classification – Movie Review Classification

We have a lot to do with binary classification in everyday life. For example, there are dogs and cats, 100 won coins and 500 won coins, and iPhone and Samsung Galaxy phones. This time, I’m going to classify a movie review. Binary classification is considered to be the most widely used in machine learning. Let’s …

2020.03.29(pm): Statistical Inference and Hypothesis Testing

This time, let’s look at the concepts of probability and statistics that are the basis of machine learning algorithms. When learning a model in the area of ​​supervised learning, the most important thing is variable selection. Numerical interpretation and verification are required to ensure good selection of this variable. So, what is needed is the …

2020.03.21(pm): Support Vector Machine(SVM)

The SVM covered in this post is a supervised learning algorithm for solving classification problems. SVM extends the input data to create more complex models that are not defined as simple hyperplanes. SVM can be applied to both classification and regression. Linear models and nonlinear characteristics Because linear and hyperplanes are not flexible, linear models …

2020.03.08(am): K-means Clustering

Unsupervised learning There are two types of machine learning algorithms: supervised learning and unsupervised learning. Unsupervised learning refers to all kinds of machine learning that must teach a learning algorithm without any known output or information. The most difficult thing in unsupervised learning is to evaluate whether the algorithm has learned something useful. Unsupervised learning …

2020.01.18(pm): Feature Selection and Dimension reduction

Feature Engineering Feature engineering is the process of using domain knowledge of the data to create features that make machine learning algorithms work. Feature engineering is fundamental to the application of machine learning, and is both difficult and expensive. The need for manual feature engineering can be obviated by automated feature learning(from : https://en.wikipedia.org/wiki/Feature_engineering ) …

2020.01.01.(pm): Statistical analysis using pandas

1. pandas features pandas has the following characteristics: Easy handling of missing values Create data with automatic and explicit label position Data-intensive Advanced label-based slicing, extraction, and subset of large datasets Intuitive Dataset Combination Flexible transformation and transformation of datasets Descriptive labeling of axes Powerful I / O corresponding to various data formats Inherent Processing …