Posts

The Point that you're missing

Tips for your data analysis and AI code # 1. Easy on your eyes, easy to your analysis sns.countplot(data=df , x ='target_column', hue='satisfaction') df['target_column'].value_counts().plot(kind='bar') # 2. Save time df['Bool'].loc[df['Bool'] == 1] = 'Y' df['Bool'].loc[df['Bool'] == 0] = 'N' # 3. Don't forget to invers transforming le = LabelEncoder() df['column'] = le.fit_transform(df['column']) le.inverse_transform(result) # 4. Why not pretty? corr_data = df.corr() fig, ax = plt.subplots(figsize=(12,10)) mask = np.zeros_like(corr_data) mask[np.triu_indices_from(mask)] = True sns.heatmap(corr_data, annot = True, mask=mask, linewidths=1., cbar_kws={"shrink": .5}, vmin = -1, vmax = 1 ) plt.show()

[Kaggle] Titanic Survivor Classification

Titanic Survivor Classification Challenge From Kaggle. """ > - 0. Modules - 1. Train Data Load - 2. Null Data - 3. Outliers & One-Hot Encoding - 3.1. Outliers - 3.2. One-Hot-Encoding - 3.3. Merge DF - 4. Correlation Analysis - 4.1. Correlation Check (include dummies) - 4.2. Get Original Categorical Column Names - 4.3. Handle Categorical Columns Using Corr (del & dummy) - 5. Data Split-1 [Data and Label] - 6. Scaling - 7. Data Split-2 [Train and Validation] - 8. Test Data Load - 9. Machine Learning - 9.0. Comparison - 9.1. ML - Decision Tree Classifier - Grid Search - 9.2. ML - Random Forest Classifier - Grid Search - 9.3. ML - Logistic Regressor - Grid Search - 9.4. ML - XGBoost Classifier - Grid Search - 9.5. ML - LGBM Classifier - Grid Search - 9.6. ML - CatBoost Classifier - Grid Search - 10. Deep Learning - 10.1. Network Model - 10.2. Test Score - 11. Final Score Comparison - 12. Submit """ #============...

[Kaggle] Pizza or Not Classification (Computer Vision)

Pizza or Not Binary Classification Challenge From Kaggle. """ > - 0. Modules - 1. Train Data Load - 1.1. Path and Check - 1.2. Amount Cnt - 1.3. Path & Files List - 1.4. Img Size Check - 2. Labeling - 2.1. Make Img & Label List - 2.2. Handling Img Ch Error - 3. Tensorization - 4. Data Split - 5. Test Data Load - 6. Learning - 6.1. Transfer - EfficientNet-B5 - 6.2. Transfer - ResNet50 V2 - 6.3. Transfer - Inception V3 - 6.4. CNN 7. Test Score 8. Submit """ #======================================== ======================================== import os from glob import glob import pathlib from PIL import Image import numpy as np import pandas as pd import matplotlib.pyplot as plt import tensorflow as tf from keras.models import Model, Sequential from keras.optimizers import Adam, Adamax, SGD, Nadam from sklearn.model_selection import train_test_split from keras.applications import EfficientNetB0, EfficientNetB4 from keras...

[Kaggle] Automobile Parts Classification (Computer Vision)

Automobile Parts Classification from Kaggle Challenge """ > - 0. Modules - 1. Train Data Load - 1.1. Path and Check - 1.2. Amount Cnt - 1.3. Path & Files List - 1.4. Img Size Check - 2. Labeling - 2.1. Make Img & Label List - 2.2. Handling Img Ch Error - 3. Tensorization - 4. Data Split - 5. Test Data Load - 6. Learning - 6.1. Transfer - EfficientNet-B5 - 6.2. Transfer - ResNet50 V2 - 6.3. Transfer - Inception V3 - 6.4. CNN 7. Test Score 8. Submit """ #======================================== ======================================== import os from glob import glob import pathlib from PIL import Image import numpy as np import pandas as pd import matplotlib.pyplot as plt import tensorflow as tf from keras.models import Model, Sequential from keras.optimizers import Adam, Adamax, SGD, Nadam from sklearn.model_selection import train_test_split from keras.applications import EfficientNetB0, EfficientNetB4 from keras.laye...

[Kaggle] Fashion Shopping Review Classification (NLP)

Load image files and make the Convolutional Deep learnig model. """ > - 0. Modules - 1. Train Data Load - 2. Data Preprocessing - 2.1. Null Data - 2.2. Language, Word filtering and check duplicated data - 2.3. Label Check - 2.4. Label Encoding - 3. Data Split - 3.1. Features & Labels - 3.2. Train & Valid 4. Test Data Load 5. Machine Learning - 5.0. Total Comparison - 5.1. Naive Bayes Classification - 5.2. LSTM Model 6. Deep Learning - 6.1. Tokenizing & Padding - 6.2. LSTM Model - 6.3. Test Data 7. Final Score 8. Submit """ #======================================== ======================================== import numpy as np import pandas as pd import matplotlib.pyplot as plt plt.rcParams['font.family'] = 'Arial' plt.rcParams['axes.unicode_minus'] = False from sklearn.preprocessing import LabelEncoder from sklearn.model_selection import train_test_split, GridSearchCV from sklearn.feat...

[Kaggle] Boston House Price Regrresion

""" > - 0. Modules - 1. Train Data Load - 2. Null Data - 3. Outliers & One-Hot Encoding - 4. Correlation Analysis - 5. Data Split-1 [Data and Label] - 6. Scaling - 7. Data Split-2 [Train and Validation] - 8. Test Data Load - 9. Machine Learning - 9.0. Comparison - 9.1. ML - Decision Tree Regressor - Grid Search - 9.2. ML - Random Forest Regressor - Grid Search - 9.3. ML - Logistic Regression - Grid Search - 9.4. ML - LGBM Regressor - Grid Search - 9.5. ML - CatBoost Regressor - Grid Search - 10. Deep Learning - 11. Final Score - 12. Submit """ #======================================== ======================================== import sklearn as sk import pandas as pd import numpy as np import seaborn as sns import matplotlib.pyplot as plt from sklearn.preprocessing import MinMaxScaler, StandardScaler, RobustScaler, LabelEncoder from sklearn.model_selection import train_test_split, GridSearchCV from sklearn.metrics import confusion...