![]() ![]() ColumnSelector() # selects num or cat columns, ideal for a Feature Union or Pipeline - klib. cat_pipe() # provides common operations for preprocessing of categorical data - klib. num_pipe() # provides common operations for preprocessing of numerical data - klib. feature_selection_pipe() # provides common operations for feature selection - klib. train_dev_test_split( df) # splits a dataset and a label into train, optionally dev and test sets - klib. loss of information # klib.preprocess functions for data preprocessing (feature selection, scaling. pool_duplicate_subsets( df) # pools subset of cols based on duplicates with min. mv_col_handling( df) # drops features with high ratio of missing vals based on informational content - klib. drop_missing( df) # drops missing values, also called in data_cleaning() - klib. ![]() convert_datatypes( df) # converts existing to more efficient dtypes, also called inside data_cleaning() - klib. clean_column_names( df) # cleans and standardizes column names, also called inside data_cleaning() - klib. data_cleaning( df) # performs datacleaning (drop duplicates & empty rows/cols, adjust dtypes.) - klib. missingval_plot( df) # returns a figure containing information about missing values # klib.clean functions for cleaning datasets - klib. dist_plot( df) # returns a distribution plot for every numeric feature - klib. corr_plot( df) # returns a color-encoded heatmap, ideal for correlations - klib. corr_mat( df) # returns a color-encoded correlation matrix - klib. ![]() cat_plot( df) # returns a visualization of the number and frequency of categorical features - klib. # scribe functions for visualizing datasets - klib. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |