Test Combo
JudiLing.test_combo — Methodtest_combo(test_mode;kwargs...)A wrapper function for a full model for a specific combination of parameters. A detailed introduction is in Test Combo Introduction
testcombo: testcombo is deprecated. While it will remain in the package it is no longer actively maintained.
Obligatory Arguments
test_mode::Symbol: which test mode, currently supports :trainonly, :presplit, :carefulsplit and :randomsplit.
Optional Arguments
train_sample_size::Int64=0: the desired number of training dataval_sample_size::Int64=0: the desired number of validation dataval_ratio::Float64=0.0: the desired portion of validation data, if works only if :valsamplesize is 0.0.extension::String=".csv": the extension for data nfeaturesinflectionsn_grams_target_col::Union{String, Symbol}=:Word: the column name for target stringsn_grams_tokenized::Boolean=false: if true, the dataset target is assumed to be tokenizedn_grams_sep_token::String=nothing: separatorgrams::Int64=3: the number of grams for cuesn_grams_keep_sep::Boolean=false: if true, keep separators in cuesstart_end_token::String=":": start and end token in boundary cuespath_sep_token::String=":": path separator in the assembled pathrandom_seed::Int64=314: the random seedsd_base_mean::Int64=1: the sd mean of base featuressd_inflection_mean::Int64=1: the sd mean of inflectional featuressd_base::Int64=4: the sd of base featuressd_inflection::Int64=4: the sd of inflectional featuresisdeep::Boolean=true: if true, mean of each feature is also randomizedadd_noise::Boolean=true: if true, add additional Gaussian noisesd_noise::Int64=1: the sd of the Gaussian noisenormalized::Boolean=false: if true, most of the values range between 1 and -1, it may slightly exceed between 1 or -1 depending on the sdif_combined::Boolean=false: if true, then features are combined with both training and validation datalearn_mode::Int64=:cholesky: which learning mode, currently supports :cholesky and :whmethod::Int64=:additive: whether :additive or :multiplicative decomposition is requiredshift::Int64=0.02: shift value for :additive decompositionmultiplier::Int64=1.01: multiplier value for :multiplicative decompositionoutput_format::Int64=:auto: to force output format to dense(:dense) or sparse(:sparse), make it auto(:auto) to determined by the programsparse_ratio::Int64=0.05: the ratio to decide whether a matrix is sparsewh_freq::Vector=nothing: the learning sequenceinit_weights::Matrix=nothing: the initial weightseta::Float64=0.1: the learning raten_epochs::Int64=1: the number of epochs to be trainedmax_t::Int64=0: the number of epochs to be trainedA::Matrix=nothing: the number of epochs to be trainedA_mode::Symbol=:combined: the adjacency matrix mode, currently supports :combined or :train_onlymax_can::Int64=10: the max number of candidate path to keep in the outputthreshold_train::Float64=0.1:the value set for the support such that if the support of an n-gram is higher than this value, the n-gram will be taking into consideration for training datais_tolerant_train::Bool=false: if true, select a specified number (given bymax_tolerance) of n-grams whose supports are below threshold but above a second tolerance threshold to be added to the path for training datatolerance_train::Float64=-0.1: the value set for the second threshold (in tolerant mode) such that if the support for an n-gram is in between this value and the threshold and the max_tolerance number has not been reached, then allow this n-gram to be added to the path for training datamax_tolerance_train::Int64=2: maximum number of n-grams allowed in a path for training datathreshold_val::Float64=0.1:the value set for the support such that if the support of an n-gram is higher than this value, the n-gram will be taking into consideration for validation datais_tolerant_val::Bool=false: if true, select a specified number (given bymax_tolerance) of n-grams whose supports are below threshold but above a second tolerance threshold to be added to the path for validation datatolerance_val::Float64=-0.1: the value set for the second threshold (in tolerant mode) such that if the support for an n-gram is in between this value and the threshold and the max_tolerance number has not been reached, then allow this n-gram to be added to the path for validation datamax_tolerance_val::Int64=2: maximum number of n-grams allowed in a path for validation datan_neighbors_train::Int64=10: the top n form neighbors to be considered for training datan_neighbors_val::Int64=20: the top n form neighbors to be considered for validation dataissparse::Bool=false: if true, keep sparse matrix format when learning pathsoutput_dir::String="out": the output directoryverbose::Bool=false: if true, more information will be printed