Utils
JudiLing.iscorrect — FunctionCheck whether the predictions are correct.
JudiLing.display_pred — FunctionDisplay prediction nicely.
JudiLing.translate — FunctionTranslate indices into words or utterances
JudiLing.translate_path — FunctionAppend indices together to form a path
JudiLing.is_truly_sparse — FunctionCheck whether a matrix is truly sparse regardless its format, where M is originally a sparse matrix format.
Check whether a matrix is truly sparse regardless its format, where M is originally a dense matrix format.
JudiLing.isattachable — FunctionCheck whether a gram can attach to another gram.
Check whether a gram can attach to another gram.
JudiLing.iscomplete — FunctionCheck whether a path is complete.
JudiLing.isstart — FunctionCheck whether a gram can start a path.
JudiLing.isnovel — FunctionCheck whether a predicted path is in training data.
JudiLing.check_used_token — FunctionCheck whether there are tokens already used in dataset as n-gram components.
JudiLing.cal_max_timestep — Functionfunction cal_max_timestep(
data_train::DataFrame,
data_val::DataFrame,
target_col::Union{String, Symbol};
tokenized::Bool = false,
sep_token::Union{Nothing, String, Char} = "",
)Calculate the max timestep given training and validation datasets.
Obligatory Arguments
data_train::DataFrame: the training datasetdata_val::DataFrame: the validation datasettarget_col::Union{String, Symbol}: the column with the target word forms
Optional Arguments
tokenized::Bool = false: Whether the word forms in thetarget_colare already tokenizedsep_token::Union{Nothing, String, Char} = "": The token with which the word forms are tokenized
Examples
JudiLing.cal_max_timestep(latin_train, latin_val, target_col=:Word)function cal_max_timestep(
data::DataFrame,
target_col::Union{String, Symbol};
tokenized::Bool = false,
sep_token::Union{Nothing, String, Char} = "",
)Calculate the max timestep given training dataset.
Obligatory Arguments
data::DataFrame: the datasettarget_col::Union{String, Symbol}: the column with the target word forms
Optional Arguments
tokenized::Bool = false: Whether the word forms in thetarget_colare already tokenizedsep_token::Union{Nothing, String, Char} = "": The token with which the word forms are tokenized
Examples
JudiLing.cal_max_timestep(latin, target_col=:Word)