Evaluation
JudiLing.Comp_Acc_Struct
— TypeA structure that stores information about comprehension accuracy.
JudiLing.eval_SC
— FunctionAssess model accuracy on the basis of the correlations of row vectors of Chat and C or Shat and S. Ideally the target words have highest correlations on the diagonal of the pertinent correlation matrices. Homophones support option is implemented.
JudiLing.eval_SC_loose
— FunctionAssess model accuracy on the basis of the correlations of row vectors of Chat and C or Shat and S. Count it as correct if one of the top k candidates is correct. Homophones support option is implemented.
JudiLing.accuracy_comprehension
— Methodaccuracy_comprehension(S, Shat, data)
Evaluate comprehension accuracy for training data.
In case of homophones/homographs in the dataset, the correct/incorrect values for base and inflections may be misleading! See below for more information.
Obligatory Arguments
S::Matrix
: the (gold standard) S matrixShat::Matrix
: the (predicted) Shat matrixdata::DataFrame
: the dataset
Optional Arguments
target_col::Union{String, Symbol}=:Words
: the column name for target stringsbase::Vector=nothing
: base features (typically a lexeme)inflections::Union{Nothing, Vector}=nothing
: other features (typically in inflectional features)
Examples
accuracy_comprehension(
S_train,
Shat_train,
latin_val,
target_col=:Words,
base=[:Lexeme],
inflections=[:Person, :Number, :Tense, :Voice, :Mood]
)
Note
In case of homophones/homographs in the dataset, the correct/incorrect values for base and inflections may be misleading! Consider the following example: The wordform "Äpfel" in German can be nominative plural, genitive plural and accusative plural. Let's assume we have a dataset in which "Äpfel" occurs in all three case/number combinations (i.e. there are homographs). If all these wordforms have the same semantic vectors (e.g. because they are derived from word2vec or fasttext which typically have a single vector per unique wordform), the predicted semantic vector of the wordform "Äpfel" will be equally correlated with all three case/number combinations in the dataset. In such cases, while the algorithm in this function can unambiguously conclude that the correct surface form "Äpfel" was comprehended, which of the three possible rows is the correct one will be picked somewhat non-deterministically (see https://docs.julialang.org/en/v1/base/collections/#Base.argmax). It is thus possible that the algorithm will then use the genitive plural instead of the intended nominative plural as the ground plural, and will report that "case" was comprehended incorrectly.
JudiLing.accuracy_comprehension
— Methodaccuracy_comprehension(
S_val,
S_train,
Shat_val,
data_val,
data_train;
target_col = :Words,
base = nothing,
inflections = nothing,
)
Evaluate comprehension accuracy for validation data.
In case of homophones/homographs in the dataset, the correct/incorrect values for base and inflections may be misleading! See below for more information.
Obligatory Arguments
S_val::Matrix
: the (gold standard) S matrix of the validation dataS_train::Matrix
: the (gold standard) S matrix of the training dataShat_val::Matrix
: the (predicted) Shat matrix of the validation datadata_val::DataFrame
: the validation datasetdata_train::DataFrame
: the training dataset
Optional Arguments
target_col::Union{String, Symbol}=:Words
: the column name for target stringsbase::Vector=nothing
: base features (typically a lexeme)inflections::Union{Nothing, Vector}=nothing
: other features (typically in inflectional features)
Examples
accuracy_comprehension(
S_val,
S_train,
Shat_val,
latin_val,
latin_train,
target_col=:Words,
base=[:Lexeme],
inflections=[:Person, :Number, :Tense, :Voice, :Mood]
)
Note
In case of homophones/homographs in the dataset, the correct/incorrect values for base and inflections may be misleading! Consider the following example: The wordform "Äpfel" in German can be nominative plural, genitive plural and accusative plural. Let's assume we have a dataset in which "Äpfel" occurs in all three case/number combinations (i.e. there are homographs). If all these wordforms have the same semantic vectors (e.g. because they are derived from word2vec or fasttext which typically have a single vector per unique wordform), the predicted semantic vector of the wordform "Äpfel" will be equally correlated with all three case/number combinations in the dataset. In such cases, while the algorithm in this function can unambiguously conclude that the correct surface form "Äpfel" was comprehended, which of the three possible rows is the correct one will be picked somewhat non-deterministically (see https://docs.julialang.org/en/v1/base/collections/#Base.argmax). It is thus possible that the algorithm will then use the genitive plural instead of the intended nominative plural as the ground plural, and will report that "case" was comprehended incorrectly.
JudiLing.eval_SC
— Methodeval_SC(SChat::AbstractArray, SC::AbstractArray)
Assess model accuracy on the basis of the correlations (or Euclidean distances or Cosine Similarities) of row vectors of Chat and C or Shat and S. Ideally the target words have highest correlations (lowest distance/highest similarity) on the diagonal of the pertinent correlation (distance/similarity) matrices.
If freq
is added, token-based accuracy is computed. Token-based accuracy weighs accuracy values according to words' frequency, i.e. if a word has a frequency of 30 and overall there are 3000 tokens (the frequencies of all types sum to 3000), this token's accuracy will contribute 30/3000.
If there are homophones/homographs in the dataset, this evaluation method may be misleading: the predicted vector will be equally correlated with the target vector of both words and the one on the diagonal will not necessarily be selected as the most correlated. In such cases, supplying the dataset and target_col
is recommended which enables taking into account homophones/homographs.
Obligatory Arguments
SChat::Union{SparseMatrixCSC, Matrix}
: the Chat or Shat matrixSC::Union{SparseMatrixCSC, Matrix}
: the C or S matrix
Optional Arguments
digits
: the specified number of digits after the decimal place (or before if negative)R::Bool=false
: if true, pairwise correlation/distance/similarity matrix R is returnfreq::Union{Missing, Array{Int64, 1}, Array{Float64,1}}=missing
: list of frequencies of the wordforms in X and Ymethod::Union{Symbol, String}=:correlation
: Method for computing similarities, one of {:correlation, :euclidean, :cosine}.
eval_SC(Chat_train, cue_obj_train.C)
eval_SC(Chat_val, cue_obj_val.C)
eval_SC(Shat_train, S_train)
eval_SC(Shat_val, S_val)
JudiLing.eval_SC
— Methodeval_SC(SChat::AbstractArray, SC::AbstractArray, SC_rest::AbstractArray)
Assess model accuracy on the basis of the correlations (or Euclidean distances or Cosine Similarities) of row vectors of Chat and C or Shat and S. Ideally the target words have highest correlations (lowest distance/highest similarity) on the diagonal of the pertinent correlation (distance/similarity) matrices.
If freq
is added, token-based accuracy is computed. Token-based accuracy weighs accuracy values according to words' frequency, i.e. if a word has a frequency of 30 and overall there are 3000 tokens (the frequencies of all types sum to 3000), this token's accuracy will contribute 30/3000.
The order is important. The fist gold standard matrix has to be corresponing to the SChat matrix, such as eval_SC(Shat_train, S_train, S_val)
or eval_SC(Shat_val, S_val, S_train)
If there are homophones/homographs in the dataset, this evaluation method may be misleading: the predicted vector will be equally correlated with the target vector of both words and the one on the diagonal will not necessarily be selected as the most correlated. In such cases, supplying the dataset and target_col is recommended which enables taking into account homophones/homographs.
Obligatory Arguments
SChat::Union{SparseMatrixCSC, Matrix}
: the Chat or Shat matrixSC::Union{SparseMatrixCSC, Matrix}
: the training/validation C or S matrixSC_rest::Union{SparseMatrixCSC, Matrix}
: the validation/training C or S matrix
Optional Arguments
digits
: the specified number of digits after the decimal place (or before if negative)R::Bool=false
: if true, pairwise correlation/distance/similarity matrix R is returnfreq::Union{Missing, Array{Int64, 1}, Array{Float64,1}}=missing
: list of frequencies of the wordforms in X and Ymethod::Union{Symbol, String}=:correlation
: Method for computing similarities, one of {:correlation, :euclidean, :cosine}.
eval_SC(Chat_train, cue_obj_train.C, cue_obj_val.C)
eval_SC(Chat_val, cue_obj_val.C, cue_obj_train.C)
eval_SC(Shat_train, S_train, S_val)
eval_SC(Shat_val, S_val, S_train)
JudiLing.eval_SC
— Methodeval_SC(SChat::AbstractArray, SC::AbstractArray, data::DataFrame, target_col::Union{String, Symbol})
Assess model accuracy on the basis of the correlations (or Euclidean distances or Cosine Similarities) of row vectors of Chat and C or Shat and S. Ideally the target words have highest correlations (lowest distance/highest similarity) on the diagonal of the pertinent correlation (distance/similarity) matrices. Support for homophones.
If freq
is added, token-based accuracy is computed. Token-based accuracy weighs accuracy values according to words' frequency, i.e. if a word has a frequency of 30 and overall there are 3000 tokens (the frequencies of all types sum to 3000), this token's accuracy will contribute 30/3000.
Obligatory Arguments
SChat::Union{SparseMatrixCSC, Matrix}
: the Chat or Shat matrixSC::Union{SparseMatrixCSC, Matrix}
: the C or S matrixdata::DataFrame
: datasetstarget_col::Union{String, Symbol}
: target column name
Optional Arguments
digits
: the specified number of digits after the decimal place (or before if negative)R::Bool=false
: if true, pairwise correlation/distance/similarity matrix R is returnfreq::Union{Missing, Array{Int64, 1}, Array{Float64,1}}=missing
: list of frequencies of the wordforms in X and Ymethod::Union{Symbol, String}=:correlation
: Method for computing similarities, one of {:correlation, :euclidean, :cosine}.
eval_SC(Chat_train, cue_obj_train.C, latin, :Word)
eval_SC(Chat_val, cue_obj_val.C, latin, :Word)
eval_SC(Shat_train, S_train, latin, :Word)
eval_SC(Shat_val, S_val, latin, :Word)
JudiLing.eval_SC
— Methodeval_SC(SChat::AbstractArray, SC::AbstractArray, SC_rest::AbstractArray, data::DataFrame, data_rest::DataFrame, target_col::Union{String, Symbol})
Assess model accuracy on the basis of the correlations (or Euclidean distances or Cosine Similarities) of row vectors of Chat and C or Shat and S. Ideally the target words have highest correlations (lowest distance/highest similarity) on the diagonal of the pertinent correlation (distance/similarity) matrices.
If freq
is added, token-based accuracy is computed. Token-based accuracy weighs accuracy values according to words' frequency, i.e. if a word has a frequency of 30 and overall there are 3000 tokens (the frequencies of all types sum to 3000), this token's accuracy will contribute 30/3000.
The order is important. The first gold standard matrix has to be corresponing to the SChat matrix, such as eval_SC(Shat_train, S_train, S_val, latin, :Word)
or eval_SC(Shat_val, S_val, S_train, latin, :Word)
Obligatory Arguments
SChat::Union{SparseMatrixCSC, Matrix}
: the Chat or Shat matrixSC::Union{SparseMatrixCSC, Matrix}
: the training/validation C or S matrixSC_rest::Union{SparseMatrixCSC, Matrix}
: the validation/training C or S matrixdata::DataFrame
: the training/validation datasetsdata_rest::DataFrame
: the validation/training datasetstarget_col::Union{String, Symbol}
: target column name
Optional Arguments
digits
: the specified number of digits after the decimal place (or before if negative)R::Bool=false
: if true, pairwise correlation/distance/similarity matrix R is returnfreq::Union{Missing, Array{Int64, 1}, Array{Float64,1}}=missing
: list of frequencies of the wordforms in X and Ymethod::Union{Symbol, String}=:correlation
: Method for computing similarities, one of {:correlation, :euclidean, :cosine}.
eval_SC(Chat_train, cue_obj_train.C, cue_obj_val.C, latin, :Word)
eval_SC(Chat_val, cue_obj_val.C, cue_obj_train.C, latin, :Word)
eval_SC(Shat_train, S_train, S_val, latin, :Word)
eval_SC(Shat_val, S_val, S_train, latin, :Word)
JudiLing.eval_SC
— Methodeval_SC(SChat::AbstractArray, SC::AbstractArray, batch_size::Int64)
Assess model accuracy on the basis of the correlations of row vectors of Chat and C or Shat and S. Ideally the target words have highest correlations on the diagonal of the pertinent correlation matrices. For large datasets, pass batch_size to process evaluation in chunks.
If there are homophones/homographs in the dataset, this evaluation method may be misleading: the predicted vector will be equally correlated with the target vector of both words and the one on the diagonal will not necessarily be selected as the most correlated. In such cases, supplying the dataset and target_col is recommended which enables taking into account homophones/homographs.
Currently only available for correlation.
Obligatory Arguments
SChat
: the Chat or Shat matrixSC
: the C or S matrixdata
: datasetstarget_col
: target column namebatch_size
: batch size
Optional Arguments
digits
: the specified number of digits after the decimal place (or before if negative)verbose::Bool=false
: if true, more information is printed
eval_SC(Chat_train, cue_obj_train.C, latin, :Word)
eval_SC(Chat_val, cue_obj_val.C, latin, :Word)
eval_SC(Shat_train, S_train, latin, :Word)
eval_SC(Shat_val, S_val, latin, :Word)
JudiLing.eval_SC
— Methodeval_SC(SChat::AbstractArray, SC::AbstractArray, data::DataFrame, target_col::Union{String, Symbol}, batch_size::Int64)
Assess model accuracy on the basis of the correlations of row vectors of Chat and C or Shat and S. Ideally the target words have highest correlations on the diagonal of the pertinent correlation matrices. For large datasets, pass batch_size to process evaluation in chunks. Support homophones.
Currently only available for correlation.
Obligatory Arguments
SChat::AbstractArray
: the Chat or Shat matrixSC::AbstractArray
: the C or S matrixdata::DataFrame
: datasetstarget_col::Union{String, Symbol}
: target column namebatch_size::Int64
: batch size
Optional Arguments
digits
: the specified number of digits after the decimal place (or before if negative)verbose::Bool=false
: if true, more information is printed
eval_SC(Chat_train, cue_obj_train.C, latin, :Word, 5000)
eval_SC(Chat_val, cue_obj_val.C, latin, :Word, 5000)
eval_SC(Shat_train, S_train, latin, :Word, 5000)
eval_SC(Shat_val, S_val, latin, :Word, 5000)
JudiLing.eval_SC_loose
— Methodeval_SC_loose(SChat, SC, k)
Assess model accuracy on the basis of the correlations (or Euclidean distances or Cosine Similarities) of row vectors of Chat and C or Shat and S. Ideally the target words have highest correlations (lowest distance/highest similarity) on the diagonal of the pertinent correlation (distance/similarity) matrices. Count it as correct if one of the top k candidates is correct.
If there are homophones/homographs in the dataset, this evaluation method may be misleading: the predicted vector will be equally correlated with the target vector of both words and it is not guaranteed that the target on the diagonal will be among the k neighbours. In particular, eval_SC
and eval_SC_loose
with k=1 are not guaranteed to give the same result. In such cases, supplying the dataset and target_col
is recommended which enables taking into account homophones/homographs.
Obligatory Arguments
SChat::Union{SparseMatrixCSC, Matrix}
: the Chat or Shat matrixSC::Union{SparseMatrixCSC, Matrix}
: the C or S matrixk
: top k candidates
Optional Arguments
digits
: the specified number of digits after the decimal place (or before if negative)method::Union{Symbol, String}=:correlation
: Method for computing similarities, one of {:correlation, :euclidean, :cosine}.
eval_SC_loose(Chat, cue_obj.C, k)
eval_SC_loose(Shat, S, k)
JudiLing.eval_SC_loose
— Methodeval_SC_loose(SChat, SC, k, data, target_col)
Assess model accuracy on the basis of the correlations (or Euclidean distances or Cosine Similarities) of row vectors of Chat and C or Shat and S. Ideally the target words have highest correlations (lowest distance/highest similarity) on the diagonal of the pertinent correlation (distance/similarity) matrices. Count it as correct if one of the top k candidates is correct. Support for homophones.
Obligatory Arguments
SChat::Union{SparseMatrixCSC, Matrix}
: the Chat or Shat matrixSC::Union{SparseMatrixCSC, Matrix}
: the C or S matrixk
: top k candidatesdata
: datasetstarget_col
: target column name
Optional Arguments
digits
: the specified number of digits after the decimal place (or before if negative)method::Union{Symbol, String}=:correlation
: Method for computing similarities, one of {:correlation, :euclidean, :cosine}.
eval_SC_loose(Chat, cue_obj.C, k, latin, :Word)
eval_SC_loose(Shat, S, k, latin, :Word)
JudiLing.eval_manual
— Methodeval_manual(res, data, i2f)
Create extensive reports for the outputs from build_paths
and learn_paths
.
JudiLing.eval_acc
— Methodeval_acc(res, gold_inds::Array)
Evaluate the accuracy of the results from learn_paths
or build_paths
.
Obligatory Arguments
res::Array
: the results fromlearn_paths
orbuild_paths
gold_inds::Array
: the gold paths' indices
Optional Arguments
digits
: the specified number of digits after the decimal place (or before if negative)verbose::Bool=false
: if true, more information is printed
Examples
# evaluation on training data
acc_train = JudiLing.eval_acc(
res_train,
cue_obj_train.gold_ind,
verbose=false
)
# evaluation on validation data
acc_val = JudiLing.eval_acc(
res_val,
cue_obj_val.gold_ind,
verbose=false
)
JudiLing.eval_acc
— Methodeval_acc(res, cue_obj::Cue_Matrix_Struct)
Evaluate the accuracy of the results from learn_paths
or build_paths
.
Obligatory Arguments
res::Array
: the results fromlearn_paths
orbuild_paths
cue_obj::Cue_Matrix_Struct
: the C matrix object
Optional Arguments
digits
: the specified number of digits after the decimal place (or before if negative)verbose::Bool=false
: if true, more information is printed
Examples
acc = JudiLing.eval_acc(res, cue_obj)
JudiLing.eval_acc_loose
— Methodeval_acc_loose(res, gold_inds)
Lenient evaluation of the accuracy of the results from learn_paths
or build_paths
, counting a prediction as correct when the correlation of the predicted and gold standard semantic vectors is among the n top correlations, where n is equal to max_can
in the 'learnpaths' or `buildpaths` function.
Obligatory Arguments
res::Array
: the results fromlearn_paths
orbuild_paths
gold_inds::Array
: the gold paths' indices
Optional Arguments
digits
: the specified number of digits after the decimal place (or before if negative)verbose::Bool=false
: if true, more information is printed
Examples
# evaluation on training data
acc_train_loose = JudiLing.eval_acc_loose(
res_train,
cue_obj_train.gold_ind,
verbose=false
)
# evaluation on validation data
acc_val_loose = JudiLing.eval_acc_loose(
res_val,
cue_obj_val.gold_ind,
verbose=false
)
JudiLing.extract_gpi
— Functionextract_gpi(gpi, threshold=0.1, tolerance=(-1000.0))
Extract, using gold paths' information, how many n-grams for a gold path are below the threshold but above the tolerance.