ml

ml.FitAndPredict(path)

Splits and trains a KNN model to classify the article embeddings. Before splitting into train-test sets it will seperate the articles fetched from the news API to be predicted after testing which will be saved to a seprate file.

Parameters:

path (str) – the directory of data such as the ttl file and the article list

ml.get_embeddings(ttl_path, entities)

Use the RDF2Vec algorithm to produce the embedding vectors, first it must load the KG, then using the randomwalker it will create the embeddings by using the pyRDF2Vec package

Parameters:
  • ttl_path (str) – the directory of the turtle file .ttl

  • entities (list[str]) – list of URIs that we wish to create embeddings for

Returns:

a tuple containing the extracted embeddings and literals for the entities if defined

Return type:

tuple

ml.read_data(path)

This function loads the .tsv (tab seprated values, like csv but with tab instead of commas) that contains the URI - Label pairs

Parameters:

path (str) – The directory where the tsv is stored

Returns:

tuple containing list of URIs and Labeles

Return type:

tuple