The goal of this visualization is to provide interactive browsing of association rules within named entities of a RDF graph through three synchronized and complementary visualization techniques:

**Overview of Rules:**We use a scatter plot technique with a slight modification to allow the user to identify how many rules are concerned by each pair <confidence, interestingness>. The chart's x-axis represents values of interestingness, while the y-axis represents values of confidence. The crossing points between both measures are represented by diamond symbols, which color encodes measures of interest, texture encodes symmetry, and size encodes the number of rules for each pair of values <confidence, interestingness>.**Circular Paginated View of Subsets:**We use a chord diagram chart to provide a clear and simultaneous representation of relationships between items and their measures of interest. The arcs correspond to single items and the ribbons represent the rules, which color encodes values of confidence and interestingness, and texture encodes symmetry. Each ribbon contains arrowheads on its extremities to indicate the rule's direction, i.e. the extremity containing the arrow implies that the item is a consequent. The order of arcs around the circumference can be modified by sorting the keywords by alphabetic order or according to the number of association rules involving each item.**Exploratory Graph View of Items:**The association graph is meant to give an intuitive portray of antecedent and consequent items involved in rules by representing them as nodes placed in the left and right side of rules. We use an association graph with two vertical stacks of labeled rectangles at the left and right extremities of the window to represent items, and diamond-shaped nodes placed at the center of the visualization space to represent the association rules between items.

For extracting the association rules, we used the CORD-19 Named Entities Knowledge Graph, which describes the named entities embedded in the publications, which are linked to DBPedia, Wikidata, and Bioportal datasets. Particulary, at the moment, we use the named entities linked to the Wikidata dataset. Furthermore, we only treat publications between 1990 and 2020.

We use the

- Antecedents: either a named entity or a pair of named entities.
- Consequents: either a named entity or a pair of named entities, which are consequence of the existence of antecendents in the publication.
- Support: the probability of finding the named entities X and Y in a transaction. It is estimated by the number of times X and Y appear among all available transactions. The resulting value is between 0 and 1.
- Supp(X → Y) = P(X ∩ Y)
- Confidence: the probability of finding the named entity Y in a transaction, knowing that the named entity X is in the same transaction. It is estimated by the corresponding frequency observed (number of times that X and Y appear among all transactions divided by the number of times where X is found). The resulting value is is between 0 and 1.
- Conf(X → Y) = P (Y / X) = P(X ∩ Y) / P(X) = Sup(X → Y) / Sup(X)
- Interestingness: the serendipity of the rule, which serve to penalize the rules or named entities with high frequency of appearance within the database.
- Interestingness(X → Y) = (Supp(X → Y) / Supp(X)) × (Supp(X → Y) / Supp(Y)) × (1 - (Supp(X → Y) / Tot. No. of transactions))
- isSymmetric: whether the rule works inversely, i.e. whether there is another rule where the antecendent is the consequent and vice versa.
- Cluster: the cluster which the rule belongs. These are automatically generated with no assigend semantic meaning.

- the
*confidence*, which defines a threshold of confidence (≥ .7) that determine whether the rule is kept. - the
*interestingness*, which defines a threshold of interestingness (≥ .3) that determine whether the rule is kept. - the
*redundancy*, which remove every rule that comply with the following definition of redundancy: A,B,C → D is redundant if Conf(A,B → D) ≥ Conf(A,B,C → D)

- Clustering of Publications: 963 rules
- Clustering of Named Entities: 116 rules
- Clustering of Publications and Named Entities: 432 rules
- No Clustering: 261 rules

- Aline Menin, Lucie Cadorel, Andrea G. B. Tettamanzi, Alain Giboin, Fabien Gandon, Marco Winckler. ARViz: Interactive Visualization of Association Rules for RDF Data Exploration. IV 2021 - 25th International Conference Information Visualisation, July 2021, Sydney, Australia. (To appear)
- Lucie Cadorel, Andrea G. B. Tettamanzi. Mining RDF Data of COVID-19 Scientific Literature for Interesting Association Rules. WI-IAT'20 - IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, Dec 2020, Melbourne, Australia.

- Aline Menin, Postdoctoral Researcher, Univ. Côte d'Azur, Inria, CNRS, Laboratory I3S
- Lucie Cadorel, PhD Student, Univ. Côte d'Azur, Inria, CNRS, Laboratory I3S
- Andrea G. B. Tettamanzi, Professor, Univ. Côte d'Azur, Inria, CNRS, Laboratory I3S
- Alain Giboin, Researcher, Univ. Côte d'Azur, Inria, CNRS, Laboratory I3S
- Fabien Gandon, Researcher, Univ. Côte d'Azur, Inria, CNRS, Laboratory I3S
- Marco Winckler, Professor, Univ. Côte d'Azur, Inria, CNRS, Laboratory I3S