1 Introduction

In this vignette we explain in more detail how to perform sharing analyses with ISAnalytics and its dedicated sharing functions.

2 Installation and options

ISAnalytics can be installed quickly in different ways:

  • You can install it via Bioconductor
  • You can install it via GitHub using the package devtools

There are always 2 versions of the package active:

  • RELEASE is the latest stable version
  • DEVEL is the development version, it is the most up-to-date version where all new features are introduced

2.1 Installation from bioconductor

RELEASE version:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("ISAnalytics")

DEVEL version:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

# The following initializes usage of Bioc devel
BiocManager::install(version='devel')

BiocManager::install("ISAnalytics")

2.2 Installation from GitHub

RELEASE:

if (!require(devtools)) {
  install.packages("devtools")
}
devtools::install_github("calabrialab/ISAnalytics",
                         ref = "RELEASE_3_15",
                         dependencies = TRUE,
                         build_vignettes = TRUE)

DEVEL:

if (!require(devtools)) {
  install.packages("devtools")
}
devtools::install_github("calabrialab/ISAnalytics",
                         ref = "master",
                         dependencies = TRUE,
                         build_vignettes = TRUE)

2.3 Setting options

ISAnalytics has a verbose option that allows some functions to print additional information to the console while they’re executing. To disable this feature do:

# DISABLE
options("ISAnalytics.verbose" = FALSE)

# ENABLE
options("ISAnalytics.verbose" = TRUE)

Some functions also produce report in a user-friendly HTML format, to set this feature:

# DISABLE HTML REPORTS
options("ISAnalytics.reports" = FALSE)

# ENABLE HTML REPORTS
options("ISAnalytics.reports" = TRUE)

3 Shared integration sites

An integration site is always characterized by a triple of values: (chr, integration_locus, strand), hence these attributes are always present in integration matrices.

library(ISAnalytics)
#> Loading required package: magrittr
data("integration_matrices")
data("association_file")
#>    chr integration_locus strand     GeneName GeneStrand
#> 1:  16          68164148      +       NFATC3          +
#> 2:   4         129390130      + LOC100507487          +
#> 3:   5          84009671      -        EDIL3          -
#> 4:  12          54635693      -         CBX5          -
#> 5:   5          84009671      -        EDIL3          -
#> 6:  12          54635693      -         CBX5          -
#>                                                 CompleteAmplificationID
#> 1: PJ01_POOL01_LTR75LC38_PT001_PT001-103_lenti_GLOBE_PB_1_SLiM_0060_MNC
#> 2:  PJ01_POOL01_LTR53LC32_PT001_PT001-81_lenti_GLOBE_BM_1_SLiM_0180_MNC
#> 3:  PJ01_POOL01_LTR53LC32_PT001_PT001-81_lenti_GLOBE_BM_1_SLiM_0180_MNC
#> 4:  PJ01_POOL01_LTR83LC66_PT001_PT001-81_lenti_GLOBE_BM_1_SLiM_0180_MNC
#> 5:  PJ01_POOL01_LTR83LC66_PT001_PT001-81_lenti_GLOBE_BM_1_SLiM_0180_MNC
#> 6:  PJ01_POOL01_LTR27LC94_PT001_PT001-81_lenti_GLOBE_BM_1_SLiM_0180_MNC
#>    seqCount fragmentEstimate
#> 1:      182        102.94572
#> 2:    23219         68.73747
#> 3:    20205         67.12349
#> 4:    13269         65.15760
#> 5:    14748         61.46981
#> 6:    12588         60.84781

We can aggregate our data in different ways according to our needs (to know more about this topic take a look at the vignette vignette("workflow_start", package = "ISAnalytics")), obtaining therefore different groups. Each group has an associated set of integration sites.

## Aggregation by standard key
agg <- aggregate_values_by_key(integration_matrices,
                               association_file,
                               value_cols = c("seqCount", "fragmentEstimate"))
agg <- agg %>% dplyr::filter(TimePoint %in% c("0030", "0060"))
#> # A tibble: 419 × 11
#>    chr   integration_locus strand GeneName GeneStrand SubjectID CellMarker
#>    <chr>             <dbl> <chr>  <chr>    <chr>      <chr>     <chr>     
#>  1 1               8464757 -      RERE     -          PT001     MNC       
#>  2 1               8464757 -      RERE     -          PT001     MNC       
#>  3 1               8607357 +      RERE     -          PT001     MNC       
#>  4 1              11339120 +      UBIAD1   +          PT001     MNC       
#>  5 1              11339120 +      UBIAD1   +          PT001     MNC       
#>  6 1              16186297 -      SPEN     +          PT001     MNC       
#>  7 1              16186297 -      SPEN     +          PT001     MNC       
#>  8 1              16602483 +      FBXO42   -          PT001     MNC       
#>  9 1              25337264 -      MIR4425  +          PT002     MNC       
#> 10 1              25337264 -      MIR4425  +          PT002     MNC       
#>    Tissue TimePoint seqCount_sum fragmentEstimate_sum
#>    <chr>  <chr>            <dbl>                <dbl>
#>  1 BM     0030               542                 3.01
#>  2 BM     0060                 1                 1.00
#>  3 BM     0060                 1                 1.00
#>  4 BM     0060              1605                 8.03
#>  5 PB     0060                 1                 1.00
#>  6 BM     0030                 1                 1.00
#>  7 PB     0060                 1                 1.00
#>  8 BM     0060              2947                 9.04
#>  9 BM     0030                23                 9.14
#> 10 PB     0060                36                 7.07
#> # … with 409 more rows

An integration site is shared between two or more groups if the same triple is observed in all the groups considered.

4 Automated sharing counts

ISAnalytics provides the function is_sharing() for computing automated sharing counts. The function has several arguments that can be tuned according to user needs.

4.1 SCENARIO 1: single input data frame and single grouping key

sharing_1 <- is_sharing(agg, 
                        group_key = c("SubjectID", "CellMarker", 
                                      "Tissue", "TimePoint"),
                        n_comp = 2,
                        is_count = TRUE,
                        relative_is_sharing = TRUE,
                        minimal = TRUE,
                        include_self_comp = FALSE, 
                        keep_genomic_coord = TRUE)
#> Calculating combinations...
#> Done!
sharing_1
#>                    g1                g2 shared           is_coord count_g1
#>  1: PT001_MNC_BM_0030 PT001_MNC_BM_0060     21 <data.table[21x3]>       54
#>  2: PT001_MNC_BM_0030 PT001_MNC_PB_0060      8  <data.table[8x3]>       54
#>  3: PT001_MNC_BM_0060 PT001_MNC_PB_0060     29 <data.table[29x3]>      114
#>  4: PT001_MNC_PB_0030 PT001_MNC_PB_0060     10 <data.table[10x3]>       28
#>  5: PT001_MNC_BM_0030 PT002_MNC_BM_0030      0  <data.table[0x3]>       54
#>  6: PT001_MNC_BM_0060 PT002_MNC_BM_0030      1  <data.table[1x3]>      114
#>  7: PT001_MNC_PB_0060 PT002_MNC_BM_0030      1  <data.table[1x3]>       59
#>  8: PT001_MNC_PB_0030 PT002_MNC_BM_0030      0  <data.table[0x3]>       28
#>  9: PT001_MNC_BM_0030 PT002_MNC_PB_0060      0  <data.table[0x3]>       54
#> 10: PT001_MNC_BM_0060 PT002_MNC_PB_0060      0  <data.table[0x3]>      114
#> 11: PT001_MNC_PB_0060 PT002_MNC_PB_0060      0  <data.table[0x3]>       59
#> 12: PT002_MNC_BM_0030 PT002_MNC_PB_0060      8  <data.table[8x3]>       98
#> 13: PT002_MNC_PB_0030 PT002_MNC_PB_0060      7  <data.table[7x3]>       15
#> 14: PT001_MNC_PB_0030 PT002_MNC_PB_0060      0  <data.table[0x3]>       28
#> 15: PT002_MNC_BM_0060 PT002_MNC_PB_0060      5  <data.table[5x3]>       33
#> 16: PT001_MNC_BM_0030 PT002_MNC_PB_0030      0  <data.table[0x3]>       54
#> 17: PT001_MNC_BM_0060 PT002_MNC_PB_0030      1  <data.table[1x3]>      114
#> 18: PT001_MNC_PB_0060 PT002_MNC_PB_0030      0  <data.table[0x3]>       59
#> 19: PT002_MNC_BM_0030 PT002_MNC_PB_0030      3  <data.table[3x3]>       98
#> 20: PT001_MNC_PB_0030 PT002_MNC_PB_0030      0  <data.table[0x3]>       28
#> 21: PT002_MNC_BM_0060 PT002_MNC_PB_0030      2  <data.table[2x3]>       33
#> 22: PT001_MNC_BM_0030 PT001_MNC_PB_0030      7  <data.table[7x3]>       54
#> 23: PT001_MNC_BM_0060 PT001_MNC_PB_0030      7  <data.table[7x3]>      114
#> 24: PT001_MNC_BM_0030 PT002_MNC_BM_0060      1  <data.table[1x3]>       54
#> 25: PT001_MNC_BM_0060 PT002_MNC_BM_0060      0  <data.table[0x3]>      114
#> 26: PT001_MNC_PB_0060 PT002_MNC_BM_0060      0  <data.table[0x3]>       59
#> 27: PT002_MNC_BM_0030 PT002_MNC_BM_0060      5  <data.table[5x3]>       98
#> 28: PT001_MNC_PB_0030 PT002_MNC_BM_0060      0  <data.table[0x3]>       28
#>                    g1                g2 shared           is_coord count_g1
#>     count_g2 count_union     on_g1     on_g2   on_union
#>  1:      114         147 38.888889 18.421053 14.2857143
#>  2:       59         105 14.814815 13.559322  7.6190476
#>  3:       59         144 25.438596 49.152542 20.1388889
#>  4:       59          77 35.714286 16.949153 12.9870130
#>  5:       98         152  0.000000  0.000000  0.0000000
#>  6:       98         211  0.877193  1.020408  0.4739336
#>  7:       98         156  1.694915  1.020408  0.6410256
#>  8:       98         126  0.000000  0.000000  0.0000000
#>  9:       18          72  0.000000  0.000000  0.0000000
#> 10:       18         132  0.000000  0.000000  0.0000000
#> 11:       18          77  0.000000  0.000000  0.0000000
#> 12:       18         108  8.163265 44.444444  7.4074074
#> 13:       18          26 46.666667 38.888889 26.9230769
#> 14:       18          46  0.000000  0.000000  0.0000000
#> 15:       18          46 15.151515 27.777778 10.8695652
#> 16:       15          69  0.000000  0.000000  0.0000000
#> 17:       15         128  0.877193  6.666667  0.7812500
#> 18:       15          74  0.000000  0.000000  0.0000000
#> 19:       15         110  3.061224 20.000000  2.7272727
#> 20:       15          43  0.000000  0.000000  0.0000000
#> 21:       15          46  6.060606 13.333333  4.3478261
#> 22:       28          75 12.962963 25.000000  9.3333333
#> 23:       28         135  6.140351 25.000000  5.1851852
#> 24:       33          86  1.851852  3.030303  1.1627907
#> 25:       33         147  0.000000  0.000000  0.0000000
#> 26:       33          92  0.000000  0.000000  0.0000000
#> 27:       33         126  5.102041 15.151515  3.9682540
#> 28:       33          61  0.000000  0.000000  0.0000000
#>     count_g2 count_union     on_g1     on_g2   on_union

In this configuration we set:

  • A single input data frame: agg
  • A single grouping key by setting the argument grouping_key. In this specific case, our groups will be identified by a unique combination of SubjectID, CellMarker, Tissue and TimePoint
  • n_comp represents the number of comparisons to compute: 2 means we’re interested in knowing the sharing for PAIRS of distinct groups
  • We want to keep the counts of distinct integration sites for each group by setting is_count to TRUE
  • relative_is_sharing if set to TRUE adds sharing expressed as a percentage, more precisely it adds a column on_g1 that is calculated as the absolute number of shared integrations divided by the cardinality of the first group, on_g2 is analogous but is computed on the cardinality of the second group and finally on_union is computed on the cardinality of the union of the two groups.
  • By setting the argument minimal to TRUE we tell the function to avoid redundant comparisons: in this way only combinations and not permutations are included in the output table
  • include_self_comp adds rows in the table that are labelled with the same group: these rows always have a 100% sharing with all other groups. There are few scenarios where this is useful, but for now we set it to FALSE since we don’t need it
  • keep_genomic_coord allows us to keep the genomic coordinates of the shared integration sites as a separate table

4.1.1 Changing the number of comparisons

sharing_1_a <- is_sharing(agg, 
                        group_key = c("SubjectID", "CellMarker", 
                                      "Tissue", "TimePoint"),
                        n_comp = 3,
                        is_count = TRUE,
                        relative_is_sharing = TRUE,
                        minimal = TRUE,
                        include_self_comp = FALSE, 
                        keep_genomic_coord = TRUE)
#> Calculating combinations...
#> Done!
sharing_1_a
#>                    g1                g2                g3 shared
#>  1: PT001_MNC_BM_0030 PT001_MNC_BM_0060 PT001_MNC_PB_0060      6
#>  2: PT001_MNC_BM_0030 PT001_MNC_PB_0030 PT001_MNC_PB_0060      1
#>  3: PT001_MNC_BM_0060 PT001_MNC_PB_0030 PT001_MNC_PB_0060      2
#>  4: PT001_MNC_BM_0030 PT001_MNC_BM_0060 PT002_MNC_BM_0030      0
#>  5: PT001_MNC_BM_0030 PT001_MNC_PB_0060 PT002_MNC_BM_0030      0
#>  6: PT001_MNC_BM_0060 PT001_MNC_PB_0060 PT002_MNC_BM_0030      1
#>  7: PT001_MNC_PB_0030 PT001_MNC_PB_0060 PT002_MNC_BM_0030      0
#>  8: PT001_MNC_BM_0030 PT001_MNC_PB_0030 PT002_MNC_BM_0030      0
#>  9: PT001_MNC_BM_0060 PT001_MNC_PB_0030 PT002_MNC_BM_0030      0
#> 10: PT001_MNC_BM_0030 PT001_MNC_BM_0060 PT002_MNC_PB_0060      0
#> 11: PT001_MNC_BM_0030 PT001_MNC_PB_0060 PT002_MNC_PB_0060      0
#> 12: PT001_MNC_BM_0060 PT001_MNC_PB_0060 PT002_MNC_PB_0060      0
#> 13: PT001_MNC_PB_0030 PT001_MNC_PB_0060 PT002_MNC_PB_0060      0
#> 14: PT001_MNC_BM_0030 PT002_MNC_BM_0030 PT002_MNC_PB_0060      0
#> 15: PT001_MNC_BM_0060 PT002_MNC_BM_0030 PT002_MNC_PB_0060      0
#> 16: PT001_MNC_PB_0060 PT002_MNC_BM_0030 PT002_MNC_PB_0060      0
#> 17: PT001_MNC_PB_0030 PT002_MNC_BM_0030 PT002_MNC_PB_0060      0
#> 18: PT001_MNC_BM_0030 PT002_MNC_PB_0030 PT002_MNC_PB_0060      0
#> 19: PT001_MNC_BM_0060 PT002_MNC_PB_0030 PT002_MNC_PB_0060      0
#> 20: PT001_MNC_PB_0060 PT002_MNC_PB_0030 PT002_MNC_PB_0060      0
#> 21: PT002_MNC_BM_0030 PT002_MNC_PB_0030 PT002_MNC_PB_0060      1
#> 22: PT001_MNC_PB_0030 PT002_MNC_PB_0030 PT002_MNC_PB_0060      0
#> 23: PT002_MNC_BM_0060 PT002_MNC_PB_0030 PT002_MNC_PB_0060      1
#> 24: PT001_MNC_BM_0030 PT001_MNC_PB_0030 PT002_MNC_PB_0060      0
#> 25: PT001_MNC_BM_0060 PT001_MNC_PB_0030 PT002_MNC_PB_0060      0
#> 26: PT001_MNC_BM_0030 PT002_MNC_BM_0060 PT002_MNC_PB_0060      0
#> 27: PT001_MNC_BM_0060 PT002_MNC_BM_0060 PT002_MNC_PB_0060      0
#> 28: PT001_MNC_PB_0060 PT002_MNC_BM_0060 PT002_MNC_PB_0060      0
#> 29: PT002_MNC_BM_0030 PT002_MNC_BM_0060 PT002_MNC_PB_0060      1
#> 30: PT001_MNC_PB_0030 PT002_MNC_BM_0060 PT002_MNC_PB_0060      0
#> 31: PT001_MNC_BM_0030 PT001_MNC_BM_0060 PT002_MNC_PB_0030      0
#> 32: PT001_MNC_BM_0030 PT001_MNC_PB_0060 PT002_MNC_PB_0030      0
#> 33: PT001_MNC_BM_0060 PT001_MNC_PB_0060 PT002_MNC_PB_0030      0
#> 34: PT001_MNC_PB_0030 PT001_MNC_PB_0060 PT002_MNC_PB_0030      0
#> 35: PT001_MNC_BM_0030 PT002_MNC_BM_0030 PT002_MNC_PB_0030      0
#> 36: PT001_MNC_BM_0060 PT002_MNC_BM_0030 PT002_MNC_PB_0030      0
#> 37: PT001_MNC_PB_0060 PT002_MNC_BM_0030 PT002_MNC_PB_0030      0
#> 38: PT001_MNC_PB_0030 PT002_MNC_BM_0030 PT002_MNC_PB_0030      0
#> 39: PT001_MNC_BM_0030 PT001_MNC_PB_0030 PT002_MNC_PB_0030      0
#> 40: PT001_MNC_BM_0060 PT001_MNC_PB_0030 PT002_MNC_PB_0030      0
#> 41: PT001_MNC_BM_0030 PT002_MNC_BM_0060 PT002_MNC_PB_0030      0
#> 42: PT001_MNC_BM_0060 PT002_MNC_BM_0060 PT002_MNC_PB_0030      0
#> 43: PT001_MNC_PB_0060 PT002_MNC_BM_0060 PT002_MNC_PB_0030      0
#> 44: PT002_MNC_BM_0030 PT002_MNC_BM_0060 PT002_MNC_PB_0030      1
#> 45: PT001_MNC_PB_0030 PT002_MNC_BM_0060 PT002_MNC_PB_0030      0
#> 46: PT001_MNC_BM_0030 PT001_MNC_BM_0060 PT001_MNC_PB_0030      5
#> 47: PT001_MNC_BM_0030 PT001_MNC_BM_0060 PT002_MNC_BM_0060      0
#> 48: PT001_MNC_BM_0030 PT001_MNC_PB_0060 PT002_MNC_BM_0060      0
#> 49: PT001_MNC_BM_0060 PT001_MNC_PB_0060 PT002_MNC_BM_0060      0
#> 50: PT001_MNC_PB_0030 PT001_MNC_PB_0060 PT002_MNC_BM_0060      0
#> 51: PT001_MNC_BM_0030 PT002_MNC_BM_0030 PT002_MNC_BM_0060      0
#> 52: PT001_MNC_BM_0060 PT002_MNC_BM_0030 PT002_MNC_BM_0060      0
#> 53: PT001_MNC_PB_0060 PT002_MNC_BM_0030 PT002_MNC_BM_0060      0
#> 54: PT001_MNC_PB_0030 PT002_MNC_BM_0030 PT002_MNC_BM_0060      0
#> 55: PT001_MNC_BM_0030 PT001_MNC_PB_0030 PT002_MNC_BM_0060      0
#> 56: PT001_MNC_BM_0060 PT001_MNC_PB_0030 PT002_MNC_BM_0060      0
#>                    g1                g2                g3 shared
#>              is_coord count_g1 count_g2 count_g3 count_union     on_g1    on_g2
#>  1: <data.table[6x3]>       54      114       59         175 11.111111 5.263158
#>  2: <data.table[1x3]>       54       28       59         117  1.851852 3.571429
#>  3: <data.table[2x3]>      114       28       59         157  1.754386 7.142857
#>  4: <data.table[0x3]>       54      114       98         244  0.000000 0.000000
#>  5: <data.table[0x3]>       54       59       98         202  0.000000 0.000000
#>  6: <data.table[1x3]>      114       59       98         241  0.877193 1.694915
#>  7: <data.table[0x3]>       28       59       98         174  0.000000 0.000000
#>  8: <data.table[0x3]>       54       28       98         173  0.000000 0.000000
#>  9: <data.table[0x3]>      114       28       98         232  0.000000 0.000000
#> 10: <data.table[0x3]>       54      114       18         165  0.000000 0.000000
#> 11: <data.table[0x3]>       54       59       18         123  0.000000 0.000000
#> 12: <data.table[0x3]>      114       59       18         162  0.000000 0.000000
#> 13: <data.table[0x3]>       28       59       18          95  0.000000 0.000000
#> 14: <data.table[0x3]>       54       98       18         162  0.000000 0.000000
#> 15: <data.table[0x3]>      114       98       18         221  0.000000 0.000000
#> 16: <data.table[0x3]>       59       98       18         166  0.000000 0.000000
#> 17: <data.table[0x3]>       28       98       18         136  0.000000 0.000000
#> 18: <data.table[0x3]>       54       15       18          80  0.000000 0.000000
#> 19: <data.table[0x3]>      114       15       18         139  0.000000 0.000000
#> 20: <data.table[0x3]>       59       15       18          85  0.000000 0.000000
#> 21: <data.table[1x3]>       98       15       18         114  1.020408 6.666667
#> 22: <data.table[0x3]>       28       15       18          54  0.000000 0.000000
#> 23: <data.table[1x3]>       33       15       18          53  3.030303 6.666667
#> 24: <data.table[0x3]>       54       28       18          93  0.000000 0.000000
#> 25: <data.table[0x3]>      114       28       18         153  0.000000 0.000000
#> 26: <data.table[0x3]>       54       33       18          99  0.000000 0.000000
#> 27: <data.table[0x3]>      114       33       18         160  0.000000 0.000000
#> 28: <data.table[0x3]>       59       33       18         105  0.000000 0.000000
#> 29: <data.table[1x3]>       98       33       18         132  1.020408 3.030303
#> 30: <data.table[0x3]>       28       33       18          74  0.000000 0.000000
#> 31: <data.table[0x3]>       54      114       15         161  0.000000 0.000000
#> 32: <data.table[0x3]>       54       59       15         120  0.000000 0.000000
#> 33: <data.table[0x3]>      114       59       15         158  0.000000 0.000000
#> 34: <data.table[0x3]>       28       59       15          92  0.000000 0.000000
#> 35: <data.table[0x3]>       54       98       15         164  0.000000 0.000000
#> 36: <data.table[0x3]>      114       98       15         222  0.000000 0.000000
#> 37: <data.table[0x3]>       59       98       15         168  0.000000 0.000000
#> 38: <data.table[0x3]>       28       98       15         138  0.000000 0.000000
#> 39: <data.table[0x3]>       54       28       15          90  0.000000 0.000000
#> 40: <data.table[0x3]>      114       28       15         149  0.000000 0.000000
#> 41: <data.table[0x3]>       54       33       15          99  0.000000 0.000000
#> 42: <data.table[0x3]>      114       33       15         159  0.000000 0.000000
#> 43: <data.table[0x3]>       59       33       15         105  0.000000 0.000000
#> 44: <data.table[1x3]>       98       33       15         137  1.020408 3.030303
#> 45: <data.table[0x3]>       28       33       15          74  0.000000 0.000000
#> 46: <data.table[5x3]>       54      114       28         166  9.259259 4.385965
#> 47: <data.table[0x3]>       54      114       33         179  0.000000 0.000000
#> 48: <data.table[0x3]>       54       59       33         137  0.000000 0.000000
#> 49: <data.table[0x3]>      114       59       33         177  0.000000 0.000000
#> 50: <data.table[0x3]>       28       59       33         110  0.000000 0.000000
#> 51: <data.table[0x3]>       54       98       33         179  0.000000 0.000000
#> 52: <data.table[0x3]>      114       98       33         239  0.000000 0.000000
#> 53: <data.table[0x3]>       59       98       33         184  0.000000 0.000000
#> 54: <data.table[0x3]>       28       98       33         154  0.000000 0.000000
#> 55: <data.table[0x3]>       54       28       33         107  0.000000 0.000000
#> 56: <data.table[0x3]>      114       28       33         168  0.000000 0.000000
#>              is_coord count_g1 count_g2 count_g3 count_union     on_g1    on_g2
#>         on_g3  on_union
#>  1: 10.169492 3.4285714
#>  2:  1.694915 0.8547009
#>  3:  3.389831 1.2738854
#>  4:  0.000000 0.0000000
#>  5:  0.000000 0.0000000
#>  6:  1.020408 0.4149378
#>  7:  0.000000 0.0000000
#>  8:  0.000000 0.0000000
#>  9:  0.000000 0.0000000
#> 10:  0.000000 0.0000000
#> 11:  0.000000 0.0000000
#> 12:  0.000000 0.0000000
#> 13:  0.000000 0.0000000
#> 14:  0.000000 0.0000000
#> 15:  0.000000 0.0000000
#> 16:  0.000000 0.0000000
#> 17:  0.000000 0.0000000
#> 18:  0.000000 0.0000000
#> 19:  0.000000 0.0000000
#> 20:  0.000000 0.0000000
#> 21:  5.555556 0.8771930
#> 22:  0.000000 0.0000000
#> 23:  5.555556 1.8867925
#> 24:  0.000000 0.0000000
#> 25:  0.000000 0.0000000
#> 26:  0.000000 0.0000000
#> 27:  0.000000 0.0000000
#> 28:  0.000000 0.0000000
#> 29:  5.555556 0.7575758
#> 30:  0.000000 0.0000000
#> 31:  0.000000 0.0000000
#> 32:  0.000000 0.0000000
#> 33:  0.000000 0.0000000
#> 34:  0.000000 0.0000000
#> 35:  0.000000 0.0000000
#> 36:  0.000000 0.0000000
#> 37:  0.000000 0.0000000
#> 38:  0.000000 0.0000000
#> 39:  0.000000 0.0000000
#> 40:  0.000000 0.0000000
#> 41:  0.000000 0.0000000
#> 42:  0.000000 0.0000000
#> 43:  0.000000 0.0000000
#> 44:  6.666667 0.7299270
#> 45:  0.000000 0.0000000
#> 46: 17.857143 3.0120482
#> 47:  0.000000 0.0000000
#> 48:  0.000000 0.0000000
#> 49:  0.000000 0.0000000
#> 50:  0.000000 0.0000000
#> 51:  0.000000 0.0000000
#> 52:  0.000000 0.0000000
#> 53:  0.000000 0.0000000
#> 54:  0.000000 0.0000000
#> 55:  0.000000 0.0000000
#> 56:  0.000000 0.0000000
#>         on_g3  on_union

Changing the n_comp to 3 means that we want to calculate the sharing between 3 different groups. Note that the shared column contains the counts of integrations that are shared by ALL groups, which is equivalent to a set intersection.

Beware of the fact that the more comparisons are requested the more time the computation requires.

4.1.2 A case when it is useful to set minimal = FALSE

Setting minimal = FALSE produces all possible permutations of the groups and the corresponding values. In combination with include_self_comp = TRUE, this is useful when we want to know the sharing between pairs of groups and plot results as a heatmap.

sharing_1_b <- is_sharing(agg,
                          group_key = c("SubjectID", "CellMarker", 
                                      "Tissue", "TimePoint"),
                          n_comp = 2,
                          is_count = TRUE,
                          relative_is_sharing = TRUE,
                          minimal = FALSE,
                          include_self_comp = TRUE)
#> Calculating combinations...
#> Calculating self groups (requested)...
#> Calculating permutations (requested)...
#> Done!
sharing_1_b
#>                    g1                g2 shared count_g1 count_g2 count_union
#>  1: PT001_MNC_BM_0030 PT001_MNC_BM_0030     54       54       54          54
#>  2: PT001_MNC_BM_0030 PT001_MNC_BM_0060     21       54      114         147
#>  3: PT001_MNC_BM_0060 PT001_MNC_BM_0030     21      114       54         147
#>  4: PT001_MNC_BM_0060 PT001_MNC_BM_0060    114      114      114         114
#>  5: PT001_MNC_BM_0030 PT001_MNC_PB_0060      8       54       59         105
#>  6: PT001_MNC_PB_0060 PT001_MNC_BM_0030      8       59       54         105
#>  7: PT001_MNC_BM_0060 PT001_MNC_PB_0060     29      114       59         144
#>  8: PT001_MNC_PB_0060 PT001_MNC_BM_0060     29       59      114         144
#>  9: PT001_MNC_PB_0060 PT001_MNC_PB_0060     59       59       59          59
#> 10: PT001_MNC_PB_0030 PT001_MNC_PB_0060     10       28       59          77
#> 11: PT001_MNC_PB_0060 PT001_MNC_PB_0030     10       59       28          77
#> 12: PT001_MNC_BM_0030 PT002_MNC_BM_0030      0       54       98         152
#> 13: PT002_MNC_BM_0030 PT001_MNC_BM_0030      0       98       54         152
#> 14: PT001_MNC_BM_0060 PT002_MNC_BM_0030      1      114       98         211
#> 15: PT002_MNC_BM_0030 PT001_MNC_BM_0060      1       98      114         211
#> 16: PT001_MNC_PB_0060 PT002_MNC_BM_0030      1       59       98         156
#> 17: PT002_MNC_BM_0030 PT001_MNC_PB_0060      1       98       59         156
#> 18: PT002_MNC_BM_0030 PT002_MNC_BM_0030     98       98       98          98
#> 19: PT001_MNC_PB_0030 PT002_MNC_BM_0030      0       28       98         126
#> 20: PT002_MNC_BM_0030 PT001_MNC_PB_0030      0       98       28         126
#> 21: PT001_MNC_BM_0030 PT002_MNC_PB_0060      0       54       18          72
#> 22: PT002_MNC_PB_0060 PT001_MNC_BM_0030      0       18       54          72
#> 23: PT001_MNC_BM_0060 PT002_MNC_PB_0060      0      114       18         132
#> 24: PT002_MNC_PB_0060 PT001_MNC_BM_0060      0       18      114         132
#> 25: PT001_MNC_PB_0060 PT002_MNC_PB_0060      0       59       18          77
#> 26: PT002_MNC_PB_0060 PT001_MNC_PB_0060      0       18       59          77
#> 27: PT002_MNC_BM_0030 PT002_MNC_PB_0060      8       98       18         108
#> 28: PT002_MNC_PB_0060 PT002_MNC_BM_0030      8       18       98         108
#> 29: PT002_MNC_PB_0060 PT002_MNC_PB_0060     18       18       18          18
#> 30: PT002_MNC_PB_0030 PT002_MNC_PB_0060      7       15       18          26
#> 31: PT002_MNC_PB_0060 PT002_MNC_PB_0030      7       18       15          26
#> 32: PT001_MNC_PB_0030 PT002_MNC_PB_0060      0       28       18          46
#> 33: PT002_MNC_PB_0060 PT001_MNC_PB_0030      0       18       28          46
#> 34: PT002_MNC_BM_0060 PT002_MNC_PB_0060      5       33       18          46
#> 35: PT002_MNC_PB_0060 PT002_MNC_BM_0060      5       18       33          46
#> 36: PT001_MNC_BM_0030 PT002_MNC_PB_0030      0       54       15          69
#> 37: PT002_MNC_PB_0030 PT001_MNC_BM_0030      0       15       54          69
#> 38: PT001_MNC_BM_0060 PT002_MNC_PB_0030      1      114       15         128
#> 39: PT002_MNC_PB_0030 PT001_MNC_BM_0060      1       15      114         128
#> 40: PT001_MNC_PB_0060 PT002_MNC_PB_0030      0       59       15          74
#> 41: PT002_MNC_PB_0030 PT001_MNC_PB_0060      0       15       59          74
#> 42: PT002_MNC_BM_0030 PT002_MNC_PB_0030      3       98       15         110
#> 43: PT002_MNC_PB_0030 PT002_MNC_BM_0030      3       15       98         110
#> 44: PT002_MNC_PB_0030 PT002_MNC_PB_0030     15       15       15          15
#> 45: PT001_MNC_PB_0030 PT002_MNC_PB_0030      0       28       15          43
#> 46: PT002_MNC_PB_0030 PT001_MNC_PB_0030      0       15       28          43
#> 47: PT002_MNC_BM_0060 PT002_MNC_PB_0030      2       33       15          46
#> 48: PT002_MNC_PB_0030 PT002_MNC_BM_0060      2       15       33          46
#> 49: PT001_MNC_BM_0030 PT001_MNC_PB_0030      7       54       28          75
#> 50: PT001_MNC_PB_0030 PT001_MNC_BM_0030      7       28       54          75
#> 51: PT001_MNC_BM_0060 PT001_MNC_PB_0030      7      114       28         135
#> 52: PT001_MNC_PB_0030 PT001_MNC_BM_0060      7       28      114         135
#> 53: PT001_MNC_PB_0030 PT001_MNC_PB_0030     28       28       28          28
#> 54: PT001_MNC_BM_0030 PT002_MNC_BM_0060      1       54       33          86
#> 55: PT002_MNC_BM_0060 PT001_MNC_BM_0030      1       33       54          86
#> 56: PT001_MNC_BM_0060 PT002_MNC_BM_0060      0      114       33         147
#> 57: PT002_MNC_BM_0060 PT001_MNC_BM_0060      0       33      114         147
#> 58: PT001_MNC_PB_0060 PT002_MNC_BM_0060      0       59       33          92
#> 59: PT002_MNC_BM_0060 PT001_MNC_PB_0060      0       33       59          92
#> 60: PT002_MNC_BM_0030 PT002_MNC_BM_0060      5       98       33         126
#> 61: PT002_MNC_BM_0060 PT002_MNC_BM_0030      5       33       98         126
#> 62: PT001_MNC_PB_0030 PT002_MNC_BM_0060      0       28       33          61
#> 63: PT002_MNC_BM_0060 PT001_MNC_PB_0030      0       33       28          61
#> 64: PT002_MNC_BM_0060 PT002_MNC_BM_0060     33       33       33          33
#>                    g1                g2 shared count_g1 count_g2 count_union
#>          on_g1      on_g2    on_union
#>  1: 100.000000 100.000000 100.0000000
#>  2:  38.888889  18.421053  14.2857143
#>  3:  18.421053  38.888889  14.2857143
#>  4: 100.000000 100.000000 100.0000000
#>  5:  14.814815  13.559322   7.6190476
#>  6:  13.559322  14.814815   7.6190476
#>  7:  25.438596  49.152542  20.1388889
#>  8:  49.152542  25.438596  20.1388889
#>  9: 100.000000 100.000000 100.0000000
#> 10:  35.714286  16.949153  12.9870130
#> 11:  16.949153  35.714286  12.9870130
#> 12:   0.000000   0.000000   0.0000000
#> 13:   0.000000   0.000000   0.0000000
#> 14:   0.877193   1.020408   0.4739336
#> 15:   1.020408   0.877193   0.4739336
#> 16:   1.694915   1.020408   0.6410256
#> 17:   1.020408   1.694915   0.6410256
#> 18: 100.000000 100.000000 100.0000000
#> 19:   0.000000   0.000000   0.0000000
#> 20:   0.000000   0.000000   0.0000000
#> 21:   0.000000   0.000000   0.0000000
#> 22:   0.000000   0.000000   0.0000000
#> 23:   0.000000   0.000000   0.0000000
#> 24:   0.000000   0.000000   0.0000000
#> 25:   0.000000   0.000000   0.0000000
#> 26:   0.000000   0.000000   0.0000000
#> 27:   8.163265  44.444444   7.4074074
#> 28:  44.444444   8.163265   7.4074074
#> 29: 100.000000 100.000000 100.0000000
#> 30:  46.666667  38.888889  26.9230769
#> 31:  38.888889  46.666667  26.9230769
#> 32:   0.000000   0.000000   0.0000000
#> 33:   0.000000   0.000000   0.0000000
#> 34:  15.151515  27.777778  10.8695652
#> 35:  27.777778  15.151515  10.8695652
#> 36:   0.000000   0.000000   0.0000000
#> 37:   0.000000   0.000000   0.0000000
#> 38:   0.877193   6.666667   0.7812500
#> 39:   6.666667   0.877193   0.7812500
#> 40:   0.000000   0.000000   0.0000000
#> 41:   0.000000   0.000000   0.0000000
#> 42:   3.061224  20.000000   2.7272727
#> 43:  20.000000   3.061224   2.7272727
#> 44: 100.000000 100.000000 100.0000000
#> 45:   0.000000   0.000000   0.0000000
#> 46:   0.000000   0.000000   0.0000000
#> 47:   6.060606  13.333333   4.3478261
#> 48:  13.333333   6.060606   4.3478261
#> 49:  12.962963  25.000000   9.3333333
#> 50:  25.000000  12.962963   9.3333333
#> 51:   6.140351  25.000000   5.1851852
#> 52:  25.000000   6.140351   5.1851852
#> 53: 100.000000 100.000000 100.0000000
#> 54:   1.851852   3.030303   1.1627907
#> 55:   3.030303   1.851852   1.1627907
#> 56:   0.000000   0.000000   0.0000000
#> 57:   0.000000   0.000000   0.0000000
#> 58:   0.000000   0.000000   0.0000000
#> 59:   0.000000   0.000000   0.0000000
#> 60:   5.102041  15.151515   3.9682540
#> 61:  15.151515   5.102041   3.9682540
#> 62:   0.000000   0.000000   0.0000000
#> 63:   0.000000   0.000000   0.0000000
#> 64: 100.000000 100.000000 100.0000000
#>          on_g1      on_g2    on_union
heatmaps <- sharing_heatmap(sharing_1_b)

The function sharing_heatmap() automatically plots sharing between 2 groups. There are several arguments to this function that allow us to obtain heatmaps for the absolute sharing values or the relative (percentage) values.

heatmaps$absolute

heatmaps$on_g1