Usage

General usage

Here is a description of all possible parameters. The tool take around 1 minute for 1d, 2000 cells, default parameters, independently of the number of gaussian and around 30 seconds for 300 cells. For the 2d 300 cells is around 3 minutes, 2000 cells around 15 minutes. Increasing the number of samples in the MCMC or in the burning phase will increase the time.

baredSC_1d

Run mcmc to get the pdf for a given gene using a normal distributions. The full documentation is available at https://baredsc.readthedocs.io

usage: baredSC_1d [-h] (--input INPUT | --inputAnnData INPUTANNDATA)
                  --geneColName GENECOLNAME
                  [--metadata1ColName METADATA1COLNAME]
                  [--metadata1Values METADATA1VALUES]
                  [--metadata2ColName METADATA2COLNAME]
                  [--metadata2Values METADATA2VALUES]
                  [--metadata3ColName METADATA3COLNAME]
                  [--metadata3Values METADATA3VALUES] [--xmin XMIN]
                  [--xmax XMAX] [--xscale {Seurat,log}]
                  [--targetSum TARGETSUM] [--nx NX] [--osampx OSAMPX]
                  [--osampxpdf OSAMPXPDF] [--minScale MINSCALE]
                  [--nnorm NNORM] [--nsampMCMC NSAMPMCMC]
                  [--nsampBurnMCMC NSAMPBURNMCMC]
                  [--nsplitBurnMCMC NSPLITBURNMCMC] [--T0BurnMCMC T0BURNMCMC]
                  [--seed SEED] [--minNeff MINNEFF] [--force] --output OUTPUT
                  [--figure FIGURE] [--title TITLE]
                  [--removeFirstSamples REMOVEFIRSTSAMPLES]
                  [--nsampInPlot NSAMPINPLOT] [--prettyBins PRETTYBINS]
                  [--logevidence LOGEVIDENCE] [--coviscale COVISCALE]
                  [--nis NIS] [--version]

Named Arguments

--version

show program’s version number and exit

Required arguments

--input

Input table (tabular separated with header) with one line per cell columns with raw counts and one column nCount_RNA with total number of UMI per cell optionally other meta data to filter.

--inputAnnData

Input annData (for example from Scanpy).

--geneColName

Name of the column with gene counts.

--output

Ouput file basename (will be npz) with results of mcmc.

Optional arguments to select input data

--metadata1ColName

Name of the column with metadata1 to filter.

--metadata1Values

Comma separated values for metadata1 of cells to keep.

--metadata2ColName

Name of the column with metadata2 to filter.

--metadata2Values

Comma separated values for metadata2 of cells to keep.

--metadata3ColName

Name of the column with metadata3 to filter.

--metadata3Values

Comma separated values for metadata3 of cells to keep.

Optional arguments to run MCMC

--xmin

Minimum value to consider in x axis.

Default: 0

--xmax

Maximum value to consider in x axis.

Default: 2.5

--xscale

Possible choices: Seurat, log

scale for the x-axis: Seurat (log(1+targetSum*X)) or log (log(X))

Default: “Seurat”

--targetSum

factor when Seurat scale is used: (log(1+targetSum*X)) (default is 10^4, use 0 for the median of nRNA_Counts)

Default: 10000

--nx

Number of values in x to check how your evaluated pdf is compatible with the model.

Default: 100

--osampx

Oversampling factor of x values when evaluating pdf of Poisson distribution.

Default: 10

--osampxpdf

Oversampling factor of x values when evaluating pdf at each step of the MCMC.

Default: 5

--minScale

Minimal value of the scale of gaussians (Default is 0.1 but cannot be smaller than max of twice the bin size of pdf evaluation and half the bin size).

Default: 0.1

--nnorm

Number of gaussian to fit.

Default: 2

--nsampMCMC

Number of samplings (iteractions) of mcmc.

Default: 100000

--nsampBurnMCMC

Number of samplings (iteractions) in the burning phase of mcmc (Default is nsampMCMC / 4).

--nsplitBurnMCMC

Number of steps in the burning phase of mcmc.

Default: 10

--T0BurnMCMC

Initial temperature in the burning phase of mcmc (>1).

Default: 100.0

--seed

Change seed for another output.

Default: 1

--minNeff

Will redo the MCMC with 10 times more samples until the number of effective samples that this value (Default is not set so will not rerun MCMC).

--force

Force to redo the mcmc even if output exists.

Optional arguments to get plots and text outputs

--figure

Ouput figure filename.

--title

Title in figures.

--removeFirstSamples

Number of samples to ignore before making the plots (default is nsampMCMC / 4).

--nsampInPlot

Approximate number of samples to use in plots.

Default: 100000

--prettyBins

Number of bins to use in plots (Default is nx).

Optional arguments to get logevidence

--logevidence

Ouput file to put logevidence value.

--coviscale

Scale factor to apply to covariance of parameters to get random parameters in logevidence evaluation.

Default: 1

--nis

Size of sampling of random parameters in logevidence evaluation.

Default: 1000

combineMultipleModels_1d

Combine mcmc results from multiple models to get a mixture using logevidence to infer weights.

usage: combineMultipleModels_1d [-h]
                                (--input INPUT | --inputAnnData INPUTANNDATA)
                                --geneColName GENECOLNAME
                                [--metadata1ColName METADATA1COLNAME]
                                [--metadata1Values METADATA1VALUES]
                                [--metadata2ColName METADATA2COLNAME]
                                [--metadata2Values METADATA2VALUES]
                                [--metadata3ColName METADATA3COLNAME]
                                [--metadata3Values METADATA3VALUES] --outputs
                                OUTPUTS [OUTPUTS ...] [--xmin XMIN]
                                [--xmax XMAX] [--xscale {Seurat,log}]
                                [--targetSum TARGETSUM] [--nx NX]
                                [--osampx OSAMPX] [--osampxpdf OSAMPXPDF]
                                [--minScale MINSCALE] [--seed SEED] --figure
                                FIGURE [--title TITLE]
                                [--removeFirstSamples REMOVEFIRSTSAMPLES]
                                [--nsampInPlot NSAMPINPLOT]
                                [--prettyBins PRETTYBINS]
                                [--logevidences LOGEVIDENCES [LOGEVIDENCES ...]]
                                [--coviscale COVISCALE] [--nis NIS]
                                [--version]

Named Arguments

--version

show program’s version number and exit

Required arguments

--input

Input table (tabular separated with header) with one line per cell columns with raw counts and one column nCount_RNA with total number of UMI per cell optionally other meta data to filter.

--inputAnnData

Input annData (for example from Scanpy).

--geneColName

Name of the column with gene counts.

--outputs

Ouput files basename (will be npz) with different results of mcmc to combine.

--figure

Ouput figure basename.

Optional arguments used to run MCMC

--xmin

Minimum value to consider in x axis.

Default: 0

--xmax

Maximum value to consider in x axis.

Default: 2.5

--xscale

Possible choices: Seurat, log

scale for the x-axis: Seurat (log(1+targetSum*X)) or log (log(X))

Default: “Seurat”

--targetSum

factor when Seurat scale is used: (log(1+targetSum*X)) (default is 10^4, use 0 for the median of nRNA_Counts)

Default: 10000

--nx

Number of values in x to check how your evaluated pdf is compatible with the model.

Default: 100

--osampx

Oversampling factor of x values when evaluating pdf of Poisson distribution.

Default: 10

--osampxpdf

Oversampling factor of x values when evaluating pdf at each step of the MCMC.

Default: 5

--minScale

Minimal value of the scale of gaussians (Default is 0.1 but cannot be smaller than max of twice the bin size of pdf evaluation and half the bin size).

Default: 0.1

--seed

Change seed for another output.

Default: 1

Optional arguments to select input data

--metadata1ColName

Name of the column with metadata1 to filter.

--metadata1Values

Comma separated values for metadata1 of cells to keep.

--metadata2ColName

Name of the column with metadata2 to filter.

--metadata2Values

Comma separated values for metadata2 of cells to keep.

--metadata3ColName

Name of the column with metadata3 to filter.

--metadata3Values

Comma separated values for metadata3 of cells to keep.

Optional arguments to customize plots and text outputs

--title

Title in figures.

--removeFirstSamples

Number of samples to ignore before making the plots (default is nsampMCMC / 4).

--nsampInPlot

Approximate number of samples to use in plots.

Default: 100000

--prettyBins

Number of bins to use in plots (Default is nx).

Optional arguments to evaluate logevidence

--logevidences

Ouput files of precalculated log evidence values.(if not provided will be calculated).

--coviscale

Scale factor to apply to covariance of parameters to get random parameters in logevidence evaluation.

Default: 1

--nis

Size of sampling of random parameters in logevidence evaluation.

Default: 1000

baredSC_2d

Run mcmc to get the pdf in 2D for 2 given genes using a normal distributions. The full documentation is available at https://baredsc.readthedocs.io

usage: baredSC_2d [-h] (--input INPUT | --inputAnnData INPUTANNDATA)
                  --geneXColName GENEXCOLNAME --geneYColName GENEYCOLNAME
                  [--metadata1ColName METADATA1COLNAME]
                  [--metadata1Values METADATA1VALUES]
                  [--metadata2ColName METADATA2COLNAME]
                  [--metadata2Values METADATA2VALUES]
                  [--metadata3ColName METADATA3COLNAME]
                  [--metadata3Values METADATA3VALUES] [--xmin XMIN]
                  [--xmax XMAX] [--nx NX] [--osampx OSAMPX]
                  [--osampxpdf OSAMPXPDF] [--minScalex MINSCALEX]
                  [--ymin YMIN] [--ymax YMAX] [--ny NY] [--osampy OSAMPY]
                  [--osampypdf OSAMPYPDF] [--minScaley MINSCALEY]
                  [--scalePrior SCALEPRIOR] [--scale {Seurat,log}]
                  [--targetSum TARGETSUM] [--nnorm NNORM]
                  [--nsampMCMC NSAMPMCMC] [--nsampBurnMCMC NSAMPBURNMCMC]
                  [--nsplitBurnMCMC NSPLITBURNMCMC] [--T0BurnMCMC T0BURNMCMC]
                  [--seed SEED] [--minNeff MINNEFF] [--force] --output OUTPUT
                  [--figure FIGURE] [--title TITLE]
                  [--splity SPLITY [SPLITY ...]]
                  [--removeFirstSamples REMOVEFIRSTSAMPLES]
                  [--nsampInPlot NSAMPINPLOT] [--prettyBinsx PRETTYBINSX]
                  [--prettyBinsy PRETTYBINSY] [--log1pColorScale]
                  [--logevidence LOGEVIDENCE] [--coviscale COVISCALE]
                  [--nis NIS] [--version]

Named Arguments

--version

show program’s version number and exit

Required arguments

--input

Input table (tabular separated with header) with one line per cell columns with raw counts and one column nCount_RNA with total number of UMI per cell optionally other meta data to filter.

--inputAnnData

Input annData (for example from Scanpy).

--geneXColName

Name of the column with gene counts for gene in x.

--geneYColName

Name of the column with gene counts for gene in y.

--output

Ouput file basename (will be npz) with results of mcmc.

Optional arguments to select input data

--metadata1ColName

Name of the column with metadata1 to filter.

--metadata1Values

Comma separated values for metadata1 of cells to keep.

--metadata2ColName

Name of the column with metadata2 to filter.

--metadata2Values

Comma separated values for metadata2 of cells to keep.

--metadata3ColName

Name of the column with metadata3 to filter.

--metadata3Values

Comma separated values for metadata3 of cells to keep.

Optional arguments to run MCMC

--xmin

Minimum value to consider in x axis.

Default: 0

--xmax

Maximum value to consider in x axis.

Default: 2.5

--nx

Number of values in x to check how your evaluated pdf is compatible with the model.

Default: 50

--osampx

Oversampling factor of x values when evaluating pdf of Poisson distribution.

Default: 10

--osampxpdf

Oversampling factor of x values when evaluating pdf at each step of the MCMC.

Default: 4

--minScalex

Minimal value of the scale of gaussians on x (Default is 0.1 but cannot be smaller than max of twice the bin size of pdf evaluation and half the bin size on x axis).

Default: 0.1

--ymin

Minimum value to consider in y axis.

Default: 0

--ymax

Maximum value to consider in y axis.

Default: 2.5

--ny

Number of values in y to check how your evaluated pdf is compatible with the model.

Default: 50

--osampy

Oversampling factor of y values when evaluating pdf of Poisson distribution.

Default: 10

--osampypdf

Oversampling factor of y values when evaluating pdf at each step of the MCMC.

Default: 4

--minScaley

Minimal value of the scale of gaussians on yx (Default is 0.1 but cannot be smaller than max of twice the bin size of pdf evaluation and half the bin size on y axis).

Default: 0.1

--scalePrior

Scale of the truncnorm used in the prior for the correlation.

Default: 0.3

--scale

Possible choices: Seurat, log

scale for the x-axis and y-axis: Seurat (log(1+targetSum*X)) or log (log(X))

Default: “Seurat”

--targetSum

factor when Seurat scale is used: (log(1+targetSum*X)) (default is 10^4, use 0 for the median of nRNA_Counts)

Default: 10000

--nnorm

Number of gaussian 2D to fit.

Default: 1

--nsampMCMC

Number of samplings (iteractions) of mcmc.

Default: 100000

--nsampBurnMCMC

Number of samplings (iteractions) in the burning phase of mcmc (Default is nsampMCMC / 4).

--nsplitBurnMCMC

Number of steps in the burning phase of mcmc.

Default: 10

--T0BurnMCMC

Initial temperature in the burning phase of mcmc.

Default: 100.0

--seed

Change seed for another output.

Default: 1

--minNeff

Will redo the MCMC with 10 times more samples until the number of effective samples that this value (Default is not set so will not rerun MCMC).

--force

Force to redo the mcmc even if output exists.

Optional arguments to get plots and text outputs

--figure

Ouput figure basename.

--title

Title in figures.

--splity

Threshold value to plot the density for genex for 2 categories in geney values.

--removeFirstSamples

Number of samples to ignore before making the plots (default is nsampMCMC / 4).

--nsampInPlot

Approximate number of samples to use in plots.

Default: 100000

--prettyBinsx

Number of bins to use in x in plots (Default is nx).

--prettyBinsy

Number of bins to use in y in plots (Default is ny).

--log1pColorScale

Use log1p color scale instead of linear color scale.

Default: False

Optional arguments to get logevidence

--logevidence

Ouput file to put logevidence value.

--coviscale

Scale factor to appy to covariance of parameters to get random parameters in logevidence evaluation.

Default: 1

--nis

Size of sampling of random parameters in logevidence evaluation.

Default: 1000

combineMultipleModels_2d

Combine mcmc 2D results from multiple models to get a mixture using logevidence to infer weights.

usage: combineMultipleModels_2d [-h]
                                (--input INPUT | --inputAnnData INPUTANNDATA)
                                --geneXColName GENEXCOLNAME --geneYColName
                                GENEYCOLNAME
                                [--metadata1ColName METADATA1COLNAME]
                                [--metadata1Values METADATA1VALUES]
                                [--metadata2ColName METADATA2COLNAME]
                                [--metadata2Values METADATA2VALUES]
                                [--metadata3ColName METADATA3COLNAME]
                                [--metadata3Values METADATA3VALUES] --outputs
                                OUTPUTS [OUTPUTS ...] [--xmin XMIN]
                                [--xmax XMAX] [--nx NX] [--osampx OSAMPX]
                                [--osampxpdf OSAMPXPDF]
                                [--minScalex MINSCALEX] [--ymin YMIN]
                                [--ymax YMAX] [--ny NY] [--osampy OSAMPY]
                                [--osampypdf OSAMPYPDF]
                                [--minScaley MINSCALEY] [--scale {Seurat,log}]
                                [--scalePrior SCALEPRIOR]
                                [--targetSum TARGETSUM] [--seed SEED] --figure
                                FIGURE [--title TITLE]
                                [--splity SPLITY [SPLITY ...]]
                                [--removeFirstSamples REMOVEFIRSTSAMPLES]
                                [--nsampInPlot NSAMPINPLOT]
                                [--prettyBins PRETTYBINS]
                                [--prettyBinsx PRETTYBINSX]
                                [--prettyBinsy PRETTYBINSY]
                                [--log1pColorScale] [--getPVal]
                                [--logevidences LOGEVIDENCES [LOGEVIDENCES ...]]
                                [--coviscale COVISCALE] [--nis NIS]
                                [--version]

Named Arguments

--version

show program’s version number and exit

Required arguments

--input

Input table (tabular separated with header) with one line per cell columns with raw counts and one column nCount_RNA with total number of UMI per cell optionally other meta data to filter.

--inputAnnData

Input annData (for example from Scanpy).

--geneXColName

Name of the column with gene counts for gene in x.

--geneYColName

Name of the column with gene counts for gene in y.

--outputs

Ouput files basename (will be npz) with different results of mcmc to combine.

--figure

Ouput figure basename.

Optional arguments used to run MCMC

--xmin

Minimum value to consider in x axis.

Default: 0

--xmax

Maximum value to consider in x axis.

Default: 2.5

--nx

Number of values in x to check how your evaluated pdf is compatible with the model.

Default: 50

--osampx

Oversampling factor of x values when evaluating pdf of Poisson distribution.

Default: 10

--osampxpdf

Oversampling factor of x values when evaluating pdf at each step of the MCMC.

Default: 4

--minScalex

Minimal value of the scale of gaussians on x (Default is 0.1 but cannot be smaller than max of twice the bin size of pdf evaluation and half the bin size on x axis).

Default: 0.1

--ymin

Minimum value to consider in y axis.

Default: 0

--ymax

Maximum value to consider in y axis.

Default: 2.5

--ny

Number of values in y to check how your evaluated pdf is compatible with the model.

Default: 50

--osampy

Oversampling factor of y values when evaluating pdf of Poisson distribution.

Default: 10

--osampypdf

Oversampling factor of y values when evaluating pdf at each step of the MCMC.

Default: 4

--minScaley

Minimal value of the scale of gaussians on yx (Default is 0.1 but cannot be smaller than max of twice the bin size of pdf evaluation and half the bin size on y axis).

Default: 0.1

--scale

Possible choices: Seurat, log

scale for the x-axis and y-axis: Seurat (log(1+targetSum*X)) or log (log(X))

Default: “Seurat”

--scalePrior

Scale of the truncnorm used in the prior for the correlation.

Default: 0.3

--targetSum

factor when Seurat scale is used: (log(1+targetSum*X)) (default is 10^4, use 0 for the median of nRNA_Counts)

Default: 10000

--seed

Change seed for another output.

Default: 1

Optional arguments to select input data

--metadata1ColName

Name of the column with metadata1 to filter.

--metadata1Values

Comma separated values for metadata1 of cells to keep.

--metadata2ColName

Name of the column with metadata2 to filter.

--metadata2Values

Comma separated values for metadata2 of cells to keep.

--metadata3ColName

Name of the column with metadata3 to filter.

--metadata3Values

Comma separated values for metadata3 of cells to keep.

Optional arguments to customize plots and text outputs

--title

Title in figures.

--splity

Threshold value to plot the density for genex for 2 categories in geney values.

--removeFirstSamples

Number of samples to ignore before making the plots (default is nsampMCMC / 4).

--nsampInPlot

Approximate number of samples to use in plots.

Default: 100000

--prettyBins

Number of bins to use in plots (Default is nx).

--prettyBinsx

Number of bins to use in x in plots (Default is nx).

--prettyBinsy

Number of bins to use in y in plots (Default is ny).

--log1pColorScale

Use log1p color scale instead of linear color scale.

Default: False

--getPVal

Use less samples to get an estimation of the p-value.

Default: False

Optional arguments to evaluate logevidence

--logevidences

Ouput files of precalculated log evidence values.(if not provided will be calculated).

--coviscale

Scale factor to apply to covariance of parameters to get random parameters in logevidence evaluation.

Default: 1

--nis

Size of sampling of random parameters in logevidence evaluation.

Default: 1000