OHDSI/PropensityScoreEvaluation

Name: PropensityScoreEvaluation

Owner: Observational Health Data Sciences and Informatics

Description: null

Created: 2017-12-01 18:08:31.0

Updated: 2017-12-01 18:15:03.0

Pushed: 2017-12-01 18:14:56.0

Homepage: null

Size: 90

Language: R

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Propensity Score Method Evaluation [UNDER DEVELOPMENT]

This study will evaluate the performance of different propensity score methods.

all.packages("devtools")
ary(devtools)
all_github("ohdsi/OhdsiRTools")
all_github("ohdsi/SqlRender")
all_github("ohdsi/DatabaseConnector")
all_github("ohdsi/Cyclops", ref="HDPS")
all_github("ohdsi/FeatureExtraction")
all_github("ohdsi/CohortMethod", ref = "hdps_clean")

Obtain a CohortMethodData object. Calls functions in CohortMethod. Currently uses single studies vignette as an example. Using any existing CohortMethodData object is ok as well.

ary(PropensityScoreEvaluation)
ectionDetails <- createConnectionDetails(dbms = "postgresql",
 = "joe",
word = "secret",
er = "myserver")

 = "inst/sql/sql_server/coxibVsNonselVsGiBleed.sql"
sureTable = "coxibVsNonselVsGiBleed"
omeTable = "coxibVsNonselVsGiBleed"
ersion <- "4"
atabaseSchema <- "my_schema"
ltsDatabaseSchema <- "my_results"

e HDPS covariates or regular FeatureExtraction covariates
Covariates = TRUE

rtMethodData <- createCohortMethodData(connectionDetails = connectionDetails,
 = file,
sureTable = exposureTable,
omeTable = outcomeTable,
ersion = cdmVersion,
atabaseSchema = cdmDatabaseSchema,
ltsDatabaseSchema = resultsDatabaseSchema,
Covariates = hdpsCovariates)

Create study population and simulation profile

faults to cross-validation to find Lasso hyperparameter with useCrossValidation = TRUE
t useCrossValidation = FALSE and specify a priorVariance to use a specific variance

yPop <- createStudyPopulation(cohortMethodData = cohortMethodData,
omeId = outcomeId,
tExposureOnly = FALSE,
outPeriod = 0,
veDuplicateSubjects = FALSE,
veSubjectsWithPriorOutcome = TRUE,
aysAtRisk = 1,
WindowStart = 0,
xposureDaysToStart = FALSE,
WindowEnd = 30,
xposureDaysToEnd = TRUE)

lationProfile <- createCMDSimulationProfile(cohortMethodData, studyPop = studyPop, outcomeId = outcomeId, useCrossValidation = FALSE, priorVariance = 1)
SimulationProfile(simulationProfile, file = file)
lationProfile <- loadSimulationProfile(file = file)

The simulation is divided into two steps: a simulation setup step and a run simulation step. The simulation setup step selects unmeasured confounding covariates and a desired sample size, if specified. It also estimates the LASSO-regularized propensity score and exposure based hdPS. The run simulation step simulates event times and estimates propensity score adjusted outcome models.

Confounding proportion: a number between 0 and 1 of fraction of covariates to discard in propensity score estimation. Set NA for no confounding

Sample size: NA - use full cohort ; otherwise - use given size (should be smaller than size of full cohort)

n turn off cross validation to specify specific variance in L1 regularized propensity score
lationSetup <- setUpSimulation(simulationProfile, cohortMethodData, useCrossValidation = TRUE, 
                                confoundingProportion = NA, sampleSize = NA, hdpsFeatures = hdpsFeatures)

SimulationSetup(simulationSetup, file = file)
SimulationSetup(file = file)

We can create simulation setups en masse:

oundingProportionList <- c(NA, 0.25)
leSizeList <- c(NA, 5000)
utFolder <- outputFolder

pSimulations(simulationProfile, cohortMethodData, confoundingProportionList,
              useCrossValidation = TRUE, sampleSizeList, outputFolder, hdpsFeatures = hdpsFeatures)

For each simulation setup, we can run simulations for given true effect size and outcome prevalence with runSimulationStudy.

For each simulation setup, we can also run a list of effect sizes and outcome prevalences with runSimulationStudies. This needs the folder name for a saved simulation setup.

faults to 1-1 propensity score matching. Can use 10-fold stratification with `stratify = TRUE`
faults to smoothed baseline hazard estimators. Can use discrete baseline estimators with `discrete = TRUE`
lationStudy <- runSimulationStudy(simulationProfile = simulationProfile, simulationSetup = simulationSetup, cohortMethodData = cohortMethodData, simulationRuns = 10, 
                                  trueEffectSize = 1.0, outcomePrevalence = 0.05, hdpsFeatures = hdpsFeatures)

EffectSizeList <- c(log(1), log(1.5), log(2), log(4))
omePrevalenceList <- c(0.001, 0.01, 0.05)
lationSetupFolder <- simulationSetupFolder
utFolder <- outputFolder

lationStudies <- runSimulationStudies(simulationProfile, cohortMethodData, simulationRuns = 100,
                                      trueEffectSizeList=trueEffectSizeList, outcomePrevalenceList=outcomePrevalenceList, hdpsFeatures=hdpsFeatures,
                                      simulationSetupFolder = simulationSetupFolder, outputFolder=outputFolder)

lationStudies <- loadSimulationStudies(outputFolder)

Look at results. The calculateMetrics function takes a simulationStudy output and returns the bias, sd, rmse, coverage of true effect size, auc, and the number of covariates with std diff greater than a given threshold before and after matching based on PS method

ew settings used in simulation
lationStudy$settings

lculate set of metrics for simulation
ics <- calculateMetrics(simulationStudy, simulationProfile$partialCMD, stdDiffThreshold = .05)

lculate set of metrics for simulation studies (nested list of different effect sizes and outcome prevalences)
icsList <- calculateMetricsList(simulationStudies, cohortMethodData, stdDiffThreshold = .05)
icsList[[1]][[1]]

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.