Name: PropensityScoreEvaluation
Owner: Observational Health Data Sciences and Informatics
Description: null
Created: 2017-12-01 18:08:31.0
Updated: 2017-12-01 18:15:03.0
Pushed: 2017-12-01 18:14:56.0
Homepage: null
Size: 90
Language: R
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
This study will evaluate the performance of different propensity score methods.
all.packages("devtools")
ary(devtools)
all_github("ohdsi/OhdsiRTools")
all_github("ohdsi/SqlRender")
all_github("ohdsi/DatabaseConnector")
all_github("ohdsi/Cyclops", ref="HDPS")
all_github("ohdsi/FeatureExtraction")
all_github("ohdsi/CohortMethod", ref = "hdps_clean")
Obtain a CohortMethodData object. Calls functions in CohortMethod. Currently uses single studies vignette as an example. Using any existing CohortMethodData object is ok as well.
ary(PropensityScoreEvaluation)
ectionDetails <- createConnectionDetails(dbms = "postgresql",
= "joe",
word = "secret",
er = "myserver")
= "inst/sql/sql_server/coxibVsNonselVsGiBleed.sql"
sureTable = "coxibVsNonselVsGiBleed"
omeTable = "coxibVsNonselVsGiBleed"
ersion <- "4"
atabaseSchema <- "my_schema"
ltsDatabaseSchema <- "my_results"
e HDPS covariates or regular FeatureExtraction covariates
Covariates = TRUE
rtMethodData <- createCohortMethodData(connectionDetails = connectionDetails,
= file,
sureTable = exposureTable,
omeTable = outcomeTable,
ersion = cdmVersion,
atabaseSchema = cdmDatabaseSchema,
ltsDatabaseSchema = resultsDatabaseSchema,
Covariates = hdpsCovariates)
Create study population and simulation profile
faults to cross-validation to find Lasso hyperparameter with useCrossValidation = TRUE
t useCrossValidation = FALSE and specify a priorVariance to use a specific variance
yPop <- createStudyPopulation(cohortMethodData = cohortMethodData,
omeId = outcomeId,
tExposureOnly = FALSE,
outPeriod = 0,
veDuplicateSubjects = FALSE,
veSubjectsWithPriorOutcome = TRUE,
aysAtRisk = 1,
WindowStart = 0,
xposureDaysToStart = FALSE,
WindowEnd = 30,
xposureDaysToEnd = TRUE)
lationProfile <- createCMDSimulationProfile(cohortMethodData, studyPop = studyPop, outcomeId = outcomeId, useCrossValidation = FALSE, priorVariance = 1)
SimulationProfile(simulationProfile, file = file)
lationProfile <- loadSimulationProfile(file = file)
The simulation is divided into two steps: a simulation setup step and a run simulation step. The simulation setup step selects unmeasured confounding covariates and a desired sample size, if specified. It also estimates the LASSO-regularized propensity score and exposure based hdPS. The run simulation step simulates event times and estimates propensity score adjusted outcome models.
Confounding proportion: a number between 0 and 1 of fraction of covariates to discard in propensity score estimation. Set NA for no confounding
Sample size: NA - use full cohort ; otherwise - use given size (should be smaller than size of full cohort)
n turn off cross validation to specify specific variance in L1 regularized propensity score
lationSetup <- setUpSimulation(simulationProfile, cohortMethodData, useCrossValidation = TRUE,
confoundingProportion = NA, sampleSize = NA, hdpsFeatures = hdpsFeatures)
SimulationSetup(simulationSetup, file = file)
SimulationSetup(file = file)
We can create simulation setups en masse:
oundingProportionList <- c(NA, 0.25)
leSizeList <- c(NA, 5000)
utFolder <- outputFolder
pSimulations(simulationProfile, cohortMethodData, confoundingProportionList,
useCrossValidation = TRUE, sampleSizeList, outputFolder, hdpsFeatures = hdpsFeatures)
For each simulation setup, we can run simulations for given true effect size and outcome prevalence with runSimulationStudy
.
For each simulation setup, we can also run a list of effect sizes and outcome prevalences with runSimulationStudies
. This needs the folder name for a saved simulation setup.
faults to 1-1 propensity score matching. Can use 10-fold stratification with `stratify = TRUE`
faults to smoothed baseline hazard estimators. Can use discrete baseline estimators with `discrete = TRUE`
lationStudy <- runSimulationStudy(simulationProfile = simulationProfile, simulationSetup = simulationSetup, cohortMethodData = cohortMethodData, simulationRuns = 10,
trueEffectSize = 1.0, outcomePrevalence = 0.05, hdpsFeatures = hdpsFeatures)
EffectSizeList <- c(log(1), log(1.5), log(2), log(4))
omePrevalenceList <- c(0.001, 0.01, 0.05)
lationSetupFolder <- simulationSetupFolder
utFolder <- outputFolder
lationStudies <- runSimulationStudies(simulationProfile, cohortMethodData, simulationRuns = 100,
trueEffectSizeList=trueEffectSizeList, outcomePrevalenceList=outcomePrevalenceList, hdpsFeatures=hdpsFeatures,
simulationSetupFolder = simulationSetupFolder, outputFolder=outputFolder)
lationStudies <- loadSimulationStudies(outputFolder)
Look at results. The calculateMetrics function takes a simulationStudy output and returns the bias, sd, rmse, coverage of true effect size, auc, and the number of covariates with std diff greater than a given threshold before and after matching based on PS method
ew settings used in simulation
lationStudy$settings
lculate set of metrics for simulation
ics <- calculateMetrics(simulationStudy, simulationProfile$partialCMD, stdDiffThreshold = .05)
lculate set of metrics for simulation studies (nested list of different effect sizes and outcome prevalences)
icsList <- calculateMetricsList(simulationStudies, cohortMethodData, stdDiffThreshold = .05)
icsList[[1]][[1]]