Details
-
Bug
-
Status: Closed
-
TBD
-
Resolution: Won't Fix
-
None
-
None
-
None
-
Undetermined
-
Description
In 2.1, a Cohort shifted from being a TreeSet of Integers to a TreeSet of a new CohortMembership object.
This adds performance overhead to basic Cohort operations. For instance, adding set off 100,000 patient ids to a Cohort jumps from taking milliseconds to taking a second or two. This may seem insignificant, but as Cohorts are used extensively for reporting and Cohorts are may be manipulates dozens of times in a single report, this can add up.
We should find a way to maintain the performance of a Cohort for the majority use case that does not require start date and end date.
I tested switching from storing as a TreeSet to a HashSet... this had a marginal improvement on peformance, certainly not a game-changer.
Some potential solutions include:
- Reverting Cohort back to it's original design and creating a new CohortWithDateRange object... this would introduce some backawrds incompatibility
- Changing the implementation of Cohort so that it supported both an underlying data storage of Set<Integer> or Set<CohortMembership> and only started to use CohortMembership if the consumer specified a start and/or end date when adding a Patient. This might work, but seems hack and error-prone... we'd need to tightly restrict the consumers from accessing the underlying model.
Gliffy Diagrams
Attachments
Issue Links
- is related to
-
TRUNK-5375 New Core 2.1 Cohort module significantly slows down Cohort manipulation
-
- Closed
-
-
TRUNK-5331 CohortMembership should not require a startDate
-
- Ready for Work
-
- relates to
-
TRUNK-5379 Cohort Membership: Resolve design inconsistents
-
- Closed
-
-
TRUNK-5211 Cohort membership should allow null start date, and should default to this
-
- Closed
-