Birth tables¶
For a population model to reproduce the target population, it can be beneficial to recreate the
sizes of each cohort in the target population. The use of historical birth counts can
lead to misestimations of the cohort sizes, since migration can lead to deviations in the cohort
sizes over time. The miscore.processes.birth.calc_adjusted_birth_table() function can help
in replicating the correct cohort sizes in a population model. The function looks up the total
cumulative mortality of each cohort, and finds the cohort size at birth that would lead to the
current cohort size with that level of mortality. This tutorial explains the workings of the
function, and gives several examples of its use.
The method explained¶
To demonstrate, we may have a population that, as of 2021, has 10,000 people alive born in 1960, and
12,000 people alive from 1970. We want to simulate their entire lifetimes, but also want to make sure
that we get the same distribution of birth years in 2021. To get the birth counts needed to recreate
this distribution, we may call miscore.processes.birth.calc_adjusted_birth_table().
1from miscore.processes.birth import calc_adjusted_birth_table
2from miscore.processes.oc.data import get_cohort_table
3
4current_cohort_sizes = [
5 (10000, 1960),
6 (12000, 1970)
7]
8
9abt = calc_adjusted_birth_table(
10 data=current_cohort_sizes,
11 sex="total",
12 country="NL",
13 year=2021
14)
15
16print(abt[1960])
11114.20561956452
The function finds for each birth year the corresponding life table, specific to the sex and country
of our target population. It then divides the current population size by the cumulative survival of
the cohort to get the birth count that would replicate their current size. This is illustrated by a
replication of the result by calling miscore.processes.oc.data.get_cohort_table() ourselves,
and dividing the population size by the cumulative survival at age 61 for the 1960 cohort:
1survival = 1 - get_cohort_table(
2 country="NL", cohorts=[1960], sex="total", cohort_table=True
3)[1960][61][0]
4
5print(10000 / survival)
11114.20561956452
Note
We subtract the output from get_cohort_table from 1 to get the cumulative
survival, since get_cohort_table reports cumulative mortality by default.
Implementation in a population model¶
To demonstrate the use of the method, we may implement it in the basic population model seen in the
Run a population model tutorial.
In this example, we have a population containing individuals from 1995 and 1996, and want to
replicate their current cohort sizes in the year 2021 of 2000 and 3000, respectively.
We can then pass the result from get_cohort_table to our PopulationModel.run() call.
1from miscore import Universe
2from miscore.processes import Birth, OC
3from miscore.processes.birth import calc_adjusted_birth_table
4from miscore.tools.cohort import Cohort, PopulationModel
5
6birth1995 = Birth(year=1995)
7oc1995 = OC(age=80)
8universe1995 = Universe(name="universe", processes=[birth1995, oc1995])
9cohort1995 = Cohort(name="cohort1995", universes=[universe1995])
10
11birth1996 = Birth(year=1996)
12oc1996 = OC(age=81)
13universe1996 = Universe(name="universe", processes=[birth1996, oc1996])
14cohort1996 = Cohort(name="cohort1996", universes=[universe1996])
15
16model = PopulationModel(cohorts=[cohort1995, cohort1996])
17
18n = calc_adjusted_birth_table(
19 data=[(2000, 1995), (3000, 1996)],
20 sex="total",
21 country="NL",
22 year=2021,
23 cohort_names=["cohort1995", "cohort1996"],
24 sum_to=10000
25)
26
27result = model.run(
28 n=n,
29 seed=123
30)
Note
We supply cohort_names with the same names given to the cohorts, so that
PopulationModel.run() knows how to assign the cohort sizes. By default, the keys in the
Dict returned by get_cohort_table are the integers of the birth years. When birth years
are repeated in the input data, cohort_names is a required input, to prevent duplicate
keys.
We also supply the argument sum_to, which allows us to scale the birth counts so that they
sum to a target population. This can be helpful when we want to perform larger simulations with
a number of simulated individuals in excess of the target population size.
Note
MISCore gives a warning:
UserWarning: Cohort(s) {1995.0, 1996.0} not available, supplying cohort with initial birth year 1990 instead.
There is no built-in cohort birth table for cohorts 1995 and 1996.
MISCore therefore uses the birth table for 1990 instead, which is the one closest to 1995 and 1996.
Use with different reference years and sexes¶
It may be that we want to get cohort sizes relative to different reference years. For example,
we may want to evaluate an intervention, but the intervention started at different
years for the two cohorts in the population, and we only know their sizes at the time of the
intervention. In this case, we pass for year not a single year, but a list with the correct
years for each cohort in data.
1from miscore.processes.birth import calc_adjusted_birth_table
2
3current_cohort_sizes = [
4 (3000, 1980),
5 (5000, 1980)
6]
7
8abt = calc_adjusted_birth_table(
9 data=current_cohort_sizes,
10 sex="total",
11 country="NL",
12 year=[2010, 2020],
13 cohort_names=["early_intervention", "late_intervention"]
14)
15
16print(abt)
{'early_intervention': 3056.477821605622, 'late_intervention': 5120.439179101247}
To get 3000 people alive for the early intervention in 2010, we need to simulate 3056 people, for the later intervention cohort we need to simulate 5120 people.
In most cases, we will model populations of a different sex separately, since they often have
different parameters for most processes. However, for a population run it might be easier to
model the two sexes together. Just like with year, we can pass sex as a sequence to get
cohort sizes for cohorts of different sexes. For example, if in our first example the early
intervention group was male, and the later intervention group was female:
18abt = calc_adjusted_birth_table(
19 data=current_cohort_sizes,
20 sex=["male", "female"],
21 country="NL",
22 year=[2010, 2020],
23 cohort_names=["early_intervention", "late_intervention"]
24)
25
26print(abt)
{'early_intervention': 3068.3054117452757, 'late_intervention': 5095.047028250785}
Use of cohort data files¶
To reproduce the Dutch population, the Dutch cohort sizes in the year 2021 can be retrieved from
miscore.processes.birth.data.nl_2021. Note that for cohorts that are very old, or very
recent, the exact cohort-specific life table will not be available. A warning will be raised, and
the closest matching cohort life table will be used.
1import pandas as pd
2
3from miscore.processes.birth import calc_adjusted_birth_table
4from miscore.processes.birth.data import nl_2021
5
6cohort_sizes = [
7 *nl_2021.male_cohort_counts,
8 *nl_2021.female_cohort_counts
9]
10
11sex = [
12 *(['male'] * len(nl_2021.male_cohort_counts)),
13 *(['female'] * len(nl_2021.female_cohort_counts))
14]
15
16cohort_names = [
17 sex[i] + "_" + str(cohort_sizes[i][1]) for i in range(len(sex))
18]
19
20adjusted_birth_table = calc_adjusted_birth_table(
21 data=cohort_sizes,
22 sex=sex,
23 country="NL",
24 year=2021,
25 cohort_names=cohort_names
26)
27
28print(pd.DataFrame.from_dict(adjusted_birth_table, "index", columns=["Birth count"]))
Birth count
male_2020 87003.034961
male_2019 88052.662508
male_2018 88036.907592
male_2017 89402.955643
male_2016 91132.582880
... ...
female_1925 122343.865786
female_1924 134409.750522
female_1923 147407.178222
female_1922 172594.819047
female_1921 192942.499192
[200 rows x 1 columns]