Birth tables¶

For a population model to reproduce the target population, it can be beneficial to recreate the sizes of each cohort in the target population. The use of historical birth counts can lead to misestimations of the cohort sizes, since migration can lead to deviations in the cohort sizes over time. The miscore.processes.birth.calc_adjusted_birth_table() function can help in replicating the correct cohort sizes in a population model. The function looks up the total cumulative mortality of each cohort, and finds the cohort size at birth that would lead to the current cohort size with that level of mortality. This tutorial explains the workings of the function, and gives several examples of its use.

The method explained¶

To demonstrate, we may have a population that, as of 2021, has 10,000 people alive born in 1960, and 12,000 people alive from 1970. We want to simulate their entire lifetimes, but also want to make sure that we get the same distribution of birth years in 2021. To get the birth counts needed to recreate this distribution, we may call miscore.processes.birth.calc_adjusted_birth_table().

from miscore.processes.birth import calc_adjusted_birth_table
from miscore.processes.oc.data import get_cohort_table

current_cohort_sizes = [
    (10000, 1960),
    (12000, 1970)
]

abt = calc_adjusted_birth_table(
    data=current_cohort_sizes,
    sex="total",
    country="NL",
    year=2021
)

print(abt[1960])

11114.20561956452

The function finds for each birth year the corresponding life table, specific to the sex and country of our target population. It then divides the current population size by the cumulative survival of the cohort to get the birth count that would replicate their current size. This is illustrated by a replication of the result by calling miscore.processes.oc.data.get_cohort_table() ourselves, and dividing the population size by the cumulative survival at age 61 for the 1960 cohort:

survival = 1 - get_cohort_table(
    country="NL", cohorts=[1960], sex="total", cohort_table=True
)[1960][61][0]

print(10000 / survival)

11114.20561956452

Note

We subtract the output from get_cohort_table from 1 to get the cumulative survival, since get_cohort_table reports cumulative mortality by default.

Implementation in a population model¶

To demonstrate the use of the method, we may implement it in the basic population model seen in the Run a population model tutorial. In this example, we have a population containing individuals from 1995 and 1996, and want to replicate their current cohort sizes in the year 2021 of 2000 and 3000, respectively. We can then pass the result from get_cohort_table to our PopulationModel.run() call.

from miscore import Universe
from miscore.processes import Birth, OC
from miscore.processes.birth import calc_adjusted_birth_table
from miscore.tools.cohort import Cohort, PopulationModel

birth1995 = Birth(year=1995)
oc1995 = OC(age=80)
universe1995 = Universe(name="universe", processes=[birth1995, oc1995])
cohort1995 = Cohort(name="cohort1995", universes=[universe1995])

birth1996 = Birth(year=1996)
oc1996 = OC(age=81)
universe1996 = Universe(name="universe", processes=[birth1996, oc1996])
cohort1996 = Cohort(name="cohort1996", universes=[universe1996])

model = PopulationModel(cohorts=[cohort1995, cohort1996])

n = calc_adjusted_birth_table(
    data=[(2000, 1995), (3000, 1996)],
    sex="total",
    country="NL",
    year=2021,
    cohort_names=["cohort1995", "cohort1996"],
    sum_to=10000
)

result = model.run(
    n=n,
    seed=123
)

Note

We supply cohort_names with the same names given to the cohorts, so that PopulationModel.run() knows how to assign the cohort sizes. By default, the keys in the Dict returned by get_cohort_table are the integers of the birth years. When birth years are repeated in the input data, cohort_names is a required input, to prevent duplicate keys. We also supply the argument sum_to, which allows us to scale the birth counts so that they sum to a target population. This can be helpful when we want to perform larger simulations with a number of simulated individuals in excess of the target population size.

Note

MISCore gives a warning: UserWarning: Cohort(s) {1995.0, 1996.0} not available, supplying cohort with initial birth year 1990 instead. There is no built-in cohort birth table for cohorts 1995 and 1996. MISCore therefore uses the birth table for 1990 instead, which is the one closest to 1995 and 1996.

Use with different reference years and sexes¶

It may be that we want to get cohort sizes relative to different reference years. For example, we may want to evaluate an intervention, but the intervention started at different years for the two cohorts in the population, and we only know their sizes at the time of the intervention. In this case, we pass for year not a single year, but a list with the correct years for each cohort in data.

from miscore.processes.birth import calc_adjusted_birth_table

current_cohort_sizes = [
    (3000, 1980),
    (5000, 1980)
]

abt = calc_adjusted_birth_table(
    data=current_cohort_sizes,
    sex="total",
    country="NL",
    year=[2010, 2020],
    cohort_names=["early_intervention", "late_intervention"]
)

print(abt)

{'early_intervention': 3056.477821605622, 'late_intervention': 5120.439179101247}

To get 3000 people alive for the early intervention in 2010, we need to simulate 3056 people, for the later intervention cohort we need to simulate 5120 people.

In most cases, we will model populations of a different sex separately, since they often have different parameters for most processes. However, for a population run it might be easier to model the two sexes together. Just like with year, we can pass sex as a sequence to get cohort sizes for cohorts of different sexes. For example, if in our first example the early intervention group was male, and the later intervention group was female:

abt = calc_adjusted_birth_table(
    data=current_cohort_sizes,
    sex=["male", "female"],
    country="NL",
    year=[2010, 2020],
    cohort_names=["early_intervention", "late_intervention"]
)

print(abt)

{'early_intervention': 3068.3054117452757, 'late_intervention': 5095.047028250785}

Use of cohort data files¶

To reproduce the Dutch population, the Dutch cohort sizes in the year 2021 can be retrieved from miscore.processes.birth.data.nl_2021. Note that for cohorts that are very old, or very recent, the exact cohort-specific life table will not be available. A warning will be raised, and the closest matching cohort life table will be used.

import pandas as pd

from miscore.processes.birth import calc_adjusted_birth_table
from miscore.processes.birth.data import nl_2021

cohort_sizes = [
    *nl_2021.male_cohort_counts,
    *nl_2021.female_cohort_counts
]

sex = [
    *(['male'] * len(nl_2021.male_cohort_counts)),
    *(['female'] * len(nl_2021.female_cohort_counts))
]

cohort_names = [
    sex[i] + "_" + str(cohort_sizes[i][1]) for i in range(len(sex))
]

adjusted_birth_table = calc_adjusted_birth_table(
    data=cohort_sizes,
    sex=sex,
    country="NL",
    year=2021,
    cohort_names=cohort_names
)

print(pd.DataFrame.from_dict(adjusted_birth_table, "index", columns=["Birth count"]))

               Birth count
male_2020     87003.034961
male_2019     88052.662508
male_2018     88036.907592
male_2017     89402.955643
male_2016     91132.582880
...                    ...
female_1925  122343.865786
female_1924  134409.750522
female_1923  147407.178222
female_1922  172594.819047
female_1921  192942.499192

[200 rows x 1 columns]