Nordpred India Forecasts

This package contains scripts for processing, interpolating, and forecasting population data from census years.

Requirements

Python 3.6+
Required packages:
- pandas
- numpy
- scipy
- matplotlib

Installation

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install required packages:

pip install pandas numpy scipy matplotlib

Set up R environment:
- Install R from CRAN.
- Install required R packages:
```
Rscript -e "install.packages(c('dplyr', 'ggplot2', 'tidyr'), repos='https://cran.rstudio.com/')"
```
- Copy the nordpred.s file to your working directory (required for nordpred analysis)

Disease Data Processing: Usage

Note: use python3 in place of python based on your installed python. It is suggested to add an alias for your python for ease of use.

Navigate to the project directory:
```
cd <workspace>/nordpred-india-forecasts
```
Run the script:
```
python reading-data/read_diabetes_data.py <input_csv> --output-base <output_base>
```
- <input_csv>: Path to the input CSV file (e.g., 18-groups-1991-2021-input-data/Nagaland-18groups-1991-2021.csv).
- <output_base>: Base name for the output files (e.g., NewScripts/processed-files/nagaland_processed).
If --output-base is not provided, the script will prompt you to enter a base name.
Output:
- The script generates two files:
  - <output_base>_male.txt
  - <output_base>_female.txt
- These files are in nordpred format (tab-separated text, years as columns, age groups as rows).

Example

To process the Nagaland data:

python3 reading-data/read_diabetes_data.py 18-groups-1991-2021-input-data/Nagaland-18groups-1991-2021.csv --output-base NewScripts/processed-files/nagaland_processed

To process the Global data:

python3 reading-data/read_diabetes_data.py 18-groups-1991-2021-input-data/GlobalType1-18groups-1991-2021.csv --output-base NewScripts/processed-files/global_processed

Notes

The script automatically detects the CSV header format (multi-header or single-header) and extracts the year column accordingly.
Confidence intervals in the data are handled by extracting the first value from each cell.
Ensure that the age-groups match between what the script expects and what is in your csv file. To see what the script expects, visit nordpred-india-forecasts/reading-data/read_diabetes_data.py and check age_map.

Population file generation: Usage

Population Data Processing

The population processing script (process-population.py) processes census data and generates population predictions.

Required Files

Census Data Files:
- 1991.csv: 1991 census data
- 2001.csv: 2001 census data
- 2011.csv: 2011 census data
- These files should be in the input directory
- CSV format with columns: State, Age, Males, Females

Running the Script

python NewPopulationScripts/process-population.py --state <state> --gender <gender> --input-dir <directory> --output-dir <directory>

Running the Script

python NewPopulationScripts/process-population.py --state <state> --gender <gender> --input-dir <directory> --output-dir <directory> [--start-year <year>] [--end-year <year>] [--forecast-years <years>]

Options:

--state: State name (e.g., Goa)
--gender: Gender (Male/Female)
--input-dir: Directory containing census CSV files
--output-dir: Directory to save output files (default: "output")
--start-year: Start year for interpolation (default: 1990)
--end-year: End year for interpolation (default: 2021)
--forecast-years: Comma-separated list of years to forecast (default: "2025,2030,2035,2040")

Output Files

Historical Population: population-{gender}-{state}.txt
- Contains interpolated population data from 1991 to 2021
- Space-separated values
- Years as columns, age groups as rows
- Example: population-male-goa.txt
Predicted Population: population-{gender}-{state}-pred.txt
- Contains population predictions for 2025-2040
- Space-separated values
- Years as columns, age groups as rows
- Example: population-male-goa-pred.txt
Visualization: population_forecast.png
- Shows historical and predicted population trends
- Includes all age groups
- Historical data (solid lines) and predictions (dashed lines)

Example

python NewPopulationScripts/process-population.py --state Goa --gender Male --input-dir NewPopulationScripts --output-dir output

This will:

Read census data from NewPopulationScripts/1991.csv, 2001.csv, and 2011.csv
Process data for Goa, male population
Generate interpolated data (1991-2021)
Create population predictions (2025-2040)
Save output files in the output directory
Create visualization of the population trends

Output Files

The script generates the following output files in the specified output directory:

{gender}-{state}.txt: Contains interpolated population data from 1991 to 2021
{gender}-{state}-pred.txt: Contains forecasted population data for future years
population_forecast.png: Visualization of population trends
population_forecast_log_scale.png: Log-scale visualization of population trends

Example

To process male population data for Nagaland:

python process-population.py3 --state "Nagaland" --gender Male --input-dir ../population-interpolation-forecast-scripts

This will:

Read census data from NewPopulationScripts/1991.csv, 2001.csv, and 2011.csv
Process data for {state}, {gender} population
Generate interpolated data from 1995 to 2020
Create population predictions for 2025-2045
Save output files in the output directory
Create visualization of the population trends

Notes

The script uses cubic spline interpolation for years between census data
For forecasting, it uses cubic spline or linear interpolation
All population values are rounded to integers
Negative values are not allowed in the output

Nordpred Analysis

The nordpred analysis script (run-nordpred-analysis.R) performs age-standardized rate predictions using the nordpred package.

Required Files

Cases file: {state}-t1_{gender}.txt
- Contains incidence data
- Space-separated values
- Years as columns, age groups as rows
- Example: goa-t1_male.txt
Historical Population: population-{gender}-{state}.txt
- Contains historical population data
- Space-separated values
- Years as columns, age groups as rows
- Example: population-male-goa.txt
Predicted Population: population-{gender}-{state}-pred.txt
- Contains future population predictions
- Space-separated values
- Years as columns, age groups as rows
- Example: population-male-goa-pred.txt

Running the Analysis

Rscript run-nordpred-analysis.R --input-dir <directory> --state <state> --gender <gender> --plot-type <type>

Options:

--input-dir: Directory containing input files (default: "test")
--state: State name (e.g., goa)
--gender: Gender (male/female)
--plot-type: Type of plot to generate
- main: Main prediction plot (default)
- trends: Trend scenarios plot
- both: Generate both plots

Trend Scenarios

The trends plot shows three different prediction scenarios:

No trend (solid black line): Assumes no change in rates
Full trend (dashed red line): Uses the full observed trend
Recent trend (dotted blue line): Uses a weighted trend, giving more weight to recent years

Output

nordpred_plot_{state}_{gender}.png: Main prediction plot
nordpred_trends_{state}_{gender}.png: Trend scenarios plot
nordpred_predictions_{state}_{gender}.csv: Predicted rates

Example

To process the Nagaland data:

python3 reading_data/read_diabetes_data.py 18-groups-1991-2021-input-data/Nagaland-18groups-1991-2021.csv --output-base nagaland_type1

To process the Global data:

python3 NewScripts/read_diabetes_data.py 18-groups-1991-2021-input-data/GlobalType1-18groups-1991-2021.csv --output-base NewScripts/processed-files/global_processed

To run nordpred analysis for Goa:

Rscript run-nordpred-analysis.R --input-dir test --state goa --gender male --plot-type both

License

This project is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). See the LICENSE file for details.

You are free to:

Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material for any purpose, even commercially

Under the following terms:

Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Nordpred India Forecasts

Requirements

Installation

Disease Data Processing: Usage

Example

Notes

Population file generation: Usage

Population Data Processing

Required Files

Running the Script

Running the Script

Output Files

Example

Output Files

Example

Notes

Nordpred Analysis

Required Files

Running the Analysis

Trend Scenarios

Output

Example

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
18-groups-1991-2021-input-data		18-groups-1991-2021-input-data
India-Population-Data		India-Population-Data
original-data-nordpred		original-data-nordpred
population-data-generation		population-data-generation
reading-data		reading-data
scripts		scripts
1991.csv		1991.csv
2001.csv		2001.csv
2011.csv		2011.csv
LICENSE		LICENSE
README.md		README.md
nordpred.s		nordpred.s
nordpred_example_predictions.csv		nordpred_example_predictions.csv
run-nordpred-analysis.R		run-nordpred-analysis.R

License

pavanimajety/nordpred-india-forecasts

Folders and files

Latest commit

History

Repository files navigation

Nordpred India Forecasts

Requirements

Installation

Disease Data Processing: Usage

Example

Notes

Population file generation: Usage

Population Data Processing

Required Files

Running the Script

Running the Script

Output Files

Example

Output Files

Example

Notes

Nordpred Analysis

Required Files

Running the Analysis

Trend Scenarios

Output

Example

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages