Fetching#

ahlive has many datasets readily available!

list datasets#

To get a listing of available datasets, use list_datasets. Here, “urban” is used as the search pattern and then a sample of 8 are shown.

[1]:
import ahlive as ah
ah.list_datasets(pattern="urban", sample=8)
- owid_urban_and_rural_populations_in_the_united_states_us_census_bureau_2010
- owid_urban_definition_population_threshold_un_2018
- owid_levels_of_urbanization_and_per_capita_gnp_in_various_regions_bairoch_1988
- owid_capital_city_population_un_urbanization_prospects_2018
- owid_urbanization_share_european_commission_atlas_of_the_human_planet
- owid_urban_and_rural_population_1950_2050_un_world_urbanization_prospects_2018
- owid_historical_urban_fraction_estimates_and_total_computed_urban_areas_hyde_31_2010
- owid_city_populations_1950_2035_un_urbanization_prospects_2018

open dataset#

To open the dataset into a dataframe, enter the label of the dataset.

[2]:
import ahlive as ah
ah.open_dataset("annual_co2")
ANNUAL CO2

Source: NOAA ESRL
https://www.esrl.noaa.gov/

[2]:
year co2_ppm uncertainty
0 1959 315.98 0.12
1 1960 316.91 0.12
2 1961 317.64 0.12
3 1962 318.45 0.12
4 1963 318.99 0.12
... ... ... ...
59 2018 408.72 0.12
60 2019 411.66 0.12
61 2020 414.24 0.12
62 2021 416.45 0.12
63 2022 418.56 0.12

64 rows × 3 columns

built-in keywords#

Some datasets have built-in keywords, like iem_asos–look for “adjustable keywords” after running ah.list_datasets.

[3]:
import ahlive as ah
ah.open_dataset("iem_asos", stn="DEN", end="2020-01-02", elev=True, data="tmpf")
IEM ASOS

Source: Iowa Environment Mesonet ASOS
https://mesonet.agron.iastate.edu/ASOS/

[3]:
station tmpf
valid
2020-01-01 00:00:00 DEN NaN
2020-01-01 00:05:00 DEN NaN
2020-01-01 00:10:00 DEN NaN
2020-01-01 00:15:00 DEN NaN
2020-01-01 00:20:00 DEN NaN
... ... ...
2020-01-01 23:40:00 DEN NaN
2020-01-01 23:45:00 DEN NaN
2020-01-01 23:50:00 DEN NaN
2020-01-01 23:53:00 DEN 41.0
2020-01-01 23:55:00 DEN NaN

314 rows × 2 columns

pandas keywords#

Besides the built-in keywords, all pd.read_csv keywords are supported.

[4]:
import ahlive as ah
ah.open_dataset("annual_co2", names=["Year", "CO2"], usecols=["Year", "CO2"])
ANNUAL CO2

Source: NOAA ESRL
https://www.esrl.noaa.gov/

[4]:
Year CO2
0 1959 315.98
1 1960 316.91
2 1961 317.64
3 1962 318.45
4 1963 318.99
... ... ...
59 2018 408.72
60 2019 411.66
61 2020 414.24
62 2021 416.45
63 2022 418.56

64 rows × 2 columns

To see a list of valid keywords for pd.read_csv:

raw dataset#

Sometimes, ahlive subsets and renames columns from the fetched dataset. To get the raw dataset without any preprocessing:

[5]:
import ahlive as ah
ah.open_dataset("iem_asos", raw=True)
IEM ASOS

Source: Iowa Environment Mesonet ASOS
https://mesonet.agron.iastate.edu/ASOS/

[5]:
station valid tmpf dwpf relh drct sknt p01i alti mslp ... wxcodes ice_accretion_1hr ice_accretion_3hr ice_accretion_6hr peak_wind_gust peak_wind_drct peak_wind_time feel metar snowdepth
0 CMI 2020-01-01 00:00 NaN NaN NaN 260.0 13.0 NaN 29.89 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN KCMI 010000Z AUTO 26013G18KT 10SM OVC025 00/M0... NaN
1 CMI 2020-01-01 00:05 NaN NaN NaN 260.0 13.0 NaN 29.89 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN KCMI 010005Z AUTO 26013KT 10SM OVC025 00/M02 A... NaN
2 CMI 2020-01-01 00:10 NaN NaN NaN 260.0 13.0 NaN 29.90 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN KCMI 010010Z AUTO 26013KT 10SM OVC024 00/M02 A... NaN
3 CMI 2020-01-01 00:15 NaN NaN NaN 260.0 11.0 NaN 29.89 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN KCMI 010015Z AUTO 26011KT 10SM OVC024 00/M02 A... NaN
4 CMI 2020-01-01 00:20 NaN NaN NaN 260.0 11.0 NaN 29.89 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN KCMI 010020Z AUTO 26011KT 10SM OVC024 00/M02 A... NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
620 CMI 2020-01-02 23:40 NaN NaN NaN 180.0 5.0 NaN 29.66 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN KCMI 022340Z AUTO 18005KT 10SM CLR 07/05 A2965... NaN
621 CMI 2020-01-02 23:45 NaN NaN NaN 180.0 5.0 NaN 29.66 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN KCMI 022345Z AUTO 18005KT 10SM CLR 07/05 A2965... NaN
622 CMI 2020-01-02 23:50 NaN NaN NaN 180.0 5.0 NaN 29.67 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN KCMI 022350Z AUTO 18005KT 10SM CLR 07/05 A2967... NaN
623 CMI 2020-01-02 23:53 44.1 41.0 88.77 190.0 6.0 0.0 29.67 1005.1 ... NaN NaN NaN NaN NaN NaN NaN 40.1 KCMI 022353Z 19006KT 10SM CLR 07/05 A2967 RMK ... NaN
624 CMI 2020-01-02 23:55 NaN NaN NaN 190.0 5.0 NaN 29.67 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN KCMI 022355Z AUTO 19005KT 10SM CLR 07/05 A2967... NaN

625 rows × 30 columns

verbose description#

To see a description and where the data was fetched from:

[6]:
import ahlive as ah
ah.open_dataset("annual_co2", verbose=True)
ANNUAL CO2

Source: NOAA ESRL
https://www.esrl.noaa.gov/

Description: The carbon dioxide data on Mauna Loa constitute the longest record of direct measurements of CO2 in the atmosphere. They were started by C. David Keeling of the Scripps Institution of Oceanography in March of 1958 at a facility of the National Oceanic and Atmospheric Administration [Keeling, 1976]. NOAA started its own CO2 measurements in May of 1974, and they have run in parallel with those made by Scripps since then [Thoning, 1989].

Data: https://www.esrl.noaa.gov/gmd/webdata/ccgg/trends/co2/co2_annmean_mlo.txt

[6]:
year co2_ppm uncertainty
0 1959 315.98 0.12
1 1960 316.91 0.12
2 1961 317.64 0.12
3 1962 318.45 0.12
4 1963 318.99 0.12
... ... ... ...
59 2018 408.72 0.12
60 2019 411.66 0.12
61 2020 414.24 0.12
62 2021 416.45 0.12
63 2022 418.56 0.12

64 rows × 3 columns

return meta#

It’s possible to return the metadata as a variable, useful for adding citations in your animation. To return the metadata:

[7]:
import ahlive as ah
df, meta = ah.open_dataset("annual_co2", return_meta=True)
for key, val in meta.items():
    print(f"{key}: {val}")
label: annual_co2
source: NOAA ESRL
base_url: https://www.esrl.noaa.gov/
description: The carbon dioxide data on Mauna Loa constitute the longest record of direct measurements of CO2 in the atmosphere. They were started by C. David Keeling of the Scripps Institution of Oceanography in March of 1958 at a facility of the National Oceanic and Atmospheric Administration [Keeling, 1976]. NOAA started its own CO2 measurements in May of 1974, and they have run in parallel with those made by Scripps since then [Thoning, 1989].
data_url: https://www.esrl.noaa.gov/gmd/webdata/ccgg/trends/co2/co2_annmean_mlo.txt

use cache#

All datasets are cached at $HOME/.ahlive/.

[8]:
import ahlive as ah
print(ah.DEFAULTS["cache_kwds"])
{'directory': '/home/docs/.ahlive/'}

To change the default cache directory:

[9]:
import os
import ahlive as ah
ah.config_defaults("cache", directory=os.path.expandvars("$HOME/.ahlive/new_cache"))
print(ah.DEFAULTS["cache_kwds"])
{'directory': '/home/docs/.ahlive/new_cache'}

To disable reading from cache:

[10]:
import ahlive as ah
ah.open_dataset("annual_co2", use_cache=False)
ANNUAL CO2

Source: NOAA ESRL
https://www.esrl.noaa.gov/

[10]:
year co2_ppm uncertainty
0 1959 315.98 0.12
1 1960 316.91 0.12
2 1961 317.64 0.12
3 1962 318.45 0.12
4 1963 318.99 0.12
... ... ... ...
59 2018 408.72 0.12
60 2019 411.66 0.12
61 2020 414.24 0.12
62 2021 416.45 0.12
63 2022 418.56 0.12

64 rows × 3 columns